How is the standard deviation calculated in Pipeline Builder?

Standard deviation can run on Populations or Samples.
The actual calculation differs (using N vs N-1) hence changing the actual final result.

Is Pipeline builder using one or the other ? Is it possible to tweak/configure this ?

Hi @VincentF !

As per the example that pops-up when you hover over the (?) sign next to “Standard Deviation” in Pipeline Builder:

the standard deviation formula used is the population standard deviation (/N). See the details in this table below (leaving here for reference):

value mean (value-mean)^2 sample st. dev.(/(N-1)) pop st. dev. (/N)
2 3 1 1.00000 0.816496581
4 1
3 0

As far as I know, there’s no way to configure this directly. Rather, you would have to create a transform step to manually compute the average and then do the steps above as a window aggregate operation, or just take the population standard deviation and multiply it by sqrt(N) / sqrt(N-1). if your use case requires the use of the sample standard deviation.

Hope this helps!

Best,

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.