How is the standard deviation calculated in Pipeline Builder?

Standard deviation can run on Populations or Samples.
The actual calculation differs (using N vs N-1) hence changing the actual final result.

Is Pipeline builder using one or the other ? Is it possible to tweak/configure this ?

Hi @VincentF !

As per the example that pops-up when you hover over the (?) sign next to “Standard Deviation” in Pipeline Builder:

the standard deviation formula used is the population standard deviation (/N). See the details in this table below (leaving here for reference):

value mean (value-mean)^2 sample st. dev.(/(N-1)) pop st. dev. (/N)
2 3 1 1.00000 0.816496581
4 1
3 0

As far as I know, there’s no way to configure this directly. Rather, you would have to create a transform step to manually compute the average and then do the steps above as a window aggregate operation, or just take the population standard deviation and multiply it by sqrt(N) / sqrt(N-1). if your use case requires the use of the sample standard deviation.

Hope this helps!

Best,