How is the standard deviation calculated in Pipeline Builder?

VincentF · February 3, 2025, 4:14pm

Standard deviation can run on Populations or Samples.
The actual calculation differs (using N vs N-1) hence changing the actual final result.

Is Pipeline builder using one or the other ? Is it possible to tweak/configure this ?

sboari · February 4, 2025, 1:17pm

Hi @VincentF !

As per the example that pops-up when you hover over the (?) sign next to “Standard Deviation” in Pipeline Builder:

the standard deviation formula used is the population standard deviation (/N). See the details in this table below (leaving here for reference):

value	mean	(value-mean)^2	sample st. dev.(/(N-1))	pop st. dev. (/N)
2	3	1	1.00000	0.816496581
4		1
3		0

As far as I know, there’s no way to configure this directly. Rather, you would have to create a transform step to manually compute the average and then do the steps above as a window aggregate operation, or just take the population standard deviation and multiply it by sqrt(N) / sqrt(N-1). if your use case requires the use of the sample standard deviation.

Hope this helps!

Best,

system · February 18, 2025, 1:18pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.