Standard deviation can run on Populations or Samples.
The actual calculation differs (using N vs N-1) hence changing the actual final result.
Is Pipeline builder using one or the other ? Is it possible to tweak/configure this ?
Standard deviation can run on Populations or Samples.
The actual calculation differs (using N vs N-1) hence changing the actual final result.
Is Pipeline builder using one or the other ? Is it possible to tweak/configure this ?
Hi @VincentF !
As per the example that pops-up when you hover over the (?) sign next to “Standard Deviation” in Pipeline Builder:
the standard deviation formula used is the population standard deviation (/N). See the details in this table below (leaving here for reference):
| value | mean | (value-mean)^2 | sample st. dev.(/(N-1)) | pop st. dev. (/N) |
|---|---|---|---|---|
| 2 | 3 | 1 | 1.00000 | 0.816496581 |
| 4 | 1 | |||
| 3 | 0 |
As far as I know, there’s no way to configure this directly. Rather, you would have to create a transform step to manually compute the average and then do the steps above as a window aggregate operation, or just take the population standard deviation and multiply it by sqrt(N) / sqrt(N-1). if your use case requires the use of the sample standard deviation.
Hope this helps!
Best,