Pipeline builder - anyway to detect columns that have only null value and drop them?

Hi, I would like to check are there any out-of-the-box existing functions within pipeline builder that I can use to detect columns that contain only null value and drop them?

It is easy to do this on code using spark, but I am trying to find a low-code way for my user.

thank you!

1 Like

Currently this is not possible as far as I know in pipeline builder.

The difficulty with doing this dynamically in PB is that the output schema from this transform could change dynamically, which would cause uncertain behaviour downstream depending on if a column contains nulls on input.

For example, if today a column contains non-null values, and I use this column in transforms in my pipeline, then tomorrow suddenly all the values are null, then the pipeline could break in unexpected ways.

As part of data cleansing in a pipeline, in my opinion it’s important to look at each column and decide if you need it or not, even if it contains null values, as that might change in future!

understand! thank you @jgreensmith

1 Like