PB Feature Request: Get dataframe().columns

In code repo, I use input_dataset.dataframe().columns to get all the columns. For example:

string_columns: list[Column] = [F.col(c).cast(T.StringType()) for c in df.columns]
df: DataFrame = df.withColumn('check_changes', F.concat_ws('_', *string_columns))

In Pipeline Builder, I need to carefully click “Add Item” for all ~100 columns, in the correct order, in the concatenate string transform. Please enable an option to either select all, or preferably a way to paste a list of columns, similar to how the Select transform allows for pasting a list of columns. (This would also help with the pain of manually dragging the handles to re-order 100 columns.)

1 Like

Hey @Joel just to clarify you want an add all option for the concatenate strings board:

Yes please! “Add all” would be okay, the ability to paste in a subset of all columns would be even better.

Understood! Have made a feature request that the Pipeline Builder team will track. Will let you know when this is ready for users to use!

1 Like