DataFrame-based UDFs in Pipeline Builder?

tstearns · April 21, 2025, 12:40pm

I’m using Python functions as UDFs in Pipeline Builder, via this guide: https://www.palantir.com/docs/foundry/functions/python-functions-builder/

The examples given in those docs are all row-by-row UDFs which take scalar inputs and produce an output per row. Is it possible instead to pass in something like a DataFrame so that I can compute my own aggregation across rows? I believe PySpark calls this a “UDTF”.

I’ve seen cases where an array is stored per row in order to pass an array of values to a UDF to emulate this, but I’m curious whether it’s possible to directly operate across rows in a UDF in Pipeline Builder. Thank you!

sperchanok · April 21, 2025, 2:10pm

Hi @tstearns. Thanks for your question. This is currently not possible in Pipeline Builder. We only support row-by-row UDFs at the moment.

As a follow up, what are you trying to use a UDTF for?

system · June 20, 2025, 2:10pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.