Update transform input dynamically

I am trying to change a transform input to include all the datasets in a given folder. i figured out that the dataset rids for a given folder can be fetched via an compass API but was wondering what the best way for me to update this in a transform would be.

Chiming in from the Pipeline Builder side-- we currently don’t support dynamic # of inputs so code repos would be your best bet here

I think the official solution is this one:

https://palantir.com/docs/foundry/building-pipelines/compass-file-lister//

2 Likes

hm. I guess the main issue is not to think of how to get the rids in a folder, or the inputs in a transform e.g. via a factory pattern that produces your transform based on data. The main issue that I see is to also re-run checks when data updates so that the lineage is complete…

I think the solution of @nicornk does this by auto-comitting to your repo, is that correct? unfortunately it seems that feature is already facing sunset.

What speaks against making a union in a prior step of the datasets? Do you have no control of uploaded resources? Maybe you need a mechanism where you append to existing datasets instead adding new ones next to them…

How often do you expect the data to arrive? If you seek for an easy way to do the initial load in case you have hundreds or thousands of datasets that you would like to union, you could try using a dict comprehension from the list of rids to generate the kwargs required in the @transform decorator…

just some thoughts… depends on what exactly you’re after.

The use-case is a user uploading csv files manually and automatically running transforms on the dataset + all the existing datasets combined. looking at using Attachments in an Action to have users upload the .csvs right now!

1 Like

This sounds similar to this issues: How to hide the “replace” radio button for manually imported data? - Ask the Community - Palantir Developer Community

My suggested solution was to create a mini Slate App that assist in uploading the csv into a schemaless dataset containing the raw files. This way you only need a single input to your transform.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.