Appending to a Dataset (instead of rewriting) from PipelineBuilder

I have a pipeline that takes in as input a series of pdfs that are manually uploaded via a workshop front-end. As new pdfs come in, I want to maintain the existing dataset and append the new media sets. It seems that Pipeline Builder does not have functionality for appending to a dataset rather than repopulating it completely?

Does anyone know of a workaround?

PipelineBuilder does support incremental datasets (and therefore the ability to append), take a look here for the docs: https://www.palantir.com/docs/foundry/building-pipelines/create-incremental-pipeline-pb

Pipeline Builder does not yet support incremental media sets, for that you’ll need to use python transforms. See this community post information on incremental media sets: https://community.palantir.com/t/pipeline-builder-incremental-mediasets-pdf-extraction/1279

1 Like