Make a change to a pipeline that will only affect future stream inputs

willstone · December 23, 2024, 4:29pm

Hi There

I have an incremental pipeline which uses an input dataset that I continuously append data to (stream).

As part of this pipeline, I have some LLM nodes. I want to make changes to the prompts of the LLM nodes without re-processing the entirety of my input dataset as that would be expensive.

Is there a way I can do this without starting a fresh new input dataset?

Thanks

helenq · January 6, 2025, 2:16pm

Hey @willstone if it’s just changes to the prompt text you should be able to do this without re-processing. Something that also works is the Skip recomputing rows feature: https://www.palantir.com/docs/foundry/pipeline-builder/pipeline-builder-llm#skip-computing-already-processed-rows

When Skip recomputing rows is enabled, rows will be compared with previously processed rows based on the columns and parameters passed into the input prompt. Matching rows with the same column and parameter values will get the cached output value without reprocessing in future deployments.