Appending to a global table across transforms of a single transform generator

1353bbf351a3cbbb4eda · June 4, 2025, 8:21pm

Hi I’m trying to combine incremental functionality with non-incremental in a code repo, and unsure how to approach this task. I’d like advice on implementing a series of transforms within a transform generator that outputs a new transformed output dataset for each input dataset, yet also have a single lookup dataset that gets appended to in each transform and is output in an incremental fashion (starting with a new dataset at the beginning of the loop of transforms yet is appended to with each transform)

caveats:

you can’t have multiple transforms writing to the same dataset (and thus even attempting to use incremental decorators won’t work here)
using **inputs to bring in multiple input to a single transform and process all the files would be difficult to avoid OOMs as the datasets are billions of rows