I am working with a dataset that acts as a snapshot of inventory positions and was wondering if there is a way to have a recursive dataset where the positions dataset is updated using its previous value within foundry considering that cyclical dependencies are not allowed.
The current best solution that I can think of is exporting the data out of foundry then re-ingesting based on a schedule.
Is this problem best solved through use of the ontology or are there any other good options?
Basic Flow Diagram:
Possible solution where dataset B is outside of foundry:
This sort of architecture is supported in a first-class way with incremental transforms, which allow you to read the existing output. The documentation is comprehensive and should hopefully be enough to get you on your way!
1 Like
Thank you for the information.
How would this resolve the cyclical dependency. As far as I can tell regardless of the incremental decorator I can not have the input and output dataset be the same in transform:
from transforms.api import transform, Input, Output
@transform(
product=Input('/examples/datasetA'),
newData=Input('/examples/datasetB'),
processed=Output('/examples/datasetA')
)
You don’t declare the dataset as an input; you just declare it as an output, and you take advantage of the fact that the IncrementalTransformOutput class has methods like filesystem
and dataframe
that allow you to retrieve the existing data.