Breaking Cyclic Dependencies in Foundry Code Repository Transforms
Problem Description
I’m encountering persistent cyclic dependency errors in my Python transforms. My workflow involves:
A materialization dataset (mat-rid)that contains study details
A transform that needs to:
Process some basic study details
Merge them with data from the materialization
Output a dataset that eventually feeds back into the ontology that creates the materialization
This creates a cyclic dependency: transform → output → ontology → materialization → transform input
What I’ve Tried
Creating a snapshot transform in a separate Python file that copies the materialization to a new dataset, then using that snapshot in my main transform
Splitting my transform into two separate transforms (one for processing, one for merging with materialization)
Creating a separate pipeline that syncs the materialization to an intermediate dataset
Despite these approaches, I’m still encountering cyclic dependency errors.
My Transform Structure
My main transform looks like this:
@transform( output=Output(“output-rid”), basic_study_details=Input(“input-rid”), db=Input(“db-rid”), basic_study_details_mat=Input(“mat-rid”) ) def process_datasets(ctx, basic_study_details, db, output, basic_study_details_mat): # Process data and merge with materialization # …
Original Problem:
I had to come to this approach of merging the old data (from ontology materialization) with newly processed data because we want to move away from the Palantir @incremental approach, as we have frequent releases (new features/updates) which is causing the snapshot build instead of incremental (resulting in re-executing already processed data -strictly not expected)
Questions
What’s the most effective way to break this cyclic dependency while still ensuring my transform has access to the latest materialization data?
Are there specific patterns or best practices for working with materializations in transforms that I should follow?
Is there a way to configure Foundry to ignore certain dependencies for cycle detection?
Any insights or examples from similar situations would be greatly appreciated!
It would be great if it is possible to move this to ontology.
(The original problem : I had to come to this approach of merging the old data (from ontology materialization) with newly processed data because we want to move away from the Palantir @incremental approach, as we have frequent releases (new features/updates) which is causing the snapshot build instead of incremental (resulting in re-executing already processed data -strictly not expected)
As per your reply, i could not understand how action logs can help me here, can you please elaborate.
“What’s the most effective way to break this cyclic dependency while still ensuring my transform has access to the latest materialization data?”
Using action logs could work here.
As per your other question , i am open for any of the possible solutions that can solve my original problem statement.
The other question is does this have to be done in the Python transform or could all the logic move to North of the Ontology?
The original problem : I had to come to this approach of merging the old data (from ontology materialization) with newly processed data because we want to move away from the Palantir @incremental approach, as we have frequent releases (new features/updates) which is causing the snapshot build instead of incremental (resulting in re-executing already processed data -strictly not expected
If this flow truly has to exist, then my only recommendation is a mirrored or second OT in your Ontology + automations.
I have a similar flow that I did not entirely do in transforms or north of the ontology because the processing is too complex and needs to happen in Pipeline Builder. Specifically, I am incrementally processing rows of data, and each time I do, I get a ‘cursor’ to tell me where I left off, but that ‘cursor’ has to be parsed from some pretty complex code that I wanted to handle in Builder.
So to get around the dependency problem: when the cursor information is parsed and reaches the ontology, I trigger an automation - condition being whenever there is a change to that OT - that edits the value on another object type that was feeding into my transforms in the first place.
Automate does have warnings about cycling that you can turn on, which will help you avoid creating an infinite loop, but this got me around the dependency.
The original problem : I had to come to this approach of merging the old data (from ontology materialization) with newly processed data because we want to move away from the Palantir @incremental approach, as we have frequent releases (new features/updates) which is causing the snapshot build instead of incremental (resulting in re-executing already processed data -strictly not expected