I have an external transform that makes a call to an external system, and I want to structure that call to run a filter on the data that only pulls data since the last time my output dataset built. Is there a way to get the last successful build start time so I can use that in the transform API call? I don’t want to make an incremental dataset, as I want to use the transform to make other API calls based on the response of the first call, and then finally write some new rows to my dataset.
Do you want the last successful Foundry build start time, or do you want the time at which the most recent external filter call was done? Those are subtly different. I would expect the latter given it ensures the filtered time range in the next build is between the last time the external system was queried and the current time. A few options for handling that timestamp in a stateful manner in external transforms:
- Use incremental processing to store a “state” file with the latest query timestamp in your external transform. After saving that timestamp in a file in the output, read it back into the transform by reading the output and use it for the next set of API calls.
- Store the time of last build in the output dataset as a column. Use incremental transforms to access the
current
output and read the value in the column by collecting the value in that column. Use that value as part of your filter in the external call, then write the new rows to the dataset.
To write to the output, if you only want to add new rows to the output, you should be able to use an output write mode of modify
. Either way, you’ll probably want to use an Incremental Transform to have access to more flexibility to read state from the prior output and control how you are writing / overwriting data as needed.
1 Like