How can you force a pipeline to build transforms that use recomputed UDFs?

I have a simple Pipeline Builder Pipeline used to refresh an object set every day to facilitate training. Using Python User-Defined Functions, it updates the objects to have been created on today’s date, and randomises a couple of other properties on each object to keep the training objects interesting.

When I build the output dataset, it runs the UDF transforms correctly and updates the output dataset correctly. However, when I schedule a build it appears to either not run at all or run without recomputing the UDF values in my transform.

How can I force Pipeline Builder to recompute the transforms as part of a scheduled build? I am looking for an option something like “Force full build”, but I can’t see this, or any option to include re-running the transforms as part of the build.

Assuming this can’t be done with core functionality, how can I modify my pipeline to trigger the transforms as part of a scheduled build? My pipeline currently looks like:

object_set_skeleton (Fusion table) > transforms_udf (Transforms) > object_set (Output dataset)

In your schedule do you have this checked off? This is the schedule view from Data Lineage

Thank you, I hadn’t even though to access the scheduled build via Data Lineage! This has resolved my issue.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.