Sharing Knowledge - How can I make my Foundry Rules build faster and cheaper? Is it possible to make it incremental?

alexanders · October 1, 2024, 8:21am

Summary

In this post, we are sharing some strategies on how to accelerate Foundry Rules (Taurus) Builds. Per default, they are running as snapshots. But by using self-managed transforms, we can tweak the Transform with Java Code to only consider new or edited rules for the build and therefore reduce run times and costs drastically.
In an example on one Foundry instance, build times went down from 2hrs+ to 3mins and therefore reduced the daily average cost from hundreds of dollars to just a few dollars.

Before we go into the details, be aware that this setup requires a self-managed transform.

Customizing your Foundry Rules pipeline is an advanced feature intended for experienced Foundry pipeline authors. This customization can result in increased implementation and maintenance burden for workflow administrators.

You can find instructions on how to eject to a self managed transform here.

Problem

The standard Foundry Rules transform is a great starting place but it evaluates all rules across all historic data in every single transform
So even when just a single rule is edited, all rules are evaluated and the outcome is snapshotted.
This can be a problem if the input datasets grow and/or the amount of rules will increase. This directly leads to increasing build times and resource usage.

Simultaneously, the users of Foundry Rules are expecting to see the results of their changes within a reasonable timeframe after the edit. So the User Experience will increasingly degrade with a larger data scale and a higher amount of rules.

Solution

Key strategy

We want to run the build only for relevant rules, that should be evaluated and where we expect change in the outcome dataset. With a last evaluation timestamp and the current timestamp at build time, we can dynamically identify relevant rules with our own logic.
We are using the previous transaction in the build to merge new rule evaluations with still valid previous evaluations. By doing so, we will have all up to date evaluations in one transaction again.
Please note: This is close to an incremental build, but purely on the level of the rules. When a rule is identified for evaluation, the transform will still use the entirety of all input datasets for the evaluation of this rule.

How?

Leveraging the Rule Status dataset

Along to the rules output, Foundry Rules also creates a rule status dataset which contains a property last_modified_timestamp for the last edit of a rule. Additionally, we can change the transform to also save the last evaluation timestamp as last_evaluation_timestamp.

results.ruleStatusOutput().getValue().withColumn(
     'last_evaluation_timestamp', functions.current_timestamp()
)

By having these two timestamps, we can filter rules that are not relevant for evaluation and only keep the ones, where

last_modified_timestamp > last_evaluation_timestamp

This is the baseline for our incremental strategy, but we can also define other criteria to still evaluate rules without a change. Here is a code snippet to define a new property should_evaluate that indicates for each rule as boolean whether it should be evaluated.

.withColumn("should_evaluate", 
   // Only run rules, which have been edited since the last transform run
   rules.col("last_modified_timestamp").gt(latestStatus.col("last_evaluation_timestamp"))
   // or run rules, which havent been evaluated in more than 7 days
   .or(latestStatus.col("last_evaluation_timestamp").lt(date_sub(current_date(), 7).cast(TimestampType)))
   // or rules that have never been run
   .or(latestStatus.col("last_evaluation_timestamp").isNull())
)

Then we perform a leftsemi join from the rules dataset with the status dataset which results in a dataset of rules which should be evaluated.

Dataset<Row> rulesToBeEvaluated = rules.join(
    rulesToEvaluate.filter(col("should_evaluate").equalTo(true)), 
    rulesToEvaluate.col("ruleId").equalTo(rules.col("rule_id")),
    "leftsemi"
);

Leveraging transactions

When evaluating only a subset of the rules in each build, we need to reuse previous transform outcomes. outcome is the actual rule output - but with the following code we use the previous transaction.

Dataset<Row> previousOutcome = outcome.getExistingOutputDataFrame().get();
Dataset<Row> previousStatus = rulesStatus.getExistingOutputDataFrame().get();

We use our existing outcome dataset and remove any row with a rule_id that is to be evaluated in this transform. So in technical terms, we run a left_anti join on the previous outcome with rulesToBeEvaluated on rule_id. (We also do this for status)

Dataset<Row> cleanedPreviousOutcome = previousOutcome.join(
    rulesToBeEvaluated,  previousOutcome.col("taurus_rule_id").equalTo(rulesToBeEvaluated.col("rule_id")),
    "left_anti"
);

And after the rule evaluation, we union the new outcome with the cleanedPreviousOutcome

Here is a schematic of the whole strategy

Schedule Strategy

We run the transform when there is an edit to the rules writeback dataset (+ if you have other critical datasets on which the transform should be run, you can also add them as trigger)
(Optional) Force builds on the weekends via a CRON schedule - you can combine this a extra logic within the transform, that will check for the weekday evaluate all rules.