Hello Community,
I have an automation setup and have a few doubts regarding failure handling and debugging.
Scenario
My automation takes in an object set, applies a filter, and then the filtered objects trigger the automation. The automation then runs a function to update another object set.
For example:
-
Object Set A contains a large list of drivers across multiple regions along with their license numbers.
-
Since the dataset is very large, I created multiple automations, each filtering drivers for a smaller region.
-
These filtered objects trigger a function that updates Object Set B, where driver details and license numbers (as foreign key) need to be populated.
Doubt 1: Behavior when automation fails
Suppose the automation is triggered by 100 objects.
-
The function processes the first 70 objects successfully.
-
At the 71st object, there is an issue (for example, a validation condition causes the loop to get stuck and the function eventually times out).
In this situation, I want to understand:
-
Are the first 70 successful updates committed/written back, or is the entire transaction rolled back?
-
Does the automation continue processing objects 72 onward, or does execution stop completely once the failure occurs?
-
Is the behavior transactional or partial-success based?
Doubt 2: Reasons for automation failure
I am aware of some common failure scenarios such as:
-
Function timeout
-
Automation being triggered by more than 10,000 objects
-
Attempting to write/update more than 100,000 objects
However, I am unable to clearly identify the exact reason for failures in some cases.
For example:
-
My automation runs hourly.
-
I received an email saying:
“Please verify the automation’s configuration if failures are unexpected. There was 1 failure for effects of type action.”
But when I checked the automation page:
-
The run status showed as “Succeeded”
-
No expected updates were present in the output dataset but how do I access the error logs
So, I would like to understand:
-
How can we properly debug such cases?
-
Where can we see detailed failure logs/error traces?
-
Is there a way to identify which specific object caused the issue?
-
Can partial failures occur even when the automation status shows as succeeded?
Would appreciate any guidance or best practices for debugging and designing resilient automations.