I have hundreds of datasets in my pipelines and the developers didn’t create health-checks as they built it. Now I need to create health-checks to monitor the pipeline that is backing our production workflow.
How can I create those health-checks in bulk ?
There are multiple means to create Health-checks:
- Create them in bulk from Data Lineage : Select all the datasets you want to create health-checks for, right click, “add Health Checks” and configure the health-check there. This will be created on all the selected datasets.
Tip: You should create a Health-check group or a Monitoring View beforehand, so that you can directly attribute those health-checks to a “central place” you will be able to monitor and get regular digests for.
- Create them from Monitoring Views: You can create a monitor for a given scope (all those schedules in those projects) and new resources within the scope will be automatic monitored. This ensure that all resources are covered, but as well that future resources will be covered automatically.
As an alternative to Monitoring Views, you can created Health-check groups. Monitoring Views are more recent and might cover more use-cases (namely, automatic coverage of new resources).
Note: you can as well create health-checks like Data Expectations from Pipeline Builder, which can then be added to Health-check groups and Monitoring Views.
Most of those methods are documented here.
Note: All pictures contain only notional data e.g. pulled from official documentation
2 Likes