Personalising DQ alerts with K

Integrating Great Expectations and DBT Tests results into our metadata observability platform.

KADA
3 min readJun 27, 2022

Our vision for K (www.kada.ai) is to be that one trusted place, that one trusted channel, that provides information about changes, updates and important news about your data ecosystem. Personal news. About the data you use. That matters to you.

One type of data related news that we can deliver in a much better way is data quality alerts.

Why?

Because nothing creates as much confusion as receiving one of these

The problem with this typical DQ alerts is that are not personalised. Is this a table I use? or care about? If there are 25 alerts, how many of them are relevant to me?

Consumers quickly opt out of paying attention to alerts that feel like spam.

So to make data quality alerts useful, data quality alerts needs to be personal.

Making DQ alerts personal

In K, there is a metrics framework for measuring usage. K profiles usage to understand who, when and how data is used. We have extended this framework to capture data quality metadata and results from popular tools like Great Expectations and DBT.

What can we do by combining usage and DQ metrics? Lots!

Let’s take a quick look at one example — making DQ alerts personal

  1. The first step is loading and processing results from GE checkpoints and DBT tests. We have native integrations with GE and DBT Cloud/Core for this.
A profile for a DBT Test

2. K links the DQ results to the data asset and track the DQ test over time. You can see the overall DQ trend and drill through to any specific day to find out what tests ran, and what passed or failed on that day

DQ included in our Data asset profiles

3. K then uses its personalisation engine to alert affected consumers of the data asset when a Test fails. Consumers that recently used this table, and are likely to use it again in the near future will get this notification.

4. To close out the loop, users (Owners, Stewards, Users) can then take actions like creating a JIRA ticket that are automatically linked to the DQ Test Failure & Data asset.

Going end to end to power a faster and more effective Data Incident Management process

It’s exciting to now have both usage and DQ observability in our platform. What’s next? Looking at DQ trends to reduce noise; predicting DQ errors; and tracking other metrics like ETL job performance (DBT model run time post change), BI load times (Power BI dataset refresh) etc.

Reach out to find out more.

--

--

KADA

A metadata platform for tracking how your data is used & where it comes from. KADA enables you to govern data that is important, reduce data risk & scale trust