Product

November 7, 2022

Using Bigeye data lineage for actionable root cause and impact analysis

min read

Kendall Lovett

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Bigeye gives data teams comprehensive visibility into the health of their data pipelines so they can find and resolve issues faster. In this post, we’ll explore how data engineers use Bigeye’s built-in data lineage features to see where data problems are impacting critical downstream tables and identify the upstream root cause for faster, easier resolution.

Lineage-driven insight into which problems are impacting your business—and how to fix them.

In a Data Operations context, lineage is the path data takes from creation, through various databases and transformation jobs, all the way down to an analytics dashboard, machine learning model, or application. Visualizing lineage is useful for any DataOps tasks that can benefit from knowing where data is flowing to (like understanding the impact of a data quality problem), or where it’s flowing from (like understanding where a data quality issue originated).

At Bigeye, we use data lineage to give our customers a clear, complete view into how data issues are impacting their environment. As soon as you connect Bigeye to your data warehouse, it begins automatically parsing your query history and creating your lineage graph. The graph is available from the catalog, as well as the Issue management workflow. This allows users to see exactly which downstream tables, BI tools, and applications are being impacted by a particular data issue—and which upstream data sources it may have originated from.

Armed with this knowledge, Bigeye users can quickly prioritize an issue based on its impact radius, take steps to mitigate and alert business users, locate the root cause of the issue, and use helpful investigation and debug tools to quickly triage and fix it.

A data lineage use case

Let's take a look at a day in the life of Juan, a data engineer. A few months ago, before his company implemented Bigeye, Juan found out about a data issue after an executive noticed an error in her BI report and notified Juan’s boss. This kicked off a painful, time consuming investigation and resolution process. Now that Juan has Bigeye, let’s see how he can take advantage of lineage-driven impact and root cause analysis to help him solve issues more effectively and improve his quality of work.

Prioritizing issue response with impact analysis

One afternoon, Juan receives two alerts from Bigeye in Slack. Without impact analysis, Juan may have decided to just tackle these two issues in sequential order or choose to focus on the alert that is further outside its given threshold. Now, however, he can quickly review the lineage graph and see that the first issue isn’t directly impacting any downstream sources. He updates the issue priority to “low” and hops over to the second notification.

Juan filters the lineage graph alerting path to Tableau dashboards and immediately discovers this issue is directly impacting a critical executive dashboard. Within seconds, Juan has the context and insight he needs to take action. He sets the Bigeye issue priority to “high”, updates the status to “investigating”, and alerts the downstream dashboard users.

Investigating issue source with root cause analysis

From the same graph, Juan can now switch his view from downstream to upstream and investigate where this problem originated. He finds the furthest upstream table with issues that may be related to the one he’s investigating. By drilling into the table, Juan can see that the cardinality of the ‘user_email’ column has dropped dramatically and he suspects someone may have broken the ingestion on the source list. Once Juan has identified the pattern, he chooses to mute all downstream alerts from the source table, allowing him to focus on the resolving issue without extra noise.

Triaging with row-level anomaly debugging

Once he’s confirmed the pattern, Juan can use Bigeye’s automatically-generated debug query to investigate affected rows in the source table without leaving Bigeye. This gives him the context and information he needs to quickly fix the ingestion job that caused the issue.

Once the issue is resolved on the source table, Juan can use bulk actions to close the seven related issues on downstream tables in a single click.

Conclusion

Data engineering teams don’t have the time or resources to babysit data pipelines or write and maintain tests for every potential data quality issue in their environment. Even if they find a problem through a basic test, complexity and interdependence make it hard to properly prioritize the issue and resolve it. Bigeye gives data teams the tools they need to get clear insights into which issues are impacting the business, and how to fix them.

share this episode

Resource

Monthly cost ($)

Number of resources

Time (months)

Total cost ($)

Software/Data engineer

$15,000

$540,000

Data analyst

$12,000

$144,000

Business analyst

$10,000

$30,000

Data/product manager

$20,000

$240,000

Total cost

$954,000

Role

Goals

Common needs

Data engineers

Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.

Freshness + volume
Monitoring
Schema change detection
Lineage monitoring

Data scientists

Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.

Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing

Analytics engineers

Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.

Lineage monitoringETL blue/green testing

Business intelligence analysts

The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.

Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing

Other stakeholders

Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.

Integration with analytics toolsReporting and insights