Engineering

How to calculate the ROI for data observability

On the one hand, businesses are more data driven than ever before. On the other hand, data pipelines are increasingly complex and error prone. Is it time to invest in data observability?

Kyle Kirwan

What is data observability?

Analogous to observability in software engineering, data observability refers to the practice of instrumenting your data systems to give a comprehensive view of what is going on in each component of your data stack at any given time.

You can read more about data observability and why it’s important here.

Convincing your organization that you need a data observability solution

Building a data observability practice in your organization often requires upfront investments – in engineering hours, process changes, and the purchase of technical solutions. Often, before leadership is willing to commit, they’ll want to understand the return on investment (“ROI”). Data teams looking to invest in data observability will need to prove that better quality, fresher data maps directly to increased revenue and/or cost savings. 

Calculating ROI

ROI is a generic performance metric that measures the efficiency of a particular investment, in particular the return compared to the cost. It’s especially helpful when used to compare multiple potential investments.

There are two components to calculating ROI:

  • calculating the return

  • calculating the initial investment/cost

Since you’re trying to justify an investment to improve data, make sure that your argument is data-driven. This means starting with the most quantifiable impact: how will implementing data observability either increase revenue or decrease costs?

Here are few examples of “pathways” that bad data might take to affect the company bottom line:

  • If a data outage impacts a company’s machine learning models, the loss of revenue can be significant. For example, a data outage that results in Uber’s surge pricing algorithm updating, might potentially cause millions in lost revenue even over the course of an hour.

  • Data quality issues might result in direct costs. For example, if the format of customer names and addresses are not validated, multiple mailers might be sent to the same actual customer, creating waste.

  • Data quality issues eat into developer productivity. Without even taking into account opportunity cost, the time that engineers spend chasing down data reliability issues that they shouldn’t have directly maps to salaries and equity compensation.

To ensure that you’re quantifying the potential return in a comprehensive, methodical way, rather than adding up random impacts, we recommend the following steps to calculate return.

Step 1: Identify all specific business issues within a company

Some examples here might include:

  • Users are registering for “new user” promo codes more than once.

  • Fraud detection is not catching fraudulent users.

  • Analytics dashboard showing sales is not up to date

Step 2:  Determine the cost of these specific business issues

The respective answers here might be:

  • Cost of users using “new user” promo codes when they should not be allowed to: $100,000/year

  • Cost of fraudulent users: $200,000/year

  • Cost of inadequate inventory in different locations due to lack of up-to-date analytics dashboard: $300,000/year

Step 3: Determine whether bad data is at the root of the issue.

The respective answers here might be:

  • Yes, because there’s no validation on new user names or emails so there are duplicate entries of a single user in the database

  • Yes, because there’s missing data

  • Yes, because there’s often a delay in the transformation of data

Step 4. Set data SLAs to improve the quality of the data.

The respective answers here might be:

  • The users database table must be deduplicated; all future writes must be standardized in format, and checked against existing entries.

  • Missing training data must be interpolated. 

  • Max delay from orders data being produced in Shopify and orders data at rest in Snowflake should be 24 hours. This should allow for timely inventory projections.

Step 5. Determine the updated cost of the issue to the business.

The respective answers here might be:

  • This should reduce the cost of duplicate new user orders by 100%. 

    • Savings of $100,000/year.

  • This should bring the false negative rate down to 2% from 4%. 

    • Savings of $100,000/year.

  • This should bring the leftover inventory percentage down 50%. 

    • Savings of $150,000/year

Less quantifiable metrics

While things like engineering time and software outages can be more or less mapped to dollars and cents, there are other potential “returns” for data observability that are less quantifiable but arguably even more significant. These include:

  • Ability to make good business decisions

  • Potential PR or legal risk

  • Lower employee retention

Our recommendation is that you do not attempt to include these “soft” metrics in the quantitative calculation, as you would have to make potentially ungrounded estimates. However, you can include a qualitative writeup of them along with your final ROI report. This provides decision makers with an additional data point if they’re on the fence, and allows them to value the soft impact as they choose.

Calculating Investment/Cost

In addition to determining the return, data teams will also need to calculate the cost. A simple strategy for determining the cost is to examine three categories: 

People: the cost of data engineers to whom the issue will be assigned. 

Process: the cost of hiring, training, and general change management. 

Technology: data observability tool purchase, implementation, and maintenance as well as infrastructure like servers or databases. 

When evaluating all of these categories, it is important to consider both short- and long-term costs.

Case Study 

Let’s say that you are an e-commerce brand, and your business issues are as above. Let's look at certain specific issues to determine the overall ROI of a data observability tool:

Issue: Users are registering for “new user” promo codes more than once

  • Potential savings after observability tool implementation: $100,000

  • Implementation cost: $80,000

  • Total savings: $20,000

  • ROI: 25%

Issue: Fraud detection is not catching fraudulent users

  • Potential savings after observability tool implementation: $100,000

  • Implementation cost: $80,000

  • Total savings: $20,000

  • ROI: 25%

Issue: Analytics dashboard showing orders is not up to date

  • Potential savings after observability tool implementation: $150,000

  • Implementation cost: $80,000

  • Total savings: $70,000

  • ROI: 87.5%

Conclusion 

Before companies invest in data observability, they will often want to calculate the ROI. They can do this by enumerating business issues, determining their data quality roots, and then setting SLAs that will ameliorate these issues. In arguments made to decision-makers, the quantitative ROI can be supplemented by additional “intangible” effects of data quality improvements, i.e. developer morale and better business decision making. 

Join our newsletter

Share this post

Related posts

Thought leadership

Data observability: Build or buy?

Companies looking to invest in data infrastructure tooling often face an age-old conundrum: do we build it or do we buy it? Let's explore.

Kyle Kirwan
Engineering

Deploying data observability: wide or deep?

Borrowing patterns from Site Reliability Engineering (SRE) and DevOps, data observability tools help data teams to understand the internal state and behavior of their data.

Kyle Kirwan
Thought leadership

Data observability, for any data team’s structure

Data teams tend to fall into one of three shapes. That shape will That will dictate the best strategy for rolling out and managing observability over your data, pipelines, and assets like analytics dashboards and machine learning models.

Kyle Kirwan
Company

Enabling self-serve data quality with Bigeye

Is "self-serve" data quality possible? Sure it is. Take a spin through the ways Bigeye enables data teams to self-serve the data they need.

Liz Elfman
Company

Bigeye and dbt Labs partner to speed data issue detection and resolution

With the new partnership, Bigeye and dbt Labs help data teams build healthy, reliable data pipelines and find and fix data issues before they impact their business.

Kendall Lovett
Thought leadership

A brief history of Databricks

Databricks has been a key innovator in data over the past decade. Here's a rundown of their history and impact on data engineering and ML.

Liz Elfman