Monitoring Stripe data with Bigeye

How do you make sure your Stripe data is being accurately monitored? With a data observability tool like Bigeye. Let's walk through it.

Liz Elfman

Many companies rely on Stripe data to make key business decisions. Reinforce the quality of this data is by using a data observability tool. In this blog post, we'll demonstrate how to quickly spin up deep data monitoring on your critical Stripe datasets with Bigeye.

Understanding Stripe datasets

Depending on which Stripe product you use and how you’re using it, you'll produce different types of raw Stripe tables. For example, a SaaS software company might use the Stripe subscription product, while an e-commerce company might use Stripe to accept payment for orders. A lending company might use Stripe for credit notes. Below, we show how raw Stripe data might look as it arrives in your data warehouse through an ETL tool like Fivetran.

It's common to ingest raw Stripe data into your data warehouse or lake, then have your data engineering team perform transformations and permission your BI teams for analysis. Improper financial data landing in your BI teams can result in disastrous business decisions being made.

There are two critical tables for every organization that uses Stripe:

  • Balance transaction: a running log of every transaction hitting your Stripe account

  • Customer: a running log of all customer information 

Other tables containing data like payment method cards, intent, and payouts, may also be available as metadata on the account itself.

Generally speaking, you will do some preliminary cleanup of the raw Stripe data. Then, the data will be aggregated into higher-level tables that make business metrics like sales, revenue, refunds, etc, easily accessible. 

How to monitor Stripe data with Bigeye

We suggest taking a programmatic approach to deploying data monitoring on your critical datasets. Why? 

Stripe tables can be repetitive; data includes many of the same, repetitive columns. As a result, when deploying metrics against Stripe data, it’s useful to use Bigconfig, which allows you to specify that all columns with certain column names should be monitored in a certain way.


BigConfig consists of two components: tag definitions and metric definitions. Tag definitions allow you to use selectors with wildcards to identify repetitive columns like transaction IDs, amounts, and currencies that occur across multiple tables. In the example below, the tag definition OBJECT_IDS states that every time a column name is of the form SAAS*.STRIPE_RAW.*.id, it belongs in the OBJECT_IDS tag.

Metric definitions allow you to apply certain kinds of metrics to each tag, without having to enable the appropriate metrics for each column. Metric definitions simplify your monitoring deployment.

Metric definitions also allow you to create custom metrics. For instance, you might want to track your refund data, grouping it by currency and calculating the average refund per currency. 

Once you have defined your tags and your metrics, you can apply certain metrics to certain tags in a deployment.

In Bigconfig, you can also create Collections to organize the tag and metric definitions. For example, we may create one collection that displays all sales data, one for customer data integrity, and another for general data integrity. Each SLA in the Bigconfig file maps to a collection in the Bigeye UI. Each collection can be configured to send notifications to specific individuals.

What metrics should you monitor on your Stripe data?

You’ll notice that in the UI above, we’ve created several collections. These collections roughly correspond to three types metrics that we suggest you monitor on your Stripe data: General data integrity/Balance transaction integrity metrics: These collections include checks that all transaction IDs are unique, non-null, and matching across multiple tables.

Accurate sales data metrics: These collections include checks that numerical values like number of sales made, number of refunds issued, chargebacks, invoices, fees, etc., all make sense. We recommend grouping these metrics by currency to ensure each currency is operating correctly. 

Customizable metrics: Finally, Bigeye’s metric templates allow you to define custom checks. For example, maybe one of your Stripe datasets contains some JSON data. Checking that JSON data directly for whether it’s empty, etc., might not give granular enough guarantees that key/value pairs are present. Instead, with Bigeye, you can define queries that expand the JSON data into specific columns to check into specific values.

Step-by-step instructions for monitoring your Stripe data

  1. Add the data warehouse that contains your Stripe data, as a data source in Bigeye

  2. Copy the Stripe BigConfig template into a local yaml file

  3. Follow the instructions in the recipe to add your own custom tag definitions and metric definitions.

  4. Follow the instructions here to install the BigConfig CLI tool

  5. Apply the BigConfig to your dataset using a single CLI command


Monitoring your Stripe data with Bigeye can ensure the integrity of your financial data. The process is made easy by the Bigconfig tool, which allows for efficient data monitoring and notification in case of any issues. As we continue to expand our turn-key coverage of other popular SaaS data sources, like Hubspot, we look forward to offering even more low-friction service to our users.

If you would like to try this out, contact us now. You just need a read-only account with access to your Stripe data, or you can use our dummy Stripe datasets to give it a spin!

Join our newsletter

Anomaly detectionData observabilityMonitoring
Share this post

Related posts


Enabling self-serve data quality with Bigeye

Is "self-serve" data quality possible? Sure it is. Take a spin through the ways Bigeye enables data teams to self-serve the data they need.

Liz Elfman

Bigeye and dbt Labs partner to speed data issue detection and resolution

With the new partnership, Bigeye and dbt Labs help data teams build healthy, reliable data pipelines and find and fix data issues before they impact their business.

Kendall Lovett
Thought leadership

A brief history of Databricks

Databricks has been a key innovator in data over the past decade. Here's a rundown of their history and impact on data engineering and ML.

Liz Elfman