customer story

Bringing Reliability to CI/CD Data Pipelines

Mayan uses Bigeye to validate data pipeline merges and deliver self-serve analytics dashboards at scale.

Growth

Services

Overview

Use cases

Data model testing

outcome

60%

reduction in time to merge data

tech stack

Mayan uses Bigeye to validate data pipeline merges and deliver self-serve analytics dashboards at scale.

About Mayan: Delivering powerful analytics to Amazon Sellers

Mayan provides Amazon sellers with powerful self-service analytics dashboards—giving them deep insights into how to optimize ad spend and more effectively grow their business.

Mayan powers its advanced analytics with a modern data stack that includes Snowflake as the primary data warehouse, Airbyte and Fivetran for extraction, dbt for transformation, and Looker to create and embed custom client dashboards. Mayan handles the scale and complexity of their operation by running all data model changes through a data-as-code continuous integration (CI) process with Gitlab.

Challenge: Building trust in data pipelines

Mayan’s small centralized data engineering team sought to optimize their data model testing process and provide self-service access to data analysts by implementing a CI (continuous integration) data pipeline with GitLab and dbt. While this data-as-code approach helped simplify and streamline data model testing, the team lacked confidence in the results due to the frequency of bugs in dbt jobs and their inability to validate a successful merge or pinpoint the cause of a failure. Each time a test did fail, the engineering team had to go through a slow, manual debugging process and data analysts didn’t have visibility into what changes would help their merge request succeed.

As a result, data merges took an average of 4 to 5 days to complete, with some taking over two weeks. Mayan needed a way to increase throughput and reduce toil on the team by monitoring the success or failure of tests and immediately identifying the point of failure for fast, easy resolution.

Solution: Blue-Green Deployment Tests

After evaluating several solutions, Mayan selected Bigeye’s data observability platform to help monitor and identify issues in data pipeline CI tests. With Bigeye, Mayan is now able to use blue-green testing to compare the staging and production tables and get instant insights into whether the ELT job did what they expected. If there’s an issue, the engineering team can now identify the exact point of failure and correct it—no manual debugging required. In addition, the team now has a historical view into merge performance over time, allowing them to track trends and provide feedback to analysts on changes they can make to help ensure their merge requests pass the first time.

Results: 60% reduction in time to deploy and insights into the health of data pipelines over time

By implementing Bigeye for data model testing validation, Mayan was able to reduce the average time to merge updates into a data model from 4-5 days down to 1. This 60% reduction in time to push changes allows data analysts to move faster, increases confidence in the data, and frees up the data engineering team to focus on high-value projects instead of chasing ETL bugs. Looking forward, the Mayan team plans to use Bigeye anomaly detection to observe the health of their analytics data and monitor the quality of training data being fed into ML applications.

Bigeye has helped us cut the time to merge by 60% while also providing artifacts on each of our merge requests. This gives us transparency to look back at past requests to see where we failed and what we can improve to decrease failures in the future.

Loc Nguyen

Data Engineer

share this case study

Bringing Reliability to CI/CD Data Pipelines

60%

About Mayan: Delivering powerful analytics to Amazon Sellers

Challenge: Building trust in data pipelines

Solution: Blue-Green Deployment Tests

Results: 60% reduction in time to deploy and insights into the health of data pipelines over time

Start your data observability journey