Product
-
April 7, 2023

Bigconfig continues to raise the bar for data monitoring as code

We're excited to announce new updates and improvements to Bigconfig, Bigeye’s industry-leading data monitoring as code solution.

Kendall Lovett

Last year we launched Bigconfig, the first monitoring-as-code solution to support enterprise-scale data observability. Bigconfig allows data engineering teams to define data monitoring as code and deploy it across their enterprise data pipelines in a fast, automated, and version-controlled way.

Bigeye customers love that Bigconfig allows them to…

  • deploy metrics automatically on any new data that matches their specifications with dynamic tagging
  • apply any of Bigeye’s 60+ out of the box data quality checks or specify and customize their monitoring with full granular control
  • manage data observability from a central location and track changes with version-controlled audit logs
  • use optimized, human-readable YAML so there’s less code to write and no new language to learn
  • control their entire Bigeye operation from the command line

Since launching Bigconfig, we’ve had the opportunity to partner with dozens of top data engineering teams to develop additional improvements.

We’re excited to share what’s new.

Support for multiple Bigconfig files

Bigconfig allows teams to define data monitoring as code in a simple, human-readable YAML template. With the latest release, teams can now utilize multiple Bigconfig files to configure and manage monitoring for specific data sources, tables, or pipelines individually.

This is especially useful for teams who divvy up ownership of data quality responsibilities across different parts of their data stack. Alternatively, you may want to store different Bigconfig modules in different files. For example, you may want to have your saved metric definitions saved and shared across teams while each team has independent tag deployment files. Learn more about multiple file support.

Tags by column type

Bigconfig has always included dynamic wildcard tagging and reusable monitoring definitions so teams can define what they want to monitor and how they want to monitor it with just a few lines of code. While users can still choose to tag columns and tables by name, they can now also tag by column type.

This allows users to easily define metrics for specific types of data and apply them globally, or in conjunction with other dynamic tags, for increased granular control.

In the example above, the customer used a Bigconfig template file to create a tag for all tables where the table name includes “analytics_warehouse” and the column type is an integer. This allows for even more fine-tuned control over which metrics are deployed and where. Learn more about the flexibility of tag definitions.

Auto apply on indexing

Now administrators can set Bigeye to apply Bigconfig files automatically each time sources are indexed. Indexing occurs automatically once a day and can be triggered on demand by selecting "rescan" in the catalog.

With auto-apply enabled, Bigconfig will automatically apply monitoring to any new tables or columns that match tag definitions in the Bigconfig file. This ensures all new tables get monitoring applied on day 1 and reduces the risk of new tables or columns slipping through the cracks. Learn more.

Queueing for CLI commands

In addition to the above feature enhancements, we’ve also invested in the scalability of Bigconfig and the Bigeye command-line interface. Bigconfig "apply" commands are now queued on the backend to eliminate time out errors and ensure enterprise-scale support.

With these new enhancements, along with many other performance improvements and bug fixes, Bigconfig is now even more ready to take on enterprise data observability for your organization.

As a team of engineers, we love that Bigeye gives us the option to create version-controlled data monitoring as code with an elegant, ‘Terraform-like’ solution. With Bigconfig, we use a simple YAML file to define data monitoring rules and then let Bigeye automatically apply them across our entire data warehouse, including new tables that come online.

Simon Dong, Sr Manager, Data Engineering, Udacity

Check out an on-demand overview of Bigconfig or request a demo to see it in action.

share this episode
Resource
Monthly cost ($)
Number of resources
Time (months)
Total cost ($)
Software/Data engineer
$15,000
3
12
$540,000
Data analyst
$12,000
2
6
$144,000
Business analyst
$10,000
1
3
$30,000
Data/product manager
$20,000
2
6
$240,000
Total cost
$954,000
Role
Goals
Common needs
Data engineers
Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.
Freshness + volume
Monitoring
Schema change detection
Lineage monitoring
Data scientists
Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.
Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing
Analytics engineers
Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.
Lineage monitoringETL blue/green testing
Business intelligence analysts
The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.
Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing
Other stakeholders
Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.
Integration with analytics toolsReporting and insights

Join the Bigeye Newsletter

1x per month. Get the latest in data observability right in your inbox.