Product
-
March 31, 2023

Streamlined data reliability engineering notifications with Bigeye

Introducing many new Slack-related features! In this post, we'll walk through them.

Jon Hsieh

We’re excited to announce improvements to notifications in Bigeye.  Data engineers spent large amounts of time in Slack, MS Teams, PagerDuty, and email. It's important to get realtime notifications from Bigeye in these tools so that the engineers can collaborate. With our latest release, notifications are now comprehensive. Data engineers get status updates regardless of whether the actions happen in Slack,  Bigeye, or via the Bigeye API.  Notifications are now consolidated into threads, making it easier and more convenient to collaborate with teammates when debugging specific issues. Slack notifications are now also interactive, allowing data engineers to triage, track, and resolve issues directly within Slack. Finally, we’ve streamlined notifications, ensuring every alert is relevant and actionable.

Comprehensive

As an issue goes through its lifecycle, Slack notifications about an issue are sent to the same Slack thread.  The thread’s root message keeps in sync with the issue’s status to summarize its current state. This continuity allows Slack users to quickly scan through a channel’s timeline and get an idea of the status of the issues in it.  

Actions that cause transitions will show up in the notification stream, regardless if the action is taken in Slack, in the Bigeye app, or via the Bigeye API. You'll get the context and track the lifecycle of most issues completely from within Slack, even if the folks on your team like to access Bigeye via different mechanisms.

Collaborative

Since notifications are threaded, you now have a place where you and your teammates can collaborate on the issue, inline with the updates on the issue.  You can use core Slack functionality such as @’ing folks to bring other teammates that may have expertise in the area. Any correspondence can now be in one place and can serve as the basis for a retrospective or run book creation after a data reliability incident is resolved.

Interactive

Now that you are using Slack to get updates and collaborating with teammates you’ll likely reach a conclusion on an issue.  You can now take action on a metric directly from slack.  This means you can mute items, close issues, and give feedback on whether or not thresholds need to be updated.

As these actions are taken, the state in the app as well as in the Slack threads are updated.  Everyone can track the state of their data pipelines regardless if they are in slack or viewing the Issue in the Bigeye app.

Streamlined

We know the value of a data engineer’s time, and work hard to ensure every notification is worthy of their attention. The latest update streamlines notifications so they only occur when data changes cause a metric to alert, when an alerting metric becomes healthy, when the state of an issue changes, or when there are configuration changes that affect an issue. Thus getting a notification means something has changed and is more likely to be actionable. This helps ensure alerts are actionable and reduces unnecessary alert noise.

Summary

The updates to the notification experience improves the actionability and meaningfulness of notifications.  It enables data reliability engineers (DREs) to interact with notifications and react to issues in their infrastructure at a computer, or on their devices like their tablets or phones.  By being able to tackle more on the first pass, subsequent triage and reviews on data quality issues are sped up and streamlined.

share this episode
Resource
Monthly cost ($)
Number of resources
Time (months)
Total cost ($)
Software/Data engineer
$15,000
3
12
$540,000
Data analyst
$12,000
2
6
$144,000
Business analyst
$10,000
1
3
$30,000
Data/product manager
$20,000
2
6
$240,000
Total cost
$954,000
Role
Goals
Common needs
Data engineers
Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.
Freshness + volume
Monitoring
Schema change detection
Lineage monitoring
Data scientists
Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.
Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing
Analytics engineers
Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.
Lineage monitoringETL blue/green testing
Business intelligence analysts
The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.
Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing
Other stakeholders
Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.
Integration with analytics toolsReporting and insights

Join the Bigeye Newsletter

1x per month. Get the latest in data observability right in your inbox.