Bigeye Staff
bigeye-staff
-
April 23, 2026

The Data + AI Iceberg: What's Below the Surface

6 min read

Garrett Flynn, Principal at KPMG, on the Data + AI iceberg: the chatbots, agents, and AI workflows organizations invest in are the visible tip, but what determines whether they're trusted, sustainable, scalable, and safe is the eleven foundational disciplines below the waterline.

Bigeye Staff
Get Data Insights Delivered
Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.
Join The AI Trust Summit on April 16
A one-day virtual summit on the controls enterprise leaders need to scale AI where it counts.
Get the Best of Data Leadership
Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

With insights from Garrett Flynn, Principal at KPMG where he leads the Trusted Data offering, presented the iceberg framework in the closing session of the Bigeye AI Trust Summit. His published work on the topic includes Data Governance in the Age of AI and The Governance Shift: Building AI-Ready Data Models.

When organizations talk about AI, the conversation tends to center on the part that's visible: the chatbots, intelligent assistants, forecasting models, and generative search experiences the business can interact with. These are the use cases that earn executive attention and budget approval. They're also, as Garrett Flynn puts it, the tip of the iceberg. What determines whether they actually work is everything below it.

"What sits below the surface is what determines whether those use cases are trusted, sustainable, scalable, and safe," Flynn said. "If that submerged foundation is weak, the use cases may still launch, but they're going to struggle to survive contact with the real world."

The foundation Flynn maps consists of eleven foundational disciplines. Taken individually, they're familiar categories. Taken together, they form the infrastructure that makes the visible part of the stack viable.

The foundation organizes into three layers, each doing different work

Flynn walks through all eleven disciplines in the session. Grouped by function, they form three distinct layers.

The first layer covers governance and risk: the authority structures that establish what AI is allowed to do and who is responsible for it. Risk management addresses the legal, regulatory, operational, and reputational risks that come with deploying AI in a given context. Without it, Flynn said, "innovation can move faster than control." Policies, standards, and guardrails define the rules of the road: what acceptable AI use looks like, what development standards apply, and where the boundaries are. Governance and accountability determines ownership across the full lifecycle: who approves a use case, how it gets prioritized, and how decisions get made from build through retire.

The second layer covers data trust: what AI models learn from and operate on. Data quality management addresses whether the underlying data is complete and fit for purpose. AI outputs, Flynn said, "will only be as reliable as the data underneath them." Data access defines who and what can interact with enterprise information, through entitlement models and controlled permissions. Data classification determines how data gets labeled, what permissions apply, and how agents understand what they're allowed to query in a given context. Security covers protection from misuse, compromise, disruption, and integrity loss across development and production environments.

The third layer covers runtime confidence: what makes deployments observable and defensible over time. Integrated trust controls address transparency and explainability. Monitoring and observability tracks model performance, data health, incidents, and drift on a continuous basis. Human oversight defines when people need to be in the loop and what the escalation paths are when something requires judgment. Business glossaries, data catalogs, lineage, and auditability form the metadata layer, connecting AI outputs back to the data and decisions that produced them.

Two of the eleven deserve closer attention: the access control shift that agentic AI requires, and the observability discipline that makes the rest of the foundation operational.

Data access requirements change as AI moves from chatbots to agents

Flynn highlights a shift in the data trust layer that most organizations haven't planned for yet.

Role-based access control (RBAC) works well when human users are on the other side of every request, calibrated for predictable interactions at human pace. Agents query data at a different scale and scope. They operate across operational systems, cross-geography data, and multi-step workflows where what they need changes at each stage of execution.

"As we move from basic GenAI and chatbots to agentic," Flynn said, "that means there also needs to be an evolution from RBAC to ABAC."

Attribute-based access control (ABAC) evaluates permissions dynamically, based on the attributes of a specific request: what data is being accessed, who is requesting it, in what operational context, and for what purpose. Flynn also notes that data classification needs to evolve alongside it. Sensitivity labels alone are insufficient for agents. What's needed is business metadata classification: enough context for an agent to determine whether access is appropriate given the specifics of a query, not just whether a sensitivity tier permits it in the abstract.

The two components are linked in practice. ABAC operates on the attributes that classification defines. Organizations that haven't updated their classification frameworks will find that ABAC can't function at the level agentic AI requires.

Observability is what makes governance operational rather than documented

"If you can't observe it, you can't govern it, and you certainly can't scale it."

Flynn's line from the session closes the loop on all eleven disciplines. Governance policies define what should happen. Monitoring and observability is how the organization knows whether it is. Without continuous visibility into model performance, data health, incidents, and drift, governance stays as documented intent: policies written, accountability assigned, but no signal that either is working.

Flynn is specific about what monitoring needs to cover: model performance, data health, incident tracking, drift detection, and operational behavior tracked continuously throughout the system's life rather than as a one-time check after deployment.

Metadata closes the same loop from a different direction. "Without metadata," Flynn said, "it becomes much harder to explain, trust, or audit what AI is actually doing." Business glossaries, data catalogs, lineage, and traceability make it possible to trace an AI output back through the data sources and transformations that produced it, which is what regulators, auditors, risk teams, and leadership will ask for when an output needs to be explained.

The monitoring and metadata components together form the foundation's feedback mechanism. The other nine disciplines define what good looks like. These two tell you whether the stack is achieving it.

Watch Flynn's full session from the Bigeye AI Trust Summit.

share with a colleague
Resource
Monthly cost ($)
Number of resources
Time (months)
Total cost ($)
Software/Data engineer
$15,000
3
12
$540,000
Data analyst
$12,000
2
6
$144,000
Business analyst
$10,000
1
3
$30,000
Data/product manager
$20,000
2
6
$240,000
Total cost
$954,000
Role
Goals
Common needs
Data engineers
Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.
Freshness + volume
Monitoring
Schema change detection
Lineage monitoring
Data scientists
Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.
Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing
Analytics engineers
Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.
Lineage monitoringETL blue/green testing
Business intelligence analysts
The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.
Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing
Other stakeholders
Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.
Integration with analytics toolsReporting and insights
about the author

Bigeye Staff

Bigeye Staff represents the collective voice of the Bigeye team. Each article is informed by the expertise of individual contributors and strengthened through collaboration across our engineers, data experts, and product leaders, reflecting our shared mission to help teams build trust in their data.

about the author

Garrett Flynn

about the author

Bigeye Staff represents the collective voice of the Bigeye team. Each article is informed by the expertise of individual contributors and strengthened through collaboration across our engineers, data experts, and product leaders, reflecting our shared mission to help teams build trust in their data.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Want the practical playbook?

Join us on April 16 for The AI Trust Summit, a one-day virtual summit focused on the production blockers that keep enterprise AI from scaling: reliability, permissions, auditability, data readiness, and governance.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Join the Bigeye Newsletter

1x per month. Get the latest in data observability right in your inbox.