The Data + AI Iceberg: What's Below the Surface
Garrett Flynn, Principal at KPMG, on the Data + AI iceberg: the chatbots, agents, and AI workflows organizations invest in are the visible tip, but what determines whether they're trusted, sustainable, scalable, and safe is the eleven foundational disciplines below the waterline.

.png)
Get the Best of Data Leadership
Stay Informed
Get Data Insights Delivered
With insights from Garrett Flynn, Principal at KPMG where he leads the Trusted Data offering, presented the iceberg framework in the closing session of the Bigeye AI Trust Summit. His published work on the topic includes Data Governance in the Age of AI and The Governance Shift: Building AI-Ready Data Models.
When organizations talk about AI, the conversation tends to center on the part that's visible: the chatbots, intelligent assistants, forecasting models, and generative search experiences the business can interact with. These are the use cases that earn executive attention and budget approval. They're also, as Garrett Flynn puts it, the tip of the iceberg. What determines whether they actually work is everything below it.
.png)
"What sits below the surface is what determines whether those use cases are trusted, sustainable, scalable, and safe," Flynn said. "If that submerged foundation is weak, the use cases may still launch, but they're going to struggle to survive contact with the real world."
The foundation Flynn maps consists of eleven foundational disciplines. Taken individually, they're familiar categories. Taken together, they form the infrastructure that makes the visible part of the stack viable.
The foundation organizes into three layers, each doing different work
Flynn walks through all eleven disciplines in the session. Grouped by function, they form three distinct layers.
The first layer covers governance and risk: the authority structures that establish what AI is allowed to do and who is responsible for it. Risk management addresses the legal, regulatory, operational, and reputational risks that come with deploying AI in a given context. Without it, Flynn said, "innovation can move faster than control." Policies, standards, and guardrails define the rules of the road: what acceptable AI use looks like, what development standards apply, and where the boundaries are. Governance and accountability determines ownership across the full lifecycle: who approves a use case, how it gets prioritized, and how decisions get made from build through retire.
The second layer covers data trust: what AI models learn from and operate on. Data quality management addresses whether the underlying data is complete and fit for purpose. AI outputs, Flynn said, "will only be as reliable as the data underneath them." Data access defines who and what can interact with enterprise information, through entitlement models and controlled permissions. Data classification determines how data gets labeled, what permissions apply, and how agents understand what they're allowed to query in a given context. Security covers protection from misuse, compromise, disruption, and integrity loss across development and production environments.
The third layer covers runtime confidence: what makes deployments observable and defensible over time. Integrated trust controls address transparency and explainability. Monitoring and observability tracks model performance, data health, incidents, and drift on a continuous basis. Human oversight defines when people need to be in the loop and what the escalation paths are when something requires judgment. Business glossaries, data catalogs, lineage, and auditability form the metadata layer, connecting AI outputs back to the data and decisions that produced them.
Two of the eleven deserve closer attention: the access control shift that agentic AI requires, and the observability discipline that makes the rest of the foundation operational.
Data access requirements change as AI moves from chatbots to agents
Flynn highlights a shift in the data trust layer that most organizations haven't planned for yet.
Role-based access control (RBAC) works well when human users are on the other side of every request, calibrated for predictable interactions at human pace. Agents query data at a different scale and scope. They operate across operational systems, cross-geography data, and multi-step workflows where what they need changes at each stage of execution.
"As we move from basic GenAI and chatbots to agentic," Flynn said, "that means there also needs to be an evolution from RBAC to ABAC."
Attribute-based access control (ABAC) evaluates permissions dynamically, based on the attributes of a specific request: what data is being accessed, who is requesting it, in what operational context, and for what purpose. Flynn also notes that data classification needs to evolve alongside it. Sensitivity labels alone are insufficient for agents. What's needed is business metadata classification: enough context for an agent to determine whether access is appropriate given the specifics of a query, not just whether a sensitivity tier permits it in the abstract.
The two components are linked in practice. ABAC operates on the attributes that classification defines. Organizations that haven't updated their classification frameworks will find that ABAC can't function at the level agentic AI requires.
Observability is what makes governance operational rather than documented
"If you can't observe it, you can't govern it, and you certainly can't scale it."
Flynn's line from the session closes the loop on all eleven disciplines. Governance policies define what should happen. Monitoring and observability is how the organization knows whether it is. Without continuous visibility into model performance, data health, incidents, and drift, governance stays as documented intent: policies written, accountability assigned, but no signal that either is working.
Flynn is specific about what monitoring needs to cover: model performance, data health, incident tracking, drift detection, and operational behavior tracked continuously throughout the system's life rather than as a one-time check after deployment.
Metadata closes the same loop from a different direction. "Without metadata," Flynn said, "it becomes much harder to explain, trust, or audit what AI is actually doing." Business glossaries, data catalogs, lineage, and traceability make it possible to trace an AI output back through the data sources and transformations that produced it, which is what regulators, auditors, risk teams, and leadership will ask for when an output needs to be explained.
The monitoring and metadata components together form the foundation's feedback mechanism. The other nine disciplines define what good looks like. These two tell you whether the stack is achieving it.
Watch Flynn's full session from the Bigeye AI Trust Summit.
Monitoring
Schema change detection
Lineage monitoring
.png)