Jim Barker
jim-barker
Thought leadership
-
March 23, 2026

The House of Data Series: Data Literacy

20 min read

This paper focuses on what it means to read, work with, and communicate about data at different skill levels — and what literacy requires in the age of AI. It does not cover training program design or enablement tooling in depth — those are addressed in the Data Enablement whitepaper.

Jim Barker
Get Data Insights Delivered
Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.
Join The AI Trust Summit on April 16
A one-day virtual summit on the controls enterprise leaders need to scale AI where it counts.
Get the Best of Data Leadership
Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

House of Data Series

Every strong data program is built like a house. Data Architecture forms the foundation — the platforms, pipelines, and operating model that everything else depends on. Seven domain pillars rise from that foundation, each one essential to a complete data program: Data Quality, Privacy, Data Security, DataOps, Compliance, Data Enablement, and Data Consumption. Data Literacy runs across all seven as a connecting beam, ensuring people at every level can read, interpret, and act on data. At the top, People & Leadership sets the direction, accountability, and culture that holds the whole structure together.

This series of whitepapers covers each component of the House of Data in depth. Each paper was written by a practitioner with direct experience in that domain. Together, they form a practical guide to building data programs that earn — and keep — trust.

Data Leadership Data Literacy Data Quality Privacy Data Security DataOps Compliance Data Enablement Data Consumption Data Architecture

This paper covers Data Literacy — the connecting beam of the House of Data, running across all seven pillars and serving as the prerequisite for any of them working as intended. A data program with strong architecture, quality controls, and security posture still fails if the people using the data can't read, interpret, and question it.

Data literacy

We often hear about the ideas of being data-driven, data-first, or data strategy organizations. We also often hear from politicians about the need to have data-driven legislation. We also have state assessments of our kids that evaluate data analysis. The topic we don't hear as much about is the idea of data literacy. Much of the information available on the topic starts with literacy and seems to fall into the same tired topic of data analytics we see elsewhere.

The lack of data literacy in the things we do leads to confusion, distortion, and delay. This white paper will focus on the needs to build out data literacy in our organizations to make better decisions and use data more appropriately.

Val Logan at The Data Lodge has transitioned from being a top-flight analytics consultant, to leading the charge for data literacy at Gartner, to founding The Data Lodge, a company focused on data literacy. The Data Lodge states the focus of data should be: "The ability to Read, Write, and Communicate with data in context in both work and life."

They continue and bring this notion down to mindset, language, and skills. In this paper we will organize around those areas but also discuss: (1) Getting started with data literacy; (2) Evaluating data literacy progress; (3) Conflation of data literacy; (4) Importance of data communication; (5) The varied relationships between data storytelling and data literacy.

Data literacy defined

There are a few definitions for data literacy worth considering.

The Oceans of Data Institute defines the data literate individual as one who "understands, explains and documents the utility and limitations of data by becoming a critical consumer of data, controlling [one's] personal data trail, finding meaning and taking action based on data. [One] can identify, collect, evaluate, analyze, interpret, present and protect data."

The Data Literacy Project defines data literacy as the ability to explore, understand, and communicate with data in a meaningful way. This can be on different levels: technically and advanced, or on a much more basic level.

A public domain definition is: data literacy is the competence to access, interpret, evaluate, and communicate data to derive insights, make informed decisions, and question conclusions based on data.

Another public domain definition is: data literacy is the ability to read, understand, analyze, and communicate with data in a meaningful way. It combines technical understanding with critical thinking and business context, enabling individuals to use data effectively for decision-making.

For this white paper we will use The Data Lodge definition: the ability to Read, Write, and Communicate with data in context in both work and life.

Extended definition and classes of data use

In expanding our context, let's consider some classes of data use that better describe the maturity of the data user. It is important to note the more experienced people are with data, the increased skills they will have to drive out the meaning of the data and grow business benefit.

Data literacy isn't binary. People across an organization sit at different points on a spectrum from avoidance to expertise, and each level requires a different approach. Understanding where your people are is the starting point for any meaningful literacy program.

Experience level Described
Data avoider Often skeptical of data-driven decisions and processes, data avoiders rely on intuition and tribal knowledge. They need to see concrete benefits before engaging. Awareness training is the prerequisite — the goal is to help them recognize their existing strengths as a foundation for building data literacy.
Data newcomer Still in the early stages of data literacy, newcomers have recognized the value of working with data in their current roles. They need foundational learning in data and analysis, critical thinking, and eventually more advanced concepts in visualization and storytelling.
Data apprentice Motivated to go deeper, apprentices are developing skills in data science, algorithms, and statistical analysis. They're also building leadership and mentoring capabilities with an eye toward becoming internal data advocates. Storytelling skills are the key lever for this group.
Data guru The most data-literate employees in the organization, gurus have advanced analytical skills and often carry data science experience. Their development focus shifts to leadership, mentoring, and staying current on emerging methodologies — so they can serve as evangelists and help raise the floor across the rest of the organization.

Data literacy in 2025

In 2025, our attention span is gone. With the advent of GenAI "Ask a Question, Get an Answer" culture, people rarely take the time to read, comprehend, integrate, and critically analyze, but rather just take the answer and run with it.

Additionally, the need for data literacy has changed from read-interpret-and-manipulate data to one in which we need to critically analyze what we find and not be afraid to ask questions after reviewing an answer, result, or report — questions like these:

  1. Wait a minute. Does this make any sense?
  2. This can't be right. How do I dig deeper to verify this result?
  3. Is this data of adequate quality? Should I even be seeing this?

Further, we need to build out programs that try to ensure consistency, anchor our data sets on known truths, be clear about the goals of your research, and validate your findings, cross-check, and leverage subject matter experts for validation.

Document your findings by establishing the goals of your research, the key points or data anchors, documenting your findings, and establishing next steps. Some organizations will focus on being right at the end of the meeting, so be careful to use data to be helpful, not as a blunt instrument to be used as an offensive instrument.

Mindset

Work to develop your people to establish a positive mindset that is focused on data trust. Further, expand the idea that data literacy isn't just about data, but growing the intellectual capabilities across your organization.

Place a premium on the ability to have staff read, review, and grow their intelligence to have broader understanding of the world we compete in, what our competitive and cooperative climate looks like, and find ways to expand ideas and learning from broader understanding.

Also encourage doubt. Grow the idea that it isn't only acceptable but encouraged to doubt the things we think and come up with new and different ideas to make everyone more helpful. This should be done in a way that allows people to be helpful, not hurtful, and encourages growth.

Language

Take the time to build out a common language. This can be related to the way you run your business, interact in your teams, and relate to data and non-data objects.

Nearly all firms find a great benefit in having a common set of business glossaries that unify language, solidify the use of acronyms, and bring people together. It is recommended to have company glossaries, not departmental. Focus on aligning the use of language even when it is difficult.

Some firms segregate terms across business units, geographies, or functions and then have the same term used to mean different things. Don't do that. Do the hard work to get alignment across these very different constituencies.

Skills

Work with your teams and all levels of staff to build out their data literacy skills. These can include but should not be limited to:

Data literacy isn't a single skill — it's a cluster. The capabilities below form the foundation that every employee benefits from developing, regardless of role. Technical depth can be added on top, but none of it works without these fundamentals in place.

Skill Description
Critical thinking The ability to analyze and question data — challenging assumptions, applying logic to problems, and drawing from diverse sources rather than accepting the first answer. This is the most important skill in the set.
Research Understanding what good research looks like: reliable, consistent, and free of bias. Includes evaluating sources, narrowing searches, and identifying both implicit and explicit bias in the data or the analysis.
Communication The ability to translate what data is saying into clear, concise, and unbiased findings for a range of audiences — including stakeholders who don't work with data directly.
Vocabulary A shared, consistent vocabulary that evolves with the organization. Misaligned terminology is one of the most common and underestimated sources of data confusion and bad decisions.
Domain knowledge Staying current with industry trends and evolving data capabilities — reading, following relevant publications, and continually expanding the context in which data is interpreted.
AI savviness As AI-generated outputs become part of day-to-day work, the ability to evaluate correctness, identify hallucinations, and distinguish fact from plausible-sounding fiction becomes a core literacy skill — not just a technical one.

Continue to build your other skills that are related to data literacy. It is only with commanding capability of these core skills that you can focus on the more specific and challenging data literacy skills listed above.

Data literacy isn't a single skill — it's a cluster. The capabilities below form the foundation that every employee benefits from developing, regardless of role. Technical depth can be added on top, but none of it works without these fundamentals in place.

Skill Description
Critical thinking The ability to analyze and question data — challenging assumptions, applying logic to problems, and drawing from diverse sources rather than accepting the first answer. This is the most important skill in the set.
Research Understanding what good research looks like: reliable, consistent, and free of bias. Includes evaluating sources, narrowing searches, and identifying both implicit and explicit bias in the data or the analysis.
Communication The ability to translate what data is saying into clear, concise, and unbiased findings for a range of audiences — including stakeholders who don't work with data directly.
Vocabulary A shared, consistent vocabulary that evolves with the organization. Misaligned terminology is one of the most common and underestimated sources of data confusion and bad decisions.
Domain knowledge Staying current with industry trends and evolving data capabilities — reading, following relevant publications, and continually expanding the context in which data is interpreted.
AI savviness As AI-generated outputs become part of day-to-day work, the ability to evaluate correctness, identify hallucinations, and distinguish fact from plausible-sounding fiction becomes a core literacy skill — not just a technical one.

Adapted from: Tableau

Conflation of data literacy

The current issue with data literacy is the same problem as with data culture. There have been many organizations or people who use the term to make a case for what they care about, which may have nothing to do with what data literacy really is.

As an example, some would use aspects of data analysis, design, development, and roll-out to be data literacy. Others will paint a picture that data analysis, data wrangling, data visualization, data ecosystem, and governance are data literacy. While these are important topics they are not data literacy. They should be considered "Conflation of Data Literacy."

Importance of data communication

There is a widely held belief that communicating with data is a method of delivering messages that are generated with data analytics. Most believe that when this is done correctly, disseminating this information helps the audience to quickly and easily assimilate material and draw the desired outcomes from it. There was a point not that long ago where debates occurred between what the difference is between data, information, and knowledge. Things have evolved so now we try to tell data stories about that situation.

Varied relationship between data literacy and data storytelling

There are two buckets of data storytellers. Those who have a narrow view of data storytelling that is based on analyzing and telling the story of the data they see. This is most completely told by the books from Mike Cisneros that have a theme of Data + Narrative + Call to Action as data storytelling. His latest book is Storytelling with Data Before-After. The other is the idea of inspiring leaders to take advantage of data in their organizations, while Scott Taylor (billed as the "Data Whisperer") tends to talk loudly about what should be done, not just the most recent hot trend, and focuses on the broad set of needs around data management from operational systems, Master Data Management (MDM), and all other critical operational systems.

There is a need to tell the story about data literacy, get people to use data, doubt data, and draw the right conclusions.

Data literacy in relation to AI trust

There is no place that data literacy has a bigger role to play than in AI trust. While AI trust is aligned with data quality, sensitivity, and the use of certified data, the core focus is 100% in alignment with the core focus of data literacy.

In the world of AI, one of the biggest criticisms is that people draw poor conclusions from what gets generated and people don't take the time to understand what the data means. While there is a push for AI governance, the real training needed is to educate all users of AI in data literacy so they make good decisions about how they use AI-generated conclusions.

Data literacy should become the base for AI trust. By having educated staff who understand data and are literate in its meaning, risks, and benefits, entire organizations can reap vast benefits.

Getting started with data literacy

One question that is often asked relative to data literacy is "How do I get started?" This can look like a daunting task. The following is an idea of how to move forward, originally shared by The Data Literacy Project.

1. Plan and assess

  • Assess current state: Conduct a skills gap analysis to understand your organization's starting point and identify areas for improvement.
  • Define goals: Determine what you want to achieve with data literacy, who the learners are, and why you are launching the program.
  • Form a task force: Create a team to lead the initiative, and ensure stakeholders are aligned on the program's vision and objectives.

2. Build the foundation

  • Develop a framework: Create a data literacy roadmap, including key concepts like data governance, data ethics, and data quality.
  • Curate data resources: Identify reliable data sources and create a "measure library" with common metrics and definitions to ensure consistency.
  • Choose your tools: Select user-friendly analytics tools that help with data access, visualization, and a clear understanding of the data.

3. Enable and engage your team

  • Provide training: Design and deliver comprehensive training programs, which can be formal or informal, to teach essential skills like interpreting and visualizing data.
  • Offer hands-on practice: Create a "sandbox" environment for employees to safely experiment with data and practice new skills.
  • Empower champions: Identify and support data champions within the organization to help spread knowledge and encourage others.

4. Foster a continuous learning culture

  • Communicate the vision: Share the program's goals and progress with the entire organization to build buy-in and excitement.
  • Emphasize storytelling: Teach employees to not just analyze data, but to also use it to tell a compelling story that drives decisions.
  • Measure and iterate: Continuously evaluate the program's impact and use feedback to make improvements and scale the initiative.

House of Data reference

The idea of data literacy is present in the House of Data. It requires that the data is created correctly, managed, and handled correctly. To build a literate business community with data it is critical to:

  1. Have data that is collected, transformed, and tested for accuracy and business fit.
  2. Data must be of high quality, trusted by the business community, and fit for purpose.
  3. It should be classified and categorized for privacy so all understand what can be used, shared, or highly restricted.
  4. It must be secured, and only made available to those who have a business purpose for it.
  5. Shared inside analytical applications and rolled out with appropriate support to be used properly.
  6. Exist within the policies established as an organization.

In short, data literacy needs a solid base, and then collaboration for effective and efficient usage.

Role of Bigeye in data literacy

Data literacy is important to Bigeye. As we help customers move forward their data programs through the building of their confidence in their data, we have a more limited focus on data literacy from a distant view. While in reality our renewed focus on AI trust aligns with what data literacy programs try to implement in the area of AI.

Bigeye does offer functionality to show data quality, an important concept for data literacy. Bigeye also shows where data comes from (provenance or root-cause analysis), but the real benefits are in the area of AI trust.

Bigeye illustrates the combination of data quality checks and lineage to better comprehend where data came from, how it is processed, and other information to increase the understanding of your data.

Summary

Data literacy should be embraced. The smarter staff is in regard to their data, their processes, and the possibilities that data can drive, the greater the benefits that will be realized.

Explore the Series

Every great data program is built from the ground up.

The House of Data breaks down the ten pillars of a mature, trustworthy data organization. Click any section to explore that paper.

Data Leadership Data Literacy Data Quality Privacy Data Security DataOps Compliance Data Enablement Data Consumption Data Architecture

References

share with a colleague
Resource
Monthly cost ($)
Number of resources
Time (months)
Total cost ($)
Software/Data engineer
$15,000
3
12
$540,000
Data analyst
$12,000
2
6
$144,000
Business analyst
$10,000
1
3
$30,000
Data/product manager
$20,000
2
6
$240,000
Total cost
$954,000
Role
Goals
Common needs
Data engineers
Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.
Freshness + volume
Monitoring
Schema change detection
Lineage monitoring
Data scientists
Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.
Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing
Analytics engineers
Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.
Lineage monitoringETL blue/green testing
Business intelligence analysts
The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.
Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing
Other stakeholders
Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.
Integration with analytics toolsReporting and insights
about the author

Jim Barker

Director of Professional Services

Jim Barker is a lifelong data practitioner, industry thought leader, and passionate advocate for treating data as a strategic asset. With more than four decades of experience spanning data quality, governance, warehousing, migration, and architecture, Jim brings a rare blend of hands-on expertise and executive perspective to the evolving data landscape.

Jim’s journey in data began at just 14 years old. Since then, he has held leadership roles across organizations including Honeywell, Informatica, Thomson Reuters, Winshuttle (Precisely), Alation, nCloud Integrators, and Wavicle, contributing to advancements in data governance, migration methodologies, and enterprise data strategies. His work has included building global data quality programs, developing scalable governance frameworks, and driving innovation recognized across the industry.

His research and writing focus on lean data management, governance strategies, and the intersection of AI, data quality, and enterprise value creation.

Now at Bigeye as Director of Professional Services, Jim is energized by the company’s vision for data observability and its role in shaping the future of trusted data. He continues to share his perspectives through writing and speaking, aiming to elevate the conversation around data, cut through industry noise, and help organizations do data the right way.

Outside of work, Jim enjoys coaching and spending time with his family, often on the basketball court or soccer field, where many of the same lessons about teamwork, discipline, and leadership apply.

As Jim puts it: “Data matters.”

about the author

about the author

Jim Barker is a lifelong data practitioner, industry thought leader, and passionate advocate for treating data as a strategic asset. With more than four decades of experience spanning data quality, governance, warehousing, migration, and architecture, Jim brings a rare blend of hands-on expertise and executive perspective to the evolving data landscape.

Jim’s journey in data began at just 14 years old. Since then, he has held leadership roles across organizations including Honeywell, Informatica, Thomson Reuters, Winshuttle (Precisely), Alation, nCloud Integrators, and Wavicle, contributing to advancements in data governance, migration methodologies, and enterprise data strategies. His work has included building global data quality programs, developing scalable governance frameworks, and driving innovation recognized across the industry.

His research and writing focus on lean data management, governance strategies, and the intersection of AI, data quality, and enterprise value creation.

Now at Bigeye as Director of Professional Services, Jim is energized by the company’s vision for data observability and its role in shaping the future of trusted data. He continues to share his perspectives through writing and speaking, aiming to elevate the conversation around data, cut through industry noise, and help organizations do data the right way.

Outside of work, Jim enjoys coaching and spending time with his family, often on the basketball court or soccer field, where many of the same lessons about teamwork, discipline, and leadership apply.

As Jim puts it: “Data matters.”

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Want the practical playbook?

Join us on April 16 for The AI Trust Summit, a one-day virtual summit focused on the production blockers that keep enterprise AI from scaling: reliability, permissions, auditability, data readiness, and governance.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Join the Bigeye Newsletter

1x per month. Get the latest in data observability right in your inbox.