Get the Best of Data Leadership
Stay Informed
Get Data Insights Delivered
Last Thursday I had the pleasure of presenting "AI Trust Supported by Governance" at the DAMA NYC Data Day. This was thanks to Danette McGilvray who recommended me based on our similar opinions on focusing first on the fundamentals of data quality to make AI work.
Those who know me understand that I don't do the whole selfie thing, but like to share some of the lessons learned I find in these sorts of events.
Today's post is going to cover three areas: (1) A plug for getting involved in DAMA locally; (2) Review of some great points from the DAMA, mainly two presentations I found extremely valuable; (3) Recap of my session and 'other items' that came up while I was presenting.
Why getting involved in DAMA matters
Eighteen months ago when my daughter graduated from Colorado State in the Ag School her key note was from a former graduate, and former president of the US Potato Growers Association. He really pushed the students not just to live your life and do your job but get involved. It was a great speech and it correlates to my perspective on life and the value we bring to our communities. In a world where we have such changes in the AI and Data space we should get involved. Also we should get involved in efforts that are about making people better not stroking our own ego's but really make a difference, to me DAMA is the best place to do that. Whenever I am asked I try to help out DAMA, I have done this in NYC, Ohio, Wisconsin, Vancouver among others. You should do the same as both a presenter but also attending their events and encouraging your younger staff to join as well. DAMA sessions often have a very 'experienced' set of people attend but the more junior folks tend to be lacking, try to help with this.
What stood out at DAMA NYC Data Day
Using AI to lighten the data steward's load
Kelle O'Neal presented the keynote of "Unleashing the Power of AI for Data Management". While some may think she shared some 'captain obvious' content, I found her presentation to be insightful and helpful for nearly all. She created an imaginary company and covered how AI capabilities can lighten the load of data professionals, mostly data stewards and make a big difference. This idea of using AI to lighten the load is one I have been very involved with and spoke of before, to me it is the great benefit AI gives to data professionals when we focus on outcomes.
In her session she spoke of: Data Capabilities, Metadata Management, Data Quality Management, Master Data Management, Data Lineage, Privacy & Compliance. She used this fictional company to describe their challenges, and how having an AI-Powered and Steward-Elevated team you can go far.
She did speak often about using AI to build these capabilities, the one thing I would share is that often there are other capabilities in the marketplace that can deliver that value with greater effectiveness, and I would look to those either as an investment or to provide a greater understanding of what is possible. This method provides a yes (check-the-box) that we can do this, but also a recognition of what you are capable of. Firms should be looking for capabilities to drive out AI without big development cycles.
Escaping the shiny object
Nicolas Averseng of Data Galaxy also did a good presentation on AI and a few things struck a sweet note for me. He spoke of escaping the "shiny object", that is focusing on the new LLM or other technology without an outcome focus makes you destined to fail. Focus on Context. I wholeheartedly believe in this, a few years ago when prompt engineering was the rage folks would talk about clarity, concise, and context. Nicolas did a good job discussing how to bring together context to make a difference. This does include automating or speeding up curation, understanding where data comes from, and moving from a catalog for humans to a library (like MPC) for agents and both technical solutions like agents and human involvement.
The bottom line for me is we all need to have a solid understanding of fundamentals to build AI applications with context, and data management approaches that address a wide range of capabilities within our data programs.
The idea of using AI in the data stewardship realm to build out descriptions is a nice one that many are today, I expanded on this when discussing AI Trust and more specifically “Sensitive Data Scanning.”
Overview: AI trust for governance
While I presented on AI Trust in Governance, I also found great value in the Q&A and other topics off-book discussions. In this next section I will provide an overview of AI Trust in Governance, and then follow-up with some other insights.
First, I tried to share my evolved doctoral content as a whole and tie it out to AI Trust components. Those of you who have read my former white papers, attended data governance training I have run, or listened to other blogcasts or seminars I have done over the years will find this familiar. The difference is that AI Trust is focused on Data Quality, Data Sensitivity, and Certification of Assets which I will discuss briefly here, and more information on the “House of Data” can be found here.
Data Quality - This is a focus area that must be in place to achieve AI Trust. That is to get AI to work correctly you need to have data quality. That is old news, we hear that all the time from consultants, software companies, CDO’s, and evangelists. The key is to get more specifics. Data Quality for AI Trust falls into two buckets, pipeline reliability and data quality.
Pipeline Reliability - This means you must establish pipeline reliability. Confirm that your data is updated on-time and has the appropriate volumes of data based on past experience. It's great to have schedulers run your pipelines, determine errors and act on those errors, pipeline reliability takes it a step further as schedulers alone aren’t enough.
Data Quality - Data Quality has been around for years, we have gone through perhaps four generations of data quality. The world of AI makes it so it is absolutely mandatory, and it needs to be done efficiently and effectively. The ability to use out of the box metrics, functional business rules, and engineering enhanced strategies helps to get and keep data right is critical.
Data sensitivity - There is a need for data sensitivity, to find the data that should or should not be used by AI applications. There is some data that should never, never, never be used for AI applications, probably BI applications as well. AI and BI applications need to know the classifications (public, private, internal, confidential, restricted) and also detail the key characteristics such as PII, PCI, PHI among others of their data. AI Trust needs the capability to use AI classifiers (an AI industry term) to automate these categorizations to move forward. Asking privacy and stewardship to maintain this manually doesn’t scale so use sensitive data scanning to move things forward.
Certification - there is also a need to be able to look at data quality, sensitive data scanning, intent and bias prevention together to certify what data is able to be used. Certifying data that is appropriate for use is a final area to focus on to achieve AI Trust.
There are plenty of stories to tell in what happens when firms don’t place operational focus on data quality, sensitive data scanning, and certification. For more information on these topics reach out to me on LinkedIn.
What came up off-script
I admit I was comfortable on stage at the DAMA NYC Data Day, largely due to being in a familiar part of New York. I have many great memories on expanding data capabilities, and talking about a topic I love, but the off-book discussions were the highlight of my day. The following were some comments that I made:
- Data Governance - I shared with the group that these fundamentals: (1) People; (2) Data Literacy; (3) DataOps; (4) Data Quality; (5) Data Privacy; (6) Data Security; (7) Compliance; (8) Data Enablement; (9) Data Consumption; (10) Data Architecture are vital to work across a solid OpsModel but I am done trying to convince others on the importance of Data Governance. It's been 18 years since I started my doctoral work on the ‘House of Data’ and finally we are progressing as an industry to the ideas I started speaking about in 2010, but I will let others try to be the cheerleader and convince others that “Data Governance is Important”
- Barker Venn - I shared that I look at this OpsModel is best achieved when there is a Venn diagram of core capabilities, not another framework, or the DAMA or DCAM wheels, but one that brings together:
- Six Sigma - The set of tools and capabilities to focus on continuous improvement as you run your data program. By leveraging these tools everything can improve and drive greater business value.
- Agile - Take advantage of the various flavors that work for your firm to get the most value out of your data program. Don’t invent new things, take advantage of internal inertia to get alignment and build momentum for your program.
- Customer Success - Most data people haven’t spent much time delivering as a CSM (Customer Service Manager) but have worked as CSM’s customer. If you ever have a chance to be a CSM, and compare what they do with what you should be doing running a data program, it's helpful. Things such as calls-to-action (CTA’s), focusing on the voice of the customer, promoting success, building relationships, and monitoring success.
- PMP - Spend time to really leverage great project management skills, build out team capabilities, and bring things together in building, running, and promoting success.
- No Heroes - I strongly advocate for the development of strong cross-functional teams and try to fight the urge for data heroes. I find the idea of data heroes being pushed by some software companies as a meek way to cover up for the sins of software and teams.
3. Two Words: When I am working with companies to modernize their data management and data governance capabilities I use two words:
- Alignment - many data programs fail due to a lack of alignment. Too much infighting, too little cooperation, and to little alignment across key groups such as Audit, GRC, CISO, Functional Teams, Data Privacy, and many others. Strive to get better alignment and you will go far.
- Momentum - do things to promote success, to educate folks, and do other activities to build momentum, build a data riptide, that has these different groups working together with alignment, and pushing forward to make real progress. If alignment gets you to go far, momentum gets you there faster.
Closing thoughts
While everyone wants to talk about dirty-in, dirty-out, or you can’t be successful without data for AI to really make things work, we need to have an OpsModel that is focused on AI Trust. That means at a minimum bring together data quality, sensitive data scanning, and certification of data to make sure AI applications and prompts are being used that get great outcomes and help your business prosper.
And remember, in all that we do in AI and Data... at the end of the day “Data Matters.”
Monitoring
Schema change detection
Lineage monitoring
.png)
.png)