Leading your organization to success through Agile Data Governance
Laura Madsen wants to challenge your outdated ideas about Data Governance. “I’m pretty sure that we wouldn’t use software that we used 20 years ago, but we’re still using Data Governance and Data Governance methodologies the same way we did 20 years ago.” And although she advocates for Agile, she’s not an Agile coach or a SCRUM master; rather she wants companies to consider agility in a broader sense as well. “Very briefly, when we think about Agile, essentially, we think about reducing process steps.” She paraphrases David Hussman’s belief that there is no inherent value in “process” — process exists in order to prove to other people that “we’re doing something.” To that end, most organizations create an enormous number of process steps she refers to as “flaming hoops,” showing that there was a lot of work put into activities such as status updates, but nothing that provided actual value.
Madsen is the author of Disrupting Data Governance, Chief Executive Guru at Via Gurus, and Mastermind at the Sisterhood of Technology Professionals (Sistech).
Resource Use
Historically, Data Governance has always been resource-intensive, and with Agile Data Governance in particular, she said, the most important resource is the individuals who do the work. The need for a data owner and a data steward for each domain, often with multiple stewards or owners covering the same data domain, etc., emerged as a system designed to serve data warehouses with hundreds of tables, and thousands of rows per table. “That’s a rather parochial idea in 2020, when we have petabytes of data blown through our data warehouses on any given day.”
One resource-heavy relic from the past is the standing committee, which always starts off with a lot of participation and enthusiasm, but over time people disengage and participation dwindles. Another historical shortcoming in Data Governance is the reliance on one or two individuals who hold the bulk of the institutional knowledge. With the amount of risk attached to Data Governance processes, the people who serve as the governance linchpin are under a lot of pressure to do more with less, so when they leave, the Data Governance program often collapses.
Instead, she recommends developing responsive and resilient capabilities by creating a dependency on multiple people with similar capabilities instead of one person who knows everything.
To make best use of time and resources, committees should be self-forming and project-based. Distinct functions must be created for participating resources: “And we should be laser clear about what people are doing.”
The Kitchen Sink
Still another legacy from the past is the tendency to take a “kitchen sink” approach, throwing compliance, risk, security, quality, and training all under the aegis of Data Governance, creating a lack of clarity in roles. “When you do everything, then you’re really doing nothing,” she said. Data stewards aren’t given formal roles or capabilities, and as such, they consider their governance duties as something they do on the side, “for fun.”
Madsen sees this as arising out of the very broad scope of the historical definition of Data Governance. Intersecting with so many different critical areas, Data Governance has become a catch-all. In truth, she said, instead of being wholly responsible for compliance, risk, security, protection, data usage, and quality, Data Governance lives in the small area where all of those domains overlap.
She considers this narrower focus as essential to survival in modern data environments, especially now, when there are entire departments devoted to these areas. Expecting a Data Governance person to be fully accountable for compliance, privacy, risk, security, protection, data quality, and data usage, “is a recipe for absolute disaster.” Today, she said, there is no excuse for being haphazard about what people are doing in those intersecting areas.
Four Aspects of Success
To succeed, companies must move away from the kitchen sink definition of Data Governance and focus on four aspects:
These categories will not need equal focus in every organization, and it’s expected that priorities will shift over time. Madsen showed a slide with some sample priorities that could be set with management input:
- Increased data use at 40% importance
- Quality at 25%
- Management at 25%
- Protection at 10%
From an Agile perspective, every sprint or increment can be measured against those values, creating “an enormous amount of transparency.” And although executives may not care about the specific tasks used to address those priorities, they will care that they are being tackled strategically, she said.
Increased Use of Data
If the work of Data Governance isn’t leading to greater use of data, she says, “What the heck are we doing?” Building data warehouses, creating dashboards, and delivering ad hoc analytics are only useful if they enable greater data use. All governance activity should be focused toward that end. The only way to get broad support for Data Governance is to increase the usage of the data.
Data Quality
Record counts and data profiling can show what’s in the data and whether or not the data is right, but analysis is not the same as data quality. “What we’re really driving towards here is the context of the data,” Madsen said, which leads to increased data use. The core of Data Quality Management is ensuring it has meaning, and the only way for the data to have meaning is to provide context.
Data Management
She talks specifically about the importance of lineage within the context of Data Management. Most end users only interact with their data at the front end when they’re entering something, and at the back end, when they see it on a report or a dashboard. Everything that happens in between those two points is a mystery to them, which creates anxiety or confusion about the accuracy or meaning of the end result. “Without lineage tools, without the capability of looking at and knowing exactly what happened from the source to the target, we lose our ability to support our end users.” For a long time those tools didn’t exist, but now they do, and those questions can be answered very quickly, she said.
Data Protection
Although Data Governance has a part in mitigating risk and protecting data, again, these are areas where governance should not be fully responsible. Instead, governance should be creating what Madsen calls “happy alliances” with those departments directly responsible for data protection, and focusing on facilitating increased data usage. This is often reversed in many organizations: If data is locked down to the point where it’s considered “completely safe,” risk may be under control, but no one is using it.
Moving into the Future/Sweeping Away the Past—Fixing Committees
Committees, she said, are not responsive, they’re not Agile and they don’t contribute to a resilient Data Governance structure. Theoretically, they do create a communication path of sorts, because a standing meeting at least assumes participants are paying attention for a specific period of time — until they lose interest.
What works better, she said, is self-forming Scrum teams or self-forming Agile teams that are on-demand or project-based, using a “backlog” (list of tasks) that becomes the working document for completing the company’s project list. “You come together, you work on the thing, and then you go your own separate ways.”
A sample self-forming Agile team might consist of a CDO, serving as a product owner, someone from
security, privacy, and IT, which creates regulatory and IT standards, and executives from business departments like finance, sales, or operations, who might also serve assubject matter experts.
The backlog serves as a centralized document where data issues are tracked, responsibilities are outlined and milestones on the way to completion are logged.
Traditional concepts like data ownership and data stewardship still have a part, but they are in service to a project or initiative rather than a fixed area or department. When the project is completed, the team disbands.
Named Data Stewards
Named data stewards serve as a resource for a particular project or area, such as the customer data domain. Named data stewards or owners for each area of responsibility should be published so that anyone can quickly and easily find the data steward for any particular domain.
On Demand Data Stewards
“Everyone’s a data steward, just like everyone’s in charge of sales.” Anyone who has a question about the data and wants to know more is, in that moment, a data steward, she said, whether they are trained for it or not. By taking ownership of a question and being willing to find an answer, the “on-demand” steward gains the ability to help the organization do a better job in that particular moment. “Ownership is so integral to successful deployment of any data function in an organization.”
Ensuring Success
To sum up, Madsen recommends starting a backlog, using it to consistently document exit criteria (your definition of “done”), and committing to actively managing it. Start thinking like a Data Governance product owner, keep communications open among intersecting areas — those “happy alliances” — and keep the ultimate goal of increased data use in mind. Focus on progress over perfection, she says, “And then just keep swimming, just keep swimming …”
Author: Amber Lee Dennis
Source: Dataversity