Data Governance is a term that picked up its importance in recent years when enterprises started realizing the importance and impact of Master Data, Data is all around us and it is not an exaggeration to tell that Data governance is part of the environment. I was exploring the standard framework for data governance over the internet and found below one, which is very relevant (Thanks to The Data Governance institute )
Sourced from The DGI Data Governance Framework © The Data Governance Institute – https://profisee.com/data-governance-what-why-how-who/
What is Master Data Governance?
Master Data governance is a set of policies, rules and processes that ensure the good quality, connected and trusted master data for an organization and ALSO to make data business ready. in any enterprise data is distributed across different sources and data can be updated many times by different stakeholders in its lifecycle. So it became vital to tap on the data governance at each and every touch.
Data Governance is neither reactive nor proactive it should inherit in the business process. Just an example of product master data management governance and expected flow. Here is the typical product life cycle.
Most of the time , companies keep the Governance in setup and Launch but ideally it should go end to end.
Following are the challenges that an enterprise face today which make governance complex.
- Distributed systems
- No one source of truth
- Lack of system controls at touch points
- Connecting Unstructured and structured data
- Manual reports and audits.
- Many stakeholders
- Different needs of data from cross functional teams.
- No Standard definition of a data row or attributes.
These all challenges can be over come by many advance technologies or tools/products which has evolved in recent past. ( I will talk about it in my next blog .. ????)
What Contributes to Data?
Coming back to the data components here are 2 important ones.
- Structured – Almost all enterprises has governance in some or other format
- Unstructured, Semi structured – Governance is Evolving
How to enable a simple, Automated Master Data Governance with all the expected outcome?
- Data Analytics techniques – Different supervised Machine learning techniques can be used to recommend the values at the time of data creation and also similar techniques can be used to audit the data quality.
- Programmatic rules for Data integration across different systems
- Assign a token to data that can control the data definition at each system and can also track the updates.
- Graph system to provide a 360-degree view of information around master data
- Unstructured data
- Digitization of unstructured data and convert it into a useful form.
- Standard Templates
- Standard Publish Mechanism
- Automated Machine Learning process to make sure Governance rules are followed while creating the content.
- Governance of data at the time of curating it. If it is a product master, then governance should start from the concept commit of a product.
- Controlled Roles and Responsibilities & Programmatic rules to ensure data quality
- Minimize the touch points
- Use of supervised learning to make sure master data will be business ready without any issues
- Simplify the rules to avoid conflict between rules.
- Simplify compliance and secure data, no matter where it resides and create more fluidity in data distribution.
- Reducing the cost of data governance by leveraging programmatic or data analytics approach
- High quality, Connected and Accurate data.
Next Topic – Knowledge Graph with ML can be a good data steward to ensure data governance?
A formal definition of data Stewart – A data steward is an oversight or data governance role within an organization and is responsible for ensuring the quality and fitness for purpose of the organization’s data assets, including the metadata for those data assets and to full fill this role we need a person who has strong function and corresponding data understanding. well, the challenge starts here as to how to identify the data stewards in an organization and how to measure the expectations out of the role? Let me share my thoughts on it.
Let me switch the topic and talk about Graph systems, just for everyone understanding here is a one liner definition –
A centralized connected system which has properties present for an entity and its relationships with other entities .
So in short , you can see each and every details of particular master data (customer, product , supplier, partner, distributors ) in one place with all the related information present in an organization. Following are the key benefits ,
- Connecting all distributed system in one place
- This will become single source of truth
- Controls are taken care at the time of creating the graph
- Any edit in the graph entities needs to adhere to the rules
- Systematically any outliers can be identified.
- A good graph ensures 100% good quality data.
Stay Tune for my next Blog on this topic.