MITB Banner

Branded Content

Centralising Data Governance through a Data Catalog

“I love governance and cataloguing because it's the intersection of human and data,” said Elliot Huebler from Tredence.
Centralising Data Governance through a Data Catalog

Data governance is the linchpin of effective data management, encompassing practices for quality, security, and utility. It establishes order, transparency, and trust, defining roles, implementing standardised processes, and leveraging technology. Essential in today’s data-driven landscape, it safeguards against risks, fosters collaboration, and supports confident, innovative decision-making.

In a recent conversation with Elliot Huebler, manager of data engineering and governance at Tredence, we delved into the intricate world of data governance and how data cataloguing plays a pivotal role in centralising and streamlining these efforts.

“I love governance and cataloguing because it’s the intersection of human and data,” Huebler, who brings a wealth of experience from his background in Galactic Evolution Astrophysics at the University of Michigan, sheds light on the challenges faced by organisations and the innovative solutions Tredence employs to overcome them.

Huebler provides insights into Tredence’s journey, highlighting their evolution from an AI ML solutions firm to a focus on data engineering. With a strong emphasis on data governance, Huebler describes the diverse pillars that constitute effective governance, ranging from data cataloguing to data quality, lineage, master data management, security, privacy policy, and organisational structure.

The Role of Data Catalogues in Governance

As Huebler explains, data cataloguing emerged as one of the first governance programs at Tredence. Recognising the complexity of implementing various niche solutions across different pillars, Tredence identified the need for a centralised approach.

The data catalogue, a tool designed not just for democratising data but also for centralising governance initiatives, became a key player in their strategy. “Data cataloguing, data quality, data lineage, master data management, security, privacy policy are all the pillars of data governance,” he said. “We needed a niche and centralised solution for all of this.”

Acknowledging the diversity of governance needs, Huebler provides an overview of the multitude of tools used by Tredence for different governance pillars. “From data catalogue tools like Alation, Collibra, and Microsoft Purview to enterprise data catalogues like Databricks’ Unity catalogues, the landscape is vast,” he explained. Custom solutions and vendor tools for data quality, security, and master data management also play crucial roles in their approach.

Huebler elaborates on Tredence’s approach to building custom data catalogues. “By starting small and focusing on a specific use case or domain, we create a robust data user journey, incorporating data quality checks, lineage, and other relevant metadata,” he said. This iterative process allows them to demonstrate the holistic value of a customised catalogue, paving the way for further scaling.

Improving Accessibility and Transparency with Data Catalogues

Comparing data catalogues to the Dewey Decimal System in libraries, Huebler emphasises the fundamental role of catalogues in making data easily navigable. “One of the beautiful things about a data catalogue is it. It hardly ever actually looks at data. It’s just looking at metadata,” said Huebler. The metadata-centric approach ensures scalability and adaptability, allowing the catalogue to automatically pick up changes and additions without impacting data quality or security.

For security, Huebler clarifies that while data catalogues focus on metadata, tools or solutions analysing the actual data are necessary for assessing and improving data quality and security. The catalogue can, however, capture and display the results of these assessments, contributing to a comprehensive governance overview.

For scalability, Huebler said that “if new tables were to be added, old tables were to be deleted. It would automatically pick those up, scan them, and then once it’s scanned, you get a page in the catalogue for that asset. And that page in the catalogue has a wide range of different fields which you can fill in about the metadata. And those fields could be the description of the table,” Huebler explained.

Huebler also said that Tredence is experimenting with a lot of generative AI capabilities such as LLMs for making the data catalogue easier, and more interactive. This would drive more adoption across customers.

Challenges and Solutions in Data Catalog Adoption

“One of the most common things we see is that users just don’t engage with the catalogue,” explained Huebler. A common challenge faced by Tredence’s clients is the adoption of data catalogues. Huebler identifies the lack of user engagement as a significant hurdle, attributing it to factors such as bandwidth constraints, absence of executive sponsorship, and a perceived lack of value.

To address this, Tredence employs creative strategies, including curation competitions and gamified approaches, to make the data experience enjoyable and rewarding. “We make a bunch of materials, demos, to really just socialise our initiative across the enterprise. And hopefully we see that that leads to a higher level of stewardship engagement,” Hubler explained.

With a data catalogue you can build a layer of visualisation on top of it. “You can get the user base engagement of the catalogue and a Power BI dashboard, and also the catalogue curation process progress. So, for that user based engagement piece, we want to hold stewards accountable,” he added.

Tredence just spent the last three months onboarding 20 users into the catalogue. “That’s likely because of our webinar session, and also there’s the other monitoring piece which is the catalogue curation progress,” he added. Apart from the user base, Tredence also wants to measure the success of our governance goals, for this the company is working on new methods.

Huebler highlights the importance of executive buy-in for driving adoption and emphasises the need for monitoring tools. Success metrics include user engagement, content growth, and progress towards governance goals. Looking to the future, Huebler envisions data catalogues evolving to take on more governance features and anticipates exciting developments in AI, particularly in the realm of language models, making data interactions more intuitive and user-friendly.

Contributed as part of AIM Branded Content. Know more here.

This article is contributed by
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
More from AIM

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.