21st-may-banner design

Intelligence and unified data governance in the age of multi-cloud

Data mesh is a type of data architecture that makes data accessible, available, discoverable, secure and interoperable.

Share

Today, it is imperative for organisations to adapt to an increasingly data-driven world and build analytic agility. However, it’s easier said than done, given the varied sources of information organisations handle and complex data handling mechanisms, including data movement, data discovery, cleansing and preparing trusted data for analytics etc. The challenge is magnified two-fold when you are unsure where your data is coming from and what it means. In the Data Engineering Summit 2022, Kirthi Ganapathy, customer engineering manager at Google Cloud, shared insights, key learnings and best practices around intelligent management of metadata, security and governance in a diverse and largely distributed data environment. 

What is data governance?

Data governance, at its most basic level, is the practice of enhancing an organisation’s data to make it discoverable, understood, protected and trusted. Every enterprise should think about the entire data lifecycle starting with data intake and ingestion, cataloguing persistence, retention, storage, management, sharing, archiving, backup, recovery, disposition, and data removal and deletion.

Data governance framework has four main pillars:

  1. Data discoverability: Data classification, data lineage, metadata and catalogue and data quality
  2. Data management: Lifecycle and records management, reference data, master data and SRE
  3. Data protection: Masking, encryption, access management, audit and compliance, residency and recoverability
  4. Data accountability: Ownership, policies and standards, domain-based governance and ethics

“Data governance encompasses the ways that people, processes and technology can work together to enable auditable compliance with defined and agreed upon policies across different technical solutions and different infrastructure boundaries,” Kirthi said.

Data priorities

“What organisations really want is to be able to derive insights from the data they have, without any restrictions, without necessarily moving it and in a way that makes sense to them,” Kirthi said.

An intelligent data fabric enables organisations to centrally manage, monitor and govern the data across data lakes, data warehouses, and datamarts with consistent controls, providing access to trusted data and powering analytics at scale. It offers unified metadata-led data management through a single pane of glass, centralised security and governance, enabling distributed ownership with global control, built-in intelligence to unify distributed data without data movement, and an open platform with support for open source tools and a robust partner ecosystem.

What is a data mesh?

Data mesh is a type of data architecture that makes data accessible, available, discoverable, secure and interoperable. It combines two principles: domain-driven decentralisation and data as a product.

In domain-driven decentralisation, data is owned by the people who understand it best. For example, the finance team owns the finance data, and the HR team owns the HR and employee data. So no single centralised entity owns the whole organisation’s data. 

In the second approach, data is considered a product. A team owns data just like a team would own the set of services and their business. In other words, you treat other teams as internal customers of your data.

Now let us delve into how to build a data mesh architecture. Building a data mesh involves:

  1. Organising data to map to your business: Logically organising data based on how it is used instead of where it is stored.
  2. Uniformly manage and govern data: Setup standardised policies for access control, data quality, classification and lifecycle management.
  3. Access data from a variety of tools: Access distributed data from google cloud-native and open source tools with automatic metadata propagation and a unified experience. 

Google Cloud Way

“We have three data domains here, sales data, CRM data or customer data and product data, each of which can be implemented as a different data lake, with its respective data pipelines, enabling the respective product teams to set up a very fine-grained permission control, including at a sub lake or ozone level on each of these data lakes independently, as defined by the organisation best practices,” said Kirthi.

She further stated that with this architecture:

  1. Your organisation gets the freedom to store data where you want, choose the best analytics tools and have flexibility in pricing and consumption model to meet financial governance needs.
  2. Built-in data intelligence leveraging the best in class AI/ML capabilities to automate data management and reduce manual toil. 
  3. Enable standardisation and unification of metadata, security policies and data classification.

REGISTER HERE TO ACCESS THE CONTENT

Share
Picture of Kartik Wali

Kartik Wali

A writer by passion, Kartik strives to get a deep understanding of AI, Data analytics and its implementation on all walks of life. As a Senior Technology Journalist, Kartik looks forward to writing about the latest technological trends that transform the way of life!
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.