MITB Banner

Data mesh vs data fabric: What’s the difference?

Data mesh and data fabric are not mutually exclusive concepts.
Share

Data mesh is a highly decentralised data architecture equipped to address challenges including lack of ownership of data, lack of quality data and scaling bottlenecks. The goal of data mesh is to treat data as a product, with each source having a data product owner who could be part of the cross-functional team of data engineers. Data mesh — introduced by Zhamak Dehghani of Thoughtworks in May 2019– overcomes the problems of traditional data lakes and data warehouses.

Data fabric is an all-in-one integrated architectural layer that connects data and analytical processes. It leverages existing metadata assets to support the design, deployment, and proper data utilisation across all environments and platforms. Data fabric aims to accelerate inference from data through automated processes and provide real-time insights. It integrates data, analytics, and dashboarding into one and serves as a management solution, allowing frictionless access in a distributed environment.

According to Gartner, data fabric is a design concept. The approach leverages continuous analytics over existing, discoverable and inferences metadata assets to enable the design, deployment and utilisation of integrated and reusable data across all environments.

Approach: Automation vs human inclusion

Data mesh approaches data from a people-and process-centric view and treats data as a product. Data fabric leverages human and machine capabilities to access data in place or support its consolidation where appropriate. It combines technologies that connect sources of data, types and locations with different methods for accessing the data. Gartner used the analogy of a self-driving car to explain the concept: Data fabric monitors the data pipelines as a passive observer and then suggests more productive alternatives. When both the data “driver” and the machine learning are comfortable with repeated scenarios, they complement each other by automating improvisational tasks while leaving the leadership free to focus on innovation. 

Data fabric continuously identifies, connects, and enriches real-time data from different applications to discover relationships between data points. It does so by building a graph storing interlinked data descriptions that algorithms can use for business analytics.

Data storage: Centralised vs decentralised

In data mesh, the data is stored decentrally within its domains inside a company. Each node has local storage and computation power, and no single point of control is necessary for operation. Essentially, original data remains within domains and copies of datasets are generated for specific use cases.

In data fabric, the data access is centralised with high-speed server clusters for network and high-performance resource sharing in the data fabric. 

Architecture

According to Thoughtworks, the data mesh paradigm is a strong candidate to supersede the data lake as the dominant architectural pattern in data and analytics. Data mesh introduces an organisational perspective, independent of specific technologies. Its architecture follows a domain-driven design and product thinking to overcome challenges related to data. Data mesh culture is about connecting people and creating a federated responsibilities structure.

While data fabric leverages metadata to drive recommendations, data mesh collaborates with subject-matter experts to oversee domains. These domains are independently deployable clusters of microservices that communicate with users. It consists of codes, workflows, teams and a technical environment. 

Data fabrics work with and are mostly compatible with technical, business and operational data. Visualisation tools make the technical infrastructure easy to interpret and help organisations manage their storage costs, performance, security and efficiency. In addition, companies can deploy a singular data fabric virtually over various data repositories to manage disparate data sources and downstream consumers.

Data access: APIs vs controlled datasets

In data mesh, data is made available via controlled datasets. First, the information is copied from the department data store to a shared location. 

In data fabric, data is made available via objective-based APIs. The data is copied into specific datasets for specific use-cases, and the business unit that owns the data is in control.

Use cases

Data mesh is ideal for hybrid cloud networks. Data fabric enables single-point data access, address data quality and storage issues and handling of security threats.
It is critical to note that data mesh and data fabric are not mutually exclusive concepts. Organisations can leverage both approaches across different use cases. According to James Serra, Data & AI Solution Architect at Microsoft, the difference between the two concepts lies in how users access data. Data fabric and data mesh provide architecture to access data across multiple technologies and platforms, he said. “But a data fabric is technology-centric, while a data mesh focuses on organisational change. [A] data mesh is more about people and process than architecture, while a data fabric is an architectural approach that tackles the complexity of data and metadata in a smart way that works well together,” he added.

PS: The story was written using a keyboard.
Share
Picture of Avi Gopani

Avi Gopani

Avi Gopani is a technology journalist that seeks to analyse industry trends and developments from an interdisciplinary perspective at Analytics India Magazine. Her articles chronicle cultural, political and social stories that are curated with a focus on the evolving technologies of artificial intelligence and data analytics.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India