Data mesh is a highly decentralised data architecture equipped to address challenges including lack of ownership of data, lack of quality data and scaling bottlenecks. The goal of data mesh is to treat data as a product, with each source having a data product owner who could be part of the cross-functional team of data engineers. Data mesh — introduced by Zhamak Dehghani of Thoughtworks in May 2019– overcomes the problems of traditional data lakes and data warehouses.
Data fabric is an all-in-one integrated architectural layer that connects data and analytical processes. It leverages existing metadata assets to support the design, deployment, and proper data utilisation across all environments and platforms. Data fabric aims to accelerate inference from data through automated processes and provide real-time insights. It integrates data, analytics, and dashboarding into one and serves as a management solution, allowing frictionless access in a distributed environment.
According to Gartner, data fabric is a design concept. The approach leverages continuous analytics over existing, discoverable and inferences metadata assets to enable the design, deployment and utilisation of integrated and reusable data across all environments.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Approach: Automation vs human inclusion
Data mesh approaches data from a people-and process-centric view and treats data as a product. Data fabric leverages human and machine capabilities to access data in place or support its consolidation where appropriate. It combines technologies that connect sources of data, types and locations with different methods for accessing the data. Gartner used the analogy of a self-driving car to explain the concept: Data fabric monitors the data pipelines as a passive observer and then suggests more productive alternatives. When both the data “driver” and the machine learning are comfortable with repeated scenarios, they complement each other by automating improvisational tasks while leaving the leadership free to focus on innovation.
Data fabric continuously identifies, connects, and enriches real-time data from different applications to discover relationships between data points. It does so by building a graph storing interlinked data descriptions that algorithms can use for business analytics.
Data storage: Centralised vs decentralised
In data mesh, the data is stored decentrally within its domains inside a company. Each node has local storage and computation power, and no single point of control is necessary for operation. Essentially, original data remains within domains and copies of datasets are generated for specific use cases.
In data fabric, the data access is centralised with high-speed server clusters for network and high-performance resource sharing in the data fabric.
According to Thoughtworks, the data mesh paradigm is a strong candidate to supersede the data lake as the dominant architectural pattern in data and analytics. Data mesh introduces an organisational perspective, independent of specific technologies. Its architecture follows a domain-driven design and product thinking to overcome challenges related to data. Data mesh culture is about connecting people and creating a federated responsibilities structure.
While data fabric leverages metadata to drive recommendations, data mesh collaborates with subject-matter experts to oversee domains. These domains are independently deployable clusters of microservices that communicate with users. It consists of codes, workflows, teams and a technical environment.
Data fabrics work with and are mostly compatible with technical, business and operational data. Visualisation tools make the technical infrastructure easy to interpret and help organisations manage their storage costs, performance, security and efficiency. In addition, companies can deploy a singular data fabric virtually over various data repositories to manage disparate data sources and downstream consumers.
Data access: APIs vs controlled datasets
In data mesh, data is made available via controlled datasets. First, the information is copied from the department data store to a shared location.
In data fabric, data is made available via objective-based APIs. The data is copied into specific datasets for specific use-cases, and the business unit that owns the data is in control.
Data mesh is ideal for hybrid cloud networks. Data fabric enables single-point data access, address data quality and storage issues and handling of security threats.
It is critical to note that data mesh and data fabric are not mutually exclusive concepts. Organisations can leverage both approaches across different use cases. According to James Serra, Data & AI Solution Architect at Microsoft, the difference between the two concepts lies in how users access data. Data fabric and data mesh provide architecture to access data across multiple technologies and platforms, he said. “But a data fabric is technology-centric, while a data mesh focuses on organisational change. [A] data mesh is more about people and process than architecture, while a data fabric is an architectural approach that tackles the complexity of data and metadata in a smart way that works well together,” he added.