MITB Banner

Data Mesh: Moving Away From Monolithic & Centralised Data Lakes

Data mesh marks an architectural and organisational shift in the way enterprises manage big data.

Share

Data Mesh

“My ask before reading on is to momentarily suspend the deep assumptions and biases that the current paradigm of traditional data platform architecture has established; Be open to the possibility of moving beyond the monolithic and centralised data lakes to an intentionally distributed data mesh architecture; Embrace the reality of ever-present, ubiquitous and distributed nature of data,” said Zhamak Dehghani, currently the director of emerging technologies at Thoughtworks.

Data mesh, a decentralised data architecture, marks an architectural and organisational shift in the way enterprises manage big data.

Data mesh

As per Zhamak, the planning and building process of data and intelligence platforms can be divided into three generations.

  • In the first generation, organisations employed proprietary enterprise data warehouses and business intelligence platforms. It was a costly approach and often left the companies reeling under technical debts.
  • The second generation had a big data ecosystem and long-running batch jobs operated by a central team of data engineers who created data lakes.
  • Industries are currently developing the third generation of data platforms similar to the previous generation but with some gaps addressed, such as real-time data analytics and cost reduction in managing big data infrastructure.

Zhamak suggested the next enterprise data platform architecture should be built to converge distributed domain-driven architecture, product thinking with data, and self-serve platform design. This gives way to data mesh.

The shift to data mesh is founded on four principles:

  • Decentralisation of data ownership and architecture
  • Domain-oriented data presented as a product
  • Using self-serve data infrastructure as a platform to get autonomous domain-oriented data teams
  • Enabling interoperability through federated governance

Data mesh is a highly decentralised data architecture to solve challenges such as lack of ownership of data, lack of quality data and removing bottlenecks to encourage organisational scaling.

The goal of data mesh is to treat data as a product, with each source having a data product owner, who could ideally be part of the cross-functional team of data engineers. Despite having a separate owner, the data should be domain-focused and should have an autonomous offering that leads to a domain-driven distributed architecture.

When to consider it?

The current data platform architectures are primarily built on a data lake or data warehouse. Unlike popular belief, the goal of data mesh is not to completely replace them. A centralised data platform with a specialised team generally works well for small and medium-sized enterprises.

However, when the organisation grows, its data domains become more diverse, and new data sources are introduced. In such cases, the existing architecture starts creating unnecessary friction and may slow down the processes.

However, it is difficult to tell when the organisation becomes big enough to render existing approaches ineffective. Further, even large organisations can remain effective with the centralised data platform. A better method could be considering the size of the IT team and evaluating whether the size of the data platform slows the cycle of innovation and turn into a bottleneck. The symptoms include: continuously longer lead times, the appearance of data solutions separate from the centralised data platform, and a need for temporary solutions for integrating new data sources.

Wrapping up

Data mesh is not a plug and play solution. It comes with its own sets of  challenges, including:

  • Need for domain specialisations: Domain-specific ETL, data lake and tools will require teams with expertise in complex data systems such as Kafka, Spark, etc.
  • Creating more copies of data can prove to be a governance challenge. This problem will be further compounded by multi-cloud and hybrid-cloud infrastructure.
  • If the company is not covered on all bases for transition to a decentralised approach, the success could be short-lived.
Share
Picture of Shraddha Goled

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.