MITB Banner

Why Databricks Acquired MosaicML

Databricks' acquisition of MosaicML aims to democratise AI by providing enterprises with accessible tools to build, own, and secure generative AI models using their own data.

Share

Listen to this story

Last month, data processing giant Databricks announced its definitive agreement to acquire MosaicML, a generative AI platform, in a transaction valued at approximately $1.3 billion. 

The acquisition was aimed to make generative AI accessible to enterprises, allowing them to “build, own and secure best-in-class generative AI models while maintaining control of their data.” Databricks CEO, Ali Ghodsi, emphasised the goal of democratising AI and making the “Lakehouse the best place to build generative AI and LLMs.”

MosaicML is recognised for its state-of-the-art MPT large language models and provides enterprises with a way to quickly build and train their models cost-effectively using their data. MosaicML as a company is also lucrative because it claims to offer inexpensive services at par with open-source front runners like LLaMA and Falcon.

What MosaicML offers

Founded in 2016 by Naveen Rao, a Stanford electrical engineer and Hanlin Tang, a Harvard graduate, MosaicML had raised a total of $37M in funding over 2 rounds. Their latest funding was raised on Jan 1, 2023.  

MosaicML offers various options for developers to utilise its platform, including an API for easy integration into front-end applications and customisation of models with their data. 

Developers can also use MosaicML’s tools to pre-train custom models from scratch and serve them through the platform. The compatibility of MosaicML with third-party tools like LangChain allows developers to leverage those tools on top of their customised models, providing flexibility and ownership over the entire model.

While MosaicML, like its competitors, provides its technology for rent, it differentiates itself by also offering its code to customers. This allows customers to run the code on their own hardware, ensuring the confidentiality of their data from MosaicML. This approach appeals to corporate customers who prioritise data privacy, as the value of an AI system, is heavily dependent on the training data used.

The company emphasises the importance of open-source models, particularly in industries handling sensitive data, where local customisation and control over the model are crucial. MosaicML focuses on serving specific industries like healthcare and banking, providing the ability to interpret and summarise large amounts of data securely. 

The company claims to make its technology accessible to all organisations at a significantly lower price, up to 15 times cheaper than its competitors. Notable customers such as AI2, Generally Intelligent, Hippocratic AI, Replit, and Scatter Labs leverage MosaicML for various generative AI use cases.

MosaicML’s MPT-30B LLM is a 30-billion parameter model that the company claims surpasses the quality of OpenAI’s GPT-3 despite having significantly fewer parameters, making it easier to run on local hardware and more cost-effective for deployment. 

The attention mechanism used by MosaicML, called “FlashAttention,” offers faster inference and training, making it more efficient than Falcon and LLaMA. Additionally, MPT-30B is designed to fit the constraints of real hardware, optimising its performance on deep-learning GPUs.

Additionally, MosaicML claims that MPT-30B compares favourably to LLaMA and Falcon in terms of performance. The model requires less compute power for training while delivering similar results, especially excelling in coding tasks. However, the claims made by MosaicML are yet to be independently verified using Stanford’s HELM measure.

Despite some comparisons and criticisms, MosaicML ultimately sees open-source LLMs, including LLaMA and Falcon, as part of the same team. The company believes proprietary platforms like OpenAI pose real competition and emphasises the empowering nature of open-source LLMs, putting the power back into the hands of enterprise developers. MosaicML believes open LLMs are closing the gap with closed-source models and have reached a point where they are extremely useful, even if they haven’t completely surpassed them yet.

Databricks’ Motive

Databricks has positioned itself strongly in the market through several strategic moves. The introduction of LakehouseIQ, the acquisition of MosaicML, and the development of Unity Catalog have placed Databricks in a favourable position to maintain its market position and compete for incremental market share.

MosaicML’s platform will be integrated and scaled over time to provide a unified platform where customers can “build, own and secure their generative AI models. For Databricks, the acquisition of MosaicML is a strategic move aimed at providing enterprises with tools to easily and cost-effectively build their own large language models (LLMs) using their proprietary data. 

By integrating this process into the broader Databricks toolchain and workflow, the company aims to reduce the costs associated with training and running LLMs. This strategy recognises the market demand for specialised LLMs that are more cost-effective and finely tuned for specific tasks. While general-purpose LLMs will continue to exist, Databricks sees an opportunity to cater to the need for tailored solutions. Both Snowflake and Databricks are actively working to provide enterprise-class governance and intellectual property protection as part of their specialised LLM offerings.

The Databricks Lakehouse Platform, combined with MosaicML’s technology, will provide customers with a “simple, fast way to retain control, security, and ownership over their valuable data without high costs.” MosaicML claims that its automatic optimisation of model training enables 2x-7x faster training compared to standard approaches.

Share
Picture of Shyam Nandan Upadhyay

Shyam Nandan Upadhyay

Shyam is a tech journalist with expertise in policy and politics, and exhibits a fervent interest in scrutinising the convergence of AI and analytics in society. In his leisure time, he indulges in anime binges and mountain hikes.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India