Why Databricks Acquired MosaicML

Databricks' acquisition of MosaicML aims to democratise AI by providing enterprises with accessible tools to build, own, and secure generative AI models using their own data.
Listen to this story

Last month, data processing giant Databricks announced its definitive agreement to acquire MosaicML, a generative AI platform, in a transaction valued at approximately $1.3 billion. 

The acquisition was aimed to make generative AI accessible to enterprises, allowing them to “build, own and secure best-in-class generative AI models while maintaining control of their data.” Databricks CEO, Ali Ghodsi, emphasised the goal of democratising AI and making the “Lakehouse the best place to build generative AI and LLMs.”

MosaicML is recognised for its state-of-the-art MPT large language models and provides enterprises with a way to quickly build and train their models cost-effectively using their data. MosaicML as a company is also lucrative because it claims to offer inexpensive services at par with open-source front runners like LLaMA and Falcon.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

What MosaicML offers

Founded in 2016 by Naveen Rao, a Stanford electrical engineer and Hanlin Tang, a Harvard graduate, MosaicML had raised a total of $37M in funding over 2 rounds. Their latest funding was raised on Jan 1, 2023.  

MosaicML offers various options for developers to utilise its platform, including an API for easy integration into front-end applications and customisation of models with their data. 

Developers can also use MosaicML’s tools to pre-train custom models from scratch and serve them through the platform. The compatibility of MosaicML with third-party tools like LangChain allows developers to leverage those tools on top of their customised models, providing flexibility and ownership over the entire model.

While MosaicML, like its competitors, provides its technology for rent, it differentiates itself by also offering its code to customers. This allows customers to run the code on their own hardware, ensuring the confidentiality of their data from MosaicML. This approach appeals to corporate customers who prioritise data privacy, as the value of an AI system, is heavily dependent on the training data used.

The company emphasises the importance of open-source models, particularly in industries handling sensitive data, where local customisation and control over the model are crucial. MosaicML focuses on serving specific industries like healthcare and banking, providing the ability to interpret and summarise large amounts of data securely. 

The company claims to make its technology accessible to all organisations at a significantly lower price, up to 15 times cheaper than its competitors. Notable customers such as AI2, Generally Intelligent, Hippocratic AI, Replit, and Scatter Labs leverage MosaicML for various generative AI use cases.

MosaicML’s MPT-30B LLM is a 30-billion parameter model that the company claims surpasses the quality of OpenAI’s GPT-3 despite having significantly fewer parameters, making it easier to run on local hardware and more cost-effective for deployment. 

The attention mechanism used by MosaicML, called “FlashAttention,” offers faster inference and training, making it more efficient than Falcon and LLaMA. Additionally, MPT-30B is designed to fit the constraints of real hardware, optimising its performance on deep-learning GPUs.

Additionally, MosaicML claims that MPT-30B compares favourably to LLaMA and Falcon in terms of performance. The model requires less compute power for training while delivering similar results, especially excelling in coding tasks. However, the claims made by MosaicML are yet to be independently verified using Stanford’s HELM measure.

Despite some comparisons and criticisms, MosaicML ultimately sees open-source LLMs, including LLaMA and Falcon, as part of the same team. The company believes proprietary platforms like OpenAI pose real competition and emphasises the empowering nature of open-source LLMs, putting the power back into the hands of enterprise developers. MosaicML believes open LLMs are closing the gap with closed-source models and have reached a point where they are extremely useful, even if they haven’t completely surpassed them yet.

Databricks’ Motive

Databricks has positioned itself strongly in the market through several strategic moves. The introduction of LakehouseIQ, the acquisition of MosaicML, and the development of Unity Catalog have placed Databricks in a favourable position to maintain its market position and compete for incremental market share.

MosaicML’s platform will be integrated and scaled over time to provide a unified platform where customers can “build, own and secure their generative AI models. For Databricks, the acquisition of MosaicML is a strategic move aimed at providing enterprises with tools to easily and cost-effectively build their own large language models (LLMs) using their proprietary data. 

By integrating this process into the broader Databricks toolchain and workflow, the company aims to reduce the costs associated with training and running LLMs. This strategy recognises the market demand for specialised LLMs that are more cost-effective and finely tuned for specific tasks. While general-purpose LLMs will continue to exist, Databricks sees an opportunity to cater to the need for tailored solutions. Both Snowflake and Databricks are actively working to provide enterprise-class governance and intellectual property protection as part of their specialised LLM offerings.

The Databricks Lakehouse Platform, combined with MosaicML’s technology, will provide customers with a “simple, fast way to retain control, security, and ownership over their valuable data without high costs.” MosaicML claims that its automatic optimisation of model training enables 2x-7x faster training compared to standard approaches.

Shyam Nandan Upadhyay
Shyam is a tech journalist with expertise in policy and politics, and exhibits a fervent interest in scrutinising the convergence of AI and analytics in society. In his leisure time, he indulges in anime binges and mountain hikes.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR