Listen to this story
Last month, data processing giant Databricks announced its definitive agreement to acquire MosaicML, a generative AI platform, in a transaction valued at approximately $1.3 billion.
The acquisition was aimed to make generative AI accessible to enterprises, allowing them to “build, own and secure best-in-class generative AI models while maintaining control of their data.” Databricks CEO, Ali Ghodsi, emphasised the goal of democratising AI and making the “Lakehouse the best place to build generative AI and LLMs.”
MosaicML is recognised for its state-of-the-art MPT large language models and provides enterprises with a way to quickly build and train their models cost-effectively using their data. MosaicML as a company is also lucrative because it claims to offer inexpensive services at par with open-source front runners like LLaMA and Falcon.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
What MosaicML offers
Founded in 2016 by Naveen Rao, a Stanford electrical engineer and Hanlin Tang, a Harvard graduate, MosaicML had raised a total of $37M in funding over 2 rounds. Their latest funding was raised on Jan 1, 2023.
MosaicML offers various options for developers to utilise its platform, including an API for easy integration into front-end applications and customisation of models with their data.
Developers can also use MosaicML’s tools to pre-train custom models from scratch and serve them through the platform. The compatibility of MosaicML with third-party tools like LangChain allows developers to leverage those tools on top of their customised models, providing flexibility and ownership over the entire model.
While MosaicML, like its competitors, provides its technology for rent, it differentiates itself by also offering its code to customers. This allows customers to run the code on their own hardware, ensuring the confidentiality of their data from MosaicML. This approach appeals to corporate customers who prioritise data privacy, as the value of an AI system, is heavily dependent on the training data used.
The company emphasises the importance of open-source models, particularly in industries handling sensitive data, where local customisation and control over the model are crucial. MosaicML focuses on serving specific industries like healthcare and banking, providing the ability to interpret and summarise large amounts of data securely.
The company claims to make its technology accessible to all organisations at a significantly lower price, up to 15 times cheaper than its competitors. Notable customers such as AI2, Generally Intelligent, Hippocratic AI, Replit, and Scatter Labs leverage MosaicML for various generative AI use cases.
MosaicML’s MPT-30B LLM is a 30-billion parameter model that the company claims surpasses the quality of OpenAI’s GPT-3 despite having significantly fewer parameters, making it easier to run on local hardware and more cost-effective for deployment.
The attention mechanism used by MosaicML, called “FlashAttention,” offers faster inference and training, making it more efficient than Falcon and LLaMA. Additionally, MPT-30B is designed to fit the constraints of real hardware, optimising its performance on deep-learning GPUs.
Additionally, MosaicML claims that MPT-30B compares favourably to LLaMA and Falcon in terms of performance. The model requires less compute power for training while delivering similar results, especially excelling in coding tasks. However, the claims made by MosaicML are yet to be independently verified using Stanford’s HELM measure.
Despite some comparisons and criticisms, MosaicML ultimately sees open-source LLMs, including LLaMA and Falcon, as part of the same team. The company believes proprietary platforms like OpenAI pose real competition and emphasises the empowering nature of open-source LLMs, putting the power back into the hands of enterprise developers. MosaicML believes open LLMs are closing the gap with closed-source models and have reached a point where they are extremely useful, even if they haven’t completely surpassed them yet.
Databricks has positioned itself strongly in the market through several strategic moves. The introduction of LakehouseIQ, the acquisition of MosaicML, and the development of Unity Catalog have placed Databricks in a favourable position to maintain its market position and compete for incremental market share.
MosaicML’s platform will be integrated and scaled over time to provide a unified platform where customers can “build, own and secure their generative AI models. For Databricks, the acquisition of MosaicML is a strategic move aimed at providing enterprises with tools to easily and cost-effectively build their own large language models (LLMs) using their proprietary data.
By integrating this process into the broader Databricks toolchain and workflow, the company aims to reduce the costs associated with training and running LLMs. This strategy recognises the market demand for specialised LLMs that are more cost-effective and finely tuned for specific tasks. While general-purpose LLMs will continue to exist, Databricks sees an opportunity to cater to the need for tailored solutions. Both Snowflake and Databricks are actively working to provide enterprise-class governance and intellectual property protection as part of their specialised LLM offerings.
The Databricks Lakehouse Platform, combined with MosaicML’s technology, will provide customers with a “simple, fast way to retain control, security, and ownership over their valuable data without high costs.” MosaicML claims that its automatic optimisation of model training enables 2x-7x faster training compared to standard approaches.