Advertisement

Databricks Unveils Dolly 2.0, A Game-Changer in the Open-Source LLMs

Dolly 2.0 is that it is available for commercial purposes unlike other 'open' source LLMs.
Listen to this story


Large language models, up until now, have been in a legal grey area being trained on ChatGPT output. Databricks seems to have figured out a way around this with Dolly 2.0, the predecessor of the large language model with ChatGPT-like human interactivity that the company released just two weeks ago. The differentiating factor between other ‘open source’ models and Dolly 2.0 is that it is available for commercial purposes without the need to pay for API access or share data with third parties unlike the rest. 

According to the company’s official statement, Dolly 2.0 is the world’s first open-source LLM that follows instructions and is fine-tuned on a transparent and openly available dataset. The LLM based on the EleutherAI pythia model family, boasts an impressive 12 billion parameters and has been fine-tuned exclusively on an open-source corpus databricks-dolly-15k.

Databricks’ employees generated this dataset, and its licensing terms allow it to be used, modified, and extended for any purpose, including academic or commercial applications. There has been a wave of LLM releases that are considered open-source by many definitions but are bound by industrial licences and. The trailblazer was Meta’s LLaMA, followed by Stanford’s Alpaca, Koala, and Vicuna.

The Stanford project’s data of 52k questions and answers was trained on the ChatGPT’s outputs. But as per OpenAI’s terms of use, you can’t use output from services that compete with OpenAI. Databricks seems to have figured out how to get around this with Dolly 2.0.

According to Ali Ghodsi, the CEO of Databricks, the model, Dolly 2.0, is set to create a “snowball” effect in the AI community. He believes that this will inspire others to contribute and collaborate on developing alternative models. The limit on commercial use was a big obstacle to overcome, he explained.

Download our Mobile App

Tasmia Ansari
Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Bangalore

Future Ready | Lead the AI Era Summit

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

20th June | Bangalore

Women in Data Science (WiDS) by Intuit India

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR