MITB Banner

Leaked LLaMA Unveils the Power of Open Source for AI

In just a month, LLaMA has become the darling of the open-source community
Share
Meta’s Code Llama is Here, But Unnaturally
Listen to this story

Meta’s AI research team has garnered a positive reputation for open-sourcing their models — the latest being LLaMA, with the model’s weights being available for academicians and researchers on a case-by-case basis. However, one of these parties leaked the code on GitHub, giving programmers all over the world open access to their first GPT-level LLM. 

The developer community has since had a field day with this model, optimising it to run on the lowest-powered devices, adding functionality to the model, and even using it to create some new use cases for LLMs. The open-source community is the biggest multiplier for AI research, and developers are the reason behind it. 

Optimising the model

When LLaMA was launched, budding LLM enthusiasts found that it required more than 16GB of VRAM to run the 7 billion parameter version of the model. However, they quickly found ways to cut down on the required amount of memory for the model. The first step in optimising the model was a community project known as LLaMA.cpp, which rewrote the model in C++.

This, along with a community effort to quantise the weights, allowed the model to run on a large range of hardware. A programmer was even able to run the 7B model on a Google Pixel 5, generating 1 token per second. llama.cpp was then ported to Rust, allowing for faster inference on CPUs, but the community was just getting started

Researchers at Stanford University created another model — a fine-tuned one based on LLaMA 7B. By using over 50,000 instruction-following demonstrations from GPT 3.5, the researchers were able to train LLaMA to give similar outputs to OpenAI’s model. What’s more, the model, called Alpaca, had a training and inferencing cost of only $600 — far lesser than the millions of dollars it takes to train these models. 

Alpaca marked a democratisation of LLMs, bringing LLaMA to the masses. By bringing down the fine-tuning cost to a few hundred dollars and open sourcing the model, Alpaca put the power of LLMs in the hands of developers all over the world, leading them to add some functionality to this LLM. 

Unique use-cases

After the model was open-sourced and researchers began to harness the power of Alpaca, programmers and developers began to see the use-cases of this LLM. While it started slowly, with a dev using Alpaca to create a Homer Simpson bot, the model soon began to see many useful applications. 

User ‘LXE’ on GitHub created a simple WebUI that allowed anyone from the community to fine-tune the model using their own text. Similarly, user ‘Sahil280114’ also created a fine-tuned code generation model from Alpaca, termed CodeAlpaca. Llama Index, a project to connect LLMs with external data, also migrated from using GPT to using LLaMA due to its open-source nature. 

Dalai was launched as an easy way to get both Alpaca and LLaMA running on any platform with just a command, further reducing the barrier to entry for LLMs. Another model, called GPT4All, was built upon the legacy of Alpaca. This was trained on around 800,000 GPT 3.5 generations, further increasing the power of LLaMA. The use cases just kept pouring in.

Colossal-AI created a ChatGPT alternative by training LLaMA with reinforcement learning with human feedback, and the community created Llamahub to keep up with all the ways one can connect to the model. The best part is that all of this occurred within 1 month of the model’s release, showing the true power of the open-source community. 

Open-source does the work

Not only has the community built and improved the model released by Meta, it has also created a host of use cases — all from one LLM. While LLMs might be all the craze right now, they are only a single type of model in the vast AI landscape. Similar to LLaMA, another model that gained users, a community, and various offshoots was Stable Diffusion.

While the open-source rise of Stable Diffusion warrants an essay of its own, suffice it to say that the model is the go-to option for image generation for developers. One only needs to look at the number of forks that the Stable Diffusion GitHub page has — over 7,600 at the time of writing — to see the impact this diffusion model has had on the open-source community.

As models get bigger, it gets more expensive and difficult to train them, centralising the power of LLMs in big-tech companies like OpenAI, Microsoft, Google and Meta. With models being open-sourced, more power goes towards the communities building products around these powerful models, eventually laying the groundwork for a free AI world. 

PS: The story was written using a keyboard.
Share
Picture of Anirudh VK

Anirudh VK

I am an AI enthusiast and love keeping up with the latest events in the space. I love video games and pizza.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India