The Battle of LLMs: Vicuna vs Alpaca

Both emerged as hot commodities based on Meta’s LLaMA
Listen to this story

Large language models have become the internet’s hot-favourite commodity. The trend torched by OpenAI’s ChatGPT is being taken forward by open-source models as the former refuses to share the details. Even though you cannot use them commercially, two models – Vicuna and Alpaca – released in March have managed to catch the AI community’s attention. 

Meta has broken the mould and shown its dedication to the academic community by open-sourcing its latest model, LLaMA. The weights of the model are available to researchers upon request, setting the stage for the newest contenders in the AI realm. Stanford’s Alpaca and Vicuna-13B, which is a collaborative work of UC Berkeley, CMU, Stanford, and UC San Diego researchers, gained momentum soon after their release.

GitHub and Codes

The best part about both models is that they are open-sourced. The worst is that the terms of use do not let the users commercialise it. These models have managed to make it to the headlines also due to their low price tags. The cost of training Vicuna’s 7B and 13B parameters is $140 and $300, respectively. On the other hand, Alpaca’s 7B parameters require $500 for data and $100 for training.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Vicuna and Alpaca’s training codes are available for public use. Vicuna is trained on user-shared conversations consisting of 70k samples. In contrast, Alpaca leverages self-instruction from davinci-003 API, comprising 52k samples. 

While this article was being written, Vicuna had 13.3k GitHub stars, while Alpaca had 20.2k stars. The repositories contain weights, fine tuning and data generation codes. The API is also available for Vicuna. Check out Vicuna and Alpaca’s GitHub repositories. 

GPT-4 thinks…

While releasing Vicuna, the researchers evaluated it using GPT-4 while Alpaca was evaluated by an author. However, evaluating AI chatbots is like trying to judge a fish on its ability to climb a tree. Many things need to be considered like language skills, reasoning and understanding of context. The models were evaluated on the basis of nine categories, ranging from common sense to maths.  

As per GPT-4, Alpaca scored 7/10 and Vicuna-13B got a 10/10 in ‘writing’. Reason: Alpaca provided an overview of the travel blog post but did not actually compose the blog post as requested, hence a low score. On the other hand, Vicuna composed a detailed blog about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions, which fully addressed the user’s request, earning a higher score.

Despite their capabilities, both the models have their limitations. Vicuna is particularly vulnerable to training data contamination and may require to create new benchmarks for testing. 

In comparison, Alpaca’s answers are typically shorter than ChatGPT, reflecting text-davinci-003’s shorter outputs. The model also exhibits common language models problems, including hallucination, toxicity, and stereotypes. Hallucination, in particular, seems to be a common failure mode for Alpaca, even when compared to text-davinci-003. For instance, Alpaca wrongly states that the capital of Tanzania is Dar es Salaam, which was the capital until 1974, when it was replaced by Dodoma. The researchers stated that Alpaca likely has other limitations associated with both the underlying language model and the instruction tuning data.

In conclusion, while both Vicuna and Alpaca have their strengths and limitations, it is essential to evaluate which model aligns with a particular project’s requirements. Vicuna’s user-shared conversations and GPT4 assessment are advantageous, while Alpaca’s self-instruction from davinci-003 API is a unique feature. While the terms of use may restrict commercialisation, the open-source nature of Vicuna and Alpaca is valuable.

Tasmia Ansari
Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox