Last updated April 17, 2023
In AI Mysteries

The Battle of LLMs: Vicuna vs Alpaca

Both emerged as hot commodities based on Meta’s LLaMA

Published on April 14, 2023
by Tasmia Ansari

Listen to this story

Large language models have become the internet’s hot-favourite commodity. The trend torched by OpenAI’s ChatGPT is being taken forward by open-source models as the former refuses to share the details. Even though you cannot use them commercially, two models – Vicuna and Alpaca – released in March have managed to catch the AI community’s attention.

Meta has broken the mould and shown its dedication to the academic community by open-sourcing its latest model, LLaMA. The weights of the model are available to researchers upon request, setting the stage for the newest contenders in the AI realm. Stanford’s Alpaca and Vicuna-13B, which is a collaborative work of UC Berkeley, CMU, Stanford, and UC San Diego researchers, gained momentum soon after their release.

GitHub and Codes

The best part about both models is that they are open-sourced. The worst is that the terms of use do not let the users commercialise it. These models have managed to make it to the headlines also due to their low price tags. The cost of training Vicuna’s 7B and 13B parameters is $140 and $300, respectively. On the other hand, Alpaca’s 7B parameters require $500 for data and $100 for training.

Vicuna and Alpaca’s training codes are available for public use. Vicuna is trained on user-shared conversations consisting of 70k samples. In contrast, Alpaca leverages self-instruction from davinci-003 API, comprising 52k samples.

While this article was being written, Vicuna had 13.3k GitHub stars, while Alpaca had 20.2k stars. The repositories contain weights, fine tuning and data generation codes. The API is also available for Vicuna. Check out Vicuna and Alpaca’s GitHub repositories.

GPT-4 thinks…

While releasing Vicuna, the researchers evaluated it using GPT-4 while Alpaca was evaluated by an author. However, evaluating AI chatbots is like trying to judge a fish on its ability to climb a tree. Many things need to be considered like language skills, reasoning and understanding of context. The models were evaluated on the basis of nine categories, ranging from common sense to maths.

As per GPT-4, Alpaca scored 7/10 and Vicuna-13B got a 10/10 in ‘writing’. Reason: Alpaca provided an overview of the travel blog post but did not actually compose the blog post as requested, hence a low score. On the other hand, Vicuna composed a detailed blog about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions, which fully addressed the user’s request, earning a higher score.

Despite their capabilities, both the models have their limitations. Vicuna is particularly vulnerable to training data contamination and may require to create new benchmarks for testing.

In comparison, Alpaca’s answers are typically shorter than ChatGPT, reflecting text-davinci-003’s shorter outputs. The model also exhibits common language models problems, including hallucination, toxicity, and stereotypes. Hallucination, in particular, seems to be a common failure mode for Alpaca, even when compared to text-davinci-003. For instance, Alpaca wrongly states that the capital of Tanzania is Dar es Salaam, which was the capital until 1974, when it was replaced by Dodoma. The researchers stated that Alpaca likely has other limitations associated with both the underlying language model and the instruction tuning data.

In conclusion, while both Vicuna and Alpaca have their strengths and limitations, it is essential to evaluate which model aligns with a particular project’s requirements. Vicuna’s user-shared conversations and GPT4 assessment are advantageous, while Alpaca’s self-instruction from davinci-003 API is a unique feature. While the terms of use may restrict commercialisation, the open-source nature of Vicuna and Alpaca is valuable.

Access all our open Survey & Awards Nomination forms in one place >>

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.

Watch More

The Battle of LLMs: Vicuna vs Alpaca

GitHub and Codes

GPT-4 thinks…

Tasmia Ansari

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.