MITB Banner

Why Falcon Sucks

Falcon at this point is actually a little on the outdated side.

Share

Why Falcon Sucks

Illustration by Nikhil Kumar

In June, the first time when UAE’s TII launched Falcon, its own LLM, it was on top of the Hugging Face Open LLM Leaderboard, championing ahead of Meta’s Llama 2 on several benchmarks. Undoubtedly, people had their doubts as the researchers did not have a paper to back up the claims so well. 

Cut to the present, the team has still not released the model’s paper. The open source champion with a 40 billion parameter size, now also has a 180 billion parameter model, which though is open source, does not have many users because of its massive compute requirements. It seems as though there is something wrong with the model.

The Open LLM Leaderboard currently has a bunch of models that are fine-tuned on top of Mistral, Llama 2, and even Chinese models such as Yi-24B and Qwen-7B, but there is no model of Falcon.

Is it a boycott?

The foremost thing here is to understand that the model is not born out of the US or Europe, which are leading the LLM race with big-tech companies. But that did not stop the developer community from experimenting with the model. For the last months, Falcon was touted as the replacement of OpenAI and Google because of its open nature, but now it’s nowhere to be seen.

Maxime Labonne, the open source researcher and the creator of several models and LLM Courses, told AIM that though he experimented with Falcon, he finds Llama and Mistral’s models a lot easier for tokenisation on different European languages. “I find it surprising when people still use it,” he said.

Labonne emphasised on the part that it was good enough when it was released as there were no other alternatives apart from the first version of Meta’s LLaMA, and had a slightly different licence as well. Interestingly, even that was not as open source as people wanted a model to be. 

“Now Llama 2 is a lot better than Falcon, and Mistral is a lot better than Llama 2. It makes no sense to use Falcon anymore,” he added. 

Moreover, when it comes to Indic models, Llama is touted as the best option. Mistral, though good for European languages, struggles to tokenise Indic languages. Adarsh Shirawalmath, the creator Kannada Llama aka Kan-Llama, said that the team decided to not use Mistral or Falcon because it is very hard to fine-tune them with Indic language tokens. 

That is why most of the current Indic language models are built on top of Llama 2. When asked about Falcon to the creators of BharatGPT, they said that the good part about it is that it is built with an academic, government, and private partnership. That is what IIT Bombay is also aiming for with BharatGPT, a large open source model. 

Come of age, in a few months

Recently, researchers from IIT Patna released a dataset for healthcare in India, focusing on Indic languages as well. AIM asked Aman Chadha, the lead researcher about why the team did not use Falcon, but used all other models such as Llama, Zephyr, and others. “Falcon was great when it came out. But with Mistral and Zephyr and a bunch of them, they seem to have done incredibly well in the area that we wanted,” he explained. 

“Falcon at this point is actually a little on the outdated side,” he added. “There’s been so much in terms of new models coming such as Tiny Llama. While Falcon is great depending on what the targeted use cases are, we look at what’s out there and how performance based models are and accordingly choose what you want to go out with,” Chadha asserted.

Undoubtedly, the sheer size of Falcon with 180 billion parameters seems like a good open source alternative to GPT-4 and others, as the conversation increasingly is moving towards edge use cases and multilingual models, Falcon seems to be getting too heavy and outdated, specially for the Indian landscape and multilingual tasks.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India