We are Entering an Era of ‘LLM Pollution’

LLM, LLM everywhere and not a single one to use.

Share

Illustration by Nikhil Kumar

Published on April 2, 2024

by Mohit Pandey

Listen to this story

Everyone is building LLMs. Be it closed or open, the number of language models out there significantly outpace the amount of extensions and applications based on them. Some are small and some are big, but only a few companies are actually able to build something tangible out of them.

Arguably, that is also something important to do. Many extensions that come out of open language models are just language additions or are making them faster on a smaller scale. Though a noble cause, it doesn’t really affect how these models get adopted. When it comes to India, there are models built on top of Llama 2 in various Indic languages, but beyond that, nothing significant has been brought out.

These models come in various sizes, from modest to monumental, yet despite their abundance, only a handful of companies have managed to translate them into tangible applications effectively.

Indeed, the proliferation of LLMs represents a significant milestone in AI advancement. Yet, the sheer volume of models being produced is outpacing the development of meaningful extensions and practical applications. While these efforts are commendable, they fall short of addressing the core issue of widespread LLM adoption.

A waste of time?

For example, there are hundreds and thousands of language models on the Hugging Face Leaderboard. As soon as a new model drops, people start messing around with it, testing its capabilities, and benchmarking it for their use before moving on. Next day, the cycle repeats with the latest model in town.

Falcon, one of the largest open source language models, when launched, was tested and applauded by a lot of developers. But soon, after testing its capabilities, people found out that Meta’s Llama 2 is a lot better even with its smaller size. The same happened with Mistral’s new models and with OpenAI’s GPT-2 after so many years.

Speaking of Falcon, it is there, but people rarely use it. No significant applications are being built out of it. But TII, the institute behind the language model, might be up to making another AI model which they would want to see on top of the leaderboards.

Undoubtedly, this is how competition works. Databricks’ new AI model called DBRX is currently outperforming every other model in the market and at a much cheaper price. Enterprises are ready to adopt it given its capabilities. The same rush would arguably get witnessed again when Meta drops Llama 3. There would be choices for sure, but people will then forget about Llama 2 as well.

This abundance of foundation language models without any added innovation is now encapsulated as ‘LLM pollution’. Rather than facilitating innovations or transformative applications, the surplus of LLMs risk inundating the field with redundant or underutilised models.

What next then?

Naveen Rao, the VP of generative AI at Databricks, told AIM that a vast majority of foundational model companies will fail. “You’ve got to do something better than they [OpenAI] do. And, if you don’t, and it’s cheap enough to move, then why would you use somebody else’s model? So it doesn’t make sense to me just to try to be ahead unless you can beat them,” he added.

Rao also said that everyone has to have their take, but many of them just build models and call it a victory. “Woohoo! You built a model. Great,” he quipped. But he said that it will not work without differentiation or problem-solving.

“Just building a piece of technology because you said you can do it doesn’t really prove that you can solve a problem,” said Rao.

Ankush Sabharwal from CoRover.ai told AIM that there is no need to build more foundational models when you already have the ones that work for use cases. It is time to go with this approach.

Throwing billions of dollars for the next GPT might create an excellent model for OpenAI, but the billions of dollars used to build GPT-4 would be arguably gone to dust. People might use it for a while, but it would just become the next GPT-2 soon. There is no point in not accelerating AI, but measuring the impact, positive and negative, it has on the adoption side definitely needs to be accelerated alongside.

There is a pressing need for greater emphasis on practical applications and real-world problem-solving with LLMs. Rather than fixating solely on the technical prowess of language models, attention should also be devoted to their practical utility and societal implications.

Companies will definitely not all use the same LLMs, and we definitely need more options. But it is also necessary to define the exact use cases of these models before building a bunch of them in different languages. The era of ‘LLM pollution’ is here. And there would be LLMs in a heap that no one would use, which were once on top of the charts, but are now languishing in a pile.

Access all our open Survey & Awards Nomination forms in one place