MITB Banner

India vs China vs US in Open Source AI

It is time for India to be on top of the Hugging Face Open LLM Leaderboard.

Share

India vs China vs US in Open Source AI

Illustration by Nikhil Kumar

It takes a lot for open source models to be on top of the Hugging Face Open LLM Leaderboard. Falcon, LLaMA and Mistral, the models that have had their moment at the top, are now making way for open source models from China. But, there is no need to be worried about the Chinese models (as long as they are open source).

Currently, an unfamiliar model Smaug-72B, is on top of the leaderboard. It boasts an average score of 80, outperforming Mistral. Developed by Abacus AI, Smaug-72B is a fine-tuned model of Qwen-72B, which was developed by the Chinese tech-giant Alibaba and released in December last year, along with a 1.8 billion parameter model.

Qwen-72B is only one of the larger models that Alibaba has developed. The company also released its latest open source model, Qwen1.5-72B model, which surpassed Claude-2.1, GPT-3.5-Turbo-0613 on several benchmarks. Notably, Qwen is also an organisation building LLMs and large multimodal models (LMMs), and other AGI-related projects.

China’s open source dominance

Clearly, the fear of China rising up against US AI models is becoming a reality. The models from the country are increasingly dominating open source, and will continue to do so in the upcoming years. But as with the case of Abacus AI’s Smaug model, it is clear that the researchers are more interested in using open source models, rather than taking any risk, which is how research works. 

Apart from Qwen, Tencent in September last year released ‘Hunyuan’ LLM for enterprise usage, marking a significant move as companies in the country strive to establish themselves as leaders in the technology industry, more specifically, the generative AI field. Tencent’s vice president, Jiang Jie, highlighted the competitive landscape, stating that over 130 LLMs had surfaced in China by July.

In December, DeepSeek, a company based in China which aims to “unravel the mystery of AGI with curiosity”, open sourced DeepSeek LLM, a 67-billion parameter model trained meticulously from scratch on a dataset consisting of 2 trillion tokens, in both English and Chinese, clearly hinting the bid to go across the globe. It also outperformed Llama 2, Claude-2, and Grok-1 on various metrics. 

Moreover, Kai-Fu Lee’s AI startup, 01.AI has open sourced its foundational LLM called Yi-34B, which outperforms Llama 2 on various key metrics. Lee said that all he wanted to do was provide an alternative to Meta’s Llama 2, which has been “the gold standard and a big contribution to the open-source community”.

Speaking of Meta, the company has clear plans to release Llama 3 as soon as possible. Though open source is clearly the winner of the AI race, there is no definite one winner in the open source race, as everyone is aiming for the top spot, and manage to stay on top of the leaderboard for a significant amount of time, be it Llama, Mistral, or UAE’s Falcon. It is clearly the right time to also adopt the impressive open source LLMs from China.

What about India?

When it comes to India and generative AI, all the recent Indic language models such as Kannada, Tamil, or Telugu, have been built on top of models built by Meta’s Llama 2 or Mistral. There is a dire need for India to build its own open source model from scratch and let others build on top of it. 

Speaking to AIM, Ganesh Ramakrishnan, the IIT Bombay professor who is leading the BharatGPT initiative, said that there is definitely the need for a foundational model for Indic languages. “We are building foundational models from scratch and that is what is keeping us busy,” said Ramakrishnan about how BharatGPT will mark India on the global AI map.

But even now, while Indian researchers are focusing on Indic language, and taking on the herculean task of collecting Indic language dataset, there is still not a single model built from scratch even in English to have reached the Hugging Face leaderboard. It is hard to assume that an Indic language model, though becoming open source, would be used by researchers from other countries.

That is why China releases its models in both English and Chinese, for others to use the model, instead of forming a bubble within the country. 

Apart from building Indic LLMs for India to be used in several ways, it is also necessary for an open source model from India that outperforms others globally, even if it is just an English LLM. “Mistral got France on the AI map. We want India to get on the AI map with BharatGPT,” said Ramakrishnan. 

“We want everyone to use generative AI,” Vishnu Vardhan, the founder of Vizzhy, who is also the GPU buddy of BharatGPT told AIM. He said that the initiative would not just be about releasing weights, but making it available to everyone and highlighted that the first model would be open source. He wants developers to help them make it better. “The more people use it, the better it will become,” and eventually, they would release more versions of the model.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India