Last updated December 4, 2023
In AI Origins & Evolution

Beware of Chinese Open-Source LLMs

Not really.

Share

Illustration by Nikhil Kumar

Published on December 4, 2023

by Mohit Pandey

Listen to this story

If you’ve seen or even heard of popular American comedy series Silicon Valley, you may be familiar with the shady Chinese app developer, Jian-Yang. Yang goes back to China to build a knock-off version of Pied Piper, a fictional cloud-based compression platform which allows users to compress and share their files between devices.

A similar drama is unfolding at OpenAI, where the company has filed for a patent for “GPT-6” and “GPT-7” in China, not in the US, to avoid the Pied Piper situation, obviously.

This new development also highlights the advancements in open source AI research in China, which even OpenAI is concerned about.

When it comes to open source AI research, we have often heard many say that it is a risk to open source powerful AI models because Chinese competitors would have all the weights of the models, and would eventually be on top of all the others. Large language models (LLMs) from China are increasingly topping the leaderboards.

Most recently, DeepSeek, a 67 billion parameter model outperformed Llama 2, Claude-2, and Grok-1 on various metrics. The best part is that the model from China is open sourced, and uses the same architecture as LLaMA. Companies like Abacus AI are ready to host the models on their platforms.

Topping the leaderboards

Given the geopolitical conflict between the US and China, the regulations on chip exports to the country are increasing, making it difficult for it to build AI models, and up its business. But the increasing number of open source models indicates that China does not really rely on US technology to further its AI field.

For instance, the Open LLM Leaderboard on Hugging Face, which has been criticised several times for its benchmarks and evaluations, currently hosts AI models from China; and they are topping the list.

Tigerbot-70b-chat-v2 is currently leading the pack. The model, available on GitHub and Hugging Face, is built on top of Llama 2 70b architecture, along with its weight. It seems like open source models such as Llama 2 are actually helping the AI community in China to build models better than the US at the moment.

Tiger Research, a company that “believes in open innovations”, is a research lab in China under Tigerobo, dedicated to building AI models to make the world and humankind a better place. Similarly, DeepSeek is also a research lab with the mission of “unravelling the mystery of AGI with curiosity”.

Another hint of China’s open source AI dominance is the Yi-34B model released by 01.AI startup, reaching a unicorn status after the release. The AI startup by Kai-Fu Lee is developing AI systems for the Chinese market. The interesting part is that the second and third models on the Open LLM Leaderboard are also models based on Yi-34B, combining them with Llama 2 and Mistral-7B.

Not just this, Alibaba, the Chinese tech giant, also released Qwen-72B with 3 trillion tokens, and a 32K context length. This, along with a smaller Qwen-1.8B, is also available on GitHub and Hugging Face, which requires just 3GB of GPU memory to run, making it amazing for the research community. Notably, Qwen is also an organisation building LLMs and large multimodal models (LMMs), and other AGI-related projects.

Is China open source a threat?

Clearly, the fear of China rising up against US AI models is becoming a reality. The models from the country are increasingly dominating the open source, and will continue to do so in the upcoming year. And regulations are clearly not making it any better for the US. But how big a threat is it really?

The recent slew of releases of open source models from China highlight that the country does not need US assistance in its AI developments. Moreover, if the US continues to crush its open source ecosystem with regulations, China will rise up even more in this aspect. As long as China continues to open source its powerful AI models, there is no threat at the moment.

Furthermore, China leading in the AI realm is not a new phenomenon. When GPT-3.5 was announced by OpenAI, Baidu released its Ernie 3.0 model, which was almost double the size of the former. The case has been the same ever since Ernie 2.0 was released, to compete with the first version of OpenAI’s GPT in 2019.

Even though these models are on the top of the Open LLM Leaderboard, a lot of researchers have been pointing out that it is just because of the evaluation metrics used for benchmarking. A lot of researchers in China are also hired from the US.

Moreover, a lot of these models are extremely restrictive. Given the information control in the country, these models might be fast, but are extremely poor when it comes to implementation into real use cases. “Don’t use Chinese models. They are going to nerf them the Chinese way, which is to alter behaviour even worse than current US censored models,” said a user on X.

Access all our open Survey & Awards Nomination forms in one place