Active Hackathon

Baidu Conquers The Next-Gen AI Race By Beating Tech Giants In A Language Test

Baidu AI

Although conversing with artificial intelligence (AI) has been a common plot point for several science fiction movies, the real-life applications are still miles away. However, according to recent media reports, Chinese technology giant — Baidu has started to make bold strides towards AI by beating Google and Microsoft in a competition designed to test the ability of machine in understanding human language.

Baidu, which is often termed as China’s Google, surpassed traditional players when it comes to AI and language learning. It has achieved the highest ever score in the General Language Understanding Evaluation (GLUE), which has been widely considered to be the benchmark for AI language comprehension skills. For most humans, the managed score is usually an 87 out of 100, however, Baidu’s model, called ERNIE (Enhanced Representation through knowledge Integration has scored a 90, which is a first for any AI models. This model was initially developed to understand the Chinese language but researchers soon realised its ability to understand English as well.


Sign up for your weekly dose of what's up in emerging technology.

According to Hao Tian, chief architect at Baidu Research said, “When we first started this work, we were thinking specifically about certain characteristics of the Chinese language. But we quickly discovered that it was applicable beyond that.”

Behind The Scenes

ERNIE was completely inspired by Google’s BERT (Bidirectional Encoder Representations from Transformers), a “masked” language model, created in 2018, used to train AI to understand human language. Both the models — Bert and Ernie were named after Sesame Street characters, a children’s show, which are used to interpret meanings by examining the words appearing both before and after a word in a sentence in order to fully establish context. 

While, Google’s model hides 15 per cent of the words in each sequence and then tries to predict them based on the context, the scene for Baidu is completely different. This is because, many Chinese characters do not have an inherent meaning until they are strung together with other characters, which is a key linguistic difference from English, and therefore Baidu’s team need to take steps further in training its AI model to better understand a way to hide a string of meaningful characters and predict the masked ones. The company has already started to use the model in order to improve results for its search engine and make its AI assistant Xiao Du more accurate.

Also Read: Baidu Goes On A Patent Frenzy

To make people understand better, the company illustrated the technique on its Github page, taking ‘Harry Potter is a series of fantasy novels written by J. K. Rowling’, as an example. Google’s BERT was able to identify the letter ‘K’ through the local co-occurring words J, K, and Rowling, but was not able to comprehend anything related to the word ‘J. K. Rowling’. However, ERNIE, on the other hand, was able to understand the exact relationship between ‘Harry Potter’ and ‘J. K. Rowling,’ by analysing the underlying knowledge of words and phrases, to come to a conclusion that ‘Harry Potter’ was a novel written by ‘J. K. Rowling.’

With such understanding AI, Baidu comprehends meaningful words instead of individual characters and therefore performing better in both English as well as Chinese. ERNIE is now being used for real-world applications, where it is deployed to answer questions on its search engine and deliver better results.

According to Baidu Research team, “Although language understanding will always remain a difficult challenge, our results on GLUE indicate that pre-training language models with continual training and multi-task learning are a promising direction for NLP research. And therefore, we will keep improving the performance of the ERNIE model via the continual pre-training framework.”

Also Read: 7 Innovations By Baidu Which Changed The Face Of AI

Business Prospect

Baidu, with a total of 5712 AI-related patents, is currently at an expanding mode different sectors like virtual assistants, smart speakers and autonomous cars. The company’s patent applications were followed by Tencent (4,115), Microsoft (3,978), Inspur (3,755), and Huawei (3,656), according to the report issued by the China Industrial Control Systems Cyber Emergency Response Team, a research unit under the MIIT. The report also mentioned how Baidu is leading the patent application in several key areas of AI, which include deep learning (1,429), NLP (938), and speech recognition (933). The company also leads in the highly competitive area of intelligent driving, with 1,237 patent applications. 

In fact, earlier this month, this Chinese giant has partnered with Samsung to develop power-efficient AI chips, which could be used for managing large-scale AI workloads, such as search ranking, speech recognition, image processing, natural language processing, autonomous driving, and deep learning platforms. So, this partnership with Samsung will help Baidu’s NLP framework, ERNIE, to process language way faster than it could be imagined with its current GPUs.

Along with that, once Baidu leaves the partnership with NVIDIA’s AI accelerators, it will, in turn, omits its dependency on American companies. It will also help in reducing the cost of their data centres, and the whole move will also give them an upper hand on its AI rival — Alibaba, who has recently launched its own AI accelerator chip.

Also Read: Baidu’s ERNIE 2.0 Gets NLP Top Honours, Eclipses Bert & XLNet


Well, after several years of research, the Chinese giant has now developed a comprehensive AI ecosystem which has brought the company at the forefront of the global AI industry. And, according to the media, in the near future, Baidu will continue to push forward the real application of AI into more vertical. It is also believed that the company will continue its researches in the core sectors of AI aiming to contribute to the technological innovation of China.

More Great AIM Stories

Sejuti Das
Sejuti currently works as Associate Editor at Analytics India Magazine (AIM). Reach out at

Our Upcoming Events

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: Enabling a Data-Driven culture within BFSI GCCs in India

Data is the key element across all the three tenets of engineering brilliance, customer-centricity and talent strategy and engagement and will continue to help us deliver on our transformation agenda. Our data-driven culture fosters continuous performance improvement to create differentiated experiences and enable growth.

Ouch, Cognizant

The company has reduced its full-year 2022 revenue growth guidance to 8.5% – 9.5% in constant currency from the 9-11% in the previous quarter