MITB Banner

BERT Is So Popular That Google Have To Release A Website To Collate All Developments

Share

With the advent of transformer-based machine translation models, researchers have been successful in implementing state-of-the-art performance in natural language processing (NLP). In 2018, Google open-sourced its groundbreaking state-of-the-art technique for NLP pre-training called Bidirectional Encoder Representations from Transformers, or BERT. With the help of this model, one can train their state-of-the-art NLP model in a few hours using a single GPU or a single Cloud TPU. The power of this model lies in the certainty that using some specific downstream tasks, BERT can be easily fine-tuned to gain state-of-the-art outcomes.

Driven by the potential of BERT models, developers working in the NLP domain have generated a number of BERT models that are fine-tuned, trained on some specific language as well as tested on a certain data domain and task. Recently, the researchers at Google released a multilingual language model known as multilingual BERT model or mBERT. mBERT supports more than 100 languages and is trained in different domains, like social media posts or newspaper articles. The main aim behind this project is to provide a quick and easy overview of the similar attributes as well as dissimilarities between Language-Specific BERT models and the Multilingual BERT model. 

Behind mBERT

Multilingual BERT is a single language model pre-trained from monolingual corpora in 104 languages using Wikipedia data. The model allows for zero-shot learning across languages, which means one can train data in a particular language and then apply the trained model to data in some other language. Thus, this model obtained impressive results on a zero-shot cross-lingual natural inference task.

BERT has further been extended to include several languages. Besides Multilingual BERT (mBERT), there are a number of BERTs available such as A Lite BERT (ALBERT), RoBERTa, etc. According to the researchers, while the multi- and cross-lingual BERT representations allow zero-shot learning and capture universal semantic structures, they do gloss over language-specific differences. This is the main reason behind the development of the number of language-specific BERT models. 

In order to navigate the potential of constant development on these BERT models, the researchers introduced a website called BertLang. In this website, the researchers gathered different language-specific models, which have been introduced on a variety of tasks and data sets.  

BertLang Street

The BertLang website currently includes 30 BERT-based models, 18 languages and 28 tasks. According to the researchers, this website is a collaborative resource to help researchers understand and find the best BERT model for a given dataset, task and language. The website provides a searchable interface as well as the possibility to add new information. 

Wrapping Up

With the impressive progress of NLP techniques in various domains, there has been an increase in the number of language-specific BERT models developed by the NLP researchers but which model provides the best performance is still in a vague state. In this project, the researchers evaluated the potential of mBERT as a universal language model by comparing it with the performance of other language-specific models. The BertLang Street provided by the researchers will help in evaluating the pros and cons of each language-specific model with respect to different dimensions, such as architectures, data domains and tasks. 

The contributions of this project by the researchers are mentioned below

  • An overall picture of language-specific BERT models from an architectural, task- and domain-related point of view has been presented.
  • The researchers summarised the performance of language-specific BERT models and compared with the performance of the multilingual BERT model.
  • A new website known as BERT Lang Street is introduced to interactively explore the state-of-the-art models.

Read the paper here.

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.