Google’s New AI Milestone: Neural Machine Translation Engine Can Now Translate 103 Languages

Neural Machine Translation (NMT), one of the most important topics in deep learning, has gained much attention from the industries and academia over the last few years. In order to create simple models out of the complex ones, tech giant Google has been doing a lot of innovations in the domain of human to machine and machine to human translations for quite a few years now. 

Back in 2017, the tech giant introduced a solution to use a simple Neural Machine Translation (NMT) model to translate between multiple languages where the researchers merged 12 language pairs into a single model. The researchers here categorised the multilingual NMT

Models into three types which are many-to-one, one-to-many and many-to-many models. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Recently, the researchers at Google AI Team built a more enhanced system for neural machine translation (NMT) and published a paper known as “Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges”. The researchers built a single massively multilingual neural machine translation (NMT) model which handles 103 languages. 

The researchers used a massive open-domain dataset which contains over 25 billion parallel sentences in 103 languages. The goal behind building this model is to enable a single model to translate between an arbitrary language pair and to accomplish this, the researchers model a massively multi-way input-output mapping task under strong constraints such as implementing a huge number of languages, different scripting systems, heavy data imbalance across languages and domains and other such.  

The desired features of this universal Neural Machine Translation (NMT) are mentioned below:

  1. Maximum throughput in terms of the number of languages considered within a single model.
  2. Maximum inductive (positive) transfer towards low-resource languages.
  3. Minimum interference (negative transfer) for high-resource languages.
  4. Robust multilingual NMT models that perform well in realistic, open-domain settings. 

How It Is Different

The researchers at the tech giant claimed that this model is the largest multilingual NMT system to date in terms of the amount of training data and the number of languages considered. However, this also means that the concept of the proposed model is not new and it can be said as the advanced version of the previously proposed model. 

This state-of-the-art model can be used as one-to-many which is one model for many languages which eventually reduces the training as well as serving costs. In a blog post, Graham Neubig, an Assistant Professor at the Language Technologies Institute of Carnegie Mellon University said that through this research the Machine Translate researchers and practitioners can gain insights on a number of points such as the importance of large models and the effects of some techniques to select how large to make the model vocabularies.    


  • Data and Supervision: The model is limited to 103 languages which are a minuscule fraction of the thousands of existing languages. This model will be less applicable when more languages will be included.
  • Learning: The heuristic strategy of this model only takes dataset size into account when determining the fraction of per-task samples seen by the model.  
  • Increasing Capacity: This model lacks the need for sufficient model capacity when training large multitask networks. The researchers also faced significant trainability challenges while training deep and high capacity neural networks.
  • Architecture and Vocabulary: As the researchers scale up to thousands of languages, vocabulary handling becomes a significantly harder challenge.

Wrapping Up

There are several benefits of utilising the multilingual models. A carefully designed multilingual model can easily handle all translation directions within a single model. This model not only helps in reducing operational costs but also improves performance on low and zero-resource language pairs as well as simplifies deployment in production systems. 

Read the paper here

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry


Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.