MITB Banner

Microsoft Translator Can Now Translate More Than 100 Languages And Dialects

With this milestone, Microsoft says it has broken the language barrier for 72% of the world's population.

Share

Microsoft recently announced that the Microsoft Translator, its AI-powered text translation service, now supports more than 100 different languages and dialects. With the addition of 12 new languages, including Georgian, Macedonian, Tibetan, and Uyghur, Microsoft claims that its Translator can now make text and information in documents accessible to 5.66 billion people worldwide.

Translator today covers the world’s most spoken languages, including English, Chinese, Hindi, Arabic and Spanish. In recent years, advances in AI technology have allowed the company to grow its language library with low-resource and endangered languages, such as Inuktitut, a dialect of Inuktut that about 40,000 Inuit speak in Canada.

“One hundred languages is a good milestone for us to achieve our ambition for everyone to be able to communicate regardless of the language they speak. Not only do we celebrate what we have done on translation – reach 100 languages – but also for speech and OCR as well, we want to remove language barriers,” said Xuedong Huang, Microsoft technical fellow and Azure AI chief technology officer.

The new languages and dialects taking Translator over the 100-language milestone are Bashkir, Dhivehi, Georgian, Kyrgyz, Macedonian, Mongolian (Cyrillic), Mongolian (Traditional), Tatar, Tibetan, Turkmen, Uyghur and Uzbek (Latin), which collectively are natively spoken by 84.6 million people.

With this milestone, Microsoft says it has broken the language barrier for 72% of the world’s population.

The frontier of machine translation technology at Microsoft is a multilingual AI model called Z-code. The model combines several languages from a language family, such as the Indian languages of Hindi, Marathi and Gujarati. In this way, the individual language models learn from each other, which reduces data requirements to achieve high-quality translations

The reduced data requirements also enabled the Translator team to build models for languages with limited resources or endangered due to dwindling populations of native speakers. Several of the languages carrying Translator over the 100-language milestone are low-resource or endangered.

Z-code is part of a larger initiative to combine AI models for text, vision, audio, and language to enable AI systems that can speak, see, hear, and understand and thus more efficiently augment human capabilities. 

Share
Picture of Victor Dey

Victor Dey

Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India