Amazon makes MASSIVE announcements around a 51-language dataset 

The MASSIVE dataset and the Massively Multilingual NLU (MMNLU-22) competition and workshop will help researchers scale natural-language-understanding technology to every language on Earth.

In the world of multilingual voice assistance, Amazon announced a new dataset called MASSIVE, a new competition using MASSIVE, and a workshop, Massively Multilingual NLU 2022. 

Imagine if everyone in the world could use voice AI systems such as Alexa in their native tongues. A promising approach to realising this vision is massively multilingual natural-language understanding (MMNLU). It is a paradigm where a single ML model can explain and understand input from many typologically diverse languages. This model can learn a shared data representation that spans languages and transfer knowledge from languages with abundant training data to those in which training data is scarce.


Sign up for your weekly dose of what's up in emerging technology.

Amazon made three announcements related to MMNLU by releasing: 

  1. A new dataset called MASSIVE, composed of one million labelled utterances spanning 51 languages, along with open-source code, provides examples of performing massively multilingual NLU modelling and allows practitioners to re-create baseline results for intent classification and slot filling.
  2. A new competition using the MASSIVE dataset called Massively Multilingual NLU 2022 (MMNLU-22).
  3. To co-host a workshop at EMNLP 2022 in Abu Dhabi and online, also called Massively Multilingual NLU 2022.


Prem Natarajan, VP of Alexa AI Natural Understanding, said, “We are very excited to share this large multilingual dataset with the worldwide language research community. We hope the dataset will help researchers worldwide to drive new advances in multilingual language understanding that expand the availability and reach of conversational-AI technologies”.



More Great AIM Stories

Poornima Nataraj
Poornima Nataraj has worked in the mainstream media as a journalist for 12 years, she is always eager to learn anything new and evolving. Witnessing a revolution in the world of Analytics, she thinks she is in the right place at the right time.

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.