Say Hello To STACL, Baidu’s New Innovation For Improving Simultaneous Translation

United Nations(New York)

Translating a language is a tedious job which involves listening, speaking and a thorough command over the language. An assignment which needs a high degree of skills for humans to have might be a cakewalk for modern computer technologies such as machine learning or artificial intelligence to acquire.  



Sign up for your weekly dose of what's up in emerging technology.

Baidu, in a recent development, has developed a device which uses ML and AI to translate languages, while boasting anticipation capabilities and controllable latency. It is an automated system that ensures a high-quality translation between two languages. It would prove to be highly advantageous against traditional consecutive interpretation, wherein a translator waits until the speaker pauses to start translating. While this method usually doubles the time needed, simultaneous interpretation is a faster option where the translator begins translating just a few seconds into the speaker’s speech.

Issues With Current Translation System

The usage of the verbs and figures of speeches may vary significantly in various languages. For example, in English, the verb comes prior to the sentence whereas in German it comes at the end of the sentence. Same with Chinese to English translation. This variance in order of words is a major hindrance for real-time human translators, causing undesirable latency and rendering speaker out of sync with the speaker. Simultaneous Translation with Anticipation and Controllable Latency (STACL) promises to address the issue of when and how to use words.

How Does It Work?

STACL works on the principle of predictive analysis. As the researchers from Baidu explain, the model doesn’t predict the source language words in the speaker’s speech but instead directly predicts the target language words in the translation. The model seamlessly fuses translation and anticipation in a single “wait-k” model. It means that translation is always k words behind the speaker’s speech to allow context for prediction. The model is trained to use the available prefix of the source sentence at each step to decide the next word in translation.


The researchers added that in the Chinese prefix Bùshí Zǒngtǒng zài Mòsīkē (“Bush President in Moscow”) and the English translation so far “President Bush” which is k=2 words behind Chinese, their system accurately predicts that the next translation word must be “meet” because Bush is likely “meeting” someone (e.g., Putin) in Moscow, which is done long before the verb appears.


The model, however, needs to be prepared about the speaker’s topic and style beforehand, just as human translators need to be. This is done by training with large amounts of data which have similar sentence structures. This enables the model to anticipate words in a sentence most likely to be spoken with a reasonable accuracy. With the current capabilities, STACL aims at dealing with latency.

How STACL Can Match Human Interpretation

Baidu has used a technology named 3.4 BLEU (Bilingual Evaluation Understudy), which is the backbone of the entire architecture. It is essentially a standard algorithm to estimate the quality of text which has been machine-translated from one natural language to another. “It a standard evaluation metric for full-sentence translation quality by comparing a machine translation result with a human reference translation”, notes the website.


While human translators can cover up to 60 percent of the source material with about three seconds delay, the new simultaneous system is much more efficient. While in the earlier Chinese to English simultaneous translation, the translator lagged behind by 3 Chinese words or about 1.5 to 2 seconds, the translation quality with new ML-model is about 5 BLEU points higher.


While STACL shows significant potential, the researchers are still to overcome many limitations of the simultaneous machine translation system. The release of STACL is not proposed to take over the human interpreters yet but may use its capabilities to offer an improved service in the years to come.

More Great AIM Stories

Bharat Adibhatla
Bharat is a voracious reader of biographies and political tomes. He is also an avid astrologer and storyteller who is very active on social media.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM