AIM Banners_978 x 90

Google Releases New Version of Translatotron: Its End-to-end Speech Translation Model

The Translatotron 2 model consists of a source speech encoder, a target phoneme decoder, and a target mel-spectrogram synthesiser.
Google released Translatotron, an end-to-end speech-to-speech translation model, in 2019. The tech giant claimed the single sequence-to-sequence model is the first end-to-end framework to directly translate speech from one language into speech in another language. The system was used to create synthesised translations of voices to ensure the sound of the original speaker is intact. But this feature had the potential to be misused to generate speech in a different voice and create deep fake voices.  This month, researchers at Google published a paper detailing ‘Translatotron 2’, an updated version that solves the deep fake problems. “The trained model is restricted to retain the source speaker’s voice, and unlike the original Translatotron, it is not able to generate spee
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Avi Gopani
Avi Gopani
Avi Gopani is a technology journalist that seeks to analyse industry trends and developments from an interdisciplinary perspective at Analytics India Magazine. Her articles chronicle cultural, political and social stories that are curated with a focus on the evolving technologies of artificial intelligence and data analytics.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed