MITB Banner

AssemblyAI releases Conformer-1 API, the SOTA Speech Recognition Model

The team took inspiration from DeepMind’s data scaling laws in the Chinchilla paper and adapted into the ASR domain, curated 650k hours of English audio, making it the largest trained supervised model. 

Share

AssemblyAI Raises $50 Mn for Superhuman Speech Recognition
Listen to this story

AssemblyAI, the company focused on building speech, voice, and text models, announced Conformer-1, its latest state-of-the-art speech recognition model. Built on the Conformer architecture and undergoing training on 650K hours of audio data, this model attains an accuracy level comparable to that of a human, demonstrating a reduction of up to 43% in errors when processing noisy data in comparison to alternative ASR models.

To improve on the Conformer architecture, the company leveraged Efficient Conformer, a modification on the original architecture that uses progressive downsampling which is inspired by ContextNet and also used Group Attention. These changes speedup the inference time by 29% and 36% training time.

Click here to learn more about Conformer-1.

AssemblyAI took inspiration from DeepMind’s data scaling laws in the Chinchilla paper and adapted into the ASR domain, curated 650k hours of English audio, making it the largest trained supervised model. 

To overcome one of the biggest problems in speech recognition, the noise, the team also implemented a modified version of spare attention. It is a pruning method for attaining sparsity of the model’s weight for achieving regularisation. This is one of the greatest achievements of the model — its robustness to noise. 

In 2020, Google Brain released the Conformer, a neural network designed for speech recognition. It is based on the Transformer architecture, which is widely used and known for its attention mechanism and parallel processing capabilities. The Conformer architecture enhances the Transformer by incorporating convolutional layers, allowing it to effectively capture both local and global dependencies, while remaining a compact neural network design.

Share
Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words. He also holds a keen interest in photography, filmmaking, and the gaming industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.