Active Hackathon

7 Free Online Resources To Learn NVIDIA NeMo

7 Free Online Resources To Learn NVIDIA NeMo

NVIDIA has recently concluded its GTC 2020 virtual conference, where it has announced the NeMo model for the building speech and language models in order to create state-of-the-art conversational AI. Introducing NeMo at the event, it was mentioned as an open-source PyTorch toolkit that has been developed for building and training GPU-accelerated conversational AI models.

With components like the model, neural module, and neural type, NeMo creates the building block for developing conversational AI models, with which users can create as well as train state-of-the-art neural network architectures. 


Sign up for your weekly dose of what's up in emerging technology.

This year, with the pandemic lockdown, there has been a significant shift in how businesses are working, which has indeed created a massive demand for AI-based services like chatbots that can interact with people and customers bridging the language gap between humans and machines. This surge has made conversational AI a necessity for many industries where businesses are deploying applications to understand and communicate with their customers. 

Thus, releasing NeMo, by NVIDIA, was required now more than ever. However, understanding this open-source toolkit might not be entirely easy for developers. And that is why we have come up with a curated list of online free resources that can help you understand it and get hands-on.

Also read: NVIDIA Just Gave A PyTorch Based Conversational AI Model For Free

Introducing NVIDIA NeMo


What can be better than understanding NeMo from the source company that has developed the open-source toolkit? Here, NVIDIA itself talks about its recently released NeMo and shares — how to build a simple Automatic Speech Recognition (ASR) model using NeMo. In this tutorial, the learners will be able to see how the ASR model can be developed using only three lines of code. Further, this tutorial will talk about how one can easily fine-tune hyperparameters using config files and how NeMo can be used for multiple GPUs using minimal effort.

Check out the tutorial here.

NeMo Voice Swap Demo

By Google’s Colab

This tutorial will teach the learners how to leverage NVIDIA NeMo to build a toy demo for swapping voice in the audio fragment with a computer-generated one. For creating this simple application with this toolkit, the demo will highlight the automatic speech recognition of the said conversation in the file, i.e. converting the audio to text, along with adding punctuation and capitalisation of the text. Along with this, the tutorial will also showcase how to generate spectrograms from resulting text and waveform audio from that spectrogram.

Check out the tutorial here.

Beginner’s Guide to NVIDIA NeMo

By Ng Wai Foong, Senior AI Engineer at Yoozoo

A Medium publication, this article will provide learners with a glimpse of the basic concepts behind NVIDIA NeMo. This article will start by giving a brief introduction of NVIDIA NeMo toolkit and would expose the learners to a few pre-trained models available at NVIDIA GPU Cloud, such as ASR, NLP, TTS. Post that, the article will showcase how it has installed the toolkit via docker or local installation with pip install and then explored in-depth on programming the model and the NeuralType that makes the building blocks of NeMo. It also highlighted and tested a few examples for ASR, NLP and text to speech tasks.

Check out the tutorial here.

Building Custom Speech Recognition Model

By Jaganadh Gopinadhan, AI & Analytics Leader at Cognizant

Another Medium publication, this tutorial will teach how to create an ASR model using LibriSpeech dataset. To build this model, a massive dataset is used, like an entire LibriSpeech dev-clean. It starts with setting up the prerequisites and then creating a training manifest file and then model training. Further, it was tried with 1000 epochs, and the transcriptions were looking good.

Check out the tutorial here.

ASR with NeMo

By Google’s Colab

Another Google’s Colab tutorial where not only it showcases how to build an Automatic Speech Recognition using NVIDIA NeMo, but also provides a conceptual overview of the end-to-end ASR model. This tutorial leverages AN4 dataset, also known as Alphanumeric dataset, and provides a comprehensive understanding of spectrogram and Mel spectrograms, along with convolutional ASR models. For building the ASR model, it starts with training from scratch, inference, model improvements and under the hood.

Check out the tutorial here.

Neural Modules and Models for Conversational AI

By PyTorch

This Medium article is by PyTorch introducing NVIDIA NeMo is authored by Oleksii Kuchaiev, a senior applied scientist and Poonam Chitale, a senior product manager at NVIDIA. It starts by giving a comprehensive understanding of NVIDIA NeMo, and then goes on to showcase building prototypes like voice swap examples. Further, it also highlights how to train and fine-tune models with NeMo. One can get an all-inclusive understanding of NeMo with this article.

Check out the article here.

Speech recognition and synthesis (ASR and TTS)

By DeepPavlov

This tutorial involved DeepPavlov that contains models for automatic speech recognition and text synthesis based on pre-build modules from NeMo. This tutorial will talk about speech recognition, speech synthesis, as well as audio encoding and decoding. With this tutorial, one can quickly build models that can recognise speech as well as synthesise speech using NVIDIA NeMo.

Check out the tutorial here.

More Great AIM Stories

Sejuti Das
Sejuti currently works as Associate Editor at Analytics India Magazine (AIM). Reach out at

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Global Parliaments can do much more with Artificial Intelligence

The world is using AI to enhance the performance of its policymakers. India, too, has launched its own machine learning system NeVA, which at the moment is not fully implemented across the nation. How can we learn and adopt from the advancement in the Parliaments around the world? 

Why IISc wins?

IISc was selected as the world’s top research university, trumping some of the top Ivy League colleges in the QS World University Rankings 2022

How does the Indian Army want to use AI?

An AI system that can collect data, analyse them and present the same to the commander in a very short time frame is one of the key requirements for the Indian Army

How Data Science Can Help Overcome The Global Chip Shortage

China-Taiwan standoff might increase Global chip shortage

After Nancy Pelosi’s visit to Taiwan, Chinese aircraft are violating Taiwan’s airspace. The escalation made TSMC’s chairman go public and threaten the world with consequences. Can this move by China fuel a global chip shortage?