Now Reading
7 Free Online Resources To Learn NVIDIA NeMo

7 Free Online Resources To Learn NVIDIA NeMo

7 Free Online Resources To Learn NVIDIA NeMo

NVIDIA has recently concluded its GTC 2020 virtual conference, where it has announced the NeMo model for the building speech and language models in order to create state-of-the-art conversational AI. Introducing NeMo at the event, it was mentioned as an open-source PyTorch toolkit that has been developed for building and training GPU-accelerated conversational AI models.

With components like the model, neural module, and neural type, NeMo creates the building block for developing conversational AI models, with which users can create as well as train state-of-the-art neural network architectures. 

Register for our upcoming Masterclass>>

This year, with the pandemic lockdown, there has been a significant shift in how businesses are working, which has indeed created a massive demand for AI-based services like chatbots that can interact with people and customers bridging the language gap between humans and machines. This surge has made conversational AI a necessity for many industries where businesses are deploying applications to understand and communicate with their customers. 

Thus, releasing NeMo, by NVIDIA, was required now more than ever. However, understanding this open-source toolkit might not be entirely easy for developers. And that is why we have come up with a curated list of online free resources that can help you understand it and get hands-on.

Also read: NVIDIA Just Gave A PyTorch Based Conversational AI Model For Free

Looking for a job change? Let us help you.

Introducing NVIDIA NeMo


What can be better than understanding NeMo from the source company that has developed the open-source toolkit? Here, NVIDIA itself talks about its recently released NeMo and shares — how to build a simple Automatic Speech Recognition (ASR) model using NeMo. In this tutorial, the learners will be able to see how the ASR model can be developed using only three lines of code. Further, this tutorial will talk about how one can easily fine-tune hyperparameters using config files and how NeMo can be used for multiple GPUs using minimal effort.

Check out the tutorial here.

NeMo Voice Swap Demo

By Google’s Colab

This tutorial will teach the learners how to leverage NVIDIA NeMo to build a toy demo for swapping voice in the audio fragment with a computer-generated one. For creating this simple application with this toolkit, the demo will highlight the automatic speech recognition of the said conversation in the file, i.e. converting the audio to text, along with adding punctuation and capitalisation of the text. Along with this, the tutorial will also showcase how to generate spectrograms from resulting text and waveform audio from that spectrogram.

Check out the tutorial here.

Beginner’s Guide to NVIDIA NeMo

By Ng Wai Foong, Senior AI Engineer at Yoozoo

A Medium publication, this article will provide learners with a glimpse of the basic concepts behind NVIDIA NeMo. This article will start by giving a brief introduction of NVIDIA NeMo toolkit and would expose the learners to a few pre-trained models available at NVIDIA GPU Cloud, such as ASR, NLP, TTS. Post that, the article will showcase how it has installed the toolkit via docker or local installation with pip install and then explored in-depth on programming the model and the NeuralType that makes the building blocks of NeMo. It also highlighted and tested a few examples for ASR, NLP and text to speech tasks.

Check out the tutorial here.

Building Custom Speech Recognition Model

By Jaganadh Gopinadhan, AI & Analytics Leader at Cognizant

Another Medium publication, this tutorial will teach how to create an ASR model using LibriSpeech dataset. To build this model, a massive dataset is used, like an entire LibriSpeech dev-clean. It starts with setting up the prerequisites and then creating a training manifest file and then model training. Further, it was tried with 1000 epochs, and the transcriptions were looking good.

Check out the tutorial here.

ASR with NeMo

By Google’s Colab

Another Google’s Colab tutorial where not only it showcases how to build an Automatic Speech Recognition using NVIDIA NeMo, but also provides a conceptual overview of the end-to-end ASR model. This tutorial leverages AN4 dataset, also known as Alphanumeric dataset, and provides a comprehensive understanding of spectrogram and Mel spectrograms, along with convolutional ASR models. For building the ASR model, it starts with training from scratch, inference, model improvements and under the hood.

Check out the tutorial here.

Neural Modules and Models for Conversational AI

By PyTorch

This Medium article is by PyTorch introducing NVIDIA NeMo is authored by Oleksii Kuchaiev, a senior applied scientist and Poonam Chitale, a senior product manager at NVIDIA. It starts by giving a comprehensive understanding of NVIDIA NeMo, and then goes on to showcase building prototypes like voice swap examples. Further, it also highlights how to train and fine-tune models with NeMo. One can get an all-inclusive understanding of NeMo with this article.

Check out the article here.

Speech recognition and synthesis (ASR and TTS)

By DeepPavlov

This tutorial involved DeepPavlov that contains models for automatic speech recognition and text synthesis based on pre-build modules from NeMo. This tutorial will talk about speech recognition, speech synthesis, as well as audio encoding and decoding. With this tutorial, one can quickly build models that can recognise speech as well as synthesise speech using NVIDIA NeMo.

Check out the tutorial here.

What Do You Think?

Join Our Discord Server. Be part of an engaging online community. Join Here.

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top