Active Hackathon

A guide to GluonNLP: Deep Learning framework for NLP

GluonNLP is a Natural language processing Deep learning-based toolkit. This toolkit includes cutting-edge pre-trained models, training scripts, and training logs to help with rapid prototyping and reproducible research.

Natural language processing is one of the most explored and currently trending topics in machine learning. By the NLP daily digital needs such as smart assistance, language translation, text prediction, etc are being addressed. In context to the various libraries used in this field, today in this post we are going to discuss a GluonNLP Natural language processing Deep learning-based toolkit. This toolkit includes cutting-edge pre-trained models, training scripts, and training logs to help with rapid prototyping and reproducible research. We also offer modular APIs with flexible building pieces for easy customization. Following are the major points that we are going to discuss in this post.     

Table of contents

  1. The GluonNLP
  2. Design of the library 
  3. Generating text sequence with GluonNLP

Let’s first understand the library structure.


Sign up for your weekly dose of what's up in emerging technology.

The GluonNLP

Deep learning has spurred rapid progress in artificial intelligence research, resulting in remarkable discoveries on long-standing problems in a wide range of natural language processing areas. Deep learning frameworks like MXNet, PyTorch, TensorFlow, Caffe, Apache, and Theano make this possible. 

These frameworks have been crucial in the transmission of ideas in the field.  In particular, imperative tools, which were perhaps popularized by Chainer, are straightforward to develop,

learn, read, and debug. Such benefits hasten the imperative programming interface. 

Jian Guo et al create and develop the GluonNLP toolkits for deep learning in natural language processing using MXNet’s imperative Gluon API. GluonNLP simultaneously provides modular APIs to allow customization by reusing efficient building blocks; pretrained state-of-the-art models, training scripts, and training logs to enable fast prototyping and promote reproducible research; and models that can be deployed in a wide variety of programming languages, including C++, Clojure, Java, Julia, Perl, Python, R, and Scala.

Features of library

Here we’ll discuss the major highlights of this library. 

Modular API

Users may tailor their model design, training, and inference by reusing efficient components across various models with GluonNLP’s modular APIs. Data processing tools, models with individual components, initialization procedures, and loss functions are examples of common components.

Take the data API of GluonNLP, which is used to design efficient data pipelines, as an example of how the modular API supports efficient implementation.

with data provided by users In natural language processing jobs, inputs are frequently of various shapes, such as sentences of various lengths. As a result, the data API includes a set of utilities for sampling inputs and converting them into mini-batches that may be computed quickly.

Pre-trained models

Building on such modular APIs, GluonCV/NLP provides pre-trained state-of-the-art models, training scripts, and training logs via the model zoo, enabling fast prototyping and encouraging repeatable research. Over 200 models have been supplied by GluonNLP for natural languages processing tasks such as word embedding, language modelling, machine translation, sentiment analysis, natural language inference, dependency parsing, and question answering.

Generating text sequence with GluonNLP

In this section by leveraging this library API,  how to sample and generate a text sequence using a pre-trained language model.  Using a language model, we can sample sequences based on the likelihood that they will appear in our model for a particular vocabulary size and sequence length. 

Given the context from previous time steps, a language model predicts the likelihood of each word happening at each time step.GluonNLP provides two samplers for generating from a language model for this purpose: BeamSearchSampler and SequenceSampler, of which we will use SequenceSampler.

Let’s now quickly install the dependencies.  

# install dependencies
!pip install gluonnlp 
!pip install mxnet 

To begin, load an AWD LSTM language model, which is a state-of-the-art RNN language pre-trained language model from which we will sample sequences.

# loading the pre-trained model
import mxnet as mx
import gluonnlp as nlp
ctx = mx.cpu()
lm_model, vocab = nlp.model.get_model(name='awd_lstm_lm_1150',

A scorer function is required for Sequence Sampler to function. As the scorer function, we will utilize the BeamSearchScorer, which implements the scoring function with a length penalty.

# scorer
scorer = nlp.model.BeamSearchScorer(alpha=0, K=5, from_logits=False)

Next, we need to define a decoder based on the pre-trained language model.

class LMDecoder(object):
    def __init__(self, model):
        self._model = model
    def __call__(self, inputs, states):
        outputs, states = self._model(mx.nd.expand_dims(inputs, axis=0), states)
        return outputs[0], states
    def state_info(self, *arg, **kwargs):
        return self._model.state_info(*arg, **kwargs)
decoder = LMDecoder(lm_model)

Now that we have a scorer and a decoder, we’re ready to construct a sampler. The example code below shows how to make a sequence sampler. We’ll make a sampler with 5 beams and a maximum sample length of 100 to control softmax activation.

# create sampler
seq_sampler = nlp.model.SequenceSampler(beam_size=5,

Next, we’ll produce sentences that begin with “I enjoy swimming.” We feed the language model [‘I,’ ‘love,’ ‘to’] to retrieve the starting states and set the initial input to be the word ‘swim’. 

# generate samples
bos = 'I love to swim'.split()
bos_ids = [vocab[ele] for ele in bos]
begin_states = lm_model.begin_state(batch_size=1, ctx=ctx)
if len(bos_ids) > 1:
    _, begin_states = lm_model(mx.nd.expand_dims(mx.nd.array(bos_ids[:-1]), axis=1),
inputs = mx.nd.full(shape=(1,), ctx=ctx, val=bos_ids[-1])

All this can be combined with a helper function by which using a single line we can generate the sequence. 

# helper function
def generate_sequences(sampler, inputs, begin_states, num_print_outcomes):
    samples, scores, valid_lengths = sampler(inputs, begin_states)
    samples = samples[0].asnumpy()
    scores = scores[0].asnumpy()
    valid_lengths = valid_lengths[0].asnumpy()
    print('Generation Result:')
    for i in range(num_print_outcomes):
        sentence = bos[:-1]
        for ele in samples[i][:valid_lengths[i]]:
        print([' '.join(sentence), scores[i]])

Below now we can generate the sequence.

generate_sequences(seq_sampler, inputs, begin_states, 5)

Here is the output of the function,

As we can see the generated context is quite suitable for our original sentence.

Final words

Through this post, we have discussed the GluonNLP, a deep learning-based library to address various task-related NLP such as sentiment analysis, word embeddings, sequence generation, etc. We may experiment with various applications of natural language processing by leveraging its modular APIs and pre-trained models.   


More Great AIM Stories

Vijaysinh Lendave
Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

The curious case of Google Cloud revenue

Porat had earlier said that Google Cloud was putting in money to make more money, but even with the bucket-loads of money that it was making, profitability was still elusive.

Global Parliaments can do much more with Artificial Intelligence

The world is using AI to enhance the performance of its policymakers. India, too, has launched its own machine learning system NeVA, which at the moment is not fully implemented across the nation. How can we learn and adopt from the advancement in the Parliaments around the world? 

Why IISc wins?

IISc was selected as the world’s top research university, trumping some of the top Ivy League colleges in the QS World University Rankings 2022