Natural language processing is one of the most explored and currently trending topics in machine learning. By the NLP daily digital needs such as smart assistance, language translation, text prediction, etc are being addressed. In context to the various libraries used in this field, today in this post we are going to discuss a GluonNLP Natural language processing Deep learning-based toolkit. This toolkit includes cutting-edge pre-trained models, training scripts, and training logs to help with rapid prototyping and reproducible research. We also offer modular APIs with flexible building pieces for easy customization. Following are the major points that we are going to discuss in this post.
Table of contents
- The GluonNLP
- Design of the library
- Generating text sequence with GluonNLP
Let’s first understand the library structure.
Deep learning has spurred rapid progress in artificial intelligence research, resulting in remarkable discoveries on long-standing problems in a wide range of natural language processing areas. Deep learning frameworks like MXNet, PyTorch, TensorFlow, Caffe, Apache, and Theano make this possible.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
These frameworks have been crucial in the transmission of ideas in the field. In particular, imperative tools, which were perhaps popularized by Chainer, are straightforward to develop,
learn, read, and debug. Such benefits hasten the imperative programming interface.
Jian Guo et al create and develop the GluonNLP toolkits for deep learning in natural language processing using MXNet’s imperative Gluon API. GluonNLP simultaneously provides modular APIs to allow customization by reusing efficient building blocks; pretrained state-of-the-art models, training scripts, and training logs to enable fast prototyping and promote reproducible research; and models that can be deployed in a wide variety of programming languages, including C++, Clojure, Java, Julia, Perl, Python, R, and Scala.
Features of library
Here we’ll discuss the major highlights of this library.
Users may tailor their model design, training, and inference by reusing efficient components across various models with GluonNLP’s modular APIs. Data processing tools, models with individual components, initialization procedures, and loss functions are examples of common components.
Take the data API of GluonNLP, which is used to design efficient data pipelines, as an example of how the modular API supports efficient implementation.
with data provided by users In natural language processing jobs, inputs are frequently of various shapes, such as sentences of various lengths. As a result, the data API includes a set of utilities for sampling inputs and converting them into mini-batches that may be computed quickly.
Building on such modular APIs, GluonCV/NLP provides pre-trained state-of-the-art models, training scripts, and training logs via the model zoo, enabling fast prototyping and encouraging repeatable research. Over 200 models have been supplied by GluonNLP for natural languages processing tasks such as word embedding, language modelling, machine translation, sentiment analysis, natural language inference, dependency parsing, and question answering.
Generating text sequence with GluonNLP
In this section by leveraging this library API, how to sample and generate a text sequence using a pre-trained language model. Using a language model, we can sample sequences based on the likelihood that they will appear in our model for a particular vocabulary size and sequence length.
Given the context from previous time steps, a language model predicts the likelihood of each word happening at each time step.GluonNLP provides two samplers for generating from a language model for this purpose: BeamSearchSampler and SequenceSampler, of which we will use SequenceSampler.
Let’s now quickly install the dependencies.
# install dependencies !pip install gluonnlp !pip install mxnet
To begin, load an AWD LSTM language model, which is a state-of-the-art RNN language pre-trained language model from which we will sample sequences.
# loading the pre-trained model import mxnet as mx import gluonnlp as nlp ctx = mx.cpu() lm_model, vocab = nlp.model.get_model(name='awd_lstm_lm_1150', dataset_name='wikitext-2', pretrained=True, ctx=ctx)
A scorer function is required for Sequence Sampler to function. As the scorer function, we will utilize the BeamSearchScorer, which implements the scoring function with a length penalty.
# scorer scorer = nlp.model.BeamSearchScorer(alpha=0, K=5, from_logits=False)
Next, we need to define a decoder based on the pre-trained language model.
#decoder class LMDecoder(object): def __init__(self, model): self._model = model def __call__(self, inputs, states): outputs, states = self._model(mx.nd.expand_dims(inputs, axis=0), states) return outputs, states def state_info(self, *arg, **kwargs): return self._model.state_info(*arg, **kwargs) decoder = LMDecoder(lm_model)
Now that we have a scorer and a decoder, we’re ready to construct a sampler. The example code below shows how to make a sequence sampler. We’ll make a sampler with 5 beams and a maximum sample length of 100 to control softmax activation.
# create sampler seq_sampler = nlp.model.SequenceSampler(beam_size=5, decoder=decoder, eos_id=eos_id, max_length=100, temperature=0.97)
Next, we’ll produce sentences that begin with “I enjoy swimming.” We feed the language model [‘I,’ ‘love,’ ‘to’] to retrieve the starting states and set the initial input to be the word ‘swim’.
# generate samples bos = 'I love to swim'.split() bos_ids = [vocab[ele] for ele in bos] begin_states = lm_model.begin_state(batch_size=1, ctx=ctx) if len(bos_ids) > 1: _, begin_states = lm_model(mx.nd.expand_dims(mx.nd.array(bos_ids[:-1]), axis=1), begin_states) inputs = mx.nd.full(shape=(1,), ctx=ctx, val=bos_ids[-1])
All this can be combined with a helper function by which using a single line we can generate the sequence.
# helper function def generate_sequences(sampler, inputs, begin_states, num_print_outcomes): samples, scores, valid_lengths = sampler(inputs, begin_states) samples = samples.asnumpy() scores = scores.asnumpy() valid_lengths = valid_lengths.asnumpy() print('Generation Result:') for i in range(num_print_outcomes): sentence = bos[:-1] for ele in samples[i][:valid_lengths[i]]: sentence.append(vocab.idx_to_token[ele]) print([' '.join(sentence), scores[i]])
Below now we can generate the sequence.
generate_sequences(seq_sampler, inputs, begin_states, 5)
Here is the output of the function,
As we can see the generated context is quite suitable for our original sentence.
Through this post, we have discussed the GluonNLP, a deep learning-based library to address various task-related NLP such as sentiment analysis, word embeddings, sequence generation, etc. We may experiment with various applications of natural language processing by leveraging its modular APIs and pre-trained models.