Name Language Prediction using Recurrent Neural Network in PyTorch

In this article, we will demonstrate the implementation of a Recurrent Neural Network (RNN) using PyTorch in the task of multi-class text classification. This RNN model will be trained on the names of the person belonging to 18 language classes. After successful training, the model will predict the language category for a given name that it is most likely to belong. 
Recurrent Neural Network in PyTorch

Recurrent Neural Networks have been applied very successfully as the deep learning models in the tasks that deal with the sequential data especially the Natural Language Processing. The traditional feed-forward networks operate with the entire fixed training batch at once and produce a fixed amount of output. On the other hand, the recurrent neural networks process the same in sequence. This feature makes them outperforming in many NLP applications. With these capabilities, RNN models are popularly applied in the text classification problems.

In this article, we will demonstrate the implementation of a Recurrent Neural Network (RNN) using PyTorch in the task of multi-class text classification. This RNN model will be trained on the names of the person belonging to 18 language classes. After successful training, the model will predict the language category for a given name that it is most likely to belong. 

Implementation of RNN in PyTorch

This implementation was done in the Google Colab and the data set was read from the Google Drive. The below line of codes will mount the Google Drive to the Colab notebook and print the text files in the data set.

from google.colab import drive

from __future__ import unicode_literals, print_function, division
from io import open
import glob
import os

def printFiles(path):
  return glob.glob(path)

printFiles('gdrive/My Drive/Dataset/data/data/names/*.txt')
Recurrent Neural Network in PyTorch

The below lines of codes define function modules to convert Unicode text to equivalent ASCII value.

import unicodedata
import string

all_let = string.ascii_letters + " .,;'"
n_let = len(all_let)

def unicodeToAscii(s):
    return ''.join(
        c for c in unicodedata.normalize('NFD', s)
        if unicodedata.category(c) != 'Mn'
        and c in all_let

Using the below code snippet, a function will be defined to build the dictionary of categories and a list of names in every language.

cat_line = {}
all_cats = []

# Read a file and split into lines
def readLines(filename):
    lines = open(filename, encoding='utf-8').read().strip().split('\n')
    return [unicodeToAscii(line) for line in lines]

for filename in printFiles('gdrive/My Drive/Dataset/data/data/names/*.txt'):
    category = os.path.splitext(os.path.basename(filename))[0]
    lines = readLines(filename)
    cat_line[category] = lines

n_categories = len(all_cats)

We will check the above function for 4 Japanese names.

#Check names in a category

In the next step, the function modules will be defined to turn the names into tensors to make them compatible with the RNN model.

import torch
# Find letter index from all_let, e.g. "a" = 0
def letterToIndex(letter):
    return all_let.find(letter)

# Turn a letter into a <1 x n_let> Tensor
def letterToTensor(letter):
    tensor = torch.zeros(1, n_let)
    tensor[0][letterToIndex(letter)] = 1
    return tensor

# Turn a line into a <line_length x 1 x n_let>,
# or an array of one-hot letter vectors
def lineToTensor(line):
    tensor = torch.zeros(len(line), 1, n_let)
    for li, letter in enumerate(line):
        tensor[li][0][letterToIndex(letter)] = 1
    return tensor

We will check the above module by converting a letter to tensor and a line to tensor.

Recurrent Neural Network in PyTorch

In the next step, we will define the Recurrent Neural Network model.

import torch.nn as nn

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.hidden_size = hidden_size

        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax(dim=1)

    def forward(self, input, hidden):
        combined =, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def initHidden(self):
        return torch.zeros(1, self.hidden_size)

n_hidden = 128
#Binding model
rnn = RNN(n_let, n_hidden, n_categories)

This model will be checked on generating tensor output for a name.

input = lineToTensor('Aalsburg')
hidden = torch.zeros(1, n_hidden)

output, next_hidden = rnn(input[0], hidden)

Recurrent Neural Network in PyTorch

This untrained model has generated the likelihoods of all the categories the given input name belongs to. 

Now, we will define functions for providing random training examples to the network during training and generating categories for the network outputs.

import random

def randomChoice(l):
    return l[random.randint(0, len(l) - 1)]

def randomTrainingExample():
    category = randomChoice(all_cats)
    line = randomChoice(cat_line[category])
    category_tensor = torch.tensor([all_cats.index(category)], dtype=torch.long)
    line_tensor = lineToTensor(line)
    return category, line, category_tensor, line_tensor

#Check on a random sample
for i in range(10):
    category, line, category_tensor, line_tensor = randomTrainingExample()
    print('category =', category, '/ line =', line)

Recurrent Neural Network in PyTorch

def categoryFromOutput(output):
    top_n, top_i = output.topk(1)
    category_i = top_i[0].item()
    return all_cats[category_i], category_i
#Check category for an output

In the next step, the hyperparameters and the training function will be defined and the RNN model will be trained in 100 epochs.

learning_rate = 0.005 

def train(category_tensor, line_tensor):
    hidden = rnn.initHidden()


    for i in range(line_tensor.size()[0]):
        output, hidden = rnn(line_tensor[i], hidden)

    loss = criterion(output, category_tensor)

    # Add parameters' gradients to their values, multiplied by learning rate
    for p in rnn.parameters():, alpha=-learning_rate)

    return output, loss.item()

import time
import math

n_iters = 100000
print_every = 5000
plot_every = 1000

# Keep track of losses for plotting
current_loss = 0
all_losses = []

def timeSince(since):
    now = time.time()
    s = now - since
    m = math.floor(s / 60)
    s -= m * 60
    return '%dm %ds' % (m, s)

start = time.time()

for iter in range(1, n_iters + 1):
    category, line, category_tensor, line_tensor = randomTrainingExample()
    output, loss = train(category_tensor, line_tensor)
    current_loss += loss

    # Print iter number, loss, name and guess
    if iter % print_every == 0:
        guess, guess_i = categoryFromOutput(output)
        correct = '✓' if guess == category else '✗ (%s)' % category
        print('%d %d%% (%s) %.4f %s / %s %s' % (iter, iter / n_iters * 100, timeSince(start), loss, line, guess, correct))

    # Add current loss avg to list of losses
    if iter % plot_every == 0:
        all_losses.append(current_loss / plot_every)
        current_loss = 0

Recurrent Neural Network in PyTorch

After training, we will visualize the loss to see the performance.

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker


The below code snippet will test the on the unseen texts and plot the confusion matrix.

# Keep track of correct guesses in a confusion matrix
confusion = torch.zeros(n_categories, n_categories)
n_confusion = 10000

# Just return an output given a line
def evaluate(line_tensor):
    hidden = rnn.initHidden()

    for i in range(line_tensor.size()[0]):
        output, hidden = rnn(line_tensor[i], hidden)

    return output

# Go through a bunch of examples and record which are correctly guessed
for i in range(n_confusion):
    category, line, category_tensor, line_tensor = randomTrainingExample()
    output = evaluate(line_tensor)
    guess, guess_i = categoryFromOutput(output)
    category_i = all_cats.index(category)
    confusion[category_i][guess_i] += 1

# Normalize by dividing every row by its sum
for i in range(n_categories):
    confusion[i] = confusion[i] / confusion[i].sum()

# Set up plot
figsize = (10, 10)
fig = plt.figure(figsize=figsize)
ax = fig.add_subplot(111)
cax = ax.matshow(confusion.numpy())

# Set up axes
ax.set_xticklabels([''] + all_cats, rotation=90)
ax.set_yticklabels([''] + all_cats)

# Force label at every tick

# sphinx_gallery_thumbnail_number = 2

The below function will print the likelihood of belonging to a language category for the given names.

def predict(input_line, n_predictions=3):
    print('\n> %s' % input_line)
    with torch.no_grad():
        output = evaluate(lineToTensor(input_line))

        # Get top N categories
        topv, topi = output.topk(n_predictions, 1, True)
        predictions = []

        for i in range(n_predictions):
            value = topv[0][i].item()
            category_index = topi[0][i].item()
            print('(%.2f) %s' % (value, all_cats[category_index]))
            predictions.append([value, all_cats[category_index]])

Finally, we will check the predicted likelihoods for the given three names.


Recurrent Neural Network in PyTorch

So, as we can see above, the RNN model has given the likelihoods for the given names which of the language categories they belong to. For example, for the name ‘Aggelen’, it has given the top 3 likelihoods in which ‘French’ has the highest value. All three predictions are correct. That means, according to the trained RNN model, the name ‘Aggelen’ has the highest chances of belonging to the ‘French’ language. We could apply the argmax to print only the language category with the highest likelihood, but to make it more clear, the top 3 predictions are given in the result. You can check this model on more numbers of predictions and tune the parameters to improve the accuracy.


  1. Gabriel Loye, ‘A Beginner’s Guide on Recurrent Neural Networks with PyTorch’
  2. ‘NLP from Scratch: Classifying Names with a Character-Level RNN’, PyTorch Tutorial.

More Great AIM Stories

Dr. Vaibhav Kumar
Vaibhav Kumar has experience in the field of Data Science and Machine Learning, including research and development. He holds a PhD degree in which he has worked in the area of Deep Learning for Stock Market Prediction. He has published/presented more than 15 research papers in international journals and conferences. He has an interest in writing articles related to data science, machine learning and artificial intelligence.
Yugesh Verma
How to Visualize Backpropagation in Neural Networks?

The backpropagation algorithm computes the gradient of the loss function with respect to the weights. these algorithms are complex and visualizing backpropagation algorithms can help us in understanding its procedure in neural network.

Yugesh Verma
How is Boolean algebra used in Machine learning?

Machine learning model with Boolean algebra starts with the data with a target variable and input or learner variables and using the set of rules it generates output value by considering a given configuration of input samples.

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM