Active Hackathon

Hands-On Tutorial On Polyglot – Python Toolkit For Multilingual NLP Applications

Polyglot is an open-source python library which is used to perform different NLP operations. It is based on NumPy which is why it is fast. It has a large variety of dedicated commands which makes it stand out of the crowd.

Natural Language Processing is a process of making the human language understandable to machines and then performing different operations on it to extract useful information. NLP is a part of Artificial Intelligence which makes the interaction between computer and human language.

There is a large variety of python libraries that can help us in performing NLP tasks. All libraries have certain unique features and which make them different from each other. Generally, NLP libraries have functions like Tokenize, Stemming, Lamenting, Spell CHeck, etc. 


Sign up for your weekly dose of what's up in emerging technology.

Polyglot is an open-source python library which is used to perform different NLP operations. It is based on NumPy which is why it is fast. It has a large variety of dedicated commands which makes it stand out of the crowd. It is similar to spacy and can be used for languages that do not support spacy.

In this article, we will explore different NLP operations and functions which can be performed using polyglot.


Like any other python library, we will install polyglot using pip install polyglot.

  1. Importing Required Libraries

We will import polyglot and explore its different functionalities. All functionalities will be imported as and when required.

  1. Performing Operation on Data

Before performing different operations on our data, let us first initialize some text which we will use for performing different functions on.

init = '''Analytics India Magazine chronicles technological progress in the space of  analytics, artificial intelligence, data science & big data by highlighting the innovations, players, and challenges shaping the future of India through promotion and discussion of ideas and thoughts by smart, ardent, action-oriented individuals who want to change the world.'''

  1. Language Detection

Polyglot can identify the language of the text passed to it using the language function. Let us see how to use it.

detect = Detector(init)


Language Detector
  1. Tokenize

In tokenize, we can print the wordlist which is the words that are there in the text used as well as the sentences which are there in the text. 

from polyglot.text import Text

text = Text(init)




Sentences Detection
  1. POS Tagging

Parts of speech tagging is used to identify the syntactic functionality of word occurrence.

from polyglot.mapping import Embedding


POS Tagging
  1. Named Entity Extraction

It extracts phrases from the plain text that are entities like location, person, and organizations.


Named Entity Extration

Let us try this with some more texts. 

init1 = '''Hello my name is Himanshu Sharma and I am from India'''

text = Text(init1)


  1. Morphological analysis

It defines the regularities behind word formation in human language. Let us see how to use it.

from polyglot.text import Word

words = ["programming", "parallel", "inevitable", "beautiful"]

for w in words:

     w = Word(w, language="en")

     print(w, w.morphemes)

Morphological Analysis
  1. Sentiment Analysis

It is used to find out the polarity of the text.

text = Text("The new economic policies are quite good.")

for w in text.words:

    print(w, w.polarity)

Sentiment Extraction

These are some of the NLP operations which we can perform using polyglot.


In this article we saw how polyglot can be used to detect the language we are using in a particular text, followed by the tokenization in words and sentences. We saw how we can use named entity recognition and sentiment analysis. Polyglot is easy to use and can be used for a variety of od NLP operations.

More Great AIM Stories

Himanshu Sharma
An aspiring Data Scientist currently Pursuing MBA in Applied Data Science, with an Interest in the financial markets. I have experience in Data Analytics, Data Visualization, Machine Learning, Creating Dashboards and Writing articles related to Data Science.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Data Science Skills Survey 2022 – By AIM and Great Learning

Data science and its applications are becoming more common in a rapidly digitising world. This report presents a comprehensive view to all the stakeholders — students, professionals, recruiters, and others — about the different key data science tools or skillsets required to start or advance a career in the data science industry.

How to Kill Google Play Monopoly

The only way to break Google’s monopoly is to have localised app stores with an interface as robust as Google’s – and this isn’t an easy ask. What are the options?