Hands-On Tutorial On Polyglot – Python Toolkit For Multilingual NLP Applications

Polyglot is an open-source python library which is used to perform different NLP operations. It is based on NumPy which is why it is fast. It has a large variety of dedicated commands which makes it stand out of the crowd.

Natural Language Processing is a process of making the human language understandable to machines and then performing different operations on it to extract useful information. NLP is a part of Artificial Intelligence which makes the interaction between computer and human language.

There is a large variety of python libraries that can help us in performing NLP tasks. All libraries have certain unique features and which make them different from each other. Generally, NLP libraries have functions like Tokenize, Stemming, Lamenting, Spell CHeck, etc. 

Polyglot is an open-source python library which is used to perform different NLP operations. It is based on NumPy which is why it is fast. It has a large variety of dedicated commands which makes it stand out of the crowd. It is similar to spacy and can be used for languages that do not support spacy.


Sign up for your weekly dose of what's up in emerging technology.

In this article, we will explore different NLP operations and functions which can be performed using polyglot.


Like any other python library, we will install polyglot using pip install polyglot.

  1. Importing Required Libraries

We will import polyglot and explore its different functionalities. All functionalities will be imported as and when required.

  1. Performing Operation on Data

Before performing different operations on our data, let us first initialize some text which we will use for performing different functions on.

init = '''Analytics India Magazine chronicles technological progress in the space of  analytics, artificial intelligence, data science & big data by highlighting the innovations, players, and challenges shaping the future of India through promotion and discussion of ideas and thoughts by smart, ardent, action-oriented individuals who want to change the world.'''

  1. Language Detection

Polyglot can identify the language of the text passed to it using the language function. Let us see how to use it.

detect = Detector(init)


Language Detector
  1. Tokenize

In tokenize, we can print the wordlist which is the words that are there in the text used as well as the sentences which are there in the text. 

from polyglot.text import Text

text = Text(init)




Sentences Detection
  1. POS Tagging

Parts of speech tagging is used to identify the syntactic functionality of word occurrence.

from polyglot.mapping import Embedding


POS Tagging
  1. Named Entity Extraction

It extracts phrases from the plain text that are entities like location, person, and organizations.


Named Entity Extration

Let us try this with some more texts. 

init1 = '''Hello my name is Himanshu Sharma and I am from India'''

text = Text(init1)


  1. Morphological analysis

It defines the regularities behind word formation in human language. Let us see how to use it.

from polyglot.text import Word

words = ["programming", "parallel", "inevitable", "beautiful"]

for w in words:

     w = Word(w, language="en")

     print(w, w.morphemes)

Morphological Analysis
  1. Sentiment Analysis

It is used to find out the polarity of the text.

text = Text("The new economic policies are quite good.")

for w in text.words:

    print(w, w.polarity)

Sentiment Extraction

These are some of the NLP operations which we can perform using polyglot.


In this article we saw how polyglot can be used to detect the language we are using in a particular text, followed by the tokenization in words and sentences. We saw how we can use named entity recognition and sentiment analysis. Polyglot is easy to use and can be used for a variety of od NLP operations.

More Great AIM Stories

Himanshu Sharma
An aspiring Data Scientist currently Pursuing MBA in Applied Data Science, with an Interest in the financial markets. I have experience in Data Analytics, Data Visualization, Machine Learning, Creating Dashboards and Writing articles related to Data Science.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM