MITB Banner

Let’s Learn TextBlob Quickstart – A Python Library For Processing Textual Data

Share

TextBlob Text Classification

Processing text in such a way to extract useful information from it known as text processing. It is the textual data analysis using different tools and techniques. In order to pass the text to a machine learning model, we need to process it to find out certain important information and the numerical features about the text.

Textblob is an open-source python library for processing textual data. It performs different operations on textual data such as noun phrase extraction, sentiment analysis, classification, translation, etc. 

Textblob is built on top of NLTK and Pattern also it is very easy to use and can process the text in a few lines of code. Textblob can help you start with the NLP tasks.

In this article, we will explore textblob and learn about all of its major features with this Hands-on tutorials. 

Implementation:

Textblob requires certain features from  NLTK, so we will start by installing both NLTK and Textblob using pip install nltk & pip install textblob.

  1. Importing required libraries

We will import both NLTK and textblob, and we will download certain dependencies using NLTK. 

from textblob import TextBlob

import nltk

nltk.download('punkt')

nltk.download('averaged_perceptron_tagger')

nltk.download('brown')

  1. Text selection for Processing

We can use any text for this text processing tutorial. I have taken an article from today’s newspaper. 

art = '''Among the 10 countries that have reported the highest number of case in the world, daily cases are still continuously rising in only two – India and Colombia.  Other than the US and Brazil, daily cases also appear hitting a plateau in Mexico (7th spot, 480,278 cases). Russia (4th, 892,654 cases), South Africa (5th, 563,598 cases), and Chile (9th, 375,044 cases). The remaining two – Spain (10th, 370,060 cases) and Peru (7th, 483,133 cases) – managed to control outbreaks once, but are now seeing a resurgence of cases. All caseloads are from the worldometers.info dashboard. To be sure, the global Covid-19 curve has flattened twice before — first, when the Chinese outbreak peaked and the contagion was yet to reach the West; the second, when cases dropped in Europe — however, it has risen again with more ferocity both times as the virus has spread to new regions.'''

  1. Text Processing

We will start with different techniques of text processing but before that, we need to pass the text to the TextBlob function. 

blob = TextBlob(art)

Starting with some of the basic text processing functions like finding the tags and noun phrases.

  • Tags

Tags function is used to find the respective tags of the particular word which describes whether the word is a noun, adjective, etc.  

blob.tags

Text  Tags
  • Noun Phrases

Noun phrases function helps us find out the noun phrases in the text given.

blob.noun_phrases 

TextBlob
  • Sentiments

Sentiment function is used to find out the polarity and subjectivity of the text. The polarity is used to check whether the text is positive or negative and subjectivity is used to check whether the text is objective or subjective.

blob.sentiment

We can use the function polarity and subjectivity to find their values individually also.

  • Words

Words function split the text into words that are used in the text.

blob.words

Text Words, TextBlob
  • Sentences

Sentences function split the text into the sentences which are used to form the text.

blob.sentences

TextBlob

We can also find the polarity of all individual sentences using the polarity function mentioned above.

for sentence in blob.sentences:

    print(sentence.sentiment.polarity)

Sentiment Analysis, TextBlob
  • Singularize & Pluralize words

We can select different words from our text and can singularize and pluralize them. Similarly, we can pass any word and convert it into a singular or plural form. 

word_text = blob.words

word_text[3]

word_text[3].singularize()

word_text[4].pluralize()

  • Lemmatize

Lemmatize function is used to find out the lemma for the word.

word_text[3].lemmatize()

  • Spell Check

Spell check function and correct function helps in checking and correcting the spelling mistakes in our sentence or word or article.

sent = TextBlob("Among the 10 countries that have reported the highest number  of case in the world")

print(sent.correct())

from textblob import Word

w = Word('amog')

w.spellcheck()

Spellcheck Analysis
  • Parsing Text

By default, Textblob uses Pattern’s parser. We will parse our text using the parser function.

blob.parse()

TextBlob
  • N-Grams

N-grams function returns a tuple of n successive words from a given text. You just need to pass the value of n in the n-gram function to decide the number of words in the n-gram.

blob.ngrams(n=5)

TextBlob

These are some of the text processing functions that are provided by textblob. We can use textblob for text processing as it is easy to use and has a lot of predefined functions.

Conclusion:

In this article, we have learned about Textblob and how text blob is used for text processing. Textblob provides a wide variety of functions that are used to draw certain properties of the textual data. It allows us to change the properties of data to make it useful to pass it to the machine learning model. 

Share
Picture of Himanshu Sharma

Himanshu Sharma

An aspiring Data Scientist currently Pursuing MBA in Applied Data Science, with an Interest in the financial markets. I have experience in Data Analytics, Data Visualization, Machine Learning, Creating Dashboards and Writing articles related to Data Science.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.