In Natural language processing, we largely deal with large volumes of textual data that is created every second on the internet. There are different techniques in NLP by which we understand more about the data like text classification, sentiment analysis, pos tagging. Also Named Entity Recognition (NER), is also called Entity identification where each word is identified in predefined categories like Organization, Place, Person, etc.
In this article, we will explore NER with its meaning, functionalities and how it identifies words into predefined categories. We will first define the different texts and then will find entities using a pre-trained model present in the spacy library.
What will you learn from this article?
Sign up for your weekly dose of what's up in emerging technology.
- What is Named Entity Recognition?
- What are the different use cases for NER?
- How NER is used to categorize different worlds?
When we read a corpus we automatically get to know what word is a place, location, etc. In a similar fashion, NER works. It is used to analyze huge volumes of unstructured data, for example, emails, twitter feeds, etc. Also, using such technology helps to attain information about the text really quickly. With the help of NER machines can understand what is there in the piece of text. It has proven tech for initial text classification.
Let us see how NER works. Consider the below piece of text.
Text:- I work for Facebook. I have my own home in the US and I earn 5 lacs in a month.
In the above text, Facebook is an organization the US is a country and similarly, other words are categorized by the machine. There can be various use cases of using NER like categorizing tickets, recommendations, etc. Now we will see how this actually works.
We need to first install the required libraries and load the pre-trained model for NER. Use the below code for the same.
from spacy import displacy
nlp = spacy.load(‘en’)
Now we will define the text in which we want to find entities. We will take a random example and will compute the entities using this model. Use the below code for the same.
text1= nlp(“Delhi is the capital of India. Delhi has a population of 1.3 crore. Arvind Kejriwal is the Chief Minister of Delhi”)
for word in text3.ents:
text2 = nlp(“I work for Facebook. I have my own home in US and I earn 5 lacs in a month.”)
If we do not understand any of the entities predicted by the model use the below function and pass the predicted entity to get the meaning of it like we are computing for ‘ORG’.
Now we will use a different version of visualising entities. Use the below code for the same.
This style ‘ent’ highlights the word with corresponding entities associated with it. Like Delhi is highlighted with “GPE” and similarly other words. Now we will see another style of it that is “dep” and will compute the output for both text1 and text2.
These methods break the whole text into different sentences and for each sentence and every word it gives out the figure of speech and how the words are associated with each other as shown in the above images.
There are different open-source APIs for these sort of tasks in NLP that includes Stanford Named Entity Recognizer, Spacy and Natural Language ToolKit. We can also train our custom models for the Named Entity Recognizer. Check here the open-source labelled datasets that can be used for creating your own NER.
Through this article, we explored how named entity recognition can be helpful in analyzing the different textual data. We used a pre-trained model from the spacy library for the same and categorized words into different entities. With tremendous advancements in NLP, the machines are getting smarter and can now intelligently understand large volumes of textual data that result in numerous use cases like machine translation, text summarization, etc.