How to perform Named Entity Recognition (NER) using a transformer?

This article is focused on making the procedure of NER easy and trustworthy. Since we know about the success of transformers in machine learning, we can say that for NER using a transformer can be reliable for us.

Named Entity Recognition (NER) is one of the major tasks that are required to be performed in many NLP tasks. Achieving state-of-the-art performance in NER is very difficult. Transformers can be considered as the pre-trained models that can be used in every aspect of data science to achieve high performance. We can also perform NER using transformers. In this article, we will discuss how we can use a BERT transformer for accomplishing the NER in a very easy way. The major points to be discussed in the article are listed below.

Table of contents 

  1. What is Named Entity Recognition (NER)?
  2. What is BERT?
  3. Applying BERT for NER

Let’s start by introducing Named Entity Recognition (NER).

What is Named Entity Recognition (NER)?

In one of the previous articles, we covered that in any text document, named entities are the words that are objects of the real world. Examples of Named entities can be the name of any person, place, or thing. Named entities have their own identity in the corpus. Virat Kohli, Delhi, Lenovo laptop can be an example of a Named Entity in any text data. 

Named entities can be of different classes like Virat Kohli is the name of a person and Lenovo is the name of a company. The process of recognizing such entities with their class and specification can be considered as Named Entity Recognition. In traditional ways of performing NER, we mostly find usage of spacy and NLTK

There can be a variety of applications of NER in natural language processing. For example, we use this for summarizing information from the documents and search engine optimization, content recommendation, and identification of different Biomedical subparts processes. 

In this article, we aim to make the implementation of NER easy and using transformers like BERT we can do this. Implementation of NER will be performed using BERT, so we are required to know what BERT is, which we will explain in our next section.   

What is BERT?

In one of the previous articles, we had a detailed introduction to BERT. BERT stands for Bidirectional Encoder Representations from Transformers. It is a famous transformer in the field of NLP. This transformer is a pre-trained transformer like the others.  The training for this transformer is performed by deep bidirectional representation from the unlabeled text by jointly conditioning on both left and right context and using the data from English Wikipedia(2500 M words) and wordsBooksCorpus. 

Talking about the variants of BERT, we get BERT BASE and  BERT LARGE pre-trained transformers. The BERT BASE variant is a composition of 12 encoder layers, 12 attention heads, and 768 feedforward networks. The BERT LARGE variant has 24 encoder layers with 16 attention heads and 1024 feed-forward networks. Also with this article, we can utilize a beginner guide to using BERT for text classification. Next in this article, we will be using the BERT model for the NER process of NLP. Let’s see how we can do this.  

Applying BERT for NER

This article is focused on making the procedure of NER easy and trustworthy. Since we know about the success of transformers in machine learning, we can say that for NER using a transformer can be reliable for us. Also, BERT is one of the most successful transformers for performing different NLP tasks to achieve a state-of-art performance. In this article, we will proceed with installing the transformer library using which we can utilize many transformers.


Installation of the transformer library can be performed using the following lines of codes.

!pip install transformers


After installation, we are ready to use the BERT transformer for Named Entity Recognition.

Importing libraries

In this implementation, we are going to use modules only from the transformer library.

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

From the library, we have imported an auto tokenizer for tokenizing the words and a model for automatic token classification.

Instantiation of BERT

In this implementation, we are going to use a variant of the BERT model Named Bert-base-NER which is a fine-tuned BERT model for Named Entity Recognition. We can achieve state-of-the-art performance in NER tasks using this model. This also has two variants –  base and large one like we have discussed above. We can instantiate the model using the following lines of codes.

tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NER")
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")


After instantiation, we are ready to use bert-base-NER. 

Defining pipeline and example

Here we will use the pipeline module of the transformer library for defining a pipeline through which we can pass our data.

NER = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is yugesh and I live in India"

In the above codes, we have defined one pipeline and an example for testing the results.

Fitting and evaluating pipeline   

After performing the above-given procedures we are ready to use the pipeline in the example.

results = NER(example)


Here we can see the results of NER. in the output pipeline we can see the probability with the class of the Named Entity. This model is trained using the following abbreviation:

OOutside of a Named Entity
B-MISBeginning of a miscellaneous Entity right after another miscellaneous Entity
I-MISMiscellaneous Entity
B-PERBeginning of a person’s name right after another person’s name
I-PERPerson’s name
B-ORGBeginning of an organization right after another organization
B-ORGBeginning of an organization right after another organizationI-ORG organization
B-LOCBeginning of a location right after another location

Final words

In this article, we have gone through a small introduction to the Named Entity Recognition and BERT model. Through a hands-on implementation, we could understand how we can use a transformer for named entity recognition in a very easy way and just in a few steps.


Download our Mobile App

Yugesh Verma
Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring