Active Hackathon

A complete tutorial on zero-shot text classification

To avoid data labelling, we can utilise zero-shot learning that aims to perform modelling using less amount of labelled data. When this learning comes to text classification, we call the whole process zero-shot text classification.

Almost all text classification models require a large amount of labelled data. Getting labelled data is one of the difficult tasks that also require much effort and processing. To avoid data labelling, we can utilise zero-shot learning that aims to perform modelling using less amount of labelled data. When this learning comes to text classification, we call the whole process zero-shot text classification. In this article, we are going to discuss zero-shot text classification and the ways using which we can perform zero-shot text classification. The major points to be discussed in the article are listed below.

Table of content 

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.
  1. What is zero-shot learning?
  2. Zero-shot text classification
  3. Implementation using transformers  
    1. Bart-large-mnli
    2. Cross-Encoder
    3. Bart-large-nli
  4. Implementation using Flair

Let’s start with understanding zero-shot learning.

What is zero-shot learning?

In one of our articles, we have discussed that models based on zero-shot learning do not require a huge amount of labelled data, their recognition system relies on the availability of labelled data, and knowledge gain of unseen classes is semantically related to the seen classes. The human can perform zero-shot learning where using the existing knowledge about any unseen class they can make the relationship between seen and unseen classes and are capable of recognizing unseen classes. 

In many cases, we find the usage of zero-shot learning in the field of recognition modelling. And using such modelling procedures we are capable of making a model to learn about the unseen classes that are not labelled at the time of training. We can also say that this type of learning predicts the new classes by learning the intermediate semantic layers and their attributes. In the above-mentioned article, we can also get a lot of information about zero-shot learning with its implementation. In this article, our main focus is to describe the zero-shot text classification. Let’s see what it is.

What is  Zero-shot text classification?

As we know, text classification is a task of natural language processing where the model needs to predict the classes of the text documents. In the traditional process, we are required to use a huge amount of labelled data to train the model, and also they can’t predict using the unseen data. Adding zero-shot learning with text classification has taken natural language processing to the extreme. 

The main goal of any model related to the zero-shot text classification technique is to classify the text documents without using any single labelled data or without having seen any labelled text. We mainly find the implementations of zero-shot classification in the transformers. In the hugging face transformers, we can find that there are more than 60  transformers that work based on zero-shot classification. 

One more thing that comes to mind when we talk about the zero-shot text classification zero-shot classification is the few-shot classification which is also similar to the zero-shot classification but this type of modelling possesses the usage of very few labelled samples in the time of training. We find the implementation of the few-shot classification methods in OpenAI where GPT-3 is a well-known few-shot classifier.

We can also utilise the Flair for zero-shot classification, under the package of Flair we can also utilise various transformers for the NLP procedures like named entity recognition, text tagging, text embedding, etc. Flair provides a TARSclassifier for zero-shot classification. 

In this article, we are going to discuss how we can perform zero-shot text classification using hugging face transformers and TARSclassifier in python.

Implementation using transformers 

As we have discussed above there are more than 60 transformers available for performing zero-shot classification. We can not put all of them. So in this article, we are going to implement some of the most downloaded transformers for zero-shot text classification. 

Before implementing these transformers we are required to install transformers in the environment. This can be done using the following line of codes.

!pip install transformers

After installation, we are ready to use transformers.

Bart-large-mnli

This transformer is developed by researchers of Facebook that can be considered as an up-gradation of the Bart-large model trained using the MNLI dataset.

In hugging face we get the pipeline module that can be defined as the zero-shot classification for performing zero-shot classification.

 Let’s try this transformer 

import transformers

classifier = transformers.pipeline(“zero-shot-classification”,

                      model=”facebook/bart-large-mnli”)

 Output:

Let’s perform an operation.

sequence= “i can perform article”

labels = [‘writting’, ‘management’, ‘checking’]

classifier(sequence, labels)

Output:

Here we see the predicted probabilities in a sorted sequence. We can also use this transformer for multiclass labelling and in the PyTorch environment. The detailed information about the transformer can be found here.  

Cross-Encoder

This transformer is also a part of the hugging face transformer family that is trained using the SNLI and MNLI datasets. It can be utilized for cross encoding and zero-shot text classification.

Let’s try this transformer.

Defining pipeline:

classifier1 = transformers.pipeline(“zero-shot-classification”,                               model=’cross-encoder/nli-distilroberta-base’)

Output:

Zero-shot text classification 

classifier1(sequence, labels)

Output:

Here we can see the prediction in a sorted manner. More detail about this transformer can be found here.

Bart-large-nli

This model is trained using the NLI dataset. This one is specially designed for zero-shot text classification. The base transformer for this model is bart-large. Let’s try this model.

Defining pipeline

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

# Load model & tokenizer

bart_model = AutoModelForSequenceClassification.from_pretrained(‘navteca/bart-large-mnli’)

bart_tokenizer = AutoTokenizer.from_pretrained(‘navteca/bart-large-mnli’)

# Get predictions

nlp = pipeline(‘zero-shot-classification’, model=bart_model, tokenizer=bart_tokenizer)

Output:

Zero-shot text classification 

sequence= “i can perform article”

labels = [‘writting’, ‘management’, ‘checking’]

nlp(sequence, labels)

Output:

Here we have seen the three most famous transformers for zero-shot text classification.

Implementation using Flair

In the above, we have discussed that we can also use the Flair library for zero-shot text classification, under this library we have a TARSclassifier model of TARS that is specially designed to perform zero-shot text classification.

We can install Flair using the following lines of codes:

!pip install Flair

After installation, we are ready to perform zero-shot text classification. 

Importing model:

from Flair.models import TARSClassifier

classifier2 = TARSClassifier.load(‘tars-base’)

Output :

Defining sentence:

from Flair.data import Sentence

sentence = Sentence(“I am so glad to use Flair”)

Defining classes:

classes = [“happy”, “sad”]

Generating prediction:

classifier2.predict_zero_shot(sentence, classes)

print(sentence)

Output:

Here we can see that this model has predicted so right. We can also use Flair for different tasks of NLP. We can find more information about Flair here.  

Final words

In this article, we have discussed zero-shot learning and zero-shot text classification. Along with this, we have discussed the ways that can be utilized to implement zero-shot text classification using the hugging face and Flair. All of them have provided very good results.  

References

More Great AIM Stories

Yugesh Verma
Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM