MITB Banner

What Is Contrastive Learning?

Share

The recent success in self-supervised models can be attributed in the renewed interest of the researchers in exploring contrastive learning, a paradigm of self-supervised learning. For instance, humans can identify objects in the wild even if we do not recollect what the object exactly looks like.

We do this by remembering high-level features and ignoring the details at the microscopic level. So, now the question is can we build representation learning algorithms that do not concentrate on pixel-level details, and only encode high-level features sufficient enough to distinguish different objects? With contrastive learning, researchers are trying to address this.

Recently, even Google’s SimCLR demonstrated the implications of contrastive learning, which we will briefly go into at the end of this article. 

Principle Of Contrastive Learning

via Ankesh Anand

Contrastive learning is an approach to formulate the task of finding similar and dissimilar things for an ML model. Using this approach, one can train a machine learning model to classify between similar and dissimilar images.

The inner working of contrastive learning can be formulated as a score function, which is a metric that measures the similarity between two features.

Here

x+ is data point similar to x, referred to as a positive sample

x− is a data point dissimilar to x, referred to as a negative sample

Over this, a softmax classifier can be built that classifies positive and negative samples correctly. A similar application of this technique can be found in the recently introduced framework SimCLR.

Applying Contrastive Learning

via Google AI

Google has introduced a framework called “SimCLR” that uses contrastive learning. This framework first learns generic representations of images on an unlabeled dataset and then is fine-tuned with a small dataset of labelled images for a given classification task. 

The basic representations are learned by simultaneously maximising agreement between different versions or views of the same image and cutting down the difference using contrastive learning. 

When the parameters of a neural network are updated using this contrastive objective causes representations of corresponding views to “attract” each other, while representations of non-corresponding views “repel” each other.

A finer explanation of the original paper was given in this blog.

The procedure is as follows:

  1. First, generate batches of a certain size, say N from the raw images
  2. For each image in this batch, a random transformation function is applied to get a pair of two images
  3. Each augmented image in a pair is passed through an encoder to get image representations. 
  1. The representations of the two augmented images are then passed through a non-linear dense layer followed by a ReLU, which is then followed by another dense layer. These images are passed over a series of these layers to apply non-linear transformation and project it into a representation 
  2. For each augmented image in the batch, get an embedding vector.

Now, the similarity between two augmented versions of an image is calculated using cosine similarity. SimCLR uses “NT-Xent loss” (Normalised Temperature-Scaled Cross-Entropy Loss), which is known as contrastive loss.

via amitness

First, the augmented pairs in the batch are taken one by one. Later a softmax function is applied to find the probability of these two images being similar.

via amitness

As shown above, the softmax function can be used to calculate how similar the two augmented cat images are and all remaining images in the batch are sampled as dissimilar images (negative pair).

Based on the loss, the encoder and projection head representations improve over time, and the representations obtained place similar images closer in the space.

The results from SimCLR showed that it outperformed previous self-supervised methods on ImageNet. 

To know more about this topic, check this and this.

PS: The story was written using a keyboard.
Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India