Thanks to the application of contrastive learning, self-supervised representation learning has significantly advanced in recent years. It has emerged as a widespread technique among researchers and developers alike to perform various image and video tasks. The contrastive learning technique teaches a model to pull together the representations of a target image (also known as the ‘anchor’) and a matching (positive) image in (negative) images. It is applied to both supervised as well as self-supervised settings.
For instance, supervised learning (uses labelled data) generates positives from existing same-class examples, providing more variability in pre-training achieved by simply augmenting the anchor. On the other hand, in self-supervised learning (no labelled data), the positive is often an augmentation of the anchor, and the negatives are chosen to be the other samples from the training minibatch.
Due to random sampling in self-supervised learning, false negatives can cause a degradation in representation quality. Several optimal methods have been proposed in recent times to tackle this issue, and researchers worldwide are now looking at ways to develop new frameworks, libraries, and tools to fast-track video and image understanding.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Here are some of the top frameworks for contrastive learning.
SimCLR
Developed by Google, SimCLR is an open-source framework for advancing self-supervised and semi-supervised models for image analysis. This method not only simplifies but also improves previous approaches to self-supervised representation learning on images. For example, it significantly advances the SOTA on self-supervised and semi-supervised learning. Furthermore, it achieves a new record for image classification with a limited amount of class-labelled data – i.e. 85.8 per cent top-5 accuracy using 1 per cent of labelled images on the ImageNet dataset.
Download our Mobile App

The simplicity of this approach is that it is easily incorporated into existing supervised learning pipelines. Google has released the latest version, SimCLRv2, introduced in the research paper ‘Big self-supervised models are strong semi-supervised learners.’ The proposed semi-supervised learning algorithm in this study can be summarised into three steps:
- Unsupervised pre-training of a big ResNet model using SimCLRv2
- Supervised fine-tuning on a few labelled examples
- Distillation with unlabelled examples for refining and transferring the task-specific knowledge
The source code of SimCLRv2 architecture is available on GitHub. Also, check out the PyTorch version of SimCLR here.
Lightly
Lightly is a computer vision framework for self-supervised learning. With this, you can train deep learning models using self-supervision. In other words, you do not require any labels to train a model. The framework has been built to help you understand and work with large unlabelled datasets. Built on top of PyTorch, Lightly is fully compatible with other frameworks such as Fast.ai.
In the latest update, Lightly has integrated support for active learning in combination with the Lightly platform. With the help of this, you can now create embeddings of your unlabelled data and combine them with model predictions to select the most valuable sample for labelling.
Check out the detailed tutorial on ‘Active Learning Using Detectron2 on Comma10K’ here.
OWOD
Developed by researchers from IIT Hyderabad; University of AI, UAE; Australian National University, Australis; and Linkoping University, Sweden, the team has introduced a strong evaluation protocol and provided a novel solution, called ORE: Open World Object Detector (OWOD), based on contrastive clustering and energy-based unknown identification.
In their experiment, the researchers analysed the efficacy of ORW in achieving Open World Objectives. As a result, they found that identifying and characterising unknown instances helped them reduce confusion in an incremental object detection setting. They achieved SOTA performance with no extra methodological effort.
Check out the source code of OWOD on GitHub.
TensorFlow Similarity
TensorFlow Similarity is a TensorFlow library for similarity learning. It is also called metric learning and contrastive learning. The platform offers a SOTA algorithm for metric learning and all the necessary components to research, train, evaluate, and serve similarity-based models.
TensorFlow Similarity currently is in a beta testing phase and supports supervised training. In the coming months, it looks to support both semi-supervised and self-supervised learning.
Check out the source code for TensorFlow Similarity on GitHub.
solo-learn
solo-learn is a library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. The library aims at providing SOTA self-supervised methods in a comparable environment while at the same time implementing training tricks. Furthermore, as the library is self-contained, it is possible to use the models outside of solo-learn.
Check out the source code of solo-learn here.
CURL
CURL, or Contrastive Unsupervised Representations for Reinforcement Learning, extracts high-level features from raw pixels using contrastive learning and performs off-policy control on top of the extracted features. It outperformed prior-pixel methods – model-based and model-free – on complex tasks in the DeepMind Control Suite and Atari Games, showing 1.9x and 1.2x performance gains at the 100K environment and interaction steps benchmarks.
CURL is the first image-based algorithm to nearly matching the sample efficiency of methods that use state-based features on the DeepMind Control Suite. The open-source code is available on GitHub.
ContrastiveSeg
Inspired by unsupervised contrastive representation learning, ContrastiveSeg offers a pixel-wise contrastive framework for semantic segmentation in a fully supervised setting. It enforces pixel embeddings belonging to the same semantic class to be more similar than embeddings from other classes. Furthermore, it raises a pixel-wise metric learning paradigm for semantic segmentation by exploring the structures of labelled pixels. ContrastiveSeg can effortlessly incorporate into the existing segmentation framework without extra overhead during testing.
Check out the source code of ContrastiveSeg on GitHub.
SalesForce PCL
Developed by SalesForce Research, Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method that bridges contrastive learning with clustering. It not only learns low-level features for the task of instance discrimination but also encodes semantic structures discovered by clustering into the learned embedding space.
PCL has outperformed SOTA instance-wise contrastive learning methods on multiple benchmarks with substantial improvement in low-resource transfer learning. Check out the code and pre-trained models on GitHub.