Yann LeCun’s Latest Offering – Barlow Twins

Self-supervised learning (SSL) has become a handy technique in computer vision tasks. Significant advances in SSL means its methods can now learn representations even if the input samples are distorted. Also referred to as ‘data augmentations’, this is made possible by maximising the similarity of representations extracted from different distorted versions of a sample.

However, there is a slight hitch. This approach introduces trivial constant representations. Currently, most methods avoid collapsed solutions by employing careful implementation details. To that end, Yann LeCun and his team have introduced Barlow Twins. This objective function avoids collapse by measuring the cross-correlation matrix between the output of two identical networks fed with distorted versions. The aim here is to minimise the redundancy between the vector components and make the outputs as close as possible to the identity matrix.


Sign up for your weekly dose of what's up in emerging technology.

Trivial Representations

Self-supervised learning has proved to be a good solution for deep learning systems’ excessive data dependency. LeCun, referred to as one of the Godfathers of deep learning and the inventor of convolutional neural networks, first gave a glimpse of self-supervised learning in 2018 during his keynote speech at AAAI conference.

Self-supervised learning helps in creating data-efficient artificial systems. This method learns valuable representations of the input data without relying on human annotations. Self-supervised learning has major applications in the field of natural language processing. 

As discussed, data augmentation also leads to trivial representations. In the past, there have been several attempts to overcome this problem, including:

  • Contrastive methods define positive and negative sample pairs that are treated differently in the loss function. 
  • Clustering methods use one distorted sample to measure ‘targets’ for the loss. Another distorted version of the sample is used to predict these targets. It is then followed by the application of an alternate optimisation scheme such as K-means.
  • BYOL and SIMSIAM are among the more recent methods. In both methods, network architectures and parameters updates are modified to introduce asymmetry.

Barlow Twins

LeCun and his team now proposed a new method called Barlow Twins. Named after neuroscientist H.Barlow, this method draws heavily from his influential 1960 article, titled ‘Possible Principles Underlying the Transformation of Sensory Messages’, which notes that sensory processing’s goal is to recode redundant sensory input data into code with a statistically independent component, also called factorial code.

Barlow Twins method applies redundancy reduction, similar to Barlow’s one in his article, to self-supervised learning. As used in the Barlow Twins method, the principle of redundancy reduction has proved successful in explaining the visual system’s organisation and led to the introduction of several algorithms for supervised and unsupervised learning.

Barlow Twins is conceptually simple, easy to implement and learns useful representations. With this method, the researchers propose an objective function that makes the cross-correlation matrix computed from twin representations to be very close to the identity matrix. Barlow Twins benefits from the use of very high-dimensional representations.

Barlow Twins operates on a joint embedding of distorted images. It produces two distorted views for all the images of a batch sampled from the dataset obtained through the distribution of data augmentation. These two batches of distorted views are fed to a deep network with trainable parameters, producing batches of representations.

Credit: Barlow Twins

The advantages of the methods include: It doesn’t require large batches, asymmetric mechanisms like prediction networks, momentum encoders, stop gradients or non-differentiable operators.

Wrapping Up

Barlow Twins have outperformed previous state-of-art methods for self-supervised learning with the added advantage of being simpler and avoiding trivial representations. It is also on-par with the current ImageNet classification methods with linear classification head and a number of other classification and object detection methods. Researchers believe that further algorithm refinement could open doors for more effective solutions.

Read the full paper here.

More Great AIM Stories

Shraddha Goled
I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.

Now Reliance wants to conquer the AI space

Many believe that Reliance is aggressively scouting for AI and NLP companies in the digital space in a bid to create an Indian equivalent of FAANG – Facebook, Apple, Amazon, Netflix, and Google.