Did you know that almost 90% of the world’s data has been created in the last two years alone, and nearly 2.5 quintillion bytes of data are produced by humans every day?
On social media platforms like Instagram, close to 95 million photos and videos are shared, while 500 million tweets are sent out every day. Not to mention the swathes of visual data being churned out daily.
Self-supervised learning or unsupervised semantic feature learning has become the go-to option to build scalable computer vision models. Self-supervised learning is the ability of a system to learn without manual annotation.
Sign up for your weekly dose of what's up in emerging technology.
Previously, for a system to learn high-level semantic image features, it required a massive amount of manually labelled data, which is time-consuming, expensive and impractical to scale. Thanks to self-supervised learning — unlabelled images, videos and other data — can now be trained seamlessly.
Lately, there have been several developments in self-supervised learning. Many researchers from large tech companies, including Google, Facebook and Microsoft, are developing scalable machine learning models.
Keeping in mind the advancements in the computer vision landscape, we have curated a list of resources to get the hang of self-supervised learning:
In a series of videos, Anuj Shah introduces self-supervised representation learning and why it is crucial. He also discusses the entire pipeline for performing self-supervised learning. In addition to this, he speaks about various categories of self-supervised learning and explores research works in the areas of self-supervised methods like MoCo, SimCLT, SwAV and others.
In 122 PowerPoint slides, DeepMind’s Andrew Zisserman captures the essence of self-supervised learning perfectly, touching upon its implementation on unlabelled image, videos and audio files, alongside discussing various parameters, functions and challenges to findings.
The slides could be a good start for beginners and amateurs looking to understand the basics of self-supervised learning. Click here to view the presentation slides.
In this Microsoft Research podcast, Dr Philip Bachman — a researcher at MSR Montreal — gives an overview of the machine learning space and talks about the clutter in successfully gathering helpful information. Further, he cites the ongoing work on Deep InfoMax, a novel approach to self-supervised learning, and delve into machine learning classification problems.
Check out the podcast here.
Facebook VP and chief AI scientist Yann LeCun, alongside Ishan Misra, a research scientist, in a blog post, explains the self-supervised learning model in detail, along with its impact, categories and various architecture used (siamese networks or joint embedding architecture).
Further, the post highlights self-supervised CV model SEER, used on a billion random, unlabeled and uncurated public Instagram images and supervised fine-tuning on ImageNet. The self-supervised systems saw 84.2 percent accuracy on ImageNet.
In this blog post, OpenAI’s Lilian Weng explains the nuances of self-supervised learning and covers interesting ideas in self-supervised learning tasks on images, videos, and control problems.
In this tutorial, tech evangelist Andrew Ng discusses effective machine learning techniques. Ng speaks about Silicon Valley’s best practices in the AI and the machine learning landscape, alongside providing a broad introduction to machine learning, data mining, statistical pattern recognition, supervised learning and real-time use cases.
Topics covered include:
- Supervised learning (parametric/non-parametric algorithms, kernels, neural networks, etc.)
- Unsupervised learning (clustering, dimensionality reduction, deep learning and more)
- Best practices in machine learning (variance theory/bias; innovation process in ML and AI)
The book written by Vaibhav Verdhan teaches you how to apply unsupervised learning algorithms. The book covers many bases, from kmeans and hierarchical clustering to advanced neural networks like GANs and restricted Boltzmann machines. The book also gives multiple examples, where each new algorithm is introduced with a case study for retail, banking, aviation sectors.
More than anything, the book bridges the gap between complex math and practical implementations, covering all the model developments to production deployment. Also, after the end of each chapter, you’ll find quizzes, practice datasets, and research paper links for better comprehension.
Zhu Xiaojin and Andrew B Goldberg highlight various semi-supervised learning aspects (the combination of labelled and unlabeled data) in this book. It talks about popular semi-supervised learning models like self-training, mixture models, co-training, graph-based methods and more.
The book further discusses the basic mathematical formulation, assumptions and limitations of each model. It is a good starting point for understanding the nuances of semi-supervised learning before experimenting with self-supervised learning.