The future is already here – it’s just not evenly distributed: William Gibson
We are gravitating towards technological singularity. Futurists like Louis Rosenberg, Ray Kurzweil, and Patrick Winston have predicted the timeframe for ‘super intelligence’ (between 2030-2045). But are these timelines realistic? And what approaches (supervised, semi-supervised, or unsupervised learning) will get us there?
Andrew Ng, founder and CEO of Landing AI, swears by smart-sized, “data-centric” AI, whereas Meta’s VP & Chief AI Scientist, Yann LeCun thinks “the revolution will not be supervised”. Instead, he proposes using self-supervised learning to build AI systems with common sense, taking us a step closer to human-level intelligence.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Supervised learning uses a labelled dataset to teach models how to map inputs to desired outputs. The algorithm gauges its accuracy through the loss function, adjusting until the error has been minimised. Supervised learning is currently the most prevalent machine learning approach, with applications in fraud detection, sales forecasting, and inventory optimisation.
According to a recent O’Reilly research, 82% of respondents claimed their company prefers supervised learning over unsupervised or semi-supervised learning. As per Gartner, supervised learning will continue to be the most popular type of machine learning in 2022.
“The last decade saw a leap towards deep learning. This decade, it may be towards data-centric AI,” Andrew Ng has said. Deep learning networks have made huge strides in the last decade. He believes the way forward is to improve the dataset while keeping the neural network architecture fixed.
Break the bias
Biased or duplicated data is another issue hurting an AI system’s performance. As per Yann LeCun, supervised learning works well in domains with well-defined boundaries. The types of inputs seen during deployment are not significantly different from those used during training. However, building large amounts of clean, unbiased labelled datasets isn’t easy.
Yann LeCun recommends self-supervised learning to tackle the data problem. SSL trains a system to learn good representation of inputs in a task-independent way. It uses unlabeled data to learn representations from large training sets. Then, it uses a part of labelled data to achieve good performance on a supervised task. You only need little labelled data to learn, and the system will be able to handle inputs different from training samples. SSL also reduces the sensitivity of the system to bias in the data.
According to Andrew, biased data leads to biased systems. Data-centric AI gives the ability to engineer a subset of the data. If the performance is biased towards a subset of the data, but works for most of the data set, changing the whole neural network architecture to improve the performance on just that subset, is counterproductive. But if you can engineer a subset of the data, you can fix the issue in a targeted manner.
Fake it until you make it
But what if we don’t have enough data, to begin with? According to Andrew, you don’t always need huge datasets to train a system: 50 carefully engineered examples will be adequate for the neural network to understand what it is supposed to learn. In other words, the focus needs to shift from big data to good data.
Gartner predicts that 60% of data used to train AI systems will be synthetic. NVIDIA has launched a powerful synthetic data generation engine for training neural networks, called Omniverse Replicator.
Synthetic data plays a pivotal role in data-centric AI. The use of synthetic data goes beyond just a pre-processing step for increasing the data set for a learning algorithm, Andrew NG said. However, as the application of synthetic data is controversial, methods like data augmentation, improving labelling consistency, or collecting more data also make sense.
Supervised learning can lead to disastrous results if the training datasets are not properly vetted. For example, an earlier version of ImageNet contained photos of naked children, porn actresses, college parties, and more — all of which were scraped from the internet without consent. Meanwhile, 80 Million Tiny Images contained a variety of racist, sexist, and otherwise offensive annotations, including nearly 2,000 images labelled with the N-word and labels such as “rape suspect” and “child molester.”
In focus: SSL
Yann LeCun said the real progress in AI depends on us figuring out a way to get machines to learn how the world works (just like humans and animals): by watching it, and a bit by acting in it. Such world models enable us to perceive, interpret, reason, plan and act. So how can machines learn world models? he asked. Two pertinent questions to consider here are:
First, what learning paradigm should we use to train world models? Yann LeCun’s answer is SSL. He cited it with an example: Instruct a machine to watch a video and learn a representation of what will happen next in the video. As a result, the machine may acquire vast amounts of background knowledge about how the world works, like how humans and animals learn.
Secondly, what architecture should world models use? Yann LeCun proposed a new deep macro-architecture dubbed the Hierarchical Joint Embedding Predictive Architecture (H-JEPA). For example, instead of predicting future frames of a video clip, a JEPA learns abstract representations of the video clip and the future of the clip so that the latter is easily predictable based on its understanding of the former. This can be accomplished by using some of the latest advances in non-contrastive SSL methods, specifically a method known as VICReg (Variance, Invariance, Covariance Regularisation), he said.
In practical AI systems, we are leaning towards larger architectures pre-trained with SSL on large amounts of unlabelled data for a wide variety of tasks. Meta AI has language-translation systems (a single neural net) to handle hundreds of languages. Meta also has multilingual speech-recognition systems that can deal with languages with very little data, let alone annotated data, said Yann LeCun.