Listen to this story
For decades, making a machine fully capable of learning by observing its environment has been the biggest dream for many researchers. Though methods like supervised or reinforcement learning have made huge advancements, there is a lot of speculation if they are the right way forward.
Self-supervised learning guru, Yann LeCun, chief of AI at Meta, has a similar vision for autonomous machine intelligence. In his paper published in June 2022, LeCun proposed several solutions and architectures that can be combined and implemented to build self-supervised autonomous machines.
In September 2022, LeCun in a talk at UC Berkeley EECS Colloquium, revisited his paper about autonomous machine intelligence and explained his predictions, speculations, and vision for the future of this field.
Case for self-supervised learning
“Machines do not have common sense,” starts LeCun. While Google’s Deepmind has been a proponent of reinforcement learning, for several years, LeCun has been a promoter of self-supervised learning. Animals and humans do not require a large chunk of labelled samples (supervised learning) or a number of trials (reinforcement learning) to learn new things. Inferring information from the world around us is naturally embedded in our “software”.
Giving the ability to machines to learn, reason, and plan without being trained or rewarded for failing or succeeding at a task is the ideal future of AI. LeCun talks about how when learning to drive a car, drivers do not have to run off the road or a cliff multiple times to figure out if and how to avoid them.
While making a case for self-supervised or unsupervised learning, LeCun argues against the general consensus regarding scaling models with more parameters and layers while training them would help reach artificial general intelligence. He says that training large language models and transformers requires discrete values and elimination of irrelevant information in the data.
‘Observation and prediction is the essence of intelligence‘
A lot of learning is just observations, interpretations, and interactions. Humans and animals know how to react and behave in situations that they have never encountered before. Kenneth Craik in his book, The Nature of Explanation, said that, “Common Sense is not just facts, it is a collection of models.”
Building on this statement, LeCun proposed a modular architecture for autonomous machine intelligence by combining different models to act as different parts of the brain of a machine and mimic the animal brain. With all the models being differentiable, each model is connected to another to drive specific functions similar to the brain – from identification to reaction to the environment.
- The configurator module takes input from all other modules and configures them to perform the task at hand
- The perception module receives signals from the sensors to predict and estimate the current state of the world
- The world module is a “simulator” that relates the information from the perception module and predicts the future state of the world. The key issue in this module is that it has to generate multiple outputs to mimic the “world” as the natural world is not entirely predictable
- The cost module, as suggested by the name, measures the amount of “energy” required to perform the task and bring it down to a minimum
- The short-term memory module stores the information from the other modules like the present and future state of the world along with the calculated energy from the cost module in scalar form
- The actor module computes possible action sequences based on the result of the world module and the cost module
The centrepiece of the architecture is the world model which predicts the future world states. It has two tasks – to estimate the missing information which is not provided by the perception module and to predict possible and plausible future states of the world.
LeCun also elaborates on the possibility for machines to have commonsense without acquiring emotions. For a machine to effectively measure the “cost” from the cost module of the architecture, machine emotions will be part of the intrinsic cost, or the prediction of outcomes from a trainable critic, which are both part of the cost module.
Path towards machine common sense
LeCun said that to enable machines to think, we need to teach them how animals learn by watching, and by acting on it and integrating neurobiology within the field. He also recently released VICRegL, a self-supervised learning method to learn global and local information at the same time. During the Tesla AI Day, the Autopilot team also demonstrated the use of self-supervised learning to predict the movement of pedestrians and vehicles across the streets and take actions accordingly in their full-self driving of Tesla vehicles.
Self-supervised learning methods are being used for learning image representations, classifications, and segmentation. For the development of self-driving cars, robots, and virtual assistants, machines need to learn general intelligence or practical judgement of the world around them and take immediate actions corresponding to the changes in the environment.
While the paper was at the centre of a controversy, LeCun admits that the scope of the proposed architecture is too large and does not answer many questions about how these will be implemented into machines but also states that the current methods of supervised and reinforcement learning fall short for developing “consciousness” in machines. Some researchers argue that creating a model that exhibits intelligence like animals and humans is the end goal of artificial intelligence.