Listen to this story
|
Exactly a year ago, during a fireside chat with NVIDIA’s Jensen Huang, OpenAI co-founder Ilya Sutskever had famously proclaimed that ‘text is a projection of the world’. And contrary to popular perception, ChatGPT is doing much more than just the surface-level learning of statistical correlations. This was during the launch of GPT-4.
Cut to present, former OpenAI computer scientist Andrej Karpathy while discussing the road to the AGI, said: “There’s a lot of optimisation, and I think, roughly speaking, the way things are happening is that everyone is trying to build what I refer to as a kind of LLM OS (operating system).”
OpenAI has always been on the side of language, firmly believing that text-based models would be the next frontier and the base for building smarter AI models – or possibly the first AGI.
“I sort of felt with AGI that it wasn’t clear how it was going to happen. It was very academic and one would think about different approaches. Now I think it’s very clear. There’s a lot of space that everyone is trying to fill,” said Karpathy.
The current focus revolves around the development of what Karpathy calls an LLM OS – an operating system designed to integrate various modalities such as text, images, and audio, and at the core, or CPU, is the LLM Transformer, with the RAM as the context length.
Agree to Disagree
The disagreement has continued over the past year. Meta AI chief Yann LeCun disagreed with Sutskever, and possibly with Karpathy’s assessment today. LeCun believes, “Large language models have no idea of the underlying reality that language describes.” He emphasised that a majority of human knowledge is nonlinguistic.
Gary Marcus and other AI scientists too agreed with the notion that “LLMs don’t reliably understand the world”. The problem is that regardless of this, the focus of current AI research is largely on the basis of text and language.
Weighing in his thoughts on a similar discussion, Francois Chollet, the creator of Keras and deep learning researcher at Google said, “Language can be thought of as the *operating system* of the mind. It is not the *substrate* of the mind – you can think perfectly well without language, much like you can still run programs on a computer without an OS (albeit with much more difficulty).”
This was in response to LeCun recalling a quote: “This language system seems to be distinct from regions that are linked to our ability to plan, remember, reminisce on past and future, reason in social situations, experience empathy, make moral decisions, and construct one’s self-image.” He explained that everyday cognitive tasks exist without the need for language.
Chollet further said that language significantly streamlines and improves cognition, similar to how an OS simplifies and enhances computing. It aids in creating, retaining, and examining thoughts and memories. Without language, the ability to string together intricate thoughts or recall distant memories would be notably challenging, but not impossible.
The AGI Debate Continues
LeCun sumed it up saying, “I’d say language is not the OS of the mind, it’s the shell. You can have a perfectly functional OS without a shell.”
Subbarao Kambhampati, AI researcher and professor at ASU, puts it wittingly, “..and just as the easiest way to hack a computer is to hack its OS, the easiest way to hack the mind is via language. This is why, understanding and securing the OS err.. language is essential even if it is not the intelligence.”
He agrees with Chollet that the role of language has evolved beyond external communication, which he also explained in one of his papers. Furthermore, Kambhampati finds it intriguing that we feel compelled to assign linguistic descriptions to inherently non-verbal phenomena, including emotions, dance, and sports manoeuvres like the pick and roll.
“So yes, Chollet’s operating system metaphor is quite apt,” he added.
On the other hand, Kambhampati told AIM that his views on LLMs align with LeCun. “We both feel that there is no reason to believe that these n-gram models would be able to reason beyond what they actually do,” he said.
Similar to LeCun’s vision of autonomous machine intelligence, he also proposes a world model, with different models connected together to a larger model, for bringing out information. “LLMs can only guess, they cannot verify.”
It all seems to fall into place. Language models may be the OS of the future, as Karpathy believes, and unknowingly the experts agree a lot.