Listen to this story
When GPT-3 was first launched in 2020, users were surprised with the huge performance leap from its predecessor, GPT-2. It’s been over two years since OpenAI has been discreet about GPT-4—only letting out dribs of information, remaining silent for most of the time.
But not anymore.
As people have been talking about this for months, several sources hint that it’s already out. Hopefully, sometime from December to February, we might be able to see the new model.
There’s nothing that the model can’t do. But looks like the model is missing out on some elements—or not.
Rise of GPT models
In May 2020, AI research laboratory OpenAI unveiled the largest neural network ever created—GPT-3—in a paper titled, ‘Language Models are Few Shot Learners’. The researchers released a beta API for users to toy with the system, giving birth to the new hype of generative AI.
People were generating eccentric results. The new language model could transform the description of a web page into the corresponding code. It emulates the human narrative, by either writing customised poetry or turning into a philosopher—predicting the true meaning of life. There’s nothing that the model can’t do. But there’s also a lot it can’t undo.
As GPT-3 isn’t that big of a deal for some, the name remains a bit ambiguous. The model could be a fraction of the futuristic bigger models that are yet to come.
American firm Cerebras’ CEO Andrew Feldman said, “From talking to OpenAI, GPT-4 will be about 100 trillion parameters”. Unsurprisingly, this left users excited.
Sources say that Open AI was focused on optimising data and compute per Chinchilla-like compute-optimal laws, instead of using parameters. Moreover, the model would be text-only and aligned with human preference such as instructGPT.
The bigger the better
The bitter lesson in AI—in the words of DeepMind’s researcher Richard Sutton is—“The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin.” All we need to see if this holds up in the future.
Currently, GPT-3 has 175 billion parameters, which is 10x faster than any of its closest competitors.
Also read, GPT-3 Is Quietly Damaging Google Search
The increase in the number of parameters of 100-fold from GPT-2 to GPT-3 has brought a qualitative leap between the two models. It’s evident that GPT-4 can be notably bigger than GPT-3—at least in parameters—with qualitative differences. GPT-3 can learn to learn, but it’s almost astounding to predict how GPT-4 would work.
GPT-4 might do things GPT-3 can’t do
On August 20, 2022, Robert Scoble tweeted on how OpenAI was giving the beta access of GPT-4 to a small group which was close to the AI firm. Scoble said, “A friend has access to GPT-4 and can’t talk about it due to NDAs. Was about to tell me everything about it and then remembered who he was talking to.”
Since this becomes anecdotal evidence of sorts, such a perspective could be influenced by excitement or a lack of a testing method that can even be reliable.
As language models advance every year, users would certainly expect enhanced performance. If the training mainly relies on perception, the claims above might be a significantly larger leap than the shift from GPT-2 to GPT-3.
Meanwhile, a user remained sceptical, sparking further discussion on how GPT-4 can turn work done on GPT-3 obsolete.
Open AI founder Sam Altman himself tweeted:
From Scoble’s claim to the company’s CEO talking about the Turing test—which deals with the question of whether machines can think—things might have turned out interesting.
Further, the Turing test comes with historical relevance, which signifies the limits of intelligence in machines. As researchers claim that no AI system can pass the test, it’s evident that an advanced system such as GPT-4 would surely put up a fight.
Deflating the first reason, the Turing test is regarded as obsolete in general terms. It’s a test of deception so that an AI could pass it without possessing intelligence in any human sense.
Also read, What’s Bigger Than GPT3? It’s “Jurassic”
Reddit user Igor Baikov posted that GPT-4 would be very sparse or large, considering the company’s history of building a denser model. It would certainly deem meaningless when directly compared with the other popular models such as LaMDA, GPT-3, and PaLM.
The possibility of GPT-4 being multimodal—such as accepting audio, text, image, and even video inputs—is anticipated. Moreover, there is an assumption that audio datasets from Open AI’s Whisper will be utilised to create the textual data needed to train GPT4.
Also read, OpenAI’s Whisper Might Hold The Key To GPT4
The major plot twist, however, is whether this entire article was written by GPT-4.