Hype can be a dangerous thing. Too much of it can tank your shares, kill your product launch effectively and turn the excitement on its head. The wave of excitement around generative AI that OpenAI is riding has effectively become an introduction to LLMs for most of the world. And turned everyone’s eyes on Sam Altman’s OpenAI.
When Altman first confirmed that OpenAI was in fact building the successor to its benchmark model GPT3, the AI community was excited. GPT3 was a state-of-the-art language model with 175 billion parameters – holding the record for the largest-ever AI model then. And since its release in 2020, speculation has been rife around GPT4. It might be bigger, faster, smarter? May be free from incorrect responses. It looked like nothing short of a perfect product could satisfy.
Release of GPT-4
Even after the announcement yesterday, Altman was eager to admit how much of a perfect model GPT4 wasn’t. ‘It is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it,’ he tweeted saying ‘we really appreciate feedback on its shortcomings’ to his Twitter followers.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
In the same interview, Altman admitted that he hadn’t expected the reaction that followed ChatGPT’s release. ‘I can see why DALL-E surprised people but was genuinely confused why ChatGPT did. We put GPT-3 out almost three years ago, put it into an API and the incremental update from that to ChatGPT should have been predictable and want to do more introspection on why I was miscalibrated on that,’ he added.
ChatGPT runs on GPT 3.5. OpenAI had seemingly planned for this model to go slightly under the radar since it was to be a precursor to GPT4. Since ChatGPT massified the model, the average person using GPT4 may find it to be not that different from the GPT 3.5 (ChatGPT).
Download our Mobile App
Altman also said that he expected less hype and fewer users for GPT4 than was actually the case when they prepared to release ChatGPT to the world. ‘Less hype is probably better as a general rule. One of the strange things about these technologies is they are impressive but not robust. Use them in a demo you think ‘good to go’ but use them longer term and you see the weaknesses. But it is going to get better,’ he said.
Altman was very aware of how fallible LLMs were in reality. They hallucinate a lot – so does GPT4 but less than its predecessors.
And OpenAI clearly had learned its lessons from Google’s Bard launch. In a bid to get ahead in the race, Sundar Pichai announced a rival chatbot which was expected to be better than ChatGPT. Unfortunately, despite the message that it was smarter than OpenAI’s product (it would be connected to the internet), the demo was a flop. (The demo video showed one of Bard’s hallucinated responses to the world)
Besides, OpenAI already had a big chunk of people including investors, media and users using ChatGPT who would predictably be waiting for GPT4. By January, the chatbot had set a record for the fastest growing user base for any platform – it was estimated that the chatbot had amassed 100 million monthly active users in two months. Since GPT4 is available on ChatGPT Plus, OpenAI has a potential user base of 100 million. And it is hard to meet the expectations of this many.
Still much better than GPT-3.5
This explains why OpenAI’s blog carefully described the distinction between GPT-3.5 and GPT4 during its launch saying it won’t be noticeable during a ‘casual conversation.’ It stated, ‘The difference comes out when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.’
But for anyone in the know, GPT4 is a much bigger improvement over GPT-3.5. According to OpenAI GPT4 is 82% less likely to respond to requests for content that OpenAI does not allow and 60% less likely to come up with hallucinations.
It also outperforms ChatGPT on human tests like the Uniform Bar Exam by a mile – GPT4 ranks in the 90th percentile and ChatGPT ranked in the 10th percentile. In the Biology Olympiad, GPT4 ranks in the 99th percentile and ChatGPT ranks in the 31st.
Oren Etzioni, CEO and founder of Allen Institute for AI called the model a benchmark and rightly so. ‘The continued improvements along many dimensions are remarkable. GPT-4 is now the standard by which all foundation models will be evaluated,’ he stated.
For all the disappointments that GPT4 may have caused – the multimodal feature is still being researched, OpenAI’s paper around the model reveals practically nothing and it does still hallucinate. But the fact remains that there is no AI model better than GPT4. It only asks for users to drop preconceived notions and have an open mind.
Hype reaches fever pitch
But since the launch of ChatGPT in November last year, things are looking vastly different. OpenAI has become a hot property and Microsoft is pouring billions into it. The popularity of the chatbot among the general public was enough to worry Google about their search. And everyone was looking at GPT4.
But Altman has done a lot to temper expectations. During an interview with StrictlyVC in January, Altman went on to say that ‘people are begging to be disappointed and they will be.’ On rumours around the size and scale of the model being bigger than 100 trillion parameters, Altman responded unfavourably. ‘The GPT-4 rumour mill is a ridiculous thing, I don’t know where it all comes from. It has been going for six months at this volume,’ he said.