What to Expect From GPT-5

An AI agent that can do the captcha would be an interesting added capability.

Share

Illustration by Nikhil Kumar

Published on March 15, 2024

by Mohit Pandey

Listen to this story

It’s been exactly one year since Sam Altman posted a photo with his blue backpack and launched GPT-4. Cut to present, “Patience jimmy. It would be worth the wait,” was all Altman said in a response to a post on Xl ast week, asking OpenAI to release GPT-5. Nothing has happened since then.

After releasing ChatGPT based on GPT-3.5 in November 2022, it took OpenAI five months more to release GPT-4. And now, people have been speculating what the next version of OpenAI’s GPT will include and when it will drop.

Logan Kilpatrick, who recently left OpenAI, posted on X that many reporters have been reaching out to him to share the release date along with the weights.

First and foremost, the most obvious prediction for GPT-5 is that it will be multimodal, supporting text, image, audio, and video from the ground up. This has already been evident since the release of Sora. Mira Murati, the CTO of OpenAI, has said that Sora will be available for general use by the end of this year. Hopefully, it will be integrated within GPT-5 as well.

In the episode of Unconfuse Me with Bill Gates, Altman pointed out that OpenAI is on ‘this long, continuous curve’ to create newer and better models. He highlighted the importance of multimodality as the key aspect of GPT-5 that enables it to process video input and generate new videos while confirming that the work on the model has already begun.

Personalised and Better Reasonability

Altman also spoke at length with Gates about how GPT-5 would emphasise on customisation and personalisation. “The ability to know about you, your email, your calendar, how you like appointments booked, connected to other outside data sources—all of that. Those will be some of the most important areas of improvement,” said Altman.

Furthermore, he claimed that GPT-5 would have much better reasoning capabilities. “GPT-4 can reason in only extremely limited ways. Also, reliability is a concern. If you ask GPT-4 most questions 10,000 times, one of those 10,000 is probably pretty good, but it doesn’t always know which one. You’d like to get the best response of 10,000 each time,” said Altman.

“Coding is probably the only area for which we’re most excited about productivity gain today. It’s deployed massively and used at scale at this point,” said Altman, which could also be linked with the leaked Q* model, which had very high mathematical capabilities.

Smarter Than Ever Before

Altman at the recent World Government Summit in Dubai also said, “we’re going to make the model smarter, it’s going to be better at everything across the board,” talking about how GPT-5 would be faster and smarter at every general task, instead of being good at single ones.

Furthermore, it can be said that GPT-5 might have agentic capabilities and support several autonomous frameworks. This would also help OpenAI further improve on its recent collaboration with Figure, which gave ChatGPT a voice and body, further improving on-device capabilities for several edge use cases.

Increased Context Length

Given the current developments by its competitors, OpenAI should improve upon its AI models’ context length. Currently, GPT-4 has a maximum context length of 32k, and GPT-4 Turbo has increased it to 128k. On the other hand, Claude 3 Opus, which is the strongest model released by Anthropic, offers a 200k context window, and Google’s Gemini 1.5 has a 1 million tokens context window.

This is apart from the increase in the model’s parameter size. GPT-3 had 175 billion parameters, and the size of GPT-4 was speculated to be around the 1.7 trillion mark, but nothing was confirmed. People have funnily speculated that GPT -5 will have around 69 quadrillion parameters.

Sets AGI Benchmark

Speaking of Claude 3, the different models from Anthropic have been outperforming GPT-4 on several benchmarks, including MMLU and HumanEval. It also encompasses several improved capabilities in multilingual use cases.

This is where GPT-5 is expected to take the biggest stride. Discussions on X, Reddit, and HackerNews extensively discuss how OpenAI should be able to achieve near-perfect scores on several current existing benchmarks. This would also include the capabilities of incorporating techniques like internal RAG and definitely taking care of the dreaded hallucinations.

This brings up the conversation about AGI and the much-discussed job displacements that come along with AI. Altman said GPT-4.5 might automate 100 million jobs globally. But it was misconstrued as AI replacing jobs when he was indeed talking about augmenting human intelligence. Possibly, GPT-5 might augment the intelligence of 1 billion users.

Access all our open Survey & Awards Nomination forms in one place