What can we expect from GPT-4?

GPT-4 will not have 100 trillion parameters.

Going by the release cycle of the GPT franchise, the launch of the fourth generation is imminent, if not overdue. Last year, Sam Altman, the CEO of OpenAI, in a Q&A session at AC10 online meetup, spoke about the impending GPT-4 release. The release is probably on tap for July-August this year. However, OpenAI has kept a tight lid on the release date, and there is no definitive information available in the public domain on the same. But, one thing is for sure: GPT-4 will not have 100 trillion parameters.

GPT-3, released in May 2020, has 175 billion parameters. The third generation in the GPT-n series uses deep learning to produce human-like text. On September 22, 2020, Microsoft licensed the exclusive use of GPT-3. Based on the available information and Sam Altman’s statements at the Q&A session, we have compiled a list of improvements to expect in GPT-4.

Size doesn’t matter

Large language models like GPT-3 have achieved outstanding results without much model parameter updating. Though GPT-4 is most likely to be bigger than GPT-3 in terms of parameters, Sam Altman has clarified that size won’t be the differentiator for the next generation of OpenAI’s autoregressive language model. The parameter figures are likely to fall between GPT-3 and Gopher; between 175 billion-280 billion.

NVIDIA and Microsoft’s love-child Megatron-Turing NLG held the title of the largest dense neural network at 530 billion parameters (roughly 3x GPT-3) until Google’s PaLM (540 billion parameters) took the cake. Interestingly, smaller models such as Gopher (280 billion parameters) and Chinchilla (70 billion parameters) have outperformed MT-NLG across several benchmarks.

In 2020, OpenAI’s Jared Kaplan and the team claimed performance improved with the number of parameters. The PaLM model showed performance improvements from scale have not yet plateaued. However, Sam Altman has hinted that OpenAI is taking a different approach. He said OpenAI would no longer focus on making extremely large models but rather on getting the most out of smaller models. The AI research lab will look at other aspects — such as data, algorithms, parameterisation, or alignment — to bring significant improvements.

GPT-4 – a text-only model

Multimodal models are the deep learning models of the future. Because we live in a multimodal world, our brains are multisensory. Perceiving the world in only one mode at a time severely limits AI’s ability to navigate and comprehend it. Making GPT-4 a text-only model could be an attempt to push language models to their limits, adjusting parameters like model and dataset size before moving on to the next generation of multimodal AI.


Sparse models that use conditional computation in different parts of the model to process different inputs have been successful. Such models scale easily beyond the 1 trillion parameter mark without incurring high computing costs. However, the benefits of MoE approaches taper off on very large models. GPT-4, like GPT-2 and GPT-3, will be a dense model. In other words, all parameters will be used to process any given input.


Assuming that GPT-4 could be larger than GPT-3, the number of training tokens required to be compute-optimal (according to DeepMind’s findings) could be around 5 trillion– an order of magnitude greater than current datasets. The number of FLOPs required to train the model to achieve minimal training loss would be 10–20x that of GPT-3. In the Q&A, Altman has said GPT-4 would require more computing than GPT-3. OpenAI will focus on optimising variables than scaling the model. 

In alignment

The OpenAI’s north star is a beneficial AGI. The OpenAI is likely to build on the learnings from InstructGPT models, which are trained with humans in the loop. InstructGPT was deployed as the default language model on OpenAI’s API and is much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through their alignment research. However, the alignment was limited to OpenAI employees and English-speaking labellers. GPT-4 is likely to be more aligned with humans compared to GPT-3.

Download our Mobile App

Sri Krishna
Sri Krishna is a technology enthusiast with a professional background in journalism. He believes in writing on subjects that evoke a thought process towards a better world. When not writing, he indulges his passion for automobiles and poetry.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.