Listen to this story
The huge GPU crisis is upon us — something that Tesla chief Elon Musk had warned the tech industry about. In April this year, Musk had tweeted, “It seems like everyone and their dog is buying GPUs at this point,” pointing at the huge demand which would eventually lead to a shortage. Cut to the present, everyone wants to build AI products and companies. It’s an AI deluge of such magnitude that even a company like NVIDIA is struggling to build and provide them at the moment.
The demand for high-performance GPUs, especially the NVIDIA H100s, has skyrocketed. As of August 2023, the tech industry is grappling particularly with shortage of the highly sought-after NVIDIA H100s. The scarcity of these GPUs is significantly impacting AI companies that heavily rely on them for model training and inference tasks.
Gossip of the Valley
Andrej Karpathy from OpenAI said, “’Who’s getting how many H100s and when’ is the top gossip of the Valley right now.” Interestingly, Stephen Balaban, CEO of Lambda Labs said, “Lambda has a few thousand more H100s coming online before the end of this year — if you need 64 H100s or more, DM me.” The situation is that dire at the moment.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Various AI leaders, including Quora CEO Adam D’Angelo, and Sam Altman of OpenAI, have voiced their concerns about the GPU shortage. OpenAI has revealed that the limited GPU supply is hindering their short-term plans, including model fine-tuning and dedicated capacity. Which is possibly one of the reasons why the company is still stuck on GPT-4 and not able to fulfil the promises it made with its LLM.
Not just AI companies, several categories of organisations have a significant demand for H100 GPUs. These include startups engaged in LLMs, cloud service providers (CSPs) like Azure, GCP, and AWS, larger private clouds like CoreWeave and Lambda, and other prominent companies like Musk’s Tesla. As for Musk, he had already bought thousands of NVIDIA GPUs before anyone else to reserve it for xAI. Possibly, everyone is bidding against that and that is how he convinced Altman to buy the ai.com domain, in exchange for GPUs.
According to reports, GPT-4 was probably trained using around 10,000 to 25,000 Nvidia’s A100s. For GPT-5, Musk suggested it might require 30,000 to 50,000 H100s . In February 2023, Morgan Stanley predicted GPT-5 would use 25,000 GPUs. With such an amount of GPUs required and NVIDIA the only reliable supplier in the market, only it can make the situation better.
Who Needs How Much
According to a recent blog, OpenAI is estimated to require around 50,000 H100 GPUs, while Inflection AI is looking for approximately 22,000 units. The requirements for Meta are uncertain, but it is rumoured that they may need around 25,000 GPUs, possibly even exceeding 100,000 units.
The major cloud service providers, including Azure, Google Cloud, AWS, and Oracle, may collectively seek around 30,000 GPUs each. Private clouds like AWS Lambda and CoreWeave are also expected to demand a total of 100,000 GPUs. Other AI-focused companies such as Anthropic, Helsing, Mistral, and Character might individually need around 10,000 units.
It’s important to note that these figures are approximate estimates and some overlap may occur between cloud providers and their end customers. Considering these numbers, the total demand for H100 GPUs could be around 432,000 units. At an estimated price of $35,000 per GPU, this translates to a staggering $15 billion worth of GPUs, that all goes to NVIDIA.
Additionally, it’s worth mentioning that this estimation does not include Chinese companies like ByteDance (TikTok), Baidu, and Tencent, which are likely to have substantial demands for H800 GPUs, especially designed for the Chinese market, as well.
While the future outlook remains uncertain, the industry is hopeful that increased supply and advancements in GPU technology will eventually ease the shortage. For example, NVIDIA has been talking about releasing A800s that would be able to power the same amount of compute for building AI models, but it is still questionable. Until then, AI companies must navigate this challenging period by exploring alternative GPU options and partnerships to continue their crucial work in the field of artificial intelligence.
GPU scarcity is the moat
To make things worse, experts in the industry fear that the current GPU scarcity may cause a self-reinforcing cycle, where scarcity itself becomes a moat, leading to further GPU hoarding and exacerbating the shortage. Probably that is why Musk hoarded the GPUs in the first place. The next-generation H100 successor is not expected until late 2024, adding to the concerns.
Acquiring H100s has emerged as a significant concern for AI companies, hindering their operations and causing delays in product rollouts and model training. The AI boom’s unprecedented demand for computational power has exacerbated the situation, leading to a scarcity of essential components used in GPU manufacturing.
NVIDIA has been backing almost every AI startup in the world. It seems that the company was funding startups so that they could start the company and buy the GPUs. Now, it has established a GPU monopoly in the market and dependency of others on itself, the onus is on the chip giant to fulfil the demand of the market.
But creating GPUs involves a complex manufacturing process that requires various critical components. Memory, interconnect speed (such as InfiniBand), caches, and cache latencies play vital roles in determining the performance of GPUs. The shortage of any of these components can lead to a delay in GPU production, contributing to the overall scarcity.