Last updated May 3, 2024
In AI News & Update

Meta Spends $30 Billion on a Million NVIDIA GPUs to Train its AI Models

Larger than the Apollo Moon Mission, which cost about $26 billion between 1968 and 1973.

Share

Published on May 3, 2024

by Donna Eva

Listen to this story

In a “staggering” revelation, Meta AI chief Yann LeCun confirmed that Meta has obtained $30 billion worth of NVIDIA GPUs to train their AI models. Enough to run a small nation or even put a man on the moon in 1969.

Speaking at the Forging the Future of Business with AI Summit organised by Imagination in Action, LeCun said that more variations of Llama-3 would be out over the next few months, with training and fine-tuning currently taking place.

“Despite all the computers we have on our hands, it still takes a lot of time to fine-tune, but a bunch of variations on those models are going to come out over the next few months,” he said.

Speaking of fine-tuning and training, host John Werner stated that Meta had bought an additional 500,000 GPUs from NVIDIA, taking the total number of NVIDIA GPUs up to a million, with a retail value of $30 billion.

Combining the total costs of the GPUs so far, Werner pointed out that the training of the model exceeded the costs of the entire Apollo space programme, which back in the 1960s, amounted to about $25.4 billion.

Agreeing, LeCun said, “Yeah, it’s staggering, isn’t it? A lot of it, not just training, but deployment, is limited by computational abilities. One of the issues that we’re facing is the supply of GPUs and the cost of them at the moment.

Obviously, adjusted for inflation, the Apollo programme still outsells the Meta in terms of how much was actually spent, with roughly $257 billion spent. But it’s no secret that the cost of GPUs is a continuously growing expense for AI companies.

Recently, OpenAI’s Sam Altman said that he doesn’t care if the company spends upwards of $50 billion a year in developing AGI. The company, as of March, employs as many as 720,000 NVIDIA H100 GPUs for Sora alone. This amounts to about $21.6 billion.

Similarly, all big tech companies are hoping to expand how many GPUs they can obtain by the end of the year, or even by 2025.

Microsoft is aiming for 1.8 million GPUs by the end of the year. Meanwhile, OpenAI hopes to use 10 million GPUs for their latest AI model.

In the meantime, NVIDIA has also been churning out GPUs, with their latest DGX H200 GPU being hand-delivered by CEO Jensen Huang to Altman.

Coming back to LeCun, he pointed out that the need of the hour was the ability to upscale learning algorithms so they could be parallelised across several GPUs. “Progress on this has been kind of slow in the community, so I think we’re kind of waiting for breakthroughs there,” he said.

With that occurring, costs could potentially lower for AI companies, though with increasingly fast upscaling overall, demand could remain the same.

Access all our open Survey & Awards Nomination forms in one place