Chip War: NVIDIA and Intel Accelerator Face Off

NVIDIA’s AI inference platforms might give Intel a run for their money.
Listen to this story

NVIDIA recently concluded its GPU technology conference (GTC), and this year’s was a big one. Along with announcing the launch of rentable AI supercomputers in the form of DGX Cloud, it also announced a new lineup of accelerator chips—the NVIDIA H100 NVL, L4, L40, and Grace Hopper

These chips are clearly targeted at the AI boom, with each individual model targeted at unique inferencing workloads. The most significant among them is the NVIDIA H100 NVL for LLM deployment and the NVIDIA Grace Hopper for recommendation models. NVIDIA also announced the launch of cuLitho, a set of libraries for GPU-accelerated lithography. 

With this new lineup of chips, it seems that NVIDIA is moving further away from making GPGPUs (general purpose GPUs), capitalising on the AI wave to realise their diversification dreams. While this move is sure to expand the size of NVIDIA’s Total Addressable Market (TAM), it means that the company needs to take on a behemoth to do so—Intel

Intel has established somewhat of a foothold in the market of specialised compute after the acquisition of Habana and Tower Semiconductor. Will NVIDIA compete directly with Intel, or can the two of them coexist peacefully?

Gains from specialisation

As mentioned previously, the four new accelerators are built from the ground up for certain workloads, with each chip occupying a niche in AI compute. They are also built specifically to work well with NVIDIA’s tech stack of inference software, which includes offerings such as ‘TensorRT’ and ‘Triton’ to parallelise large inference workloads. 

NVIDIA already has a considerable presence in the AI compute middleware market, with CUDA becoming the de facto way to interface closely with their omnipresent GPUs. Specialising its new hardware to work well with this software stack amplifies the impact of silicon-level optimisation, thus improving performance and efficiency. 

These optimisations also work to specialise the chip for certain workloads over others. For instance, L4 is specialised for AI video that claims to deliver up to 120 times the performance of CPUs, providing 99% better energy efficiency. However, this is for a very specific workload—AI-powered video performance

The L4 is an accelerator chip for enhanced video decoding, streaming, and generating AI video, likely developed to power the NVIDIA–Adobe partnership. The L40 provides the same functionality for graphics and AI image generation in 2D and 3D. Reportedly, this chip performs 7x better than the previous generation in Stable Diffusion.

The H100 NVL is basically two H100 GPUs bridged together, with an emphasis on higher memory capacity equating to about 188GB of HBM2 (high-bandwidth memory) VRAM. Current SOTA LLMs have a huge requirement for VRAM to store datasets, which is tackled by NVIDIA by the release of this product. The last chip in the lineup, Grace Hopper, combines CPU (Grace) and GPU (Hopper) technology for processing giant datasets for recommender systems and LLMs. 

These kinds of improvements are only possible by creating a specific silicon die and layout for this purpose. Down to the IC level, the silicon is manufactured in such a way that the flow of data between various components is optimised specifically for AI-powered video workloads. An inference platform is a leaner version of a general-purpose GPU, pointed specifically at a certain workload to beat the competition. 

This approach should alarm decision-makers at Intel, as it seems that NVIDIA is coming after the company, at least in the field of AI compute. Currently Intel’s Xeon and Habana lineup is aimed at serving this market and, while NVIDIA’s current platforms haven’t directly challenged it yet, it seems like they just might.

The underbelly of Intel

While many know Intel for its ‘Core’ series of chips, it’s no secret that a majority of their innovations and profits come from the enterprise sector. In this large pond, Intel is trying to ride the AI wave to revitalise their enterprise sales and expand the size of their market. The most obvious move by Intel to enter AI compute was the launch of its ‘Xe Max’ GPU, which was “purpose-built for AI”.

Other offerings include AI hardware acceleration in Intel Xeon Scalable CPUs and Movidius VPUs for processing computer vision workloads. However, through their acquisition of Israeli semiconductor company ‘Habana’, Intel has gained somewhat of a foothold in the AI cloud compute market. 

While the company’s spokespersons agreed that it would “take time to gain market share from NVIDIA in cloud and data centre computing”, it has not stopped them from providing various offerings for deep learning compute. From the Gaudi platform for training workloads to the Greco platform for inference, Intel has provided a few alternatives for those who do not wish to use GPGPUs for AI compute. However, that is all set to change with the L4 lineup that provides specialised chips for narrow use-cases. 

Though, Intel might have an answer to NVIDIA’s incursion. Released after years of delays, Intel Xeon Sapphire Rapids is a lineup of Xeon CPUs made for AI workloads. Intel claims that optimisations to the silicon, along with the support for advanced matrix extension, makes these chips good at AI workloads. 

Intel claims ‘10x higher PyTorch performance for real-time inference’ using Sapphire Rapids, but Grace Hopper is coming for Xeon’s lunch. The Grace CPU has already proven that it performs 2.5x more with 3.5x more efficiency than its competitors but is, as of yet, untested against Xeon’s ‘hybrid’ approach to AI compute. 

By providing specialised compute platforms for each AI workload, NVIDIA has undermined Intel’s market strategy, provided a better alternative, and has an extensive network of partners ready to adopt it. In addition, NVIDIA’s cuLitho has been offered to ASML, TSMC and Synopsis but not Intel, further diminishing the value proposition offered by Intel Foundry Services. 

While the AI market is a big fish to fry, Intel’s fledgling efforts seem like they’re going up against insurmountable odds. From NVIDIA’s dominating presence in the cloud to experience in developing custom silicon to the optimisations that cuLitho can bring, it seems that Intel’s time in AI compute is slowly eroding away. 

Update: Article has been updated to reflect changes.

Download our Mobile App

Anirudh VK
I am an AI enthusiast and love keeping up with the latest events in the space. I love video games and pizza.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Is Sam Altman a Hypocrite? 

While on the one hand, Altman is advocating for the international community to build strong AI regulations, he is also worried when someone finally decides to regulate it