MITB Banner

Chip War: NVIDIA and Intel Accelerator Face Off

NVIDIA’s AI inference platforms might give Intel a run for their money.
Share
Listen to this story

NVIDIA recently concluded its GPU technology conference (GTC), and this year’s was a big one. Along with announcing the launch of rentable AI supercomputers in the form of DGX Cloud, it also announced a new lineup of accelerator chips—the NVIDIA H100 NVL, L4, L40, and Grace Hopper

These chips are clearly targeted at the AI boom, with each individual model targeted at unique inferencing workloads. The most significant among them is the NVIDIA H100 NVL for LLM deployment and the NVIDIA Grace Hopper for recommendation models. NVIDIA also announced the launch of cuLitho, a set of libraries for GPU-accelerated lithography. 

With this new lineup of chips, it seems that NVIDIA is moving further away from making GPGPUs (general purpose GPUs), capitalising on the AI wave to realise their diversification dreams. While this move is sure to expand the size of NVIDIA’s Total Addressable Market (TAM), it means that the company needs to take on a behemoth to do so—Intel

Intel has established somewhat of a foothold in the market of specialised compute after the acquisition of Habana and Tower Semiconductor. Will NVIDIA compete directly with Intel, or can the two of them coexist peacefully?

Gains from specialisation

As mentioned previously, the four new accelerators are built from the ground up for certain workloads, with each chip occupying a niche in AI compute. They are also built specifically to work well with NVIDIA’s tech stack of inference software, which includes offerings such as ‘TensorRT’ and ‘Triton’ to parallelise large inference workloads. 

NVIDIA already has a considerable presence in the AI compute middleware market, with CUDA becoming the de facto way to interface closely with their omnipresent GPUs. Specialising its new hardware to work well with this software stack amplifies the impact of silicon-level optimisation, thus improving performance and efficiency. 

These optimisations also work to specialise the chip for certain workloads over others. For instance, L4 is specialised for AI video that claims to deliver up to 120 times the performance of CPUs, providing 99% better energy efficiency. However, this is for a very specific workload—AI-powered video performance

The L4 is an accelerator chip for enhanced video decoding, streaming, and generating AI video, likely developed to power the NVIDIA–Adobe partnership. The L40 provides the same functionality for graphics and AI image generation in 2D and 3D. Reportedly, this chip performs 7x better than the previous generation in Stable Diffusion.

The H100 NVL is basically two H100 GPUs bridged together, with an emphasis on higher memory capacity equating to about 188GB of HBM2 (high-bandwidth memory) VRAM. Current SOTA LLMs have a huge requirement for VRAM to store datasets, which is tackled by NVIDIA by the release of this product. The last chip in the lineup, Grace Hopper, combines CPU (Grace) and GPU (Hopper) technology for processing giant datasets for recommender systems and LLMs. 

These kinds of improvements are only possible by creating a specific silicon die and layout for this purpose. Down to the IC level, the silicon is manufactured in such a way that the flow of data between various components is optimised specifically for AI-powered video workloads. An inference platform is a leaner version of a general-purpose GPU, pointed specifically at a certain workload to beat the competition. 

This approach should alarm decision-makers at Intel, as it seems that NVIDIA is coming after the company, at least in the field of AI compute. Currently Intel’s Xeon and Habana lineup is aimed at serving this market and, while NVIDIA’s current platforms haven’t directly challenged it yet, it seems like they just might.

The underbelly of Intel

While many know Intel for its ‘Core’ series of chips, it’s no secret that a majority of their innovations and profits come from the enterprise sector. In this large pond, Intel is trying to ride the AI wave to revitalise their enterprise sales and expand the size of their market. The most obvious move by Intel to enter AI compute was the launch of its ‘Xe Max’ GPU, which was “purpose-built for AI”.

Other offerings include AI hardware acceleration in Intel Xeon Scalable CPUs and Movidius VPUs for processing computer vision workloads. However, through their acquisition of Israeli semiconductor company ‘Habana’, Intel has gained somewhat of a foothold in the AI cloud compute market. 

While the company’s spokespersons agreed that it would “take time to gain market share from NVIDIA in cloud and data centre computing”, it has not stopped them from providing various offerings for deep learning compute. From the Gaudi platform for training workloads to the Greco platform for inference, Intel has provided a few alternatives for those who do not wish to use GPGPUs for AI compute. However, that is all set to change with the L4 lineup that provides specialised chips for narrow use-cases. 

Though, Intel might have an answer to NVIDIA’s incursion. Released after years of delays, Intel Xeon Sapphire Rapids is a lineup of Xeon CPUs made for AI workloads. Intel claims that optimisations to the silicon, along with the support for advanced matrix extension, makes these chips good at AI workloads. 

Intel claims ‘10x higher PyTorch performance for real-time inference’ using Sapphire Rapids, but Grace Hopper is coming for Xeon’s lunch. The Grace CPU has already proven that it performs 2.5x more with 3.5x more efficiency than its competitors but is, as of yet, untested against Xeon’s ‘hybrid’ approach to AI compute. 

By providing specialised compute platforms for each AI workload, NVIDIA has undermined Intel’s market strategy, provided a better alternative, and has an extensive network of partners ready to adopt it. In addition, NVIDIA’s cuLitho has been offered to ASML, TSMC and Synopsis but not Intel, further diminishing the value proposition offered by Intel Foundry Services. 

While the AI market is a big fish to fry, Intel’s fledgling efforts seem like they’re going up against insurmountable odds. From NVIDIA’s dominating presence in the cloud to experience in developing custom silicon to the optimisations that cuLitho can bring, it seems that Intel’s time in AI compute is slowly eroding away. 

Update: Article has been updated to reflect changes.

PS: The story was written using a keyboard.
Picture of Anirudh VK

Anirudh VK

I am an AI enthusiast and love keeping up with the latest events in the space. I love video games and pizza.
Related Posts

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories

Featured

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

AIM Conference Calendar

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives. Revel in intimate events that encapsulate the heart and soul of the AI Industry.

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed