NVIDIA has announced that its AI computing platform has once again smashed performance records in the latest round of MLPerf. This, in turn, extended the company’s lead on the industry’s only independent benchmark measuring AI performance of hardware, software and services.
NVIDIA won every test across all six application areas for data centre and edge computing systems in the second version of MLPerf Inference. The tests expand beyond the original two for computer vision to include four covering the fastest-growing areas in AI: recommendation systems, natural language understanding, speech recognition and medical imaging.
Organisations across a wide range of industries are already tapping into the NVIDIA A100 Tensor Core GPU’s exceptional inference performance to take AI from their research groups into daily operations. Financial institutions are using conversational AI to answer customer questions faster, along with retailers using AI to keep shelves stocked. Healthcare providers, on the other hand, are using AI to analyse millions of medical images to more accurately identify the disease and help save lives.
When asked NVIDIA, Ian Buck, the general manager and VP of Accelerated Computing stated, “We’re at a tipping point as every industry seeks better ways to apply AI to offer new services and grow their business.”
“The work we’ve done to achieve these results on MLPerf gives companies a new level of AI performance to improve our everyday lives,” added Buck.
The latest MLPerf results come as NVIDIA’s footprint for AI inference has grown dramatically. Five years ago, only a handful of leading high-tech companies used GPUs for inference. Now, with NVIDIA’s AI platform available through every major cloud and data centre infrastructure provider, companies representing a wide array of industries are using its AI inference platform to improve their business operations and offer additional services.
Additionally, for the first time, NVIDIA GPUs now offer more AI inference capacity in the public cloud than CPUs. Total cloud AI inference computes capacity on NVIDIA GPUs has been growing roughly 10x every two years.
NVIDIA Takes AI Inference to New Heights
NVIDIA A100, introduced earlier this year and featuring third-generation Tensor Cores and Multi-Instance GPU technology, increased its lead on the ResNet-50 test, beating CPUs by 30x versus 6x in the last round. Additionally, A100 outperformed the latest CPUs by up to 237x in the newly added recommender test for data centre inference, according to the MLPerf Inference 0.7 benchmarks.
This means a single NVIDIA DGX A100 system can provide the same performance as about 1,000 dual-socket CPU servers, offering customers extreme cost-efficiency when taking their AI recommender models from research to production.
The benchmarks also show that NVIDIA T4 Tensor Core GPU continues to be a solid inference platform for mainstream enterprise, edge servers and cost-effective cloud instances. NVIDIA T4 GPUs beat CPUs by up to 28x in the same tests. In addition, the NVIDIA Jetson AGX Xavier is the performance leader among SoC-based edge devices.
Achieving these results required a highly optimised software stack including NVIDIA TensorRT inference optimiser and NVIDIA Triton inference serving software, both available on NGC, NVIDIA’s software catalogue.
In addition to NVIDIA’s own submissions, 11 NVIDIA partners submitted a total of 1,029 results using NVIDIA GPUs, representing over 85% of the total submissions in the data centre and edge categories.