Listen to this story
Last week, we here at AIM laid down our expectations for the highly anticipated NVIDIA GTC event. And it seems like we hit the jackpot! All the announcements made by NVIDIA chief Jensen Huang at the event today suggest that in the game of hits and misses, most of them were hits.
Here, are some key takeaways from the event:
NVIDIA DGX Cloud
NVIDIA unveiled its new DGX Cloud – an AI supercomputing service that pairs NVIDIA DGX™ AI supercomputing with NVIDIA AI software, offering enterprises dedicated clusters for training advanced models in generative AI and other innovative applications. This provides businesses with immediate access to the infrastructure and software needed for training cutting-edge models.
The DGX Cloud will be hosted on Microsoft Azure, with Google Cloud and other platforms following shortly after. The service will be powered by NVIDIA AI Enterprise 3.1, featuring access to new pretrained models, optimised frameworks, and accelerated data science software libraries.
NVIDIA announced the release of three new systems specifically designed to support the Omniverse platform. The first of these systems is a new generation of workstations that will be powered by NVIDIA Ada RTX GPUs and Intel’s latest CPUs. The desktop and laptop GPUs of the RTX series will cater to the demands of the modern era of AI, design, and metaverse.
In addition, the company revealed the next generation of its OVX computing system, which is optimised to run the Omniverse platform. NVIDIA emphasised that every layer of the Omniverse stack, from chips to systems to networking and software, is a new innovation.
Finally, NVIDIA also announced its plans to provide the NVIDIA Omniverse Cloud, a fully managed cloud service, to select enterprises through a partnership with Microsoft, which will host it on Azure. The Omniverse Cloud will be integrated with Microsoft 365 productivity suite, which includes Teams, OneDrive, SharePoint, and Azure IoT Digital Twins solutions.
During the GTC 2023 event, NVIDIA unveiled a new inference platform that boasts four different configurations under a single architecture, designed to optimise different AI workloads. The L4 configuration, for instance, has been optimised for video decoding and transcoding, video call features, transcription, and real-time language translation. NVIDIA L4 will be hosted on Google Cloud.
For graphics rendering and generative AI, such as text-to-image and text-to-video, NVIDIA introduced the L40 configuration. In addition, for large language model inference, the company announced the new Hopper GPU, known as the PCIE H100, which features a dual-GPU NVL. Moreover, with supporting PCIE servers to the H100 processors, it is easy to scale out the performance.
Lastly, NVIDIA introduced the Grace Hopper, a new superchip that connects the Grace CPU and Hopper GPU, offering a high-speed processing rate of 900 GB/sec. The Grace Hopper chip is said to be ideal for processing giant datasets and large language models, further completing the stack of offerings from NVIDIA’s new inference platform.
NVIDIA AI Foundations, a cloud-based service aimed at facilitating the building of custom large language models (LLMs) and generative AI with proprietary data, was announced during the GTC 2023 event. This new service will include three models: NeMo, a language text-to-text generative model; Picasso, a visual language model-making service; and BioNeMo, which will enable medical institutes to leverage generative AI for drug discovery.
To further expand its offerings, NVIDIA has partnered with Getty Images, who will use the Picasso service to build Edify-image and Edify-video generative models. In addition, Shutterstock will be working on the development of the Edify-3D generative model, which can be used for the creation of 3D assets. Furthermore, NVIDIA announced a long-term partnership with Adobe, which will integrate generative AI for image creation, video, 3D, and animation.
Read: NVIDIA Rides The Generative AI Wave At GTC 2023
One of the major announcements made by NVIDIA is the development of a quantum control link, in collaboration with Quantum machines, which connects NVIDIA GPUs to quantum computers to perform error correction at extremely high speeds. This new platform, which leverages the newly open-sourced CUDA Quantum, provides a revolutionary new high-performance and low-latency architecture for researchers working in quantum computing.
In addition to this, the company also announced the release of RAFT, a new library that accelerates indexing, data loading, and batch retrieval of neighbours for a single query. NVIDIA is bringing RAFT to Meta’s FAISS, Milvus, and Redis, which are vector databases that will be important for organisations building proprietary large language models.
NVIDIA Triton Management Service is another new software that was announced, which automates the scaling of Triton inference instances across the data centre.
The company also unveiled two new cloud-scale acceleration libraries: CV-CUDA for computer vision and VPF for video processing. CV-CUDA includes 30 computer vision operators for detection, segmentation, and classification, while VPF is a python video encode and decode acceleration library.
NVIDIA also announced the release of a new version of Parabricks (4.1), which is a suite of AI-accelerated libraries for end-to-end genomic analysis in the cloud or in-instrument. This update brings new features and improved performance to Parabricks, making it easier for researchers and scientists to analyse genomic data.
Furthermore, NVIDIA announced a partnership with the medical technology company Medtronic to develop software-defined medical devices. The platform will be used for Medtronic systems, ranging from surgical navigation to robotic-assisted surgery.
Lastly, NVIDIA also released cuLitho, a library for computational lithography. The library accelerates computational lithography by over 40 times and was built in collaboration with industry leaders such as ASML, TSMC, and Synopsys.