MITB Banner

NVIDIA’s Large Language AI Models Are Now Available To Businesses Worldwide

Nvidia doubles down on artificial intelligence language models and inference as a platform for the Metaverse, in data centres, the cloud, and at the edge.

Share

NVIDIA has set the stage for businesses worldwide to design and deploy large language models (LLMs). This design enables them to develop domain-specific chatbots, personal assistants, and other artificial intelligence systems.

The firm announced the NVIDIA NeMo Megatron framework for training trillion-parameter language models. In addition, NVIDIA Triton Inference Server offers multi-node distributed inference features for new domains and languages. When used in conjunction with NVIDIA DGX systems, these technologies provide an enterprise-grade solution for simplifying the construction and deployment of massive language models.

“Large language models have demonstrated their flexibility and capability, answering deep domain questions, translating languages, comprehending and summarising documents, writing stories, and computing programmes all without specialised training or supervision,” said Bryan Catanzaro, NVIDIA’s vice president of applied deep learning research. “Developing huge language models for new languages and domains is perhaps the largest supercomputing use to date, and these capabilities are now accessible to the world’s corporations.”

Speed LLM Development 

NVIDIA NeMo Megatron builds on Megatron, an open-source project led by NVIDIA researchers that implements massive transformer language models at scale. Megatron 530B is the most customisable language model in the world.

Enterprises can overcome the obstacles associated with developing complex natural language processing models using the NeMo Megatron framework. It is designed to scale out across NVIDIA DGX SuperPOD’s large-scale accelerated computing infrastructure. With data processing libraries that ingest, curate, organise, and clean data, NeMo Megatron automates the complexity of LLM training. Leveraging powerful data, tensor, and pipeline parallelisation technologies enables the training of huge language models to be efficiently distributed across thousands of GPUs. Enterprises can utilise the NeMo Megatron framework to teach LLMs in their topics and languages of interest.

Real-Time LLM Inference 

New multi-GPU, multi-node capabilities in the newest NVIDIA Triton Inference Server enable real-time scaling of LLM inference workloads across several GPUs and nodes. The models demand more memory than a single GPU or even a large server with numerous GPUs can provide, and inference must be performed quickly for applications to be relevant. Megatron 530B may now be run on two NVIDIA DGX systems, reducing processing time from nearly a minute on a CPU server to half a second, enabling the deployment of LLMs for real-time applications.

Custom Language Models 

SiDi, JD Explore Academy, and VinBrain are among the early adopters constructing huge language models using NVIDIA DGX SuperPOD. SiDi, one of Brazil’s leading research and development organisations for artificial intelligence, has modified the Samsung virtual assistant for 200 million Portuguese speakers.

“The SiDi team has considerable expertise developing artificial intelligence (AI) virtual assistants and chatbots, which require both high AI performance and specialised software that is trained and tuned to the shifting nuances of human language,” said John Yi, SiDi’s CEO. “NVIDIA DGX SuperPOD is suitable for powering our team’s advanced work and enabling us to provide world-class AI services to Brazilian Portuguese speakers.”

JD Explore Academy, the research and development arm of JD.com, a leading supply chain technology and service provider, is utilising NVIDIA DGX SuperPOD to develop natural language processing for use in smart customer service, smart retail, smart logistics, the Internet of Things, and healthcare, among other applications.

VinBrain, a healthcare artificial intelligence firm based in Vietnam, used a DGX SuperPOD to develop and deploy a clinical language model for radiologists and telemedicine in 100 hospitals. It is currently used by over 600 healthcare practitioners.

Availability

NVIDIA Triton is available through the NVIDIA NGC catalogue, a repository for GPU-accelerated AI software that includes frameworks, toolkits, pretrained models, and Jupyter Notebooks, as well as through the Triton GitHub repository. Additionally, Triton is a component of NVIDIA’s AI Enterprise software stack, which NVIDIA optimises, certifies, and supports. As a result, enterprises can utilise the software suite to execute language model inference on commercially available accelerated servers in on-premises data centres and private clouds.

Share
Picture of Dr. Nivash Jeevanandam

Dr. Nivash Jeevanandam

Nivash holds a doctorate in information technology and has been a research associate at a university and a development engineer in the IT industry. Data science and machine learning excite him.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.