NVIDIA Announces Platform For Creating AI Avatars

Omniverse Avatar connects the company’s technologies in speech AI, computer vision, natural language understanding, recommendation engines and simulation technologies.

Santa Clara tech giant NVIDIA announced NVIDIA Omniverse Avatar, a technology platform for generating interactive AI avatars.

Omniverse Avatar connects the company’s technologies in speech AI, computer vision, natural language understanding, recommendation engines and simulation technologies. Avatars created in the platform are interactive characters with ray-traced 3D graphics that can see, speak, converse on a wide range of subjects, and understand naturally spoken intent.

Omniverse Avatar opens the door to creating AI assistants that are easily customizable for virtually any industry. These could help with the billions of daily customer service interactions — restaurant orders, banking transactions, making personal appointments and reservations, and more — leading to greater business opportunities and improved customer satisfaction.

“The dawn of intelligent virtual assistants has arrived,” said Jensen Huang, founder and CEO of NVIDIA. “Omniverse Avatar combines NVIDIA’s foundational graphics, simulation and AI technologies to make some of the most complex real-time applications ever created. The use cases of collaborative robots and virtual assistants are incredible and far-reaching.”

Omniverse Avatar is part of NVIDIA Omniverse™, a virtual world simulation and collaboration platform for 3D workflows. 

Omniverse Avatar Key Elements

Omniverse Avatar uses elements from speech AI, computer vision, natural language understanding, recommendation engines, facial animation, and graphics delivered through the following technologies: 

  • Its speech recognition is based on NVIDIA Riva, a software development kit that recognizes speech across multiple languages. Riva is also used to generate human-like speech responses using text-to-speech capabilities.
  • Its natural language understanding is based on the Megatron 530B large language model that can recognize, understand and generate human language. Megatron 530B is a pretrained model that can, with little or no training, complete sentences, answer questions of a large domain of subjects, summarize long, complex stories, translate to other languages, and handle many domains that it is not trained specifically to do.
  • Its recommendation engine is provided by NVIDIA Merlin™, a framework that allows businesses to build deep learning recommender systems capable of handling large amounts of data to make smarter suggestions.  
  • Its perception capabilities are enabled by NVIDIA Metropolis, a computer vision framework for video analytics.
  • Its avatar animation is powered by NVIDIA Video2Face and Audio2Face™, 2D and 3D AI-driven facial animation and rendering technologies.

These technologies are composed into an application and processed in real-time using the NVIDIA Unified Compute Framework. Packaged as scalable and customizable microservices, the skills can be securely deployed, managed and orchestrated across multiple locations by NVIDIA Fleet Command.

One can learn more about Omniverse Avatar here.

More Great AIM Stories

kumar Gandharv
Kumar Gandharv, PGD in English Journalism (IIMC, Delhi), is setting out on a journey as a tech Journalist at AIM. A keen observer of National and IR-related news.
MORE FROM AIM

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM