NVIDIA Releases Eighth Generation Of Its Popular Conversational AI Software TensorRT

The latest version of TensorRT brings BERT-Large inference latency down to 1.2 milliseconds.
NVIDIA Releases Eighth Generation Of Its Popular Conversational AI Software TensorRT
NVIDIA recently released the eighth generation of its popular AI software TensorRT which cuts inference time in half for language queries — enabling developers to build the best-performing search engines, ad recommendations and chatbots and deliver them from the cloud to the edge.  TensorRT 8 is now generally available and free of charge to members of the NVIDIA developer programme. The latest versions of plug-ins, samples, and parsers are available on the TesorRT GitHub repository.  https://youtu.be/0e5TRStkkLM What's new?  The latest version of TensorRT brings BERT-Large inference latency down to 1.2 milliseconds with new optimisations. BERT-Large is one of the world's most widely used transformer-based models.  Further, it delivers 2x accuracy for
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Amit Naik
Amit Naik
Amit Raja Naik, known as the 'AI Human,' serves as the editor at AIM Media House, where he leads a team of talented tech journalists who are driving and shaping technology conversations across India and around the world.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed