Meta Open Sources An AI Inference Engine That Works On Both NVIDIA & AMD GPUs

The company said that the vision behind the framework is to support high speed while maintaining simplicity
Listen to this story

GPUs play a crucial role in delivering computational power for the deployment of AI models, especially for large-scale pretrained models. Due to their platform-specific nature, AI practitioners at present have minimal choice in selecting high-performance GPU inference solutions. Due to the dependencies in complex runtime environments, maintaining the code that makes up these solutions becomes challenging.

In order to address these industry challenges, Meta AI has developed AITemplate (AIT), a unified open-source system with separate acceleration back ends for both AMD and NVIDIA GPU hardware technology. 

With the help of AITemplate, it is now possible to run performant inference on hardware from both GPU providers. AITemplate is a Python framework that converts AI models into high-performance C++ GPU template code for a faster inference. 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

As mentioned in the company’s blog post, researchers at Meta AI used AITemplate to improve performance up to 12x on NVIDIA GPUs and 4x on AMD GPUs compared with eager mode within PyTorch. The AITemplate system consists of a front-end layer that performs various graph transformations and a back-end layer producing C++ kernel templates for the GPU target. The company stated that the vision behind the framework is to support high-speed while maintaining simplicity.

Moreover, it delivers close to hardware-native Tensor Core (NVIDIA GPU) and Matrix Core (AMD GPU) on widely used AI models such as transformers, convolutional neural networks, and diffusers. At present, AITemplate is enabled on NVIDIA’s A100 and AMD’s MI200 GPU systems, both of which are often used in data centers for research facilities, technology companies, cloud computing service providers, among others. 

Download our Mobile App

Source: AITemplate optimizations, Meta AI

The blog reads, “AITemplate offers state-of-the-art performance for current and next-gen NVIDIA and AMD GPUs with less system complexity. However, we are only at the beginning of our journey to build a high-performance AI inference engine. We also plan to extend AITemplate to additional hardware systems, such as Apple M-series GPUs, as well as CPUs from other technology providers.”

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Bhuvana Kamath
I am fascinated by technology and AI’s implementation in today’s dynamic world. Being a technophile, I am keen on exploring the ever-evolving trends around applied science and innovation.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox