Published on October 10, 2022
In AI News

Meta Open Sources An AI Inference Engine That Works On Both NVIDIA & AMD GPUs

The company said that the vision behind the framework is to support high speed while maintaining simplicity

By Bhuvana Kamath

GPUs play a crucial role in delivering computational power for the deployment of AI models, especially for large-scale pretrained models. Due to their platform-specific nature, AI practitioners at present have minimal choice in selecting high-performance GPU inference solutions. Due to the dependencies in complex runtime environments, maintaining the code that makes up these solutions becomes challenging.

In order to address these industry challenges, Meta AI has developed AITemplate (AIT), a unified open-source system with separate acceleration back ends for both AMD and NVIDIA GPU hardware technology.

With the help of AITemplate, it is now possible to run performant inference on hardware from both GPU providers. AITemplate is a Python framework that converts AI models into high-performance C++ GPU template code for a faster inference.

As mentioned in the company’s blog post, researchers at Meta AI used AITemplate to improve performance up to 12x on NVIDIA GPUs and 4x on AMD GPUs compared with eager mode within PyTorch. The AITemplate system consists of a front-end layer that performs various graph transformations and a back-end layer producing C++ kernel templates for the GPU target. The company stated that the vision behind the framework is to support high-speed while maintaining simplicity.

Moreover, it delivers close to hardware-native Tensor Core (NVIDIA GPU) and Matrix Core (AMD GPU) on widely used AI models such as transformers, convolutional neural networks, and diffusers. At present, AITemplate is enabled on NVIDIA’s A100 and AMD’s MI200 GPU systems, both of which are often used in data centers for research facilities, technology companies, cloud computing service providers, among others.

Source: AITemplate optimizations, Meta AI

The blog reads, “AITemplate offers state-of-the-art performance for current and next-gen NVIDIA and AMD GPUs with less system complexity. However, we are only at the beginning of our journey to build a high-performance AI inference engine. We also plan to extend AITemplate to additional hardware systems, such as Apple M-series GPUs, as well as CPUs from other technology providers.”

📣 Want to advertise in AIM? Book here

Bhuvana Kamath

I am fascinated by technology and AI’s implementation in today’s dynamic world. Being a technophile, I am keen on exploring the ever-evolving trends around applied science and innovation.