IBM has developed the world’s first energy-efficient AI chip. The AI hardware accelerator chip’s novel design supports a range of model types and achieves superior performance in terms of power efficiency, the team claimed.
IBM revealed the chip’s details in a paper presented at the ongoing 2021 International Solid-State Circuits Virtual Conference (ISSCC). The technology can be scaled and used for several applications such as large-scale model training in the cloud, security, etc. “Such energy-efficient AI hardware accelerators could significantly increase compute horsepower, including in hybrid cloud environments, without requiring huge amounts of energy,” the company blog said.
What Is AI Accelerator
An AI accelerator is a high-performing parallel computation machine designed to process AI workloads such as neural networks with excellent efficiency. The chips have novel designs and are focused on low-precision arithmetic data flow architectures or in-memory computing capabilities.
AI Acceleration chips are gaining currency as they significantly reduce the amount of time taken to train and execute an AI model. The chips can be used to execute special tasks that are off-limits for CPUs.
IBM’s Energy Efficient AI Chips
As use cases expanded, the AI models have grown in complexity, leading to higher energy consumption and larger carbon footprint.
“But we want to change this approach and develop an entirely new class of energy-efficient AI hardware accelerators that will significantly increase compute power without requiring exorbitant energy,” IBM said.
The Big Blue are committed to the cause since 2015. Since then, IBM has seen a 2.5X improvement every year in terms of power performance. The steps taken by IBM in this regard involved:
- The creation of algorithmic techniques to enable training and inference without compromising on prediction accuracy.
- Development of chip design and architecture to build highly efficient complex compute engines that execute complex workloads.
- The team is also putting together a software stack to ensure hardware transparency to the application developer.
The new chip is highly optimised for low precision training and inference for different AI model types without risking the loss of quality at the application level.
This chip is the first to use an ultra-low precision hybrid FP8 (HFP8) format (used for training deep learning models in a state-of-art 7 nm EUV-based chip) and displayed better performance and power compared to other dedicated training and inference chips.
The chip is also the first to include power management in AI hardware accelerators. The chip’s performance within its total power budget can be maximised by slowing it down during the computation phases with high-power consumption, IBM researchers showed.
Further, the chip has high sustained utilisation at 80 percent for training and 60 percent for inference. The numbers are much higher compared to the 30 percent utilisation observed in typical GPUs.
Wrapping Up
The new AI core and chip can be used for several clouds to edge applications such as cloud training of deep learning models in vision and speech and natural language processing using 8-bit formats.
Autonomous vehicles, cameras, mobile phones could also benefit from this technology, along with federated learning at the edge for privacy, security, and compliance.