IBM, along with researchers from the universities of Oxford Muenster, and Exeter have developed a new method to process data at unprecedented speeds and deliver AI applications with ultra-low latency. The computations are based entirely on light instead of electricity.
As of now, this model has been tested on a small scale. However, the team is optimistic that with further development, the model could achieve up to one thousand trillion multiply-accumulate (MAC) per second and per square-millimetre — at thrice the speed of AI-processors that rely on electricity.
How The System Works
The team has used photonic integrated circuits that leverage the properties of light particles for computing. This method works on a tensor core that uses a combination of photonic processing with non-von Neuman computing that can perform several AI model-based computations such as deep neural networks for computer vision with excellent energy efficiency.
Convolution is a mathematical operation performed on two functions to produce a third one that shows how the input functions modify each other’s shape. Each of the operations for a neural network involves simple arithmetic operations such as addition or multiplication.
A single neural network requires a vast number of such operations for processing just one piece of information. With conventional methods, this processing takes a lot of time and causes latency in AI systems.
The team demonstrated that a photonic tensor core could perform convolution operation in a single time step. For this experiment, they used non-volatile photonic memory devices based on phase-change memory.
Also referred to as Perfect RAM, the phase change memory presents an innovative memory technology with superior storage applications use cases that offer fast RAM speeds.
These memory devices were then used to store the convolution kernels on a chip. Further, the photonic chip-based frequency combs were used to feed the input in multiple frequencies. At a modulation speed of 14GHz, the researchers were able to obtain a processing speed of 2 TOPS, or two trillion MAC operations per second — a huge leap from the compute density obtained on the state-of-art AI systems that run on electricity and can only carry out less than a trillion operations.
Next up, the team hopes to achieve a thousand trillion MAC operations per second per square millimetre.
“Our work shows the enormous potential for photonic processing to accelerate certain types of computations such as convolutions. The challenge going forward is to string together these computational primitives and still achieve substantial end-to-end system-level performance. This is what we’re focused on now,” the team said.
Photonic In-Memory Computing
In-memory computing is a relatively new discovery. It refers to a non-von Neumann compute architecture where the memory devices are housed in the same computational memory unit. These devices are used for both processing functions as well as memory units. Since they are closely packed, the delay or latency from data shuttling between memory and processing units is eliminated.
Using photonics with IMC could further address the latency issue and can be used for latency-critical AI applications, as demonstrated by this research from IBM.
A photonic processor runs parallel operations in one physical core using wavelength division multiplexing (WDM). The WDM technology uses different laser light wavelengths so that several optical carrier signals can be transmitted in a single optical fibre, popularly used in optical fibre communication, which helps reduce the latency.
The team believes the technology has potential applications in the automobile industry. Citing the example of a self-driving car, the release from IBM said the photonics-based systems could prove life-saving when the vehicle, moving at a higher speed, has to detect an object within a certain distance. In its blog, IBM stated that there is a real need for low latency interference in the autonomous driving domain.