MITB Banner

Did GPUs render CPUs obsolete in deep learning?

Task optimisation is much easier in the CPU than GPU.

Share

CPU for deep learning

GPUs (graphic processing units) are chips built for parallel processing. Originally developed to accelerate the 3D rendering tasks in video games, GPUs have found major applications in deep learning, artificial intelligence and high-performance computing.

GPUs have become more flexible and programmable over time and emerged as the top choice architecture for deep learning models: The hundreds of simple GPU cores work parallelly to scale down the time required for intensive computations.

In 2011, NVIDIA and Google collaborated on a study: A computer vision model was trained on both CPUs and GPUs to distinguish cats from people. The model used 2,000 CPUs to achieve the same performance as 12 GPUs.

That said, we can’t write off CPUs for deep learning. We find out why.

CPUs for deep learning

GPUs cannot function as standalone chips and can only perform limited operations. Thanks to their sparse cache memory, the bulk of the data has to be stored off-chip, leading to a lot of back and forth for data retrieval. The resulting computational bottleneck caps the speed at which GPUs can run deep learning algorithms.

While parallelism makes GPUs a good choice for deep learning, CPUs offer unique advantages. For example, task optimisation is much easier in the CPU than GPU. Though CPUs have fewer cores, the architecture is powerful and can carry out different instructions (MIMD architecture). GPU cores are organised in the blocks of 32 cores and execute the same instructions parallelly. However, parallelisation in dense networks is highly complicated. Therefore, complex optimisation techniques are more difficult to execute in GPU than CPU. Further, the power cost of the GPU is higher compared to CPUs.

US-based AI startup Neural Magic’s suite of products allows clients to deploy deep learning models without needing specialised hardware–making AI more accessible and lowering the cost. MIT professor Nir Shavit was working on research to reconstruct a brain map of a mouse when he hit on the idea. Shavit, having no knowledge on GPUs, had to opt for CPUs to execute the deep learning part of the research. “I realised that a CPU can do what a GPU does—if programmed in the right way,” he said. He parlayed the idea into a business, and Neural Magic was born.

In 2020, a group of researchers proposed the Sub-Linear Deep Learning Engine (SLIDE), which blends randomised algorithms with multi-core parallelism and workload optimisation and uses just one CPU. The researchers found SLIDE outperformed an optimised implementation of Tensorflow on the best GPU available. With fully connected architectures, training SLIDE on a 44 core CPU is 3.5 hours faster than the same system trained using Tensorflow. On the same CPU hardware, SLIDE is ten times faster than Tensorflow.

CPUs offer reasonable speeds on a range of applications. Easy availability, software support and portability make CPUs a good choice for DL applications. Even companies like Amazon, Facebook, Google, Microsoft, and Samsung are benchmarking and optimising deep learning on CPUs.

However, the challenge of optimising deep learning applications on CPUs requires careful matching of the strength of CPUs with the architectural characteristics of the application. Systems at different scales (mobile and data-centre, single vs multi-node systems) have different properties and challenges, as do different deep learning algorithms/applications such as CNNs and RNNs) and inference and training in DL.

In 2019, AI-based PQ Labs introduced MagicNet which ran deep learning applications using CPU 199 times faster than GPU. Researchers showed MagicNet running at 718 fps on Intel i7  and Tiny Yolo running 292 fps on NVIDIA TITAN X or 1080Ti graphics card achieved the same accuracy.

Of late, Israel-based deep learning startup Deci has achieved breakthrough performance using CPUs. The firm’s image classification models, called DeciNets, are used on Intel Cascade Lake CPUs.

Share
Picture of Shraddha Goled

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.