Advertisement

DeepSpeed Vs Horovod: A Comparative Analysis

A comparative analysis of open-source deep learning optimization libraries DeepSpeed and Horovod for advancing large-scale model training.

Deep learning represents a new artificial intelligence (AI) and machine learning paradigm. It has achieved enormous appeal in scientific computing, and its algorithms are widely employed to address challenging issues. To a certain degree, all deep learning algorithms depend on the capacity of deep neural networks (DNNs) to span GPU topologies. However, the same scalability has led to computer-intensive programmes, which pose operating problems for enterprises. Thus, from training to optimisation, the life cycle of a deep learning project demands strong building blocks for infrastructure that can extend computer workloads.

Over the years, many open-source deep learning optimisation libraries have been announced by tech giants such as Google, Microsoft, Uber, DeepMind and others. In this article, we will compare two of these libraries–DeepSpeed and Horovod.

DeepSpeed

In February 2020, Microsoft announced the release of an open-source library called DeepSpeed.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

Training a large and advanced deep learning model is complex and includes a number of challenges, such as model design, setting up state-of-the-art training techniques including distributed training, mixed precision, gradient accumulation, among others. 

There is no certainty that the system will perform up to the expectation or achieve the desired convergence rate. This is because large models easily run out of memory with pure data parallelism, and it is hard to utilise model parallelism in such cases. This is where DeepSpeed comes into the picture, which addresses these drawbacks and accelerates model development and training.


Download our Mobile App



One of the most important applications of DeepSpeed has been the development of Turing natural language generation (Turing-NLG), one of the largest language models with 17 billion parameters.

DeepScale stands apart in four important areas:

  • Scale: DeepSpeed supports system running models with up to 100 billion parameters, which is ten times improved on existing training optimisation frameworks. DeepSpeed’s 3D parallels can effectively train in-depth learning models with trillions of parameters using contemporary GPU clusters with hundreds of devices.
  • Speed: DeepSpeed was 4-5 times higher than competing libraries in initial tests.
  • Cost: models could be trained at three times cheaper using DeepSpeed than the alternatives.
  • Usability: DeepSpeed does not require PyTorch models for refactoring and can be used with only a few lines of code.

Horovod 

Horovod is Uber’s open-source, free software framework for distributed deep learning training using TensorFlow, PyTorch, Keras and Apache MXNet. Horovod aims to make distributed deep learning quick and easy to use. Originally, Horovod was built by Uber to make distributed deep learning quick and easy to train existing training scripts to run on hundreds of GPUs with just a few lines of Python code. It also brought the model training time down from days and weeks to hours and minutes. In the cloud platforms, including AWS, Azure, and Databricks, Horovod can be installed on-site or directly run out of the box.

Furthermore, Horovod can run on top of Apache Spark, allowing data processing and model training to be unified under a single pipeline. Once Horovod is configured, the same infrastructure may be used to train models with any framework, allowing the switching between TensorFlow, PyTorch, MXNet and future frameworks. The main principles of Horovod are built on MPI notions, namely size, rank, rank, local rank, allreduce, and allgather.

DeepSpeed vs Horovod 

Advanced deep learning models are tough to train. Besides model design, model scientists also need modern training approaches such as distributed training, mixed precision, gradient accumulation and monitoring. Still, the ideal system performance and convergence rate cannot be achieved by scientists. Large models give considerable accuracy benefits, but training billions to trillions of parameters often meets fundamental hardware restrictions. Existing systems make trade-offs between processing, communication and development efficiency to fit these models into memory. DeepSpeed and Horovod address these difficulties to expedite model development and training.

DeepSpeed brings advanced training techniques, such as ZeRO, distributed training, mixed precision and monitoring, to PyTorch compatible lightweight APIs. DeepSpeed addresses the underlying performance difficulties and improves the speed and scale of the training with only a few lines of code change to the PyTorch model.

On the other hand, the primary motivation for Horovod is to make it easy to use a single GPU training script and to scale it successfully to train across several GPUs. At Uber, it was found that the MPI model was considerably more straightforward and needed far fewer code modifications than earlier alternatives such as Distributed TensorFlow with parameter servers. Once a training script with Horovod is built, it could run on a single GPU, several GPUs or even numerous hosts without changing the code. Furthermore, Horovod is not only easy to use but also fast. 

More Great AIM Stories

Ritika Sagar
Ritika Sagar is currently pursuing PDG in Journalism from St. Xavier's, Mumbai. She is a journalist in the making who spends her time playing video games and analyzing the developments in the tech world.

AIM Upcoming Events

Regular Passes expire on 3rd Mar

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 17th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, Virtual
Deep Learning DevCon 2023
27 May, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
AIM TOP STORIES

A beginner’s guide to image processing using NumPy

Since images can also be considered as made up of arrays, we can use NumPy for performing different image processing tasks as well from scratch. In this article, we will learn about the image processing tasks that can be performed only using NumPy.

RIP Google Stadia: What went wrong?

Google has “deprioritised” the Stadia game streaming platform and wants to offer its Stadia technology to select partners in a new service called “Google Stream”.