Now Reading
Evaluation Of Major Deep Learning Frameworks


Evaluation Of Major Deep Learning Frameworks


There are many deep learning frameworks out there and it can lead to confusion as to which one is better for your task. in this article, we will evaluate the different frameworks with the help of this open-source GitHub repository.



Frameworks are like different programming languages. One has its own way of communicating with the systems. This article shows how the frameworks built for deep learning are different in terms of various factors. There can be situations where the code could be written in Java, while you are familiar with Python. Instead of writing the model in Python language you can simply implement Java and work on the model.

The goal of this repo was to make a comparison between different benchmarks helping the data scientists implement their work easily, and to make a GPU comparison with respect to the advancement. The open-source communities have also collaborated to this project making the process easier.

Benchmarking Outcomes

Three types of dataset were used on two different GPUs for framework comparison.

The first was the CIFAR-10 dataset with 50,000 training samples and 10,000 test samples, uniformly distributed over 10 classes. Every image has a depth of 3 and 32x32 shape and has been rescaled from 0-255 to 0-1.

A CNN has been used across different platforms with GPU support — Nvidia K80 and P100, with CUDA and cuDNN. CUDA (Compute Unified Device Architecture) is a parallel computing platform developed by Nvidia. CUDA support is required on these frameworks for the implementation of GPU while training and testing the models. Similarly, cuDNN is a Deep Neural Network library developed by Nvidia for high tuning of the computations like front propagation and backpropagation.

DL Library K80/CUDA 8/cuDNN 6 P100/CUDA 8/cuDNN 6
Caffe2 148 54
Chainer 162 69
CNTK 163 53
Gluon 152 62
Keras(CNTK) 194 76
Keras(TF) 241 76
Keras(Theano) 269 93
Tensorflow 173 57
Lasagne(Theano) 253 65
MXNet 145 51
PyTorch 169 51
Julia – Knet 159 *

* - Not submitted at the time of benchmarking.

Average Time for 1,000 Images: ResNet-50 – Feature Extraction

The next model was the pre-trained ResNet50 split after average pooling at the end (7,7), which creates a 2048D vector. After passing this to a softmax, it squashes the values between 0 and 1, like probabilities. This has been performed on the same Nvidia GPUs and CUDA platforms.

DL Library K80/CUDA 8/cuDNN 6 P100/CUDA 8/cuDNN 6
Caffe2 14.1 7.9
Chainer 9.3 2.7
CNTK 8.5 1.6
Keras(CNTK) 21.7 5.9
Keras(TF) 10.2 2.9
Tensorflow 6.5 1.8
MXNet 7.7 2.0
PyTorch 7.7 1.9
Julia – Knet 6.3 *

* - Not submitted at the time of benchmarking.

A sentiment analysis has been done on the IMDB dataset available on the website. The training set had 25,000 reviews and the test samples were in 25,000, strategically sampled with equal number of positives and negatives. Comparison of the time taken during training is shown below

See Also

DL Library K80/CUDA 8/cuDNN 6 P100/CUDA 8/cuDNN 6 Using cuDNN?
CNTK 32 15 Yes
Keras(CNTK) 86 53 No
Keras(TF) 35 26 Yes
MXNet 29 24 Yes
Pytorch 31 16 Yes
Tensorflow 30 22 Yes
Julia – Knet 29 * Yes

* - Not submitted at the time of benchmarking.

Study Analysis

  1. Most frameworks use cuDNN’s algorithm to run an exhaustive search and optimise the algorithm used for the forward-pass of convolutions on your fixed-sized images. For example, this can be implement on the torch platform with the following command “torch.backends.cudnn.benchmark=True”.
  2. cuDNN improves speed of the computations while training the RNNs. The downside is that running inference on CPU later-on may be more challenging.
  3. From the analysis we can see that the K80 GPU is less powerful compared to the P100 GPU even though both have CUDA and cuDNN support.
  4. This particular benchmarking on time required for training and feature extraction exhibits that Pytorch, CNTK and Tensorflow show a high rate of computational speed.

It has been determined that larger number of frameworks use cuDNN to optimize the algorithms during forward-propagation on the images. By comparing these frameworks we found out that the architecture and the data used by each of them was similar. The computation speed and time with respect to all the frameworks has been conducted but . They are simply meant to show how to create the same networks across different frameworks and the performance on these specific examples.

ONNX (Open Neural Network Exchange Format) was useful not only while developing a framework, but also while converting the score of the model. Also, MMdnn tools convert between different framework and visualise the architecture at the same time.

The study was completed with the help of the various teams’ contribution who are working on different frameworks. First, the Keras with Tensorflow has channels-last configuration which needed to specify the parameters at every batch, but now it has been developed and a channel-first is now a native configuration. This repo is the version 1.0, and the team is working on considering other benchmarks to work on and build a comparison model.

Frameworks - Tensorflow, Julia, MXNet, Keras, Theano, R, CNTK, Pytorch, Caffe2, Chainer and Gluon.



Register for our upcoming events:


Enjoyed this story? Join our Telegram group. And be part of an engaging community.

Provide your comments below

comments

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
Scroll To Top