MXNet Tutorial: Complete Guide with Hands-On Implementation of Deep Learning Framework

In this article we will look into why MXNet? A complete overview of MXNet Implementation of MXNet on random data.

As the popularity and need for deep learning networks increase, there has been a lot of effort to build tools that ease the development of deep learning models. One such tool that we will discuss today is MXNet. You might be wondering what makes MXNet better than the already existing deep learning frameworks like Theano or Caffe. The existing frameworks are programming language-specific. This problem is overcome by MXNet and it provides one system for different programming flavours.  

In this article, we will look into 

  • Why MXNet?
  • A complete overview of MXNet
  • Implementation of MXNet on random data

Why MXNet?

MXNet is an open-source deep learning framework that is used to define, train and deploy neural networks. MXNet is short for mix-net because this framework was developed by combining various programming approaches into one. This framework supports Python, R, C++, Julia, Perl and many other languages which eliminates the need to learn new languages in order to use different frameworks. 

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Another advantage is that the models built using MXNet are portable such that they can fit in small amounts of memory. So, once your model is trained and tested, it can be easily deployed to mobile devices or connected systems. MXNets are scalable to be used on multiple machines and GPU simultaneously. This is why Amazon has chosen this framework for its deep learning web services. 

A Complete Overview of MXNet

Let us look at the entire architecture of the MXNet framework. I will discuss the most important ones below. 


The NDArray: The primary data type of the MXNet framework is NDArray. This is an n-dimensional array that stores data belonging to a similar type. If you have worked with Python’s NumPy arrays, NDArrays are quite similar. Deep neural networks have thousands of parameters to store and all of this is stored in these arrays. By default, an NDArray holds 32-bit floats, but we can customize that.

The Symbolic API: Inside any given layer of a neural network, the processing happens simultaneously. Independent layers could also run in parallel. So, for a  good performance, we have to implement parallel processing using multithreading or something similar. MXNet implemented this using dataflow programming and symbolic API. 

Dataflow programming is a type of parallel programming where the data flows through a graph. It can be thought of as a black box that takes in inputs and gives multiple outputs simultaneously without specifying underlying behaviour. 


In the figure above, the execution of (A*B) and (C*D) happens at the same time. A, B, C, D, E are all symbols that are computed in parallel. MXNet will use this information for optimisation purposes.

Binder: As the name implies, this process is meant to bind the data stored in the NDArray with its corresponding symbols for execution. It is necessary to specify the context, that is, whether the execution has to take place in the CPU or GPU. Once our data is bound to the symbols, the forward propagation can take place. 

KV Store: This is a key-value store that is used for synchronization of data in multiple devices. There are two main operations in the KV store. Push operation is used to push a key-value pair to the store and Pull is used to retrieve some key from the store. This is again done for parallel computation and increasing efficiency in the architecture of the framework. 

Implementation of MXNet on Random Data

Based on the above description of the framework, let us implement them to get a better understanding. For this implementation, we will be generating random data so do not try and make sense out of it. 

The first step is installing the packages. I will use python programming language, but if you would like tutorials on using other languages click here. To install MXNet use this command

pip install mxnet

Once the installation is done, we will create a dataset and store them in NDArrays. 

import mxnet as mx
import numpy as np
custom_data = 1000
trainset = 800
testset = custom_data - trainset
features_size = 100
targets_size = 10
ft= mx.nd.uniform(low=0, high=1, shape=(custom_data,features_size))
target = mx.nd.empty((custom_data,))
for i in range(0,custom_data-1):
  target[i] = np.random.randint(0,targets)

We have generated 1000 random data points for training our model. The target contains integers between 0 and 9. This data is stored in the form of NDArray. Let us split the data into train and test sets. I have split the data as 80% train and 20% test.

xtrain = mx.nd.crop(dataset, begin=(0,0), end=(trainset,features-1))
xtest = mx.nd.crop(dataset, begin=(trainset,0), end=(custom_data,features-1))
ytrain = target[0:trainset]
ytest = target[trainset:custom_data]

The next process is using symbols for this dataset for parallel computation to take place. 

data = mx.sym.Variable('data')
Now that we have assigned a symbol for data, let us build the model.
layer1 = mx.sym.FullyConnected(data, name='layer1', num_hidden=64)
relu1 = mx.sym.Activation(layer1, name='relu1', act_type="relu")
layer2 = mx.sym.FullyConnected(relu1, name='layer2', num_hidden=target)
output = mx.sym.SoftmaxOutput(layer2, name='softmax')
model = mx.mod.Module(output)
train_iteration =,label=ytrain,batch_size=batch)

Once we have assigned our symbols to the correct NDArray, we need to bind these two together.

model.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)

Let us now assign optimizers and fit the model on training data. 

mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.1), )), num_epoch=50)

Though the accuracy here looks great, it is not an actual dataset and was only to explain NDArray, Binding and Symbols used in MXNet. 


MXNet is a machine learning library combining symbolic expression with array computation to maximize efficiency and flexibility. Parallel computation with this kind of efficiency can help in making the implementation of deep learning modules even in systems without a built-in GPU. MXNet is officially released in Apache and is an up and coming framework for developers for any programming language. 

Bhoomika Madhukar
I am an aspiring data scientist with a passion for teaching. I am a computer science graduate from Dayananda Sagar Institute. I have experience in building models in deep learning and reinforcement learning. My goal is to use AI in the field of education to make learning meaningful for everyone.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox