Active Hackathon

Hands-on Guide To Create Ensemble Of Convolutional Neural Networks

In this article, we will create an ensemble of convolutional neural networks. In this experiment, we will create an ensemble of 10 CNN models and this ensemble will be applied in multi-class prediction of MNIST handwritten digit data set.
Ensemble Of Convolutional Neural Networks

Convolutional Neural Networks have proven their advantage as a deep learning model in a variety of applications. When handling the large data sets to extract features and make predictions, the CNN models have always shown their competency. In the majority of the applications, one individual CNN model is applied. Now, there is always a scope to use a group of CNN models in the same tasks as an ensemble learning approach. In one of our articles, we discussed the customization of ensemble learning models and have seen their increased efficiency. Now, let us tune this ensembling approach with CNN models. If successful, it can be applied to the tasks where CNN models have given a low accuracy as per expectations.

In this article, we will create an ensemble of convolutional neural networks. In this experiment, we will create an ensemble of 10 CNN models and this ensemble will be applied in multi-class prediction of MNIST handwritten digit data set. Initially, we will define the individual CNN models and train them in a sequence. On the test data, every individual model will give its prediction and the final prediction of the ensemble model will be the most frequent prediction by all the individual CNN models. The same strategy is applied in creating ensembles by MaxVoting for classification.


Sign up for your weekly dose of what's up in emerging technology.


This code was implemented in Google Colab and the .py file was downloaded.

# -*- coding: utf-8 -*-

Automatically generated by Colaboratory.

Original file is located at

First, we need to import the required libraries. 

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import seaborn as sns
from sklearn.metrics import confusion_matrix, mean_squared_error
from sklearn.model_selection import train_test_split
import itertools
import math
from sklearn.model_selection import train_test_split, KFold
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras.utils.np_utils import to_categorical # convert to one-hot-encoding
from keras.layers import BatchNormalization
from keras.optimizers import Adam, RMSprop, Adagrad

After importing the required libraries, we will read the MNIST handwritten data set that is provided publicly in Google Colab as sample data.

#Reading the data
train = pd.read_csv("sample_data/mnist_train_small.csv")
test = pd.read_csv("sample_data/mnist_test.csv")

Now, we need to specify the training and test sets. It will be done using the below lines of codes. First, we will check the header and then we will identify the required columns

#Training data head

#Specifying train and test data
train_X = train.iloc[:,1:]
train_y = train.iloc[:,0]
test = test.iloc[:,1:]

#Shape of the specified data

Now, we will normalize the training and test data

#Normalize the data
train_X = train_X / 255.0
test = test / 255.0

For compatibility with the CNN model, we need to reshape the data.

#Reshape image in 3 dimensions (with 1 channel)
train_X = train_X.values.reshape(-1,28,28,1)
test = test.values.reshape(-1,28,28,1)

Since the output will be the 10 classes, we need to encode the labels of the data set.

#Encode labels to one hot vectors
train_y = to_categorical(train_y, num_classes = 10)

We will check one random image from the training data set.

#Sample image
Ensemble Of Convolutional Neural Networks

Ensemble Of Convolutional Neural Networks

In the next step, we will define 10 CNN models compatible with our data set.

# Define 10 CNN models
from keras.optimizers import RMSprop, Adam
from keras.layers import DepthwiseConv2D, Reshape, Activation

nets = 10
model = [0] *nets

for j in range(nets):
    model[j] = Sequential()
    #First Layer
    model[j].add(Conv2D(32, kernel_size = 3, activation='relu', input_shape = (28, 28, 1)))
    model[j].add(Conv2D(32, kernel_size = 3, activation='relu'))
    model[j].add(Conv2D(32, kernel_size = 5, strides=2, padding='same', activation='relu'))

    #Second Layer
    model[j].add(Conv2D(64, kernel_size = 3, activation='relu'))
    model[j].add(Conv2D(64, kernel_size = 3, activation='relu'))
    model[j].add(Conv2D(64, kernel_size = 5, strides=2, padding='same', activation='relu'))

    #Third layer
    model[j].add(Conv2D(128, kernel_size = 4, activation='relu'))

    #Output layer
    model[j].add(Dense(10, activation='softmax'))

    # Compile each model
    model[j].compile(optimizer='adam', loss="categorical_crossentropy", metrics=["accuracy"])

print('All Models Defined')

We will use the learning rate annealer in this experiment. The learning rate annealer decreases the learning rate after a certain number of epochs if the error rate does not change. Here, through this technique, we will monitor the validation accuracy and if it seems to be a plateau in 3 epochs, it will reduce the learning rate.

#LR Reduction Callback
from keras.callbacks import ReduceLROnPlateau
learning_rate_reduction=ReduceLROnPlateau(monitor='val_accuracy', patience=3, verbose=0, factor=0.5, min_lr=0.00001)

In the next step, we will train the models that we have defined above.

# train for 20 epochs
history = [0] * nets
epochs = 20

datagen = ImageDataGenerator(rotation_range=13, zoom_range=0.11, width_shift_range=0.1, height_shift_range=0.1)

for j in range(nets):
    print(f'Individual Net : {j+1}')   
    X_train2, X_val2, Y_train2, Y_val2 = train_test_split(train_X, train_y, test_size = 0.1)
    history[j] = model[j].fit_generator(datagen.flow(X_train2,Y_train2, batch_size=64), epochs = epochs, steps_per_epoch = X_train2.shape[0]//64, validation_data = (X_val2,Y_val2), callbacks=[learning_rate_reduction], verbose=0)

    print("CNN Model {0:d}: Epochs={1:d}, Training accuracy={2:.5f}, Validation accuracy={3:.5f}".format(j+1,epochs,max(history[j].history['accuracy']),max(history[j].history['val_accuracy']) ))

Ensemble Of Convolutional Neural Networks

After the successful training, we will make predictions using all the 10 trained models and final prediction will be the most frequent prediction of all the 10 predictions.

results = np.zeros( (test.shape[0],10) ) 
for j in range(nets):
    results = results + model[j].predict(test)
results = np.argmax(results,axis = 1)

Now we will check the prediction on one sample image.

#Test on result
Ensemble Of Convolutional Neural Networks

Finally, we will check the prediction label on a few more test images.

L = 4
W = 4
fig, axes = plt.subplots(L, W, figsize = (12,12))
axes = axes.ravel()

for i in np.arange(0, L * W):  

Ensemble Of Convolutional Neural Networks

So, as we can see above, our ensemble model has given correct predictions for above 16 images. We can check the model with more images. This ensemble of convolutional neural networks can be applied on a larger image data set to check its performance. In those cases, we can increase the number of individual CNN models and train with more number of epochs. 

More Great AIM Stories

Dr. Vaibhav Kumar
Vaibhav Kumar has experience in the field of Data Science and Machine Learning, including research and development. He holds a PhD degree in which he has worked in the area of Deep Learning for Stock Market Prediction. He has published/presented more than 15 research papers in international journals and conferences. He has an interest in writing articles related to data science, machine learning and artificial intelligence.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Data Science Skills Survey 2022 – By AIM and Great Learning

Data science and its applications are becoming more common in a rapidly digitising world. This report presents a comprehensive view to all the stakeholders — students, professionals, recruiters, and others — about the different key data science tools or skillsets required to start or advance a career in the data science industry.

How to Kill Google Play Monopoly

The only way to break Google’s monopoly is to have localised app stores with an interface as robust as Google’s – and this isn’t an easy ask. What are the options?