Now Reading
Hands-on Guide To Create Ensemble Of Convolutional Neural Networks

Hands-on Guide To Create Ensemble Of Convolutional Neural Networks

Dr. Vaibhav Kumar
Ensemble Of Convolutional Neural Networks

Convolutional Neural Networks have proven their advantage as a deep learning model in a variety of applications. When handling the large data sets to extract features and make predictions, the CNN models have always shown their competency. In the majority of the applications, one individual CNN model is applied. Now, there is always a scope to use a group of CNN models in the same tasks as an ensemble learning approach. In one of our articles, we discussed the customization of ensemble learning models and have seen their increased efficiency. Now, let us tune this ensembling approach with CNN models. If successful, it can be applied to the tasks where CNN models have given a low accuracy as per expectations.

In this article, we will create an ensemble of convolutional neural networks. In this experiment, we will create an ensemble of 10 CNN models and this ensemble will be applied in multi-class prediction of MNIST handwritten digit data set. Initially, we will define the individual CNN models and train them in a sequence. On the test data, every individual model will give its prediction and the final prediction of the ensemble model will be the most frequent prediction by all the individual CNN models. The same strategy is applied in creating ensembles by MaxVoting for classification.


This code was implemented in Google Colab and the .py file was downloaded.

# -*- coding: utf-8 -*-

Automatically generated by Colaboratory.

Original file is located at

First, we need to import the required libraries. 

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import seaborn as sns
from sklearn.metrics import confusion_matrix, mean_squared_error
from sklearn.model_selection import train_test_split
import itertools
import math
from sklearn.model_selection import train_test_split, KFold
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras.utils.np_utils import to_categorical # convert to one-hot-encoding
from keras.layers import BatchNormalization
from keras.optimizers import Adam, RMSprop, Adagrad

After importing the required libraries, we will read the MNIST handwritten data set that is provided publicly in Google Colab as sample data.

#Reading the data
train = pd.read_csv("sample_data/mnist_train_small.csv")
test = pd.read_csv("sample_data/mnist_test.csv")

Now, we need to specify the training and test sets. It will be done using the below lines of codes. First, we will check the header and then we will identify the required columns

#Training data head

#Specifying train and test data
train_X = train.iloc[:,1:]
train_y = train.iloc[:,0]
test = test.iloc[:,1:]

#Shape of the specified data

Now, we will normalize the training and test data

#Normalize the data
train_X = train_X / 255.0
test = test / 255.0

For compatibility with the CNN model, we need to reshape the data.

#Reshape image in 3 dimensions (with 1 channel)
train_X = train_X.values.reshape(-1,28,28,1)
test = test.values.reshape(-1,28,28,1)

Since the output will be the 10 classes, we need to encode the labels of the data set.

Stay Connected

Get the latest updates and relevant offers by sharing your email.
#Encode labels to one hot vectors
train_y = to_categorical(train_y, num_classes = 10)

We will check one random image from the training data set.

#Sample image
Ensemble Of Convolutional Neural Networks

Ensemble Of Convolutional Neural Networks

In the next step, we will define 10 CNN models compatible with our data set.

# Define 10 CNN models
from keras.optimizers import RMSprop, Adam
from keras.layers import DepthwiseConv2D, Reshape, Activation

nets = 10
model = [0] *nets

for j in range(nets):
    model[j] = Sequential()
    #First Layer
    model[j].add(Conv2D(32, kernel_size = 3, activation='relu', input_shape = (28, 28, 1)))
    model[j].add(Conv2D(32, kernel_size = 3, activation='relu'))
    model[j].add(Conv2D(32, kernel_size = 5, strides=2, padding='same', activation='relu'))

    #Second Layer
    model[j].add(Conv2D(64, kernel_size = 3, activation='relu'))
    model[j].add(Conv2D(64, kernel_size = 3, activation='relu'))
    model[j].add(Conv2D(64, kernel_size = 5, strides=2, padding='same', activation='relu'))

    #Third layer
    model[j].add(Conv2D(128, kernel_size = 4, activation='relu'))

    #Output layer
    model[j].add(Dense(10, activation='softmax'))

    # Compile each model
    model[j].compile(optimizer='adam', loss="categorical_crossentropy", metrics=["accuracy"])

print('All Models Defined')

We will use the learning rate annealer in this experiment. The learning rate annealer decreases the learning rate after a certain number of epochs if the error rate does not change. Here, through this technique, we will monitor the validation accuracy and if it seems to be a plateau in 3 epochs, it will reduce the learning rate.

See Also

#LR Reduction Callback
from keras.callbacks import ReduceLROnPlateau
learning_rate_reduction=ReduceLROnPlateau(monitor='val_accuracy', patience=3, verbose=0, factor=0.5, min_lr=0.00001)

In the next step, we will train the models that we have defined above.

# train for 20 epochs
history = [0] * nets
epochs = 20

datagen = ImageDataGenerator(rotation_range=13, zoom_range=0.11, width_shift_range=0.1, height_shift_range=0.1)

for j in range(nets):
    print(f'Individual Net : {j+1}')   
    X_train2, X_val2, Y_train2, Y_val2 = train_test_split(train_X, train_y, test_size = 0.1)
    history[j] = model[j].fit_generator(datagen.flow(X_train2,Y_train2, batch_size=64), epochs = epochs, steps_per_epoch = X_train2.shape[0]//64, validation_data = (X_val2,Y_val2), callbacks=[learning_rate_reduction], verbose=0)

    print("CNN Model {0:d}: Epochs={1:d}, Training accuracy={2:.5f}, Validation accuracy={3:.5f}".format(j+1,epochs,max(history[j].history['accuracy']),max(history[j].history['val_accuracy']) ))

Ensemble Of Convolutional Neural Networks

After the successful training, we will make predictions using all the 10 trained models and final prediction will be the most frequent prediction of all the 10 predictions.

results = np.zeros( (test.shape[0],10) ) 
for j in range(nets):
    results = results + model[j].predict(test)
results = np.argmax(results,axis = 1)

Now we will check the prediction on one sample image.

#Test on result
Ensemble Of Convolutional Neural Networks

Finally, we will check the prediction label on a few more test images.

L = 4
W = 4
fig, axes = plt.subplots(L, W, figsize = (12,12))
axes = axes.ravel()

for i in np.arange(0, L * W):  

Ensemble Of Convolutional Neural Networks

So, as we can see above, our ensemble model has given correct predictions for above 16 images. We can check the model with more images. This ensemble of convolutional neural networks can be applied on a larger image data set to check its performance. In those cases, we can increase the number of individual CNN models and train with more number of epochs. 

What Do You Think?

If you loved this story, do join our Telegram Community.

Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
What's Your Reaction?
In Love
Not Sure

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top