Last updated July 16, 2022
In AI Mysteries

Everything You Should Know About Dropouts And BatchNormalization In CNN

Through this article, we will be exploring Dropout and BatchNormalization, and after which layer we should add them.

Share

Published on September 14, 2020

by Rohit Dwivedi

Listen to this story

In Computer vision while we build Convolution neural networks for different image related problems like Image Classification, Image segmentation, etc we often define a network that comprises different layers that include different convent layers, pooling layers, dense layers, etc. Also, we add batch normalization and dropout layers to avoid the model to get overfitted. But there is a lot of confusion people face about after which layer they should use the Dropout and BatchNormalization.

Through this article, we will be exploring Dropout and BatchNormalization, and after which layer we should add them. For this article, we have used the benchmark MNIST dataset that consists of Handwritten images of digits from 0-9. The data set can be loaded from the Keras site or else it is also publicly available on Kaggle.

What we will learn from this article?

What does a CNN network consist of?
What is BatchNormalization? Where is it used?
What are Dropouts? Where are they added?

What does a CNN network consist of?

Convolution neural network (CNN’s) is a deep learning algorithm that consists of convolution layers that are responsible for extracting features maps from the image using different numbers of kernels. Then there come pooling layers that reduce these dimensions. There are again different types of pooling layers that are max pooling and average pooling layers. Also, the network comprises more such layers like dropouts and dense layers. The below image shows an example of the CNN network.

What is BatchNormalization? Where is it used?

Batch normalization is a layer that allows every layer of the network to do learning more independently. It is used to normalize the output of the previous layers. The activations scale the input layer in normalization. Using batch normalization learning becomes efficient also it can be used as regularization to avoid overfitting of the model. The layer is added to the sequential model to standardize the input or the outputs. It can be used at several points in between the layers of the model. It is often placed just after defining the sequential model and after the convolution and pooling layers. The below code shows how to define the BatchNormalization layer for the classification of handwritten digits. We will first import the required libraries and the dataset. Use the below code for the same.

import tensorflow as tf
from tensorflow import keras
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data() 
print(X_train.shape)
print(X_test.shape)

There are a total of 60,000 images in the training and 10,000 images in the testing data. Now we will reshape the training and testing image and will then define the CNN network. Use the below code for the same.

X_train=X_train.reshape(len(X_train),28,28,1)
X_test=X_test.reshape(len(X_test),28,28,1)
from keras.layers import Dense
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import BatchNormalization
model.add(Conv2D(32, (3,3), input_shape=(X_train[0].shape),activation='relu')) 
model.add(Conv2D(32, (3,3), activation='relu'))
model.add(Conv2D(32, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D())
model.add(Dense(units=128,activation = 'relu'))
model.add(Dense(units = 64, activation = 'relu'))
model.add(Dense(units = 32, activation = 'relu'))
model.add(Dense(units = 10, activation = 'softmax'))

What are Dropouts? Where are they used?

Dropouts are the regularization technique that is used to prevent overfitting in the model. Dropouts are added to randomly switching some percentage of neurons of the network. When the neurons are switched off the incoming and outgoing connection to those neurons is also switched off. This is done to enhance the learning of the model. Dropouts are usually advised not to use after the convolution layers, they are mostly used after the dense layers of the network. It is always good to only switch off the neurons to 50%. If we switched off more than 50% then there can be chances when the model leaning would be poor and the predictions will not be good. Let us see how we can make use of dropouts and how to define them while building a CNN model. We will use the same MNIST data for the same.

We will first define the library and load the dataset followed by a bit of pre-processing of the images. Use the below code for the same.

import tensorflow as tf
from tensorflow import keras
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data() 
from keras.layers import Dense
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import BatchNormalization
model.add(Conv2D(64, (3,3), input_shape=(X_train[0].shape),activation='relu')) 
model.add(Conv2D(32, (3,3), activation='relu'))
model.add(Conv2D(32, (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D())
model.add(Dense(units=128,activation = 'relu'))
model.add(Dropouts(0.25))
model.add(Dense(units = 64, activation = 'relu'))
model.add(Dense(units = 32, activation = 'relu'))
model.add(Dense(units = 10, activation = 'softmax'))

Conclusion

I would like to conclude the article by hoping that now you have got a fair idea of what is dropout and batch normalization layer. In the starting, we explored what does a CNN network consist of followed by what are dropouts and Batch Normalization. We used the MNIST data set and built two different models using the same. Batch Normalization layer can be used several times in a CNN network and is dependent on the programmer whereas multiple dropouts layers can also be placed between different layers but it is also reliable to add them after dense layers.

Access all our open Survey & Awards Nomination forms in one place

Rohit Dwivedi

I am currently enrolled in a Post Graduate Program In Artificial Intelligence and Machine learning. Data Science Enthusiast who likes to draw insights from the data. Always amazed with the intelligence of AI. It's really fascinating teaching a machine to see and understand images. Also, the interest gets doubled when the machine can tell you what it just saw. This is where I say I am highly interested in Computer Vision and Natural Language Processing. I love exploring different use cases that can be build with the power of AI. I am the person who first develops something and then explains it to the whole community with my writings.