# Everything You Should Know About Dropouts And BatchNormalization In CNN

Through this article, we will be exploring Dropout and BatchNormalization, and after which layer we should add them.

In Computer vision while we build Convolution neural networks for different image related problems like Image Classification, Image segmentation, etc we often define a network that comprises different layers that include different convent layers, pooling layers, dense layers, etc. Also, we add batch normalization and dropout layers to avoid the model to get overfitted. But there is a lot of confusion people face about after which layer they should use the Dropout and BatchNormalization.

Through this article, we will be exploring Dropout and BatchNormalization, and after which layer we should add them. For this article, we have used the benchmark MNIST dataset that consists of Handwritten images of digits from 0-9. The data set can be loaded from the Keras site or else it is also publicly available on Kaggle

#### THE BELAMY

1. What does a CNN network consist of?
2. What is BatchNormalization? Where is it used?
3. What are Dropouts? Where are they added?
1. What does a CNN network consist of?

Convolution neural network (CNN’s) is a deep learning algorithm that consists of convolution layers that are responsible for extracting features maps from the image using different numbers of kernels. Then there come pooling layers that reduce these dimensions. There are again different types of pooling layers that are max pooling and average pooling layers. Also, the network comprises more such layers like dropouts and dense layers. The below image shows an example of the CNN network.

1. What is BatchNormalization? Where is it used?

Batch normalization is a layer that allows every layer of the network to do learning more independently. It is used to normalize the output of the previous layers. The activations scale the input layer in normalization. Using batch normalization learning becomes efficient also it can be used as regularization to avoid overfitting of the model. The layer is added to the sequential model to standardize the input or the outputs. It can be used at several points in between the layers of the model. It is often placed just after defining the sequential model and after the convolution and pooling layers. The below code shows how to define the BatchNormalization layer for the classification of handwritten digits. We will first import the required libraries and the dataset. Use the below code for the same.

import tensorflow as tf
from tensorflow import keras
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
print(X_train.shape)
print(X_test.shape)


There are a total of 60,000 images in the training and 10,000 images in the testing data. Now we will reshape the training and testing image and will then define the CNN network. Use the below code for the same.

X_train=X_train.reshape(len(X_train),28,28,1)
X_test=X_test.reshape(len(X_test),28,28,1)
from keras.layers import Dense
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import BatchNormalization
model.add(Dense(units = 64, activation = 'relu'))
model.add(Dense(units = 32, activation = 'relu'))
model.add(Dense(units = 10, activation = 'softmax'))
1. What are Dropouts? Where are they used?

Dropouts are the regularization technique that is used to prevent overfitting in the model. Dropouts are added to randomly switching some percentage of neurons of the network. When the neurons are switched off the incoming and outgoing connection to those neurons is also switched off. This is done to enhance the learning of the model. Dropouts are usually advised not to use after the convolution layers, they are mostly used after the dense layers of the network. It is always good to only switch off the neurons to 50%. If we switched off more than 50% then there can be chances when the model leaning would be poor and the predictions will not be good.  Let us see how we can make use of dropouts and how to define them while building a CNN model. We will use the same MNIST data for the same.

We will first define the library and load the dataset followed by a bit of pre-processing of the images. Use the below code for the same.

import tensorflow as tf
from tensorflow import keras
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
from keras.layers import Dense
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import BatchNormalization
model.add(Dense(units = 64, activation = 'relu'))
model.add(Dense(units = 32, activation = 'relu'))
model.add(Dense(units = 10, activation = 'softmax'))

Conclusion

I would like to conclude the article by hoping that now you have got a fair idea of what is dropout and batch normalization layer. In the starting, we explored what does a CNN network consist of followed by what are dropouts and Batch Normalization. We used the MNIST data set and built two different models using the same. Batch Normalization layer can be used several times in a CNN network and is dependent on the programmer whereas multiple dropouts layers can also be placed between different layers but it is also reliable to add them after dense layers.

## More Great AIM Stories

### TensorFlow 2.5.0 Released: All Major Updates & Features

I am currently enrolled in a Post Graduate Program In Artificial Intelligence and Machine learning. Data Science Enthusiast who likes to draw insights from the data. Always amazed with the intelligence of AI. It's really fascinating teaching a machine to see and understand images. Also, the interest gets doubled when the machine can tell you what it just saw. This is where I say I am highly interested in Computer Vision and Natural Language Processing. I love exploring different use cases that can be build with the power of AI. I am the person who first develops something and then explains it to the whole community with my writings.

## Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Telegram Channel

Discover special offers, top stories, upcoming events, and more.

##### MORE FROM AIM

LTI and Mindtree both play in Analytics services businesses, just like most other large IT/ITes service providers. But, what would the analytics services business of the merged entity look like?

##### GitHub now offers math support in markdown

GitHub’s math rendering capability uses MathJax; an open-source, JavaScript-based display engine.

Meta recently organised messaging event called ‘Conversations.’

##### Wipro announces 40,000 sq.ft. Innovation Studio in Texas

The studio will leverage Wipro’s deep reservoir of IPs, patents, and innovation DNA.

##### Google’s facial recognition tech to replace smart cards in Bengaluru metro trains￼

BMRCL plans to introduce the technology at its automatic fare collection gates.

##### Data science hiring process at DealShare

In the next few months, DealShare looks to grow its data science team by 15-20 members.

##### DeepMind’s AlphaFold 2 is half of the story

The idea was if I give you a sequence of amino acids, can you predict what will be the structure or the shape that it will take in the 3D space?

##### Lenskart invests USD 2 Mn in location intelligence platform GeoIQ

GeoIQ’s AI-based location tool will help Lenskart with its aggressive store rollout strategy.

##### TensorFlow v2.9 released: Major highlights

The main highlights of this release are performance enhancement with oneDNN and the release of a new API for model distribution, called DTensor