Listen to this story
|
In Computer vision while we build Convolution neural networks for different image related problems like Image Classification, Image segmentation, etc we often define a network that comprises different layers that include different convent layers, pooling layers, dense layers, etc. Also, we add batch normalization and dropout layers to avoid the model to get overfitted. But there is a lot of confusion people face about after which layer they should use the Dropout and BatchNormalization.
Through this article, we will be exploring Dropout and BatchNormalization, and after which layer we should add them. For this article, we have used the benchmark MNIST dataset that consists of Handwritten images of digits from 0-9. The data set can be loaded from the Keras site or else it is also publicly available on Kaggle.
What we will learn from this article?
- What does a CNN network consist of?
- What is BatchNormalization? Where is it used?
- What are Dropouts? Where are they added?
- What does a CNN network consist of?
Convolution neural network (CNN’s) is a deep learning algorithm that consists of convolution layers that are responsible for extracting features maps from the image using different numbers of kernels. Then there come pooling layers that reduce these dimensions. There are again different types of pooling layers that are max pooling and average pooling layers. Also, the network comprises more such layers like dropouts and dense layers. The below image shows an example of the CNN network.
- What is BatchNormalization? Where is it used?
Batch normalization is a layer that allows every layer of the network to do learning more independently. It is used to normalize the output of the previous layers. The activations scale the input layer in normalization. Using batch normalization learning becomes efficient also it can be used as regularization to avoid overfitting of the model. The layer is added to the sequential model to standardize the input or the outputs. It can be used at several points in between the layers of the model. It is often placed just after defining the sequential model and after the convolution and pooling layers. The below code shows how to define the BatchNormalization layer for the classification of handwritten digits. We will first import the required libraries and the dataset. Use the below code for the same.
import tensorflow as tf from tensorflow import keras (X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data() print(X_train.shape) print(X_test.shape)
There are a total of 60,000 images in the training and 10,000 images in the testing data. Now we will reshape the training and testing image and will then define the CNN network. Use the below code for the same.
X_train=X_train.reshape(len(X_train),28,28,1) X_test=X_test.reshape(len(X_test),28,28,1) from keras.layers import Dense from keras.layers import Conv2D from keras.layers import MaxPooling2D from keras.layers import BatchNormalization model.add(Conv2D(32, (3,3), input_shape=(X_train[0].shape),activation='relu')) model.add(Conv2D(32, (3,3), activation='relu')) model.add(Conv2D(32, (3,3), activation='relu')) model.add(BatchNormalization()) model.add(MaxPooling2D()) model.add(Dense(units=128,activation = 'relu')) model.add(Dense(units = 64, activation = 'relu')) model.add(Dense(units = 32, activation = 'relu')) model.add(Dense(units = 10, activation = 'softmax'))
- What are Dropouts? Where are they used?
Dropouts are the regularization technique that is used to prevent overfitting in the model. Dropouts are added to randomly switching some percentage of neurons of the network. When the neurons are switched off the incoming and outgoing connection to those neurons is also switched off. This is done to enhance the learning of the model. Dropouts are usually advised not to use after the convolution layers, they are mostly used after the dense layers of the network. It is always good to only switch off the neurons to 50%. If we switched off more than 50% then there can be chances when the model leaning would be poor and the predictions will not be good. Let us see how we can make use of dropouts and how to define them while building a CNN model. We will use the same MNIST data for the same.
We will first define the library and load the dataset followed by a bit of pre-processing of the images. Use the below code for the same.
import tensorflow as tf from tensorflow import keras (X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data() from keras.layers import Dense from keras.layers import Conv2D from keras.layers import MaxPooling2D from keras.layers import BatchNormalization model.add(Conv2D(64, (3,3), input_shape=(X_train[0].shape),activation='relu')) model.add(Conv2D(32, (3,3), activation='relu')) model.add(Conv2D(32, (3,3), activation='relu')) model.add(BatchNormalization()) model.add(MaxPooling2D()) model.add(Dense(units=128,activation = 'relu')) model.add(Dropouts(0.25)) model.add(Dense(units = 64, activation = 'relu')) model.add(Dense(units = 32, activation = 'relu')) model.add(Dense(units = 10, activation = 'softmax'))
Conclusion
I would like to conclude the article by hoping that now you have got a fair idea of what is dropout and batch normalization layer. In the starting, we explored what does a CNN network consist of followed by what are dropouts and Batch Normalization. We used the MNIST data set and built two different models using the same. Batch Normalization layer can be used several times in a CNN network and is dependent on the programmer whereas multiple dropouts layers can also be placed between different layers but it is also reliable to add them after dense layers.