In computer vision, we often build Convolution neural networks for different problems dealing with images like image classification, object detection, etc. In image classification tasks a CNN network is built using a combination of different convolution layers, pooling layers, dropouts, and at last fully connected layers. But while building this type of networks we define different sizes of kernels to extract feature maps and different neurons for different layers. We do not have a fixed rule of defining the number of layers, neurons, and kernel size. Keras Tuner is a library that resolves this problem and gives us the optimal parameters to attain high accuracy. It is similar to randomizedsearchcv which finds the optimal parameters in machine learning.
Through this article, we will explore Keras’ tuner library and will check how it helps to find the optimal parameters that are kernel sizes, learning rate for optimization, and different hyper-parameters. For this experiment, we will use the MNIST data set which is data of handwritten digits. There are a total of 60000 images in the training set and 10000 images in the testing data. We will directly import the dateset from Keras. We will work on Google Colab as it provides GPU free of cost for fast processing.
Sign up for your weekly dose of what's up in emerging technology.
What you will learn from this article?
- What is Keras Tuner?
- How to build a model for using Keras Tuner?
- How to check for optimal Hyper parameter tuning?
- What is Keras Tuner?
Keras tuning is a library that allows us to find optimal hyperparameters for our model. Hyper Parameter is defined as the parameters that directly controls the performance of the models. We tune these parameters to get the best performance. In machine learning, we have techniques like GridSearchCV and RandomizedSearchCV for doing hyper-parameter tuning. We just need to define the range of the parameters and then automatically the algorithm computes the different combinations. Keras tuner is used similarly. We have to define the range of neurons we want to compute and similarly kernel size. Let’s start by building our CNN model first.
- How to Build a Model for using Keras Tuner?
We will first import the library and dependencies that are required for the model. Use the below code to install the keras tuner package and all the required libraries. Use the below code to the same. Also, we need to change the runtime environment to be GPU.
!pip install keras-tuner import tensorflow as tf from keras.utils import np_utils from tensorflow import keras from tensorflow.keras import layers from kerastuner.tuners import RandomSearch import kerastuner as kt
Now we will import the MNIST dataset from Keras and check the training and testing shapes. Use the below code to the same.
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data() print(X_train.shape) print(X_test.shape)
Now we will reshape the data in a way that is accepted by Keras followed by building the model in a way to use Keras tuner. Use the below code to do the same.
X_train=X_train.reshape(len(X_train),28,28,1) X_test=X_test.reshape(len(X_test),28,28,1) def create_model(hyperparam): model = keras.models.Sequential() model.add(Convolution2D( filters=hyperparam.Int('convolution_1',min_value=32, max_value=128, step=16), kernel_size=hyperparam.Choice('convolution_1', values = [3,6]), activation='relu',input_shape=(28,28,1) ), model.add(Convolution2D( filters=hyperparam.Int('convolution_2', min_value=64, max_value=128, step=16),kernel_size=hyperparam.Choice('convolution_2', values = [3,6]), activation='relu' ), model.add(Convolution2D( filters=hyperparam.Int('convolution_3', min_value=64, max_value=128, step=16),kernel_size=hyperparam.Choice('convolution_3', values = [3,6]), activation='relu' ), model.add(Flatten()) model.add(Dense(10)) model.add(Activation("softmax")) model.compile(optimizer=keras.optimizers.Adam(hyperparam.Choice('learning_rate', values=[1e-2, 1e-3])), loss='binary_crossentropy', metrics=['accuracy']) return model
We have now defined our choices of kernels and filters. Different combinations will be taken on which the model will be trained. Now we define the parameter which is to be monitored that is objective. Also, max trials define the number of time models will try those combinations.
tuner_search=RandomSearch(model, objective='val_accuracy', max_trials=5,directory='output',project_name="mnist")
After this, we will start searching for optimal hyperparameters. We have defined max epochs to be 5 whereas it is again variable.
Above are the snapshots of training of 5 different trials that outputs the trial summary and the best hyperparameters. Now we can check the best model from these 5 trials using the below code and use it for training the models. Follow the below code for the same.
model=tuner_search.get_best_models(num_models=1) model.fit(X_train,y_train, epochs=10, validation_data=(X_test,y_test))
After using the optimal hyperparameter given by Keras tuner we have achieved 98% accuracy on the validation data. Keras tuner takes time to compute the best hyperparameters but gives the high accuracy.
In this article, we discussed the Keras tuner library for searching the optimal hyper-parameters for Deep learning models. Keras tune is a great way to check for different numbers of combinations of kernel size, filters, and neurons in each layer. Keras tuner can be used for getting the best parameters for our deep learning model that will give the highest accuracy that can be achieved with those combinations we define. Check here a similar article titled “Guide to Hyperparameter Tuning using GridSearchCV and RandomizedSearchCV”.