How To Use Keras Tuner for Hyper-parameter Tuning of Deep Learning Models

Through this article, we will explore Keras' tuner library and will check how it helps to find the optimal parameters that are kernel sizes, learning rate for optimization, and different hyper-parameters.

In computer vision, we often build Convolution neural networks for different problems dealing with images like image classification, object detection, etc. In image classification tasks a CNN network is built using a combination of different convolution layers, pooling layers, dropouts, and at last fully connected layers. But while building this type of networks we define different sizes of kernels to extract feature maps and different neurons for different layers. We do not have a fixed rule of defining the number of layers, neurons, and kernel size. Keras Tuner is a library that resolves this problem and gives us the optimal parameters to attain high accuracy. It is similar to randomizedsearchcv which finds the optimal parameters in machine learning. 

Through this article, we will explore Keras’ tuner library and will check how it helps to find the optimal parameters that are kernel sizes, learning rate for optimization, and different hyper-parameters. For this experiment, we will use the MNIST data set which is data of handwritten digits. There are a total of 60000 images in the training set and 10000 images in the testing data. We will directly import the dateset from Keras. We will work on Google Colab as it provides GPU free of cost for fast processing. 

What you will learn from this article?

  1. What is Keras Tuner?
  2. How to build a model for using Keras Tuner?
  3. How to check for optimal Hyper parameter tuning? 
  1. What is Keras Tuner?

Keras tuning is a library that allows us to find optimal hyperparameters for our model. Hyper Parameter is defined as the parameters that directly controls the performance of the models. We tune these parameters to get the best performance. In machine learning, we have techniques like GridSearchCV and RandomizedSearchCV for doing hyper-parameter tuning. We just need to define the range of the parameters and then automatically the algorithm computes the different combinations. Keras tuner is used similarly. We have to define the range of neurons we want to compute and similarly kernel size. Let’s start by building our CNN model first. 

  1. How to Build a Model for using Keras Tuner?

We will first import the library and dependencies that are required for the model. Use the below code to install the keras tuner package and all the required libraries. Use the below code to the same. Also, we need to change the runtime environment to be GPU.

!pip install keras-tuner
import tensorflow as tf
from keras.utils import np_utils
from tensorflow import keras
from tensorflow.keras import layers
from kerastuner.tuners import RandomSearch
import kerastuner as kt

Now we will import the MNIST dataset from Keras and check the training and testing shapes. Use the below code to the same. 

(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
print(X_train.shape)
print(X_test.shape)

Output:

Now we will reshape the data in a way that is accepted by Keras followed by building the model in a way to use Keras tuner. Use the below code to do the same. 

X_train=X_train.reshape(len(X_train),28,28,1)
X_test=X_test.reshape(len(X_test),28,28,1)
def create_model(hyperparam):
   model = keras.models.Sequential()
   model.add(Convolution2D(
filters=hyperparam.Int('convolution_1',min_value=32, max_value=128, step=16), kernel_size=hyperparam.Choice('convolution_1', values = [3,6]),
activation='relu',input_shape=(28,28,1)
                          ),
   model.add(Convolution2D(
filters=hyperparam.Int('convolution_2', min_value=64, max_value=128, step=16),kernel_size=hyperparam.Choice('convolution_2', values = [3,6]),
activation='relu'
                          ),
  model.add(Convolution2D(
filters=hyperparam.Int('convolution_3', min_value=64, max_value=128, step=16),kernel_size=hyperparam.Choice('convolution_3', values = [3,6]),
activation='relu'
                         ),
  model.add(Flatten())
  model.add(Dense(10))
  model.add(Activation("softmax"))
  model.compile(optimizer=keras.optimizers.Adam(hyperparam.Choice('learning_rate', values=[1e-2, 1e-3])),
                loss='binary_crossentropy',
                metrics=['accuracy'])
  return model

We have now defined our choices of kernels and filters. Different combinations will be taken on which the model will be trained. Now we define the parameter which is to be monitored that is objective. Also, max trials define the number of time models will try those combinations.

tuner_search=RandomSearch(model,
                       objective='val_accuracy',
                       max_trials=5,directory='output',project_name="mnist")

After this, we will start searching for optimal hyperparameters. We have defined max epochs to be 5 whereas it is again variable. 

tuner_search.search(X_train,y_train,epochs=5,validation_data=(X_test,y_test))

Above are the snapshots of training of 5 different trials that outputs the trial summary and the best hyperparameters. Now we can check the best model from these 5 trials using the below code and use it for training the models. Follow the below code for the same. 

model=tuner_search.get_best_models(num_models=1)[0]
model.fit(X_train,y_train, epochs=10, validation_data=(X_test,y_test))

After using the optimal hyperparameter given by Keras tuner we have achieved 98% accuracy on the validation data. Keras tuner takes time to compute the best hyperparameters but gives the high accuracy. 

Conclusion

In this article, we discussed the Keras tuner library for searching the optimal hyper-parameters for Deep learning models. Keras tune is a great way to check for different numbers of combinations of kernel size, filters, and neurons in each layer. Keras tuner can be used for getting the best parameters for our deep learning model that will give the highest accuracy that can be achieved with those combinations we define. Check here a similar article titled “Guide to Hyperparameter Tuning using GridSearchCV and RandomizedSearchCV”. 

More Great AIM Stories

Rohit Dwivedi
I am currently enrolled in a Post Graduate Program In Artificial Intelligence and Machine learning. Data Science Enthusiast who likes to draw insights from the data. Always amazed with the intelligence of AI. It's really fascinating teaching a machine to see and understand images. Also, the interest gets doubled when the machine can tell you what it just saw. This is where I say I am highly interested in Computer Vision and Natural Language Processing. I love exploring different use cases that can be build with the power of AI. I am the person who first develops something and then explains it to the whole community with my writings.
MORE FROM AIM
Yugesh Verma
How is Boolean algebra used in Machine learning?

Machine learning model with Boolean algebra starts with the data with a target variable and input or learner variables and using the set of rules it generates output value by considering a given configuration of input samples.

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM