MITB Banner

How To Solve The Never-Ending Pursuit Of Perfect Hyperparameters

Share

Photo by Mostafa for Unsplash

The goal of hyperparameter exploration is to search across various hyperparameter configurations and find a configuration that results in the best performance. Typically, the hyperparameter exploration process is painstakingly manual, given that the search space is vast and evaluation of each configuration can be expensive.

Hyperparameters help answer questions like:

  • The depth of the decision tree
  • How many trees are required in random forest
  • How many layers should a neural network have
  • The learning rate for the Gradient Descent method.

Hyperparameters are adjustable parameters one chooses to train a model that governs the training process itself. For example, to train a deep neural network, you decide the number of hidden layers in the network and the number of nodes in each layer prior to training the model. These values usually stay constant during the training process.

To bottle down on the values, there are few methods to skim through the parameter space to figure out the values that align with the objective of the model that is being trained.

While defining the architecture of a machine learning model, it is usually not obvious to come across an optimal one because there is no one-stop answer to finding out the method in which hyperparameters can be tuned to reduce the loss; more or less a trial and error experimentation.

Techniques At Disposal

For architectures in particular like Long Short Term Memory(LSTM) networks, the learning rate and the size of the network are its prime hyperparameters.

In reinforcement learning algorithms, to measure the sensitivity of choice of hyperparameters, a larger number of data points because the performance is adequately captured with a lesser number of points due to high variance.

There are mainly three methods to perform high dimensional non-convex optimisation. They are as follows:

  • Grid search a very common and often advocated approach where you lay down a grid over the space of possible hyperparameters, and evaluate at each point on the grid; the hyperparameters from the grid which had the best objective value is then used in production.
  • Random search is performed by evaluating n uniformly random points in the hyperparameter space and select the one producing the best performance. But this method has its own disadvantages like high variance. So, a better, more intelligent alternative would be Bayesian optimisation.
  • Bayesian optimisation builds a surrogate for the objective and quantifies the uncertainty in that surrogate using a Bayesian machine learning technique, Gaussian process regression, and then uses an acquisition function defined from this surrogate to decide where to sample.

Apart from the above conventional methods, one can also make use of the graph-based systems for hyperparameter tuning.

To optimise and automate the hyperparameters, Google introduced Watch Your Step, an approach that formulates a model for the performance of embedding methods. In short, making the graph to concentrate on direct significant neighbours. Here the “Auto” portion corresponds to learning the graph hyperparameters by backpropagation.

Tools At Disposal

In this age of information abundance, especially in the world of AI where a new tool gets added and a new paper get published every other day, it becomes highly impractical for a practising machine learning engineer to keep track of which libraries work, which hyperparameters are best.

It is always great to have a toolbox that can automatically save and learn from experiment results, leading to long-term, persistent optimization that remembers all tests. A toolbox by the name Hyperparameter Hunter was released recently, which does exactly the same. The creators call this tool as a personal machine learning toolbox/assistant.

Hyperparameter hunter allows the users to run all of the benchmark/one-off experiments through it and it doesn’t start optimization from scratch like other libraries. It considers all the previously run experiments and previous optimization rounds that have been already run through it. The creators insist that Hyperparameter Hunter gives better results with increased usage. 

Key Features Include

  • Stop worrying about keeping track of hyperparameters, scores, or re-running the same Experiments
  • Automatically reads the Experiment files to find the ones that fit, and it learns from them
  • Eliminates boilerplate code for cross-validation loops, predicting, and scoring
  • Have predictions ready to go when it’s time for ensembling, meta-learning, and finalizing the models.

Dependencies: Dill, NumPy, Pandas, SciPy, Scikit-Learn, Scikit-Optimize, SimpleJSON

Here’s a quick guide to get started with hyperparameter_hunter:

Installation

pip install hyperparameter_hunter

Setting Up Environment

from hyperparameter_hunter import Environment, CVExperiment

import pandas as pd

from sklearn.datasets import load_breast_cancer

from sklearn.model_selection import StratifiedKFold

from xgboost import XGBClassifier

Performing Optimization

from hyperparameter_hunter import BayesianOptPro, Real, Integer, Categorical

Sample Code Using hyperparameter_hunter defining the OptPro (Optimization Protocol)

optimizer = BayesianOptPro(verbose=1)

Choosing which hyperparameters we want to optimize.

optimizer.forge_experiment(

model_initializer=XGBClassifier,

model_init_params=dict(

objective='reg:linear',  # setting this as a constant guideline – Not one to optimize

 # Launch

optimizer.go()

Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.