Now Reading
Hands-On Guide To Algorithm Configuration Using SMAC

Hands-On Guide To Algorithm Configuration Using SMAC

Many algorithms belong to the family of tree and ensemble, which are hard for computational problems and exposed to many hyperparameters that can be modified to improve the performance.  However, manually exploring those parameters and setting those for optimized solutions is a rigorous task and often leads to unsatisfactory results. Therefore, recently automated approaches for solving this algorithm configuration problem have led to substantial improvements in the State-of-the-art for solving various problems.  

Sequential Model-Based Algorithm Configuration, in short, referred to as SMAC, is a versatile tool for optimizing algorithm parameters or parameters of some other automated process or a user-defined function we can evaluate, such as simulation. SMAC is to be very effective for the hyperparameter optimization of machine learning algorithms. Eventually, it scales better to high dimensions and discrete input dimensions than other techniques. It also helps to capture the important information about the model domain, such as which input variables are important.     

Register for FREE Workshop on Data Engineering>>

Today in this article, we will see the use case of SMAC for optimizing the hyperparameter of the support vector machine. Later the same algorithm will be used to optimize the set of input data. All this can be done by leveraging the Python programming language. 

Implementing SMAC

To run SMAC, you need to install some dependencies before making sure you have installed SciPy of version 1.7.0 or greater. To install SciPy, use the below command; 

! pip install scipy==1.7.0

After installing SciPy, make sure you restarted the runtime; by this, the changes will take place. 

Install & import all the dependencies:
! apt-get install swig -y
! pip install pyrfr
# install smac
! pip install smac
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score
from sklearn.svm import SVC
from sklearn.model_selection import KFold
from smac.configspace import ConfigurationSpace
from smac.configspace import UniformFloatHyperparameter
from smac.scenario.scenario import Scenario
# main algo used to optimize
from smac.facade.smac_hpo_facade import SMAC4HPO 
from smac.configspace import CategoricalHyperparameter,UniformFloatHyperparameter,\
                             UniformIntegerHyperparameter, InCondition

As explained earlier, SMAC can optimize the user-defined function; in this section, we will configure the SVM, and here we will optimize the solution quality, i.e., we will optimize the accuracy. We are configuring SVM to achieve high accuracy on the iris dataset;

iris = load_iris()

Optimizing SVM:

First, we need to inform SMAC about the possible hyperparameters and their values to optimize; the ConfigSpace package can do this; below, we structurally add the hyperparameters that SVM supports along with default values. 

Let us first build the configuration space which holds all the states of  parameters;

# build configuration space
cs = ConfigurationSpace()

We define few types of SVM kernel and adding them as kernel to our configuration space;

# kernels
kernel = CategoricalHyperparameter('kernel',['linear','rbf','poly','sigmoid'],default_value='sigmoid')
cs.add_hyperparameter(kernel)

There are other common parameters such as ‘C’  Regularization parameter and shrinking; the below code add these two parameters along with range; 

# common hyperparameters
C = UniformFloatHyperparameter('C',0.001, 1000.0, default_value=1.0)
shrinking = CategoricalHyperparameter('shrinking',['true','false'],default_value='true')
cs.add_hyperparameters([C,shrinking])

Below are the kernel specific parameters such degree, coefficient, and gamma we are adding them with specified conditions; 

# Others
degree = UniformIntegerHyperparameter('degree', 1, 5, default_value=1) # used by poly
coef0 = UniformFloatHyperparameter('coef0', 0.0, 10.0, default_value=5.0) # poly, sigmoid
cs.add_hyperparameters([degree, coef0])
use_degree = InCondition(child=degree, parent=kernel, values=['poly'])
use_coef0 = InCondition(child=coef0, parent=kernel, values=['poly','sigmoid'])
cs.add_conditions([use_degree, use_coef0])
## only for rbf, ploy, sigmoid
gamma = CategoricalHyperparameter('gamma',['auto','value'],default_value='auto') 
gamma_value = UniformFloatHyperparameter('gamma_value', 0.0001, 8, default_value=8)
cs.add_hyperparameters([gamma, gamma_value])
# activate gamma_value only if gamma is set to value
cs.add_condition(InCondition(child=gamma_value, parent=gamma, values=['value']))
# restrict the use of gamma in general to choice of kernel 
cs.add_condition(InCondition(child=gamma, parent=kernel, values=['rbf','poly','sigmoid']))
Interface between SVM and Cofigurator:

For SMAC to configure the SVM, we need to add an interface that accepts the configuration as input and outputs a cost value. 

def cfg_svm(cfg):
  """ a SVM based configuration and evaluates it 
      using cross-validation """

  # deactivated parameters stored as None this is not
  # accepted by SVM below we removing them
  cfg = {k: cfg[k] for k in cfg if cfg[k]}
  # translate the boolean values
  cfg['shrinking'] = True if cfg['shrinking'] == 'true' else False
  # for gamma set it to fixed value or auto
  if 'gamma' in cfg:
    cfg['gamma'] = cfg['gamma_value'] if cfg['gamma']=='value' else 'auto'
    cfg.pop('gamma_value',None)
  model = svm.SVC(**cfg, random_state=42)
  scores = cross_val_score(model, iris.data,iris.target, cv=5)
  return (1 - np.mean(scores)) # as SMAC minimizes the cost
Resource limits:

To finalize the scenario we need to tell SMAC about the whole method and configuration space using the Scenario object. This method looks similar to the compilation of Neural Networks.    

# scenario object
scenario = Scenario({'run_obj':'quality',
                     'runcount-limit':120, 
                     'cs':cs,
                     'deterministic':'true'})
Run SMAC:

We can now use SMAC to optimize the hyperparameters; SMAC has different instantiations that set its hyperparameter to work well for different scenarios. 

See Also

smac = SMAC4HPO(scenario=scenario, rng=42,
                tae_runner=cfg_svm)
incumbent = smac.optimize()

Now let’s check the what was default hyperparameters at the beginning of optimization and the end of the optimization;

print(cs.get_default_configuration())

Output:

print(incumbent)

Output: 

The above optimization should result in reduced cost value; let’s validate the result;

def_value = cfg_svm(cs.get_default_configuration())
inc_value = cfg_svm(incumbent)
print('Default Value: {:.4f}'.format(def_value))
print('Optimized Value: {:.4f}'.format(inc_value))

Output:

Conclusion: 

In this article, we have seen the parameter optimization technique SMAC, which can be used for various algorithms and simulation tasks to maximise performance. The user-friendly API gives much more flexibility for adding condition-based parameters. And the result is quite impressive. Furthermore, we can directly use these parameters given by the SMAC for our model training. More examples for SMAC are included in the notebook, and the same SVM is trained for newly optimized parameters; the link is in the references.

References: 

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top