Hyperparameter Tuning using Python

Table of Contents

Hyperparameter Tuning is a technique of choosing the best hyperparameters to get the maximum out of a Machine Learning model. In this article, we will be going over multiple ways to perform hyperparameter tuning using Python.

What is a Hyperparameter?

Hyperparameters are values that dictate how a Machine Learning model is built.

These values cannot be estimated by the model from the given data but are rather set by the Machine Learning engineer or Data Scientist when building a model architecture. Some examples of hyperparameters are; depth of a Neural Network, number of trees, learning rate, activation function, etc.

Because these hyperparameters are so important to a model's performance, speed and accuracy, you must optimize them, and there are several methods to optimize the hyperparameters.

Hyperparameter Tuning using Python

There are two broad categories to divide hyperparameter tuning methods in Python:

Manual hyperparameter tuning: In this method, different combinations of hyperparameters are manually chosen and then the loss/accuracy of the model is measured. This is not a very recommended method as it's a tedious process and is not effective in cases where there are many hyperparameters to estimate.
Automated hyperparameter tuning: In this method, the parameters are chosen using an algorithm that automates and optimizes the process.

Before beginning to learn how to perform hyperparameter tuning using Python, let's set up the dataset first. For this exercise, we'll use the credit card fraud detection dataset and we'll use the sklearn Python library to create a training and testing dataset split.

# Importing libraries
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

# Reading in the dataset
df = pd.read_csv('../input/creditcardfraud/creditcard.csv', na_values = '#NAME?')

# Defining independent (X) and dependent variables (y)
X = df[['V17', 'V9', 'V6', 'V12']]
Y = df['Class']

# Splitting the dataset into training and testing dataset
X_Train, X_Test, Y_Train, Y_Test = train_test_split(X, Y, test_size = 0.30, random_state = 101)

Now that we have our training and testing dataset ready, let us learn different methods to perform hyperparameter tuning.

1. Hyperparameter Tuning using Random Search

As the name implies, Random Search is a method of choosing a random set of hyperparameters that work best for a given Machine Learning problem.

In this tuning method, we set up a grid of values of different hyperparameters and then, randomly pick out combinations to build, train, and evaluate a Machine Learning model on. The combination of hyperparameters that give the best evaluation results is selected as the final set of hyperparameters.

We can now start implementing Random Search by first defining a grid of hyperparameters that will be randomly sampled when calling RandomizedSearchCV() from scikit-learn's model_selection module.

# Importing necessary libraries
import numpy as np 
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import cross_val_score

# Creating a Random Search grid
random_search = {'criterion': ['entropy', 'gini'],
               'max_depth': [2],
               'max_features': ['auto', 'sqrt'],
               'min_samples_leaf': [4, 6, 8],
               'min_samples_split': [5, 7,10],
               'n_estimators': [20]}

# Initializing a Machine Learning model
clf = RandomForestClassifier()

# Finding the best hyperparameters using Random Search
model = RandomizedSearchCV(estimator = clf, param_distributions = random_search, n_iter = 10, cv = 4, verbose= 1, random_state= 101, n_jobs = -1)

# Training the model
model.fit(X_Train,Y_Train)

We can now evaluate how our model performed using Random Search. In this case, using Random Search leads to a consistent increase in accuracy compared to our base model.

from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

# Predicting using the best estimators
predictionforest = model.best_estimator_.predict(X_Test)

# Printing the confusion matrix
print(confusion_matrix(Y_Test, predictionforest))

# Printing the classification report
print(classification_report(Y_Test, predictionforest))

2. Hyperparameter Tuning using Grid Search

Grid Search is very similar to Random Search where we form a grid of values for different hyperparameters but the way this method differs is that this method chooses the combination in a specific order and run the model on each and every combination, records the data, and gives out the combination with the best accuracy or minimum loss.

In order to choose the parameters to use in Grid Search, we can now look at which parameters worked best with Random Search and form a grid-based on them to see if we can find a better combination.

# Importing necessary libraries
from sklearn.model_selection import GridSearchCV

# Creating a Grid Search grid
grid_search = {'criterion': ['entropy', 'gini'],
               'max_depth': [2],
               'max_features': ['auto', 'sqrt'],
               'min_samples_leaf': [4, 6, 8],
               'min_samples_split': [5, 7,10],
               'n_estimators': [20]}

# Initializing a Machine Learning model
clf = RandomForestClassifier()

# Finding the best hyperparameters using Grid Search
model = GridSearchCV(estimator = clf, param_grid = grid_search, 
                               cv = 4, verbose= 5, n_jobs = -1)

# Training the model
model.fit(X_Train,Y_Train)

# Predicting using the best estimators
predictionforest = model.best_estimator_.predict(X_Test)

# Printing the confusion matrix
print(confusion_matrix(Y_Test,predictionforest))

# Printing the classification report
print(classification_report(Y_Test,predictionforest))

3. Hyperparameter Tuning using Bayesian Optimization

Tuning and finding the right hyperparameters for your model is an optimization problem. We want to minimize the loss function of our model by changing model parameters. Bayesian optimization helps us find the minimal point in the minimum number of steps. It can be performed in python using Hyperopt library. In Hyperopt, Bayesian Optimization can be implemented giving 3 three main parameters to the function fmin().

Objective Function : defines the loss function to minimize.
Domain Space : defines the range of input values to test (in Bayesian Optimization this space creates a probability distribution for each of the used Hyperparameters).
Optimization Algorithm: defines the search algorithm to use to select the best input values to use in each new iteration.

A Trials() object is first created to make possible to visualize later what was going on while the fmin() function was running (eg. how the loss function was changing and how to used Hyperparameters were changing).

from hyperopt import hp, fmin, tpe, STATUS_OK, Trials

space = {'criterion': hp.choice('criterion', ['entropy', 'gini']),
        'max_depth': hp.quniform('max_depth', 10, 12, 10),
        'max_features': hp.choice('max_features', ['auto', 'sqrt','log2', None]),
        'min_samples_leaf': hp.uniform ('min_samples_leaf', 0, 0.5),
        'min_samples_split' : hp.uniform ('min_samples_split', 0, 1),
        'n_estimators' : hp.choice('n_estimators', [10, 50])
    }

def objective(space):
    model = RandomForestClassifier(criterion = space['criterion'], 
                                   max_depth = space['max_depth'],
                                 max_features = space['max_features'],
                                 min_samples_leaf = space['min_samples_leaf'],
                                 min_samples_split = space['min_samples_split'],
                                 n_estimators = space['n_estimators'], 
                                 )
    accuracy = cross_val_score(model, X_Train, Y_Train, cv = 4).mean()

    # We aim to maximize accuracy, therefore we return it as a negative value
    return {'loss': -accuracy, 'status': STATUS_OK }
    
trials = Trials()
best = fmin(fn= objective,
            space= space,
            algo= tpe.suggest,
            max_evals = 20,
            trials= trials)
best

Here 'criterion' : 1 means criterion= "gini" (object at 1th index) and similarly for max_features and n_estimators.

crit = {0: 'entropy', 1: 'gini'}
feat = {0: 'auto', 1: 'sqrt', 2: 'log2', 3: None}
est = {0: 10, 1: 50, 2: 75, 3: 100, 4: 125}

4. Hyperparameter tuning using Optuna

Optuna uses a historical record of trails details to determine the promising area to search for optimizing the hyperparameter and hence finds the optimal hyperparameter in a minimum amount of time. It has the pruning feature which automatically stops the unpromising trails in the early stages of training. another key features are:

Eager search spaces: Automated search for optimal hyperparameters using Python conditionals, loops, and syntax.
State-of-the-art algorithms: Efficiently search large spaces and prune unpromising trials for faster results.
Easy parallelization: Parallelize hyperparameter searches over multiple threads or processes without modifying code

import sklearn
import sklearn.datasets
import sklearn.ensemble
import sklearn.model_selection
import sklearn.svm
import optuna

# 1. Define an objective function to be maximized.
def objective(trial):
    iris = sklearn.datasets.load_iris()
    x, y = iris.data, iris.target
    # 2. Suggest values for the hyperparameters using a trial object.
    classifier_name = trial.suggest_categorical('classifier', ['SVC', 'RandomForest'])
    if classifier_name == 'SVC':
         svc_c = trial.suggest_loguniform('svc_c', 1e-10, 1e10)
         classifier_obj = sklearn.svm.SVC(C=svc_c, gamma='auto')
    else:
        rf_max_depth = int(trial.suggest_loguniform('rf_max_depth', 2, 32))
        classifier_obj = sklearn.ensemble.RandomForestClassifier(max_depth=rf_max_depth, 
                          n_estimators=10)
    return accuracy


# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

I hope now it's clear the features and advantages of each method and the best method will be dependent on the scenario; model and the dataset.

Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:

Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
Introduction to Data Science in Python- 400,000+ students already enrolled!
Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!

Written by

The Click Reader

At The Click Reader, we are committed to empowering individuals with the tools and knowledge needed to excel in the ever-evolving field of data science. Our sole focus is delivering a world-class data science bootcamp that transforms beginners and upskillers into industry-ready professionals.

Hyperparameter Tuning using Python

What is a Hyperparameter?

Hyperparameter Tuning using Python

1. Hyperparameter Tuning using Random Search

2. Hyperparameter Tuning using Grid Search

3. Hyperparameter Tuning using Bayesian Optimization

4. Hyperparameter tuning using Optuna

Related Articles

Machine Learning & AI Foundations: Linear Regression Course

Machine Learning with Python: Foundations - Try the course for free!

Bokeh Palettes for Color Mapping and Plotting in Python

Computer Vision Engineer: Definition, Salary and Job Requirements

Interested In Data Science Bootcamp?
Request more info now.

Hyperparameter Tuning using Python

What is a Hyperparameter?

Hyperparameter Tuning using Python

1. Hyperparameter Tuning using Random Search

2. Hyperparameter Tuning using Grid Search

3. Hyperparameter Tuning using Bayesian Optimization

4. Hyperparameter tuning using Optuna

Related Articles

Machine Learning & AI Foundations: Linear Regression Course

Machine Learning with Python: Foundations - Try the course for free!

Bokeh Palettes for Color Mapping and Plotting in Python

Computer Vision Engineer: Definition, Salary and Job Requirements

Interested In Data Science Bootcamp?Request more info now.

Interested In Data Science Bootcamp?
Request more info now.