Advertisement

How To Supercharge Your Machine Learning Experiments with Comet.ml

comet.ml

Introduction

Comet.ml is a Machine Learning experimentation platform which AI researchers and data scientists use to track, compare and explain their ML experiments. It allows ML practitioners to keep track of their databases, history of performed experiments, code modifications and production models. It enables model reproduction, easy maintenance of ML workflow and smooth collaboration throughout the iterative process of ML lifecycle. It also performs model optimization using Bayesian hyperparameter optimization algorithm and thereby reduces the overhead of tuning your model manually. 

Comet was introduced by a private organization named Comet ML Inc. (also known as CometML). It was founded by Gideon Mendels in 2017 and is headquartered at New York (USA). Some of the leading companies leveraging the Comet platform are: Google, Ancestry, Cern, Uber and Boeing.

Following are the SDKs and APIs featured by Comet:SDKs and APIs:

Comet’s web-based UI

Comet organizes your code runs based on three major concepts:

  1. Workspace
  2. Project
  3. Experiment

A workspace contains projects each os which is a collection of experiments.Let’s have an overview of each of them.

Workspaces

On creating an account, Comet provides you with a default workspace which comprises your private and public projects. You can create your own workspaces other than the default one. Each workspace has its own set of collaborators (contributors working jointly on a project). Projects in a workspace are automatically shared with all the collaborators in that public. Enabling ‘public’ sharing, you can even share your projects with the outside world.

Projects

A project is a set of ML experiments. Each project is categorized into one of the following two types:

  • Private project (can be viewed and edited by all the collaborators having appropriate permissions)
  • Public project (can be viewed by anyone but can be edited only by the owner)

A project is divided into 5 sections as follows:

  • Experiments: has Project view (the core working area at Comet)
  • Notes: has markdown notes for your project
  • Files: a list of all of the files in your project and the experiments that use those files
  • Manage: to control your project’s visibility and create shareable links
  • Archive: a list of archived experiments for your project

Experiments

An experiment is a unit of measurable research which represents a single run of your code. Each experiment is given a random ID by default which you can change to make it human-readable. A totally customizable tabular form of an experiment is known as an ‘Experiment Table’. An experiment has a set of ‘Experiment Tabs’ such as Charts Tab, HTML Tab, Hyperparameters Tab, Metrics Tab and so on, each having a peculiar functionality. For instance, Hyperparameters Tab and Metrics Tab store the ML model’s hyperparametes and evaluation metrics logged during your experiment.

Within a project, you have a ‘Project Visualizations’ section which enables viewing and comparing performance across different experiments. Besides, ‘Query Builder’ section of a project allows you to choose which experiments should be shown in the Project Visualizations and the Experiment Table.

Visit this page to know about workspaces, projects and experiments in detail along with the ways to handle the functionalities provided by each of them.

Comet facilitates Automatic Logging for several extensively used Python ML frameworks some of which are listed below. Click on the corresponding link to get a quick tutorial on how to incorporate Comet while implementing that framework.

keras, lightgbm, Uber’s ludwig, matplotlib, mlflow, pyspark, pytorch, pytorch-lightning, scikit-learn, shap, tensorflow, tensorflow model analysis, HuggingFace’s transformers.

Installation of comel_ml Python library

comet_ml can be installed using pip command as follows:

pip install comet_ml

Practical Implementation using Keras

Import the required libraries

 from comet_ml import Experiment
 import keras
 from keras.datasets import mnist
 from keras.models import Sequential
 from keras.layers import Dense, Dropout
 from keras.callbacks import EarlyStopping 

Create an experiment with your API key 

 experiment = Experiment(
     project_name='my_project',  //name of your project
    //disable automatic logging of hyperparameters
     auto_param_logging=False, 
    //enable automatic histogram logging for biases and weights
     auto_histogram_weight_logging=True,
  //enable  automatic histogram logging of gradients 
    auto_histogram_gradient_logging=True,
    //enable automatic histogram logging of activations
     auto_histogram_activation_logging=True,
 ) 

Click here to learn about the properties and methods associated with an Experiment object.

Initialize model’s parameters

 batch_size = 128
 num_classes = 10  //number of output classes
 epochs = 20
 num_nodes = 64   //number of nodes in hidden layer
 optimizer = 'adam'
 activation = 'relu' 

Get to know about the Adam optimizer and ReLU activation function.

Load the MNIST digit classification dataset using load_data() function and split it into training set and test set

(x_train, y_train), (x_test, y_test) = mnist.load_data()

Reshape the 1D train and test set

 x_train = x_train.reshape(60000, 784)
 x_test = x_test.reshape(10000, 784) 

Convert the data types to real

 x_train = x_train.astype('float32')
 x_test = x_test.astype('float32') 

Perform normalization

 x_train /= 255
 x_test /= 255 

Print the number of samples in training set and test set

 print(x_train.shape[0], 'Training set samples')
 print(x_test.shape[0], 'Test set samples') 

Convert class vectors to binary class matrices

 y_train = keras.utils.to_categorical(y_train, num_classes)
 y_test = keras.utils.to_categorical(y_test, num_classes) 

Define a dictionary for parameters to be logged

 params={'batch_size':batch_size,
         'epochs':epochs,
         'layer1_type':'Dense',
         'layer1_num_nodes':num_nodes,
         'layer1_activation':activation,
         'optimizer':optimizer
 } 

Instantiate the sequential model

model = Sequential()

Add dense layers in the network

 model.add(Dense(num_nodes, activation='relu', input_shape=(784,)))
 model.add(Dense(num_classes, activation='softmax')) 

Print model.summary() to preserve automatically in `Output` tab

print(model.summary())

Compile the model

 model.compile(loss='categorical_crossentropy',
               optimizer=optimizer,
               metrics=['accuracy']) 

Log metrics with the prefix ‘train_’

 with experiment.train():
     history = model.fit(x_train, y_train,
                         batch_size=batch_size,
                         epochs=epochs,
                         verbose=1,
                         validation_data=(x_test, y_test),
                         callbacks=[EarlyStopping(monitor='val_loss', 
                         min_delta=1e-4,patience=3, verbose=1, 
                         mode='auto')]) 

Experiment.train() marks beginning and end of the train phase. It provides a namespace for logging the training parameters.

Log metrics with the prefix ‘test_’

 with experiment.test():
     loss, accuracy = model.evaluate(x_test, y_test)
     metrics = {
         'loss':loss,
         'accuracy':accuracy
     } 

Experiment.test() marks beginning and end of the test phase. It provides a namespace for logging the testing metrics.

Log the metrics as key:value pairs in a dictionary  

experiment.log_metrics(metrics)

Log a dictionary-like object of parameters

experiment.log_parameters(params)

Create and log a hash of your data

experiment.log_dataset_hash(x_train) 

Source: https://www.comet.ml/docs/python-sdk/keras

Practical implementation using PyTorch

Import the required libraries

 from comet_ml import Experiment
 import torch
 import torch.nn as nn
 import torchvision.datasets as dsets
 import torchvision.transforms as transforms
 from torch.autograd import Variable 

Define the hyperparameters 

 hyper_params = {
     "sequence_length": 28,
     "input_size": 28, //number of input layer neurons
     "hidden_size": 128,  //number of hidden layer neurons
     "num_layers": 2,  //number of hidden layers
     "num_classes": 10,  //number of output classes
     "batch_size": 100,
     "num_epochs": 2,
     "learning_rate": 0.01
 } 

Instantiate an Experiment object

experiment = Experiment(project_name="my_project")

Log the hyperparameters

experiment.log_parameters(hyper_params)

Load the MNIST dataset

 train_dataset = dsets.MNIST(root='./data/',
                             train=True,
                             transform=transforms.ToTensor(),
                             download=True)
 test_dataset = dsets.MNIST(root='./data/',
                            train=False,
                            transform=transforms.ToTensor()) 

Define the model training and testing pipelines using DataLoader utility

 train_loader = torch.utils.data.DataLoader(dataset=train_dataset,                           batch_size=hyper_params['batch_size'], shuffle=True)
 test_loader = torch.utils.data.DataLoader(dataset=test_dataset,                            batch_size=hyper_params['batch_size'], shuffle=False) 

Define many-to-one RNN model

 class RNN(nn.Module):
     def __init__(self, input_size, hidden_size, num_layers, num_classes):
         super(RNN, self).__init__()
         self.hidden_size = hidden_size
         self.num_layers = num_layers
         self.lstm = nn.LSTM(input_size, hidden_size, num_layers, 
         batch_first=True)
         self.fc = nn.Linear(hidden_size, num_classes) 

Forward propagation

     def forward(self, x):
         # Set initial states
         h0 = Variable(torch.zeros(self.num_layers, x.size(0), 
         self.hidden_size))
         c0 = Variable(torch.zeros(self.num_layers, x.size(0), 
         self.hidden_size))
         #Forward propagate RNN
         out, _ = self.lstm(x, (h0, c0))
         #Decode hidden state of last time step
         out = self.fc(out[:, -1, :])
         return out 

Call RNN()

rnn = RNN(hyper_params['input_size'], hyper_params['hidden_size'], hyper_params['num_layers'], hyper_params['num_classes'])

Define loss function and optimizer

 criterion = nn.CrossEntropyLoss()
 optimizer = torch.optim.Adam(rnn.parameters(), 
              lr=hyper_params['learning_rate']) 

Get to know about Cross Entropy Loss.

Train the Model

 with experiment.train():
     step = 0
     for epoch in range(hyper_params['num_epochs']):
         correct = 0
         total = 0
         for i, (images, labels) in enumerate(train_loader):
             images = Variable(images.view(-1,hyper_params 
             ['sequence_length'], hyper_params['input_size']))
             labels = Variable(labels)
             #forward-backward optimizations and optimization
             optimizer.zero_grad()
             outputs = rnn(images)
             loss = criterion(outputs, labels)
             loss.backward()
             optimizer.step()
             #Compute train accuracy
             _, predicted = torch.max(outputs.data, 1)
             batch_total = labels.size(0)
             total += batch_total
             batch_correct = (predicted == labels.data).sum()
             correct += batch_correct
             #Log batch_accuracy to Comet.ml
             step += 1 #each step represents one batch
             experiment.log_metric("batch_accuracy", 
             batch_correct / batch_total, step=step)
             if (i + 1) % 100 == 0:
                 print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f'
                       % (epoch + 1, hyper_params['num_epochs'], 
                       i + 1, len(train_dataset) // 
                       hyper_params['batch_size'], loss.item()))
    

Log epoch accuracy to Comet.ml; step is each epoch

 experiment.log_metric("batch_accuracy", correct / total, step=epoch) 

Test the model

 with experiment.test():
         correct = 0
     total = 0
     for images, labels in test_loader:
         images = Variable(images.view(-1,   
         hyper_params['sequence_length'], 
         hyper_params['input_size']))
         outputs = rnn(images)
         _, predicted = torch.max(outputs.data, 1)
         total += labels.size(0)
         correct += (predicted == labels).sum() 

    Log accuracy

    experiment.log_metric("accuracy", correct / total)

    Print the logged accuracy

    print('Test Accuracy: %d %%' % (100 * correct / total))

Source: https://www.comet.ml/docs/python-sdk/pytorch

Refer to the following web links to get greater insights of Comet.ml:

Download our Mobile App

Nikita Shiledarbaxi
A zealous learner aspiring to advance in the domain of AI/ML. Eager to grasp emerging techniques to get insights from data and hence explore realistic Data Science applications as well.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Is Sam Altman a Hypocrite? 

While on the one hand, Altman is advocating for the international community to build strong AI regulations, he is also worried when someone finally decides to regulate it