Table of contents
Introduction
Comet.ml is a Machine Learning experimentation platform which AI researchers and data scientists use to track, compare and explain their ML experiments. It allows ML practitioners to keep track of their databases, history of performed experiments, code modifications and production models. It enables model reproduction, easy maintenance of ML workflow and smooth collaboration throughout the iterative process of ML lifecycle. It also performs model optimization using Bayesian hyperparameter optimization algorithm and thereby reduces the overhead of tuning your model manually.
Comet was introduced by a private organization named Comet ML Inc. (also known as CometML). It was founded by Gideon Mendels in 2017 and is headquartered at New York (USA). Some of the leading companies leveraging the Comet platform are: Google, Ancestry, Cern, Uber and Boeing.
Following are the SDKs and APIs featured by Comet:SDKs and APIs:
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
- Python SDK to track Python-based experiments (An available Python API also enables access to the REST API)
- Java SDK to track Java-based experiments
- JavaScript SDK to create custom visualizations with Comet Panels
- R SDK to track R-based experiments
- REST API to access your experiment’s data from Comet.ml
- Comet Command-Line Utilities
Comet’s web-based UI
Comet organizes your code runs based on three major concepts:

- Workspace
- Project
- Experiment
A workspace contains projects each os which is a collection of experiments.Let’s have an overview of each of them.
Workspaces
On creating an account, Comet provides you with a default workspace which comprises your private and public projects. You can create your own workspaces other than the default one. Each workspace has its own set of collaborators (contributors working jointly on a project). Projects in a workspace are automatically shared with all the collaborators in that public. Enabling ‘public’ sharing, you can even share your projects with the outside world.
Projects
A project is a set of ML experiments. Each project is categorized into one of the following two types:
- Private project (can be viewed and edited by all the collaborators having appropriate permissions)
- Public project (can be viewed by anyone but can be edited only by the owner)
A project is divided into 5 sections as follows:
- Experiments: has Project view (the core working area at Comet)
- Notes: has markdown notes for your project
- Files: a list of all of the files in your project and the experiments that use those files
- Manage: to control your project’s visibility and create shareable links
- Archive: a list of archived experiments for your project
Experiments
An experiment is a unit of measurable research which represents a single run of your code. Each experiment is given a random ID by default which you can change to make it human-readable. A totally customizable tabular form of an experiment is known as an ‘Experiment Table’. An experiment has a set of ‘Experiment Tabs’ such as Charts Tab, HTML Tab, Hyperparameters Tab, Metrics Tab and so on, each having a peculiar functionality. For instance, Hyperparameters Tab and Metrics Tab store the ML model’s hyperparametes and evaluation metrics logged during your experiment.
Within a project, you have a ‘Project Visualizations’ section which enables viewing and comparing performance across different experiments. Besides, ‘Query Builder’ section of a project allows you to choose which experiments should be shown in the Project Visualizations and the Experiment Table.
Visit this page to know about workspaces, projects and experiments in detail along with the ways to handle the functionalities provided by each of them.
Comet facilitates Automatic Logging for several extensively used Python ML frameworks some of which are listed below. Click on the corresponding link to get a quick tutorial on how to incorporate Comet while implementing that framework.
keras, lightgbm, Uber’s ludwig, matplotlib, mlflow, pyspark, pytorch, pytorch-lightning, scikit-learn, shap, tensorflow, tensorflow model analysis, HuggingFace’s transformers.
Want to reduce the time it takes to train your neural networks? Try parallelized neural network training!
— Comet (@Cometml) January 19, 2021
This lets you evaluate multiple hyperparameter options faster.
Learn how to implement Parallelized Training in our new report: https://t.co/GypH2aMHRr#NeuralNetworks
Working on object detection? You can use Comet to create custom vizzes that make it MUCH easier to debug your models. Add bounding boxes with your specifications easily. Read more: https://t.co/xvd1LI3R7B#ObjectDetection #ComputerVision #MachineLearning pic.twitter.com/wgIakXF7Rf
— Comet (@Cometml) January 18, 2021
Installation of comel_ml Python library
comet_ml can be installed using pip command as follows:
pip install comet_ml
Practical Implementation using Keras
Import the required libraries
from comet_ml import Experiment import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout from keras.callbacks import EarlyStopping
Create an experiment with your API key
experiment = Experiment( project_name='my_project', //name of your project //disable automatic logging of hyperparameters auto_param_logging=False, //enable automatic histogram logging for biases and weights auto_histogram_weight_logging=True, //enable automatic histogram logging of gradients auto_histogram_gradient_logging=True, //enable automatic histogram logging of activations auto_histogram_activation_logging=True, )
Click here to learn about the properties and methods associated with an Experiment object.
Initialize model’s parameters
batch_size = 128 num_classes = 10 //number of output classes epochs = 20 num_nodes = 64 //number of nodes in hidden layer optimizer = 'adam' activation = 'relu'
Get to know about the Adam optimizer and ReLU activation function.
Load the MNIST digit classification dataset using load_data() function and split it into training set and test set
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Reshape the 1D train and test set
x_train = x_train.reshape(60000, 784) x_test = x_test.reshape(10000, 784)
Convert the data types to real
x_train = x_train.astype('float32') x_test = x_test.astype('float32')
Perform normalization
x_train /= 255 x_test /= 255
Print the number of samples in training set and test set
print(x_train.shape[0], 'Training set samples') print(x_test.shape[0], 'Test set samples')
Convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes)
Define a dictionary for parameters to be logged
params={'batch_size':batch_size, 'epochs':epochs, 'layer1_type':'Dense', 'layer1_num_nodes':num_nodes, 'layer1_activation':activation, 'optimizer':optimizer }
Instantiate the sequential model
model = Sequential()
Add dense layers in the network
model.add(Dense(num_nodes, activation='relu', input_shape=(784,))) model.add(Dense(num_classes, activation='softmax'))
Print model.summary() to preserve automatically in `Output` tab
print(model.summary())
Compile the model
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
Log metrics with the prefix ‘train_’
with experiment.train(): history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test), callbacks=[EarlyStopping(monitor='val_loss', min_delta=1e-4,patience=3, verbose=1, mode='auto')])
Experiment.train() marks beginning and end of the train phase. It provides a namespace for logging the training parameters.
Log metrics with the prefix ‘test_’
with experiment.test(): loss, accuracy = model.evaluate(x_test, y_test) metrics = { 'loss':loss, 'accuracy':accuracy }
Experiment.test() marks beginning and end of the test phase. It provides a namespace for logging the testing metrics.
Log the metrics as key:value pairs in a dictionary
experiment.log_metrics(metrics)
Log a dictionary-like object of parameters
experiment.log_parameters(params)
Create and log a hash of your data
experiment.log_dataset_hash(x_train)
Source: https://www.comet.ml/docs/python-sdk/keras
Practical implementation using PyTorch
Import the required libraries
from comet_ml import Experiment import torch import torch.nn as nn import torchvision.datasets as dsets import torchvision.transforms as transforms from torch.autograd import Variable
Define the hyperparameters
hyper_params = { "sequence_length": 28, "input_size": 28, //number of input layer neurons "hidden_size": 128, //number of hidden layer neurons "num_layers": 2, //number of hidden layers "num_classes": 10, //number of output classes "batch_size": 100, "num_epochs": 2, "learning_rate": 0.01 }
Instantiate an Experiment object
experiment = Experiment(project_name="my_project")
Log the hyperparameters
experiment.log_parameters(hyper_params)
Load the MNIST dataset
train_dataset = dsets.MNIST(root='./data/', train=True, transform=transforms.ToTensor(), download=True) test_dataset = dsets.MNIST(root='./data/', train=False, transform=transforms.ToTensor())
Define the model training and testing pipelines using DataLoader utility
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=hyper_params['batch_size'], shuffle=True) test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=hyper_params['batch_size'], shuffle=False)
Define many-to-one RNN model
class RNN(nn.Module): def __init__(self, input_size, hidden_size, num_layers, num_classes): super(RNN, self).__init__() self.hidden_size = hidden_size self.num_layers = num_layers self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) self.fc = nn.Linear(hidden_size, num_classes)
Forward propagation
def forward(self, x): # Set initial states h0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)) c0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)) #Forward propagate RNN out, _ = self.lstm(x, (h0, c0)) #Decode hidden state of last time step out = self.fc(out[:, -1, :]) return out
Call RNN()
rnn = RNN(hyper_params['input_size'], hyper_params['hidden_size'], hyper_params['num_layers'], hyper_params['num_classes'])
Define loss function and optimizer
criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(rnn.parameters(), lr=hyper_params['learning_rate'])
Get to know about Cross Entropy Loss.
Train the Model
with experiment.train(): step = 0 for epoch in range(hyper_params['num_epochs']): correct = 0 total = 0 for i, (images, labels) in enumerate(train_loader): images = Variable(images.view(-1,hyper_params ['sequence_length'], hyper_params['input_size'])) labels = Variable(labels) #forward-backward optimizations and optimization optimizer.zero_grad() outputs = rnn(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() #Compute train accuracy _, predicted = torch.max(outputs.data, 1) batch_total = labels.size(0) total += batch_total batch_correct = (predicted == labels.data).sum() correct += batch_correct #Log batch_accuracy to Comet.ml step += 1 #each step represents one batch experiment.log_metric("batch_accuracy", batch_correct / batch_total, step=step) if (i + 1) % 100 == 0: print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' % (epoch + 1, hyper_params['num_epochs'], i + 1, len(train_dataset) // hyper_params['batch_size'], loss.item()))
Log epoch accuracy to Comet.ml; step is each epoch
experiment.log_metric("batch_accuracy", correct / total, step=epoch)
Test the model
with experiment.test(): correct = 0 total = 0 for images, labels in test_loader: images = Variable(images.view(-1, hyper_params['sequence_length'], hyper_params['input_size'])) outputs = rnn(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum()
Log accuracy
experiment.log_metric("accuracy", correct / total)
Print the logged accuracy
print('Test Accuracy: %d %%' % (100 * correct / total))
Source: https://www.comet.ml/docs/python-sdk/pytorch
Refer to the following web links to get greater insights of Comet.ml: