A Hands-On Guide to IceVision Framework for Object Detection

IceVision is a framework for object detection which allows us to perform object detection in a variety of ways using various pre-trained models provided by this framework. It also offers data curation features along with a dashboard for exploratory data analysis. The best feature it has is that it provides an end-to-end deep learning workflow that allows the practitioners to train networks with easy-to-use robust high-performance libraries such as PyTorch-Lightning and FastAI. In this article, we are going to discuss the IceVision framework for object detection with hands-on implementation. The major points that we will discuss here are listed below.

Table of Contents

  1. What is IceVision?
  2. Installing IceVision
  3. Data Preparation
    1. Importing Libraries
    2. Download and Prepare a Dataset  
    3. Parse the data
    4. Creating Datasets with Augmentations and Transform 
  4. Model Building
    1. Pre-Modelling Procedures 
    2. Training 
      1. Training using FastAI
      2. Training using PyTorch Lightning

Let’s begin the discussion by understanding what IceVision is.

What is IceVision?

IceVision is a framework that allows us to preprocess our data for object detection and train a model for object detection on the data so that using the model we can make inferences on the data. The framework provides layered connections between deep learning engines, libraries and models. Also, the framework has datasets that can be used for learning the basic implementation of the IceVision frameworks for object detection where the models under the framework are built using the libraries like TorchVision and Ultralytics YOLO

We can select from many models built on the framework and also switch between them very easily. Basically using the IceVision, we can train a model according to the datasets and after that, we can change the datasets or model as per our requirement. According to its official GitHub profile, some of the features of IceVision are listed below.

  • Using the auto_fix from the framework, we can automate the data curation and cleaning procedure.
  • We can also have access to a dashboard using the framework which can be helpful in explanatory data analysis.
  • In the framework, we have various models which can be used for object detection, segmentation, and classification.
  • The framework is compatible with the various libraries which can be used for various aspects of computer vision programming.
  • We have various transformation module in the framework which help in training the model more accurately.  

 In the next part of the article, we are going to see a basic example of implementing IceVision framework.

Installing IceVision

Let’s start with the installation which can be done by using the following lines of codes.

!wget https://raw.githubusercontent.com/airctic/IceVision/master/IceVision_install.sh

The above-given lines of code will let us have the packages of  Torch, TorchVision, IceVision framework, IceData,  MMDetection, YOLOv5 and EfficientDet. After gathering, we can install them using the following line of code. 

!bash IceVision_install.sh cuda11 master

Output 

Since we are using Google Colab we have some of the requirements like torch and TorchVision already installed in the environment. We can also change the installation target to cuda10 or CPU. Now we can restart our kernel using the restart button on the runtime panel of the notebook or we can simply use the Ctrl + m button for that.

Data Preparation 

For moving forward to the modelling, we are required to have records using which we can build a model. In this section of the article, we will discuss how we can prepare data for modelling using the IceVision framework. 

Importing Libraries 

We can import all the components of the IceVision framework using the following line of code.

from IceVision.all import *

Download and Prepare a Dataset

Now we can take our steps to the modelling side. Before going for the modelling, we are required to have a dataset for this purpose. We have a data set called Fridge Objects dataset with 134 images belonging to the four classes:

  • Can
  • Carton
  • Milk bottle 
  • Water bottle

Using the IceVision module for data import, we can import our data using this link.

Import the Data

 import icedata
path = icedata.fridge.load_data()

Output:

Parse the Data

Using the parser module of the framework, we can load the annotation file and split the data into the training and testing, and validation parts. The submodule under the parser helps in annotating for the common errors in the data.   

# Create the parser
parser = parsers.VOCBBoxParser(annotations_dir=path / "odFridgeObjects/annotations", images_dir=path / "odFridgeObjects/images")

Using the following lines of code we can split the data into training and validation datasets. 

# Parse annotations to create records
train, valid = parser.parse()
parser.class_map

Output:

Creating Datasets with Augmentations and Transform 

As we know that data augmentation and transformation help in making a model well trained and perform accurately on the data. This framework also provides this facility where the Albumentations library helps in defining and executing transformations. There are various transformations provided in the framework. In this article, we are using the aug_tfms module for the transformation of the image which helps the model to get transformations like rotation, cropping, horizontal flips, and more.

Let’s define a function for transformation

train_trans = tfms.A.Adapter([*tfms.A.aug_tfms(size=384, presize=512), tfms.A.Normalize()])
valid_trans = tfms.A.Adapter([*tfms.A.resize_and_pad(384), tfms.A.Normalize()])

Using the function with data 

train_data = Dataset(train, train_tfms)
valid_data = Dataset(valid, valid_tfms)

Let’s visualize the data after augmentation is performed.

vis = [train_data[1] for _ in range(8)]
print("training  data")
show_samples(vis, ncols=4)

Output:

training  data

vis = [valid_data[1] for _ in range(8)]
print("validation data")
show_samples(vis, ncols=4)

Output:

validation data

Model Building

 Before training a model we are required to instantiate the model variable. Make the data according to the model and various procedures to follow before any modelling procedure. So let’s start with the pre modelling procedure. 

Pre-Modelling Procedures 

In order to build a model using the IceVision framework, we are required to select libraries, models, and backbones for the model. Also, it is mandatory for us to choose these all from the given options under the framework.

Here we are using the RetinaNet model with the backbone of   resnet50_fpn_1x. Which can be specified by using the following line of codes.

model_type = models.mmdet.retinanet
backbone = model_type.backbones.resnet50_fpn_1x(pretrained=True)

Now we can instantiate the model using the following lines of code.  

model = model_type.model(backbone=backbone(pretrained=True), num_classes=len(parser.class_map), **extra_args) 

Since we have various options of models and backbone we are required to make the data according to the model. Till now we have seen how we can call the data and make changes on the data. For editing data according to the model, the framework provides the facility of data loaders using which, we can make changes on the data for modelling purposes.

# Data Loaders
train_load = model_type.train_dl(train_data, batch_size=8, num_workers=4, shuffle=True)
valid_load = model_type.valid_dl(valid_data, batch_size=8, num_workers=4, shuffle=False)

Let’s visualize the batch for validation in the loader.

model_type.show_batch(first(valid_load), ncols=4)

Output:

Now we can track the progress of the training using the FastAI and PyTorch lighting for which we can use the framework provided metric class. We are just required to instantiate a variable that can hold the metrics under it.

metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]

Training 

Now the above-defined metrics can be used for training the model using the FastAI or PyTorch-lightning. Both will support the same metrics. 

Training using fastai

training = model_type.fastai.learner(dls=[train_load, valid_load], model=model, metrics=metrics)

Output:

Tuning the Model

training.fine_tune(20, 0.00158, freeze_epochs=1)

 Output:

The above-given output is some of the results from tunning of a model where the most optimal result is highlighted. In the tabular results, we have a measure of training and validation losses with the metrics which we have chosen to track the training. 

We can also train the model using the PyTorch Lightning. The procedure is almost same but the coding part for PyTorch lightening is different. We can use the following line of codes for training the model using the pytorch lightening:

class LightModel(model_type.lightning.ModelAdapter):
    def configure_optimizers(self):
        return Adam(self.parameters(), lr=1e-4)
light_model = LightModel(model, metrics=metrics)

We can instantiate the model using the following lines of codes:

trainer = pl.Trainer(max_epochs=5, gpus=1)
trainer.fit(light_model, train_load, valid_load)

Also, we can check the results using the following lines of codes:

model_type.show_results(model, valid_ds, detection_threshold=.5)

Output:

The above-given output is the final result of the process we used for object detection using the IceVision framework. We can see that it is working well. We can use it for our projects because it is an open-source framework.  

Final Words

In this article, we have seen an overview of the IceVision framework for object detection. Along with that, we have also seen how we can use models and data from the framework and how we can make a whole process work for the object detection task. I encourage users to follow the framework more and try to perform other tasks related to computer vision problems.

 References

More Great AIM Stories

Yugesh Verma
Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.

More Stories

OUR UPCOMING EVENTS

8th April | In-person Conference | Hotel Radisson Blue, Bangalore

Organized by Analytics India Magazine

View Event >>

30th Apr | Virtual conference

Organized by Analytics India Magazine

View Event >>

MORE FROM AIM
Yugesh Verma
All you need to know about Graph Embeddings

Embeddings can be the subgroups of a group, similarly, in graph theory embedding of a graph can be considered as a representation of a graph on a surface, where points of that surface are made up of vertices and arcs are made up of edges

Yugesh Verma
A beginner’s guide to Spatio-Temporal graph neural networks

Spatio-temporal graphs are made of static structures and time-varying features, and such information in a graph requires a neural network that can deal with time-varying features of the graph. Neural networks which are developed to deal with time-varying features of the graph can be considered as Spatio-temporal graph neural networks. 

Yugesh Verma
A guide to explainable named entity recognition

Named entity recognition (NER) is difficult to understand how the process of NER worked in the background or how the process is behaving with the data, it needs more explainability. we can make it more explainable.

Yugesh Verma
10 real-life applications of Genetic Optimization

Genetic algorithms have a variety of applications, and one of the basic applications of genetic algorithms can be the optimization of problems and solutions. We use optimization for finding the best solution to any problem. Optimization using genetic algorithms can be considered genetic optimization

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM