How RetinaNet Fixes The Shortcomings Of SSD With Focal Loss

In the conventional object detectors, say, R-CNN, initially a set of object locations are generated and then these locations are classified whether they belong to the foreground or background classes using a CNN. This is working of a two-stage detector. In the case of one stage detectors like SSD, the accuracy is more when applied over dense sampling of object locations, scales and aspect ratio.

One-stage detectors generate a large set of object locations that densely cover few areas of the image. This creates a class imbalance as the negatives are increased and the object classes present in those locations go undetected.

RetinaNet was introduced by Facebook AI Research to tackle the dense detection problem.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

Under The Hood Of RetinaNet

RetinaNet was introduced to fill in for the imbalances and inconsistencies of the single shot object detectors like YOLO and SSD while dealing with extreme foreground-background classes.

RetinaNet is designed to accommodate Focal Loss, a method to prevent negatives from clouding the detector.

RetinaNet Network Architecture

The classification subnet predicts the probability of an object being present in a particular location.

The subnet is a kind of smaller version of fully convolutional networks(FCN) attached to each feature pyramid network(FPN) level.

An input feature map is taken from a given pyramid level and four 3 x 3 convolutional layers, followed by ReLU activations, and then by 3 x 3 convolutional layer.

Along with the classification subnet, a box regression subnet is attached to nullify the offset from each box to a nearby main object.

Negatives or background objects location are classified as a vector containing only zeros whereas, positives or foreground are classified by a one-hot vector. Assuming the prediction is a vector of all zeros but the target was a one-hot vector (in other words, a false negative), then the focal loss will evaluate to a large value for that anchor box.

Enhancement With Focal Loss

The loss function used in this approach is the loss of the output of classification subnet. This loss is applied to all the anchors in each sampled image.

Total focal loss of an image is the sum of the focal loss over all the anchors. The normalisation is done on the anchors assigned and not on the total anchors to avoid the negatives generated by overall anchors.

RetinaNet enabled by focal loss performs better than all existing methods, discounting the low-accuracy trend.

Initialization of RetinaNet needs a probability threshold(~0.01) for the anchor boxes. This probability is fed into the last convolutional layer of the classification subnet. This prior probability value indicates the ratio of foreground to background objects i.e positives to negatives. Hence this value is very significant.

This enhancement of using the focal loss in RetinaNet brings down the overall negatives in the output. The background is now more clearly distinguished from the foreground objects.

RetinaNet effectively improved a lot upon single-shot detection with its new training approach. Currently, there are few variants of RetinaNet, where the researchers introduce an adaptive loss function along with an instance mask prediction during training.

Read more about RetinaNet here.

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM