In computer vision, object detection is one of the powerful algorithms, which helps in the classification and localization of the object. Object detection is more challenging because it needs to draw a bounding box around each object in the image. While going through research papers you may find these terms AP, IOU, mAP, these are nothing but Object detection metrics that help in finding good models. In this article, we will demonstrate how to get insights and a clear understanding of each from these metrics.
Basic Object detection metrics covered in this article
- Introduction to the precision and recall
- Intersection over Union(IOU)
- Average Precision(AP)
- Mean Average Precision(mAP)
- Variations among mAP
Introduction to Precision and Recall
Precision – It is used to measure the correct predictions.
Recall – it is used to calculate the true predictions from all correctly predicted data.
Intersection Over Union(IOU)
IOU is a metric that finds the difference between ground truth annotations and predicted bounding boxes. This metric is used in most state of art object detection algorithms. In object detection, the model predicts multiple bounding boxes for each object, and based on the confidence scores of each bounding box it removes unnecessary boxes based on its threshold value. We need to declare the threshold value based on our requirements.
IOU = Area of union / area of intersection
def IOU(box1, box2):
x1, y1, w1, h1 = box1
x2, y2, w2, h2 = box2
w_intersection = min(x1 + w1, x2 + w2) - max(x1, x2)
h_intersection = min(y1 + h1, y2 + h2) - max(y1, y2)
if w_intersection <= 0 or h_intersection <= 0:
return 0
I = w_intersection * h_intersection
U = w1 * h1 + w2 * h2 - I
return I / U
iou = [IOU(y_test[i], y_pred[i]) for i in range(len(x_test))
]
Ground truth image
The model predicted image without using IoU looks like
After setting some threshold value it removes the unnecessary boxes based on the confidence scores.
Average Precision(AP)
To evaluate the detection commonly we use precision-recall curve but average precision gives the numerical values it is easy to compare the performance with other models. Based on the precision-recall curve AP it summarises the weighted mean of precisions for each threshold with the increase in recall. Average precision is calculated for each object.
From the above formula, P refers to precision and R refers to Recall suffix n denotes the different threshold values.
import numpy as np
from sklearn.metrics import average_precision_score
ground_truth = np.array([0, 0, 1, 1])
model_predicted_confidences = np.array([0.1, 0.4, 0.35, 0.8])
average_precision_score(ground_truth, model_predicted_confidences)
In the above output, we achieved 0.83333 average precision based on the confidence scores.
Mean Average Precision(mAP)
Mean average precision is an extension of Average precision. In Average precision, we only calculate individual objects but in mAP, it gives the precision for the entire model. To find the percentage correct predictions in the model we are using mAP.
Here N denoted the number of objects.
mAP= [0.83,0.66,0.99,0.78,0.60]
a=len(mAP)
b=sum(mAP)
c=a/b
print(c)
Each object has its individual average precision values, we are adding all these values to find Mean Average precision.
Variations Among mAP
In most of the research papers, these metrics will have extensions like mAP iou = 0.5, mAP iou = 0.75, mAP small, medium, large. In this, we will clearly demonstrate what it actually means.
- mAP iou=0.5 represents the model has used 0.5 threshold value to remove unnecessary bounding boxes, it is the standard threshold value for most of the models.
- mAP iou=0.75 represents the model has used 0.75 threshold value, By using this we can get accurate results by removing bounding boxes with less than 25% of the intersection with ground truth image.
- mAP small represents the model has given mAP score based on smaller objects in the data.
- mAP large represents the model has given mAP score based on larger objects in the data.
Conclusion
In the above article, we demonstrated what are the basic object detection metrics by comparing the predicted output and ground truth values and their variations, These metrics are used by most of the state of art models to find the prediction in the model.