Object detection is a tedious job, and if you ever tried to build a custom object detector for your research there are many factors architectures we have to think about, we have to consider our model architecture like FPN(feature pyramid network) with region purposed network, and on opting for region proposal methods we have Faster R-CNN, or we can use more of one-shot techniques like SSD(single shot detector) and YOLO(you only look once).
Now in all of this continuous competition of making object detection models better and efficient, the Facebook AI team has launched many cutting edge detectors, models, frameworks, and datasets over the years. But still, it never came out of controversies easily there are tweets and negative images that are going on the internet towards facebook AI systems.
Facebook sucks— Elon Musk (@elonmusk) May 14, 2020
In 2018, Facebook AI Research (FAIR) published a new object detection algorithm called Detectron. It was a great library that implements state-of-art object detection, including Mask R-CNN. It was written in Python and Caffe2 deep learning framework.
Due to Detectron, there were many research projects published later like Feature pyramid network(FPN), Data Distillation, Omni-Supervised Learning, and Mask R-CNN. Detectron backbone network framework was based on:
The goal of detectron was pretty simple to provide a high- performance codebase for object detection, but there were many difficulties like it was very hard to use since it’s using caffe2 & Pytorch combined and it was becoming difficult to install.
And that’s why FAIR came up with the new version of Detectron.
“Detectron2 is Facebook AI Research’s next-generation software system that implements state-of-the-art object detection algorithms”– Github Detectron2
Detectron2 is built using Pytorch, which has a very active community and continuous up-gradation & bug fixes. This time Facebook AI research team really listened to issues and provided very easy setup instructions for installations. They also provided a very easy API to extract scoring results. Other Frameworks like YOLO have an obscure format of their scoring results which are delivered in multidimensional array objects. YOLO takes more effort to parse the scoring results and inference it in the right place.
Detectron2 got pretty massive trending on the internet since its release:
Detectron2 originates from Mask R-CNN benchmark, and Some of the new features of detectron2 comes with are as follows:
- This time it is Powered by Pytorch deep learning framework.
- Panoptic segmentation
- Include Densepose
- Provide a wide set of baseline results and trained models for download in the Detectron2 ModelZoo.
- Included projects like DeepLab, TensorMask, PointRend, and more.
- Can be used as a wrapper on top of other projects.
- Exported to easily accessible formats like caffe2 and torchscript.
- Flexible and fast training on single or multiple GPU servers.
There is also a new model launched with detectron2, i.e. Detectron2go, which is made by adding an additional software layer, Dtectron2go makes it easier to deploy advanced new models to production. Some of the other features of detectron2go are:
- Standard training workflows with-in-house datasets
- Network quantization
- Model conversion to optimized formats for deployment to mobile devices and cloud.
- Operating System: Linux or macOS
- Python: 3.6+
- Pytorch: 1.5+ & torchvison that matches the Pytorch installation. You can install both together at pytorch.org
- OpenCV for Visualization
We are going to use the official Google Colab tutorial from Detectron2.
- Installing dependencies (pyyaml)
!pip install pyyaml==5.1 import torch, torchvision
- Install Detectron2 and restart your runtime after executing below command:
import torch assert torch.__version__.startswith("1.7") !pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.7/index.html
- Setup Detectron2 logger
import detectron2 from detectron2.utils.logger import setup_logger setup_logger()
- Import additional libraries
import numpy as np import os, json, cv2, random from google.colab.patches import cv2_imshow
- Import detectron2 utilites for easy execution
from detectron2 import model_zoo from detectron2.engine import DefaultPredictor from detectron2.config import get_cfg from detectron2.utils.visualizer import Visualizer from detectron2.data import MetadataCatalog, DatasetCatalog
- Run a detectron2 model trained on COCO dataset
!wget http://images.cocodataset.org/test-stuff2017/000000017581.jpg -q -O input.jpg im = cv2.imread("./input.jpg") cv2_imshow(im)
- Create a detectron2 configuration and a DefaultPredictor to run inference on input image
cfg = get_cfg() # add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set threshold for this model # Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") predictor = DefaultPredictor(cfg) outputs = predictor(im)
- Print predicted output
- Visualize the predicted output using Visulizer utility by Detectron2
output = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN), scale=1.2) out = output1.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1])
For common installation, error refer here
FAIR team has gone pretty straight this time by open-sourcing everything, this was a good move as the team believes that they can’t achieve the state of the art algorithms and techniques in isolation so open source is the solution to a better AI era. FAIR has done many interesting projects like Multimodal hate speech Memes challenges:
Facebook AI research has included many projects that are made by using Detectron2 like:
Some of the external projects that use detectron2:
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Mohit is a Data & Technology Enthusiast with good exposure to solving real-world problems in various avenues of IT and Deep learning domain. He believes in solving human's daily problems with the help of technology.