MITB Banner

Watch More

Introduction To YolactEdge For Real-time Object Segmentation On Edge Device

YolatEdge is one of the first competitive instanced segmentation techniques that can run on small devices with great real-time speed, It can reach up to 30fps on Nvidia Jetson AGX Xavier and 172fps on RTX 2080Ti. YolactEdge techniques come with Resnet-101 backbone which takes 550×550 resolution image as input. It paper called YolactEdge: Real-time Instance Segmentation on the Edge is authored by Haotian Liu, Rafael A. Rivera Soto, Fanyi Xiao, and Yong Jae Lee in Dec 2020, and the code and models are open-sourced on GitHub here.

Some of the new features and things the authors came up with are:

  • TensorRT optimization technique without compromising trading off speed and accuracy, 
  • A novel feature warping module to accomplish temporal redundancy in videos.
  • Integrated YouTube VIS and MS COCO datasets.
  • Produces a 3 to 5x speedup over existing real-time methods while producing competitive mask and box detection accuracy.

In order to do inferences in real-time speeds on edge devices, the authors built the SOTA image-based real-time instances segmentation method YOLACT and did some new changes mainly two: one at algorithms level and other system levels. YolactEdge leverages the facility of Nvidia TensorRT machine inference engine to quantize the network parameters to fewer bits while systematically balancing any tradeoff inaccuracy, and it also leverages temporal redundancy in the video, and learn to rework and propagate features over time in order that the deep network’s expensive backbone feature computation doesn’t get to be fully computed on every frame.

YolactEdge Backbone

YOLACT can be divided into 4 components: 

  1. a feature Backbone.
  2. a feature pyramid network(FPN). 
  3. a ProtoNet.
  4. a Prediction Head.

As shown in the above figure, YolactEdge extends the YOLACT method to videos by transforming a set of the features from keyframes (shown in left) to nonkeyframes (shown on the right side of the above figure), to reduce expensive backbone computation. Especially on non-keyframes, it computes features that are cheap while crucial for mask prediction, which largely accelerates the technique while retaining accuracy on non-keyframes. YolacEdge uses blue, orange, and grey to indicate computed, transformed, and skipped blocks. 

yolactedge

Implementation

YolactEdge is trained on a batch size of 32 on 4 GPUs using ImagNet already pre-trained weights, First, the authors used pre-train YOLACT with SGD for 500k iterations. Then, they froze YOLACT weights and trained FeatFlowNet on FlyingChairs dataset. Finally, they fine-tuned all weights except the ResNet backbone architecture for 200k iterations. 

Installation

  • It is written in python3 programming language 
  • Installed PyTorch 1.6.0 from here
  • Install CUDA 10.2/11.0 and cuDNN 8.0.0.
  • Download TensorRT 7.1 tar file here and install TensorRT from the official documentation.
  • Install torch2trt.
 git clone https://github.com/NVIDIA-AI-IOT/torch2trt
 cd torch2trt
 sudo python setup.py install --plugins
 Installing some other dependencies:
 !pip install cython
 !pip install opencv-python pillow matplotlib
 !pip install !git+https://github.com/haotian-liu/cocoapi.git#"egg=pycocotools&subdirectory=PythonAPI"
 !pip install GitPython termcolor tensorboard
 Clone the repo and change the directory inside:
 git clone https://github.com/haotian-liu/yolact_edge.git
 cd yolact_edge 

YolactEdge Models

Authors provided baseline YOLACT and YolactEdge models trained on COCO and YouTube VIS dataset, given below is the information about Youtube VIS models

MethodBackbone mAPAGX-Xavier FPSRTX 2080 Ti FPSweights
YOLACTR-50-FPN44.78.559.8download 
YolactEdge(w/o TRT)R-50-FPN44.210.567.0download 
YolactEdgeR-50-FPN44.032.4177.6download 
YOLACTR-101-FPN47.35.942.6download 
YolactEdge(w/o TRT)R-101-FPN46.99.561.2download 
YolactEdgeR-101-FPN46.230.8172.7download 
Youtube VIS models

YolactEdge COCO Models

Method  Backbone    mAPTitan Xp FPSAGX-Xavier FPSRTX 2080 Ti FPSweights
YOLACTMobileNet-V222.115.035.7download 
YolactEdgeMobileNet-V220.835.7161.4download 
YOLACTR-50-FPN28.242.59.145.0download 
YolactEdgeR-50-FPN27.030.7140.3download 
YOLACTR-101-FPN29.833.56.636.5download 
YolactEdgeR-101-FPN29.527.3124.8download 
COCO Models

To evaluate the pretrained models, you can put the corresponding weight file in the ./weights directory by creating one and run further commands.

Evaluation of YolactEdge

For Convert each component of the trained model to TensorRT using the optimal settings and evaluate on the YouTube VIS validation set.

 !python3 eval.py --trained_model=./weights/yolact_edge_vid_847_50000.pth
 # Evaluate on the entire COCO validation set.
 # '--yolact_transfer' is used to convert the models trained with YOLACT to be compatible with YolactEdge.
 !python3 eval.py --yolact_transfer --trained_model=./weights/yolact_edge_54_800000.pth
 # Output a COCO file for the COCO test-dev set. The command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively. These files can then be submitted to the website for evaluation.
 !python3 eval.py --yolact_transfer --trained_model=./weights/yolact_edge_54_800000.pth --dataset=coco2017_testdev_dataset --output_coco_json 

Running on Images

 # Display qualitative results on the specified image.
 python eval.py --yolact_transfer --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png
 # Process an image and save it to another file.
 python eval.py --yolact_transfer --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png
 # Process a whole folder of images.
 python eval.py --yolact_transfer --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --images=path/to/input/folder:path/to/output/folder 

On videos

 # Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance.
 # If video_multiframe > 1, then the trt_batch_size should be increased to match it or surpass it. 
 python eval.py --yolact_transfer --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=my_video.mp4
 # Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.
 python eval.py --yolact_transfer --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --trt_batch_size 2 --video=0
 # Process a video and save it to another file. This is unoptimized.
 python eval.py --yolact_transfer --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --video=input_video.mp4:output_video.mp4 
yolactedge

Conclusion 

YolactEdge is the new way of looking at object detection problem with Real-time Instance Segmentation on the Edge with less computation power and the only thing we are left of is an optimization problem in deep learning projects which is been completed by approaches like YolacEdge, to  learn more about the project you can follow below resources:

Access all our open Survey & Awards Nomination forms in one place >>

Picture of Mohit Maithani

Mohit Maithani

Mohit is a Data & Technology Enthusiast with good exposure to solving real-world problems in various avenues of IT and Deep learning domain. He believes in solving human's daily problems with the help of technology.

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
Recent Stories