Last updated April 8, 2024
In AI Mysteries

Guide To PP-YOLO: An Effective And Efficient Implementation Of Object Detector

Share

Published on February 17, 2021

by Aishwarya Verma

PP-YOLO is a deep learning framework to detect objects. This framework is based on YOLO4 architecture. This method was published in the form of a Research paper titled as PP-YOLO: An Effective and Efficient Implementation of Object Detector by the researchers of Baidu : Xiang Long, Kaipeng Deng, Guanzhong Wang, Yang Zhang, Qingqing Dang, Yuan Gao, Hui Shen, Jianguo Ren, Shumin Han, Errui Ding, Shilei Wen. As the developers of PP-YOLO like to say

This paper is not intended to introduce a novel object detector. It is more like a recipe, which tells you how to build a better detector step by step.

PP-YOLO stands for PaddlePaddle – You only look once. The purpose of this framework is to ease the process of object detection in construction, training, optimization and deployment of these models in a faster and better way. This framework provides many conventional algorithms to enhance modularity and also give data augmentation methods, loss function, etc. that helps in reducing the size of the platform and enables high-performance deployment. Key features are mentioned below :

PP-YOLO provides many pre-trained models such as object detection, instance segmentation, face detection, etc.
PP-YOLO uses modular designs which help developers to make different pipelines quickly.
PP-YOLO provides end-to-end methods for data augmentation, construction, training, optimization, compression and deployment.
PP-YOLO supports distributed training as well.

Algorithms implemented in PP-YOLO:

*Source:https://github.com/PaddlePaddle/PaddleDetection/*

Architecture of PP-YOLO

The architecture of PP-YOLO is significantly based on YOLO4 . PaddleDetection’s architecture is mainly divided into 3 categories:

Backbone: This part contains the convolution neural network to generate features. It actually contains a pre-trained classification model. In this case, it is ResNet50-vd.
Detection Neck: Then the Feature Pyramid Network(FPN) is made to create a pyramid of features by combining and mixing the ConvNet representations.
Detection Head: This section makes the prediction and bounding box on the object.

*Source : https://arxiv.org/abs/2007.12099*

Comparison of PP-YOLO with other state-of-the-art Algorithms

PP-YOLO achieved a balance between effectiveness (45.2% mAP) and efficiency (72.9 FPS), in comparison with the existing state-of-the-art detectors such as EfficientDet and YOLOv4.

Requirements

OS 64-bit operating system
Python2 >= 2.7.15 or Python 3(3.5.1+/3.6/3.7), 64-bit version
pip/pip3(9.0.1+), 64-bit version operating system is
CUDA >= 9.0
cuDNN >= 7.6

Install the PaddlePaddle framework with gpu. The dependency table is given here.

 #If your machine is installed with CUDA10, please run the following command to install
 python -m pip install paddlepaddle-gpu==1.8.4.post107 -i https://mirror.baidu.com/pypi/simple
 # Confirm that PaddlePaddle is installed successfully in your Python interpreter
 import paddle.fluid as fluid
 fluid.install_check.run_check()
 #Confirm PaddlePaddle version
 !python -c "import paddle; print(paddle.__version__)"

Installation

Install PaddleDetection via git.

 !git clone https://github.com/PaddlePaddle/PaddleDetection.git
 %cd /content/PaddleDetection

Make sure the test below is passed for using PP-YOLO pre-trained models.

!python ppdet/modeling/tests/test_architectures.py

The output of above test should be:

Pre-trained model prediction

Use the pre-trained model to predict the image and quickly experience the model prediction effect.

!python tools/infer.py -c configs/ppyolo/ppyolo.yml -o use_gpu=true weights=https://paddlemodels.bj.bcebos.com/object_detection/ppyolo.pdparams --infer_img=demo/000000014439.jpg

Quick start on Small Dataset

This demo fine-tunes a small dataset by using a pre-trained object detection model and learns PaddleDetection quickly.

Specify the gpu to be used by:

!export CUDA_VISIBLE_DEVICES=0

Download the dataset, here we are using fruit dataset for object detection. The dataset is available at here or, you can download by

!python dataset/fruit/download_fruit.py

Train the dataset. Here we are using yolov3_mobilenet_v1_fruit.yml to fine-tune the model from the COCO dataset.

!python -u tools/train.py -c configs/yolov3_mobilenet_v1_fruit.yml --eval

Evaluate the model by running the command below:

!python -u tools/eval.py -c configs/yolov3_mobilenet_v1_fruit.yml

Draw inference from the trained model. Below is the input picture in the trained-model.

By calculating the inference via this command,

 !python -u tools/infer.py -c configs/yolov3_mobilenet_v1_fruit.yml \
                          -o weights=https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1_fruit.tar \
                          --infer_img=demo/orange_71.jpg

The output is shown as:

You can check the full demo, here.

Introducing PP-YOLO, a new implementation of object detector based on PaddlePaddle. Compared with YOLOv4, PP-YOLO improves mAP on COCO to 45.9% and runs at 72.9 fps on Tesla V100. #ComputerVision #AI

Paper: https://t.co/uHWECwzOf5
GitHub: https://t.co/5qXN3lqQh5 pic.twitter.com/pQrxx4VQu5
— Baidu Research (@BaiduResearch) August 18, 2020

EndNotes

In this article, we have covered the PP-YOLO framework which is much faster and accurate than other existing object detection frameworks. You can check out the advanced tutorials here.