Guide To PP-YOLO: An Effective And Efficient Implementation Of Object Detector

PP-YOLO is a deep learning framework to detect objects. This framework is based on YOLO4 architecture. This method was published in the form of a Research paper titled as PP-YOLO: An Effective and Efficient Implementation of Object Detector by the researchers of Baidu : Xiang Long, Kaipeng Deng, Guanzhong Wang, Yang Zhang, Qingqing Dang, Yuan Gao, Hui Shen, Jianguo Ren, Shumin Han, Errui Ding, Shilei Wen. As the developers of PP-YOLO like to say 

This paper is not intended to introduce a novel object detector. It is more like a recipe, which tells you how to build a better detector step by step.

PP-YOLO stands for PaddlePaddle – You only look once. The purpose of this framework is to ease the process of object detection in construction, training, optimization and deployment of these models in a faster and better way. This framework provides many conventional algorithms to enhance modularity and also give data augmentation methods, loss function, etc. that helps in reducing the size of the platform and enables high-performance deployment. Key features are mentioned below :

  • PP-YOLO provides many pre-trained models such as object detection, instance segmentation, face detection, etc.
  • PP-YOLO uses modular designs which help developers to make different pipelines quickly.
  • PP-YOLO provides end-to-end methods for data augmentation, construction, training, optimization, compression and deployment.
  • PP-YOLO supports distributed training as well.

Algorithms implemented in PP-YOLO:


Architecture of PP-YOLO

The architecture of PP-YOLO is significantly based on YOLO4 .  PaddleDetection’s architecture is mainly divided into 3 categories:

  1. Backbone: This part contains the convolution neural network to generate features. It actually contains a pre-trained classification model. In this case, it is ResNet50-vd.
  2. Detection Neck: Then the Feature Pyramid Network(FPN) is made to create a pyramid of features by combining and mixing the ConvNet representations.
  3. Detection Head: This section makes the prediction and bounding box on the object. 
Source :

Comparison of PP-YOLO with other state-of-the-art Algorithms

PP-YOLO achieved a balance between effectiveness (45.2% mAP) and efficiency (72.9 FPS), in comparison with the existing state-of-the-art detectors such as EfficientDet and YOLOv4. 

Source :


  • OS 64-bit operating system
  • Python2 >= 2.7.15 or Python 3(3.5.1+/3.6/3.7), 64-bit version
  • pip/pip3(9.0.1+), 64-bit version operating system is
  • CUDA >= 9.0
  • cuDNN >= 7.6

Install the PaddlePaddle framework with gpu. The dependency table is given here.

 #If your machine is installed with CUDA10, please run the following command to install
 python -m pip install paddlepaddle-gpu==1.8.4.post107 -i
 # Confirm that PaddlePaddle is installed successfully in your Python interpreter
 import paddle.fluid as fluid
 #Confirm PaddlePaddle version
 !python -c "import paddle; print(paddle.__version__)" 


Install PaddleDetection via git.

 !git clone
 %cd /content/PaddleDetection 

Make sure the test below is passed for using PP-YOLO pre-trained models.

!python ppdet/modeling/tests/

The output of above test should be:

Pre-trained model prediction

Use the pre-trained model to predict the image and quickly experience the model prediction effect.

!python tools/ -c configs/ppyolo/ppyolo.yml -o use_gpu=true weights= --infer_img=demo/000000014439.jpg

Quick start on Small Dataset

This demo fine-tunes a small dataset by using a pre-trained object detection model and learns PaddleDetection quickly. 

  1. Specify the gpu to be used by:


  1. Download the dataset, here we are using fruit dataset for object detection. The dataset is available at here or, you can download by

!python dataset/fruit/

  1. Train the dataset. Here we are using yolov3_mobilenet_v1_fruit.yml to fine-tune the model from the COCO dataset.

!python -u tools/ -c configs/yolov3_mobilenet_v1_fruit.yml --eval

  1. Evaluate the model by running the command below:

!python -u tools/ -c configs/yolov3_mobilenet_v1_fruit.yml

  1. Draw inference from the trained model. Below is the input picture in the trained-model.

By calculating the inference via this command,

 !python -u tools/ -c configs/yolov3_mobilenet_v1_fruit.yml \
                          -o weights= \

 The output is shown as:

You can check the full demo, here.


In this article, we have covered the PP-YOLO framework which is much faster and accurate than other existing object detection frameworks. You can check out the advanced tutorials here.

Official codes, docs & Tutorials are available at:

You can also check some other articles at:

Download our Mobile App

Aishwarya Verma
A data science enthusiast and a post-graduate in Big Data Analytics. Creative and organized with an analytical bent of mind.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Bangalore

Future Ready | Lead the AI Era Summit

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

20th June | Bangalore

Women in Data Science (WiDS) by Intuit India

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox