Guide to Yolov5 for Real-Time Object Detection

Re-imagining Business Value Creation Through AI

Design by Re-imagining Business Value Creation Through AI

Real Time object detection is a technique of detecting objects from video, there are many proposed network architecture that has been published over the years like we discussed EfficientDet in our previous article, which is already outperformed by YOLOv4, Today we are going to discuss YOLOv5.

YOLO refers to “You Only Look Once” is one of the most versatile and famous object detection models. For every real-time object detection work, YOLO is the first choice by Data Scientist and Machine learning engineers. YOLO algorithms divide all the given input images into the SxS grid system. Each grid is responsible for object detection. Now those Grid cells predict the boundary boxes for the detected object. For every box, we have five main attributes: x and y for coordinates, w and h for width and height of the object, and a confidence score for the probability that the box containing the object.



YOLO v1 was introduced in 2016 by Joseph Redmon et al with a research paper called “You Only Look Once: Unified, Real-Time Object Detection”. This was the initial paper by Redmon that revolutionized the industry and changed the Real-Time Object detection methods totally.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Image for post

By just looking at the image once, it can detect the objects with a speed of 45fps(frames per second), another YOLO v1 type, Fast YOLOv1 was able to achieve 155fps with little less accuracy.


YOLO v1 : Part 1. YOLO, short for You Only Look Once is a… | by Divakar  Kapil | Escapades in Machine Learning | Medium

It used the Darknet framework that was trained on the ImageNet-1000 dataset. But YOLOv1 has many limitations like 

  • it can’t detect the objects properly when the objects are small
  • it also can’t generalize the objects if the image is of different dimensions


The second version of YOLOv2 was released in 2017 by Ali Farhadi and Joseph Redmon. This time Joseph collaborated with Ali for major bug fixes and accuracy increment. The research they published was “YOLO9000: Better, Faster, Stronger.” The name of the second version of YOLO was YOLO9000. The major competitor of YOLO9000 was Faster R-CNN, which was also an object detection algorithm that uses Region Proposal Network & (SSD)Single-shot Multbox Detector to identify the multiple objects from an image.

Some of the features of YOLOv2 are:

  • YOLOv2 added Batch Normalization as an improvement that normalizes the input layer of the image by altering the activation functions.
  • Higher-resolution input: input size has been increased from 224*224 to 448*448.
  • Anchor boxes.
  • Multi-Scale training.
  • Darknet 19 architecture with 19 convolution layers and 5 Max Pooling layers.
YOLOv2 performance on MS COCO dataset
MS coco dataset performance


After one year, on March 25, Joseph Redmon and Ali Farhadi came up with another version of YOLO and a research paper called: “YOLOv3: An Incremental improvement.”

animals detection realtime
Image: source

At 320×320, YOLOv3 runs with 22ms at 28.2 mAP with great accuracy, as shown in the above video. It is three times faster than the previous SSD and four times faster than RetinaNet.

New YOLOv3 followed the methodology of the previous YOLOv2 version: YOLO9000. In this approach, Redmond uses Darknet 53 architecture, which was a significantly improved version and had 53 convolution layers.

Some of the new, improved features in YOLOv3 was:

  • Class Predictions
  • Feature Pyramid Networks(FPN)
  • Darknet 53 architecture


As Redmond was not currently working on the CV for a long time, a new team of three developers released YOLOv4. It was released by Alexey Bochoknovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Alexey is the one who developed the Windows version of YOLO back in the days.

YOLOv4 runs twice faster than EfficientDet with comparable performance, as shown in the below diagram, which was officially published on the YOLOv4 research paper.

Ms coco object detection

Some of the new features of YOLOv4 is:

  • Anyone with a 1080 Ti or 2080 ti GPU can run the YOLOv4 model easily. 
  • YOLOv4 includes CBN(Cross-iteration batch normalization) and PAN(Pan aggregation network) methods.
  • Weighted-Residual-Connections(WRC).
  • Cross-Stage-Partial connections(CSP), a new backbone to enhance CNN(convolution neural network)
  • Self-adversarial-training(SAT): A new data augmentation technique
  • DropBlock regularization.


After a few days of the release of the YOLOv4 model on 27 May 2020, YOLOv5 got released by Glenn Jocher(Founder & CEO of Utralytics). It was publicly released on Github here. Glenn introduced the YOLOv5 Pytorch based approach, and Yes! YOLOv5 is written in the Pytorch framework.

It is state of the art and newest version of the YOLO object detection series, and with the continuous effort and 58 open source contributors, YOLOv5 set the benchmark for object detection models very high; as shown below, it already beats the EfficientDet and its other previous YOLOv5 versions.

YOLO comparision with efficientDet
Comparision object detion models

There is no official paper released yet and also many controversies are happening about its name. Now Let’s see some coding example that was published with its code at Github for learning purposes.

Man riding a bicycle past a car in a driveway with an umbrella in the background.

Pytotch inferences are very fast that before releasing YOLOv5, many other AI practitioners often translate the YOLOv3 and YOLOv4 weights into Ultralytics Pytorch weight.


We are going to see a starter tutorial on YOLOv5 by Ultralytics and going to detect some objects from our given image. Remember to change your runtime to GPU inside Colab. Fullnotebook is available here

  1. First, clone the YOLOv5 repo from GitHub to our Google colab environment using the below command.
!git clone  # clone repo
  1. Install the dependencies using the pip command
%cd yolov5
%pip install -qr requirements.txt  # install dependencies
  1. Import some of the modules like a torch and display to display our output image inside the notebook.
import torch
from IPython.display import Image, clear_output 
  1. Download this custom image from here for testing
Bear coming out from train
  1. Test using this command, runs inference on a variety of sources and will automatically download the latest model from here.
!python --weights --img 640 --conf 0.25 --source data/images/
Image(filename='runs/detect/exp5/1.jpg', width=600)


object detection on image using yOLOv5

You can also use the Yolov5  model using PyTorch Hub.


We have gone through the history of YOLO object detection models and also seen a simple tutorial to check the accuracy of this architecture. It is pretty awesome and fast, there are many other tutorials on the internet available to go into the depth of YOLOv5. If you want to explore more about YOLOv5, here are some of the tutorials you can refer to these tutorials:

Mohit Maithani
Mohit is a Data & Technology Enthusiast with good exposure to solving real-world problems in various avenues of IT and Deep learning domain. He believes in solving human's daily problems with the help of technology.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.