Active Hackathon

Getting Started With Object Detection Using TensorFlow

object detection

Object detection is the process of classifying and locating objects in an image using a deep learning model. Object detection is a crucial task in autonomous Computer Vision applications such as Robot Navigation, Self-driving Vehicles, Sports Analytics and Virtual Reality

Locating objects is done mostly with bounding boxes. Instance segmentation masks and key points are also used separately for locating objects or along with bounding boxes. A bounding box is a simple rectangle box that bounds an object. Representation of bounding boxes is standardized for interchangeability and reproducibility of object detection datasets and object detection models. One of the famous and widely used bounding box formats is COCO format. This format was introduced by the “Common Objects in Context” dataset, which is a huge collection of annotated images prepared exclusively for object detection. This format describes a bounding box with four parameters as [top_left_x, top_left_y, width, height]. 


Sign up for your weekly dose of what's up in emerging technology.

In this article, we discuss how to perform Object Detection with a pre-trained EfficientDet model using TensorFlow. Google’s EfficientDet is one of the famous object detection models. This model is trained on the popular COCO2017 dataset. This dataset has around 160,000 images that contain 80 classes. The EfficientDet model’s training checkpoints are available open-source and can be readily implemented in any custom detection model via transfer learning.

This article assumes that the readers understand the fundamentals of deep learning, computer vision, image segmentation, and transfer learning. Nevertheless, the following articles may instantly fulfill the prerequisites:

  1. Getting Started With Deep Learning Using TensorFlow Keras
  2. Getting Started With Computer Vision Using TensorFlow Keras
  3. Exploring Transfer Learning Using TensorFlow Keras
  4. Getting Started with Semantic Segmentation Using TensorFlow Keras

Let’s dive deeper into hands-on learning. 

Object Detection API

TensorFlow’s Object Detection API is a useful tool for pre-processing and post-processing data and object detection inferences. Its visualization module is built on top of Matplotlib and performs visualizations of images along with their coloured bounding boxes, object classes, keypoints, instance segmentation masks with fine control. Here, we use this API for post-processing the inference results and visualize the results.

Download the API from its source repository.

!git clone --depth 1


object detection API

This clone brings many TensorFlow models at once. Install Object Detection API and its dependencies using the following commands.

 sudo apt install -y protobuf-compiler
 # change directory
 cd models/research/
 protoc object_detection/protos/*.proto --python_out=.
 cp object_detection/packages/tf2/ .
 # install dependencies
 python -m pip install . 

A portion of the output:

Install object detection API

Create the environment 

Import necessary libraries, frameworks and modules.

 import matplotlib
 import matplotlib.pyplot as plt
 import cv2
 import numpy as np
 import tensorflow as tf
 import tensorflow_hub as hub

 from object_detection.utils import label_map_util
 from object_detection.utils import visualization_utils as viz_utils
 from object_detection.utils import ops as utils_ops
 %matplotlib inline 

Prepare EfficientDet Model

Load the pre-trained model with weights from the TensorFlow Hub.

 model_url = ''
 efficientdet = hub.load(model_url) 

The model is completely ready for deployment or inference. TensorFlow Hub has a great collection of ready-to-deploy pre-trained models. Models and their checkpoints can be loaded with a single line of code. 

Prepare some Data for Inference

Download some image data to perform inference. The following data source contains open-source images, each contains multiple objects suitable for detection. Clone the source and download the data into the local (or virtual) machine.

!git clone



Read the images and save them in the required format in a list. EfficientDet receives input images in the shape of [1, 512, 512, 3]. It does not support batching. It receives images one by one. The shape supported by our version is 512 by 512 pixels. There are other versions of EfficientDet that support 640 by 640, 768 by 768, 1024 by 1024, and so on. Each image should be in 3 colour channels. If a grayscale image is used, it should be modified to the required shape format.

 images = []
 # Read 10 images from the downloaded dataset
 for i in range(1,11):
     url = './dataset/Images/%03d.jpg'%i
     img = cv2.imread(url)
     # cv2 reads image in BGR format
     # convert BGR into RGB
     img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
     # EfficientDet expects 512 by 512
     img = tf.image.resize(img, (512,512))
     # EfficientDet expects uint8
     img = tf.cast(img, tf.uint8)
     # EffiencientDet expects [1,512,512,3]
     img = tf.expand_dims(img, axis=0)

Check the image shape. 



Sample an image and visualize it.

 img = images[0].numpy().reshape(512,512,3)


sample image

The pixel values range from 0 to 255. EfficientDet expects images not to be scaled or normalized. Hence, our data is ready for inference.

Visualize all our test images. There are 10 images in our test data.

 for i in range(10):
     img = images[i].numpy().reshape(512,512,3)


preprocessed images - object detection task
preprocessed images - object detection task
preprocessed images - object detection task
preprocessed images - object detection task
preprocessed images - object detection task

Inference – Object Detection

Perform inference with the EfficientDet Model on the pre-processed image data.

 results = []
 # infer and save results in a list
 for i in range(10):
     res = efficientdet(images[i])
 # what results do we obtain?


Out of these results, we are interested in detection_boxes, detection_classes, and detection_scores required to visualize the detections.

During inference, classified objects are reported as integers. The following helper method obtains the original class name against the class numbers from the originally trained dataset.

 label = './models/research/object_detection/data/mscoco_label_map.pbtxt'
 category = label_map_util.create_category_index_from_labelmap(label, 

Visualize Object Detection Results

Define a helper function that displays the input images and results on top of them. The bounding boxes, locations, class name, and colour are extracted from the results and displayed as images. 

 def display_detections(image, result):
     result = {key:val.numpy() for key,val in result.items()}

Display the input images along with inferences made on them.

 for i in range(10):
     img = images[i].numpy().copy()[0]
     res = results[i]
     display_detections(img, res) 


inference - object detection

This notebook contains the above code implementation.

Wrapping Up

In this article, we have discussed object detection and its standard data formats. We have discussed an object detection model implementation with the famous EfficientDet model pre-trained on COCO 2017 dataset. We have learnt to perform object detection by loading a pre-trained model and its checkpoints, inferencing test images, post-processing the results and visualizing the detections with bounding boxes. 

Interested readers can choose a different version of the EfficientDet model or a different model (such as CenterNet, Faster R-CNN, Mask R-CNN and SSD), preprocess the data according to the model’s requirements and perform inference on their own data.


More Great AIM Stories

Rajkumar Lakshmanamoorthy
A geek in Machine Learning with a Master's degree in Engineering and a passion for writing and exploring new things. Loves reading novels, cooking, practicing martial arts, and occasionally writing novels and poems.

Our Upcoming Events

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: Enabling a Data-Driven culture within BFSI GCCs in India

Data is the key element across all the three tenets of engineering brilliance, customer-centricity and talent strategy and engagement and will continue to help us deliver on our transformation agenda. Our data-driven culture fosters continuous performance improvement to create differentiated experiences and enable growth.

Ouch, Cognizant

The company has reduced its full-year 2022 revenue growth guidance to 8.5% – 9.5% in constant currency from the 9-11% in the previous quarter