Now Reading
What is GSDT: GNNs for Simultaneous Detection and Tracking

What is GSDT: GNNs for Simultaneous Detection and Tracking


Multi-Object Tracking, also called the MOT, is the detection and follow-up of multiple moving objects at the same time in a dynamic environment. It finds crucial applications including autonomous vehicles, robot navigation, security surveillance, medical imaging and sports analysis. Multi-Object Tracking comprises two key challenges, namely, object detection and data association. Object detection is performed by a neural network that looks for the objects of interest, whereas, data association is performed by a time-lapse-aware neural network that looks for correspondences between the same object in two different frames. Traditional multi-object tracking approaches to train the object detection network and the data association network separately. These networks are optimized separately to obtain better performance in their parts of the job. This strategy fails to handle object detection and data association end-to-end in machine learning modeling, though these tasks rely wholly on each other. This issue limits improvement in performance beyond a certain level. 

A few recent approaches introduced joint multi-object tracking to tackle the above-said problem. Some attempted tracking objects individually and independently that easily resolved the data association problem, but they led to a new problem. They ignored object-object relationships as they started tracking objects individually. Object-object relationships are crucial in identifying relative patterns among objects. On the other hand, some approaches attempted, including object-object relationships, but they necessitated training object detectors separately.

To this end, Yongxin Wang, Kris Kitani, Xinshuo Weng of the Robotics Institute, Carnegie Mellon University has developed an end-to-end trainable joint Multi-Object Tracking architecture using Graphical Neural Networks that is named GSDT, the abbreviation for GNNs for Simultaneous Detection and Tracking. GSDT models object-object relationships for both the data association and object detection. It follows the joint multi-object tracking strategy; thus it can be trained and optimized as a whole. It employs Graphical Neural Networks to obtain more discriminative features. This model achieves state-of-the-art results in various public multi-object datasets, including MOT15, MOT16, MOT17 and MOT20.

MOT 20
A sample Multi-object tracking on MOT20 dataset using the GSDT model (source)

How GSDT differs from competing models

GSDT strategy
The training strategy of GSDT compared to the previous works (source)

In GSDT, two images from successive frames and tracklets from the previous frame are given to the model as inputs. The model attempts to detect the objects in the current frame with these inputs and associate those detected objects with the tracklets of the previous frame. By associating the tracklets to the objects, the model decides iteratively whether to continue using a specific tracklet or to discontinue it or to initiate a new tracklet at the current frame.

An overview of the GSDT Architecture
An overview of the GSDT Architecture (source)

An object detector and a re-identification module are used in GSDT to detect multiple objects and associate them simultaneously. In addition, graphical neural networks are used to extract and learn features and improve both object detection and data association performances. In short, the GSDT architecture is composed of four modules, namely, GNNs-based feature extraction module, node feature aggregation module, object detection module and data association module.

Functional overview of node feature aggregation
Functional overview of node feature aggregation (source)

Python implementation of GSDT

  1. GSDT requires a PyTorch environment with CUDA enabled GPU runtime. Download the source codes from the official repository.
!git clone


  1. Change the directory to refer to the downloaded GSDT and explore its contents.
 %cd /content/GSDT/
 !ls -p 


  1. GSDT works well with Anaconda-3 distribution. Download and install if the local machine does not have a conda environment.


  1. Enable and activate the conda environment.
  1. Inside the conda’s base environment, provide the following command. 
conda create -n dev python=3.6

A part of Code and Output:

See Also
PyTorch Geometric

  1. Activate conda’s development environment using the following command and run the following steps inside the development environment only.
conda activate dev
  1. Install the dependencies in the development environment by running pip command in recursion.
pip install -r requirements.txt

A part of Code and Output:

  1. Install the PyTorch version 1.7.0  that is compatible with the CUDA version 10.2. Anaconda distribution comes with CUDA 10.2 by default.
pip install torch==1.7.0
  1. Install the PyTorch Geometric package 
bash CUDA_version=cu102


  1. Build Deformable Convolutional Neural Network version 2 from the source file using the following command successively. 
 cd ./src/lib/models/networks/DCNv2
  1. Download the dataset from MOT15 and MOT20 challenges. Once the dataset is ready, the following commands generate labels corresponding to the objects.
 cd src
  1. Download the pre-trained models corresponding to the MOT15 dataset and MOT20 dataset and their weights and move them to /content/GSDT/experiments. Perform sample evaluation on two frames from the datasets, each using the following commands successively. 
 cd ./experiments model_mot15 model_mot20 

Performance of GSDT

GSDT has been evaluated on the open challenges MOT15, MOT16, MOT17 and MOT20. Compared with competing models, the model has been submitted by its authors to the official leaderboard of the MOT challenge. Models are evaluated based on numerous standard metrics including MOTA, IDF1, MT, ML and IDS. 

Sample MOT
A sample multi-object tracking on MOT20 dataset using the GSDT model (Source)
Sample MOT
A sample multi-object tracking on MOT17 dataset using the GSDT model (Source)

GSDT greatly outperforms most of the well-acclaimed models including DMT, LIF_TsimInt, MDP_SubCNN, CDA_DDAL, MPNTrack, EAMTT, AP_HWDPL, NOMTwSDP, RAR15, Tube_TK, CTrackerV1, CTTrack17, SORT20 and POI. GSDT is recognized as the state-of-the-art in the MOT challenge during its publication.

A sample MOT
A sample multi-object tracking on MOT17 dataset using the GSDT model (Source)

Further reading

What Do You Think?

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.
Join our Telegram Group. Be part of an engaging community

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top