ByteDance Releases Multi-Object Tracking Library Named “ByteTrack”

The research team hopes that the high accuracy, fast speed and simplicity of ByteTrack can make it attractive and effective in real applications. 

Video and Language analytics platform ByteDance recently announced the release of its Multi-Object Tracking library for estimating bounding boxes and identities of objects in videos, named ByteTrack

The library aims to solve the problem of detecting objects with low detection scores, e.g. occluded objects that are simply thrown away, and bring non-negligible true objects missing and fragmented trajectories. ByteTrack uses a simple, effective and generic association method, tracking by associating every detection box instead of only the high score ones. For the low score detection boxes, it utilizes their similarities with tracklets to recover true objects and filter out the background detections.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.
Image: ByteTrack Research Paper

ByteTrack uses BYTE technology, which is different from traditional methods, which only keep the high score detection boxes. BYTE keeps every detection box and separates them into high score ones and low score ones. It first associates the high score detection boxes to the tracklets. Some tracklets get unmatched because it does not match to an appropriate high score detection box, which usually happens when occlusion, motion blur or size changing occurs. Then it associates the low score detection boxes and these unmatched tracklets to recover the objects in low score detection boxes and filter out the background simultaneously. 

Image: ByteTrack Research Paper

The input of BYTE is a video sequence V, along with an object detector and the Kalman filter. ByteTrack is equipped with a high-performance detector named YOLOX, along with the association method BYTE. YOLOX switches the YOLO series detectors to an anchor-free manner and conducts other advanced detection techniques, including decoupled heads, strong data augmentations, such as Mosaic and Mixup with effective label assignment strategy SimOTA, to achieve state-of-the-art performance on object detection

Image: ByteTrack Research Paper

ByteTrack was evaluated on the half validation set of MOT17 using different combinations of training data. When using only the half training set of MOT17, the performance achieves 75.8 MOTA, outperforming most methods. This is because it uses strong augmentations such as Mosaic and Mixup. When further adding CrowdHuman, Cityperson and ETHZ for training, we can achieve 76.7 MOTA and 79.7 IDF1. 

ByteTrack is very robust to occlusion for its accurate detection performance and the help of associating low score detection boxes. The model also sheds light on making the best use of detection results to enhance multi-object tracking. The research team hopes that the high accuracy, fast speed and simplicity of ByteTrack can make it attractive and effective in real applications. 

More Great AIM Stories

Victor Dey
Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM