According to researchers of Google, autonomous driving technology is expected to enable a wide range of applications that has the potential to save many human lives, ranging from robotaxis to self-driving trucks. As data is the fuel behind any intelligent machine, open-sourcing landmark and self-driving datasets not only accelerate the progress of autonomous driving research but also drive the progress in machine perception tasks.
In this article, we listed down the top 5 autonomous driving datasets that were open-sourced at the popular Computer Vision conference, CVPR 2020.
(The list is in alphabetical order)
1| BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning
About: BDD100K is a new, diverse, and large-scale dataset of visual driving scenes. Researchers from UC Berkeley and others collected and annotated the largest available dataset of annotated driving scenes that consists of over 100K diverse video clips.
The benchmarks of this dataset are comprised of ten tasks, which are image tagging, lane detection, drivable area segmentation, road object detection, semantic segmentation, instance segmentation, multi-object detection tracking, multi-object segmentation tracking, domain adaptation, and imitation learning. BDD100K covers more realistic driving scenarios and captures more of the “long-tail” of appearance variation and pose configuration of categories of interest in diverse environmental domains.
Get the dataset here.
2| Google Landmarks Dataset v2 – A Large-Scale Benchmark for Instance-Level Recognition and Retrieval
About: Researchers from Google Research introduced the Google Landmarks Dataset v2 (GLDv2), which is a new benchmark for large-scale, fine-grained instance recognition and image retrieval in the domain of human-made and natural landmarks.
According to the researchers, GLDv2 is the largest dataset to date by a large margin, including over 5M images and 200k distinct instance labels. Its test set consists of 118k images with ground truth annotations for both the retrieval and recognition tasks. The dataset is claimed to have several challenging properties inspired by real-world applications that previous datasets did not consider. They are an extremely long-tailed class distribution, a large fraction of out-of-domain test photos and large intra-class variability.
Get the dataset here.
3| Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition
About: The Mapillary Street-Level Sequences (MSLS) is a large dataset for urban and suburban place recognition from image sequences. According to the researchers, with vast applications in robust localisation and efficient large-scale 3D reconstruction, lifelong place recognition is a crucial and challenging task in the field of computer vision, The dataset is designed to exhibit the diversities of true lifelong learning.
The MSLS dataset contains more than 1.6 Million images that are extracted from the Mapillary collaborative mapping platform. The dataset stresses images from 30 major cities across six continents, hundreds of distinct cameras from different viewpoints, and capture times, spanning all seasons over a period of nine years. All images are geo-located with GPS and compass and feature high-level attributes such as road type.
Get the dataset here.
4| nuScenes: A Multimodal Dataset for Autonomous Driving
About: nuTonomy scenes or nuScenes is a large-scale public dataset for autonomous driving. The dataset enables researchers to study challenging urban driving situations with the help of a full sensor suite of a real self-driving car.
According to the researchers, this dataset is the first dataset to carry the fully autonomous vehicle sensor suite, i.e. 6 cameras, 5 radars and 1 lidar, all with 360-degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes.
Get the dataset here.
5| Scalability in Perception for Autonomous Driving: Waymo Open Dataset
About: Researchers from Google introduced a new large-scale, high quality, and diverse dataset that consists of 1150 scenes (each 20 seconds long), consisting of well synchronised and calibrated high-quality LiDAR and camera data captured across a range of urban and suburban geographies. It is claimed that the dataset is 15x more diverse than the largest camera+LiDAR dataset available based on the proposed geographical coverage metric.
The dataset is further annotated with 2D (camera image) and 3D (LiDAR) bounding boxes, with consistent identifiers across frames and provided strong baselines for 2D as well as 3D detection and tracking tasks.
Get the dataset here.