MITB Banner

Top 10 Popular Datasets For Autonomous Driving Projects

Share

Since a few years, organisations have been investing heavily in the autonomous driving space. The reason behind this spending is expected to reshape the ways of the transport network in a positive way. According to reports, the global autonomous vehicle market is expected to witness an accelerated CAGR of 62.86% to reach $41.24 billion by 2024. 

In this article, we list down ten popular datasets for autonomous driving projects.

The list is in alphabetical order.

1| Astyx Dataset HiRes2019 

The Astyx Dataset HiRes2019 is a popular automotive radar dataset for deep learning-based 3D object detection. The motive behind open-sourcing this dataset is to provide high-resolution radar data to the research community, facilitating and stimulating research on algorithms using radar sensor data. The dataset is a radar-centric automotive dataset based on radar, lidar and camera data for 3D object detection. The size of the dataset is more than 350 MB, and it consists of 546 frames.

Download here.

2| Berkeley DeepDrive

The Berkeley DeepDrive dataset by UC Berkeley is comprised of over 100K video sequences with diverse kinds of annotations including image-level tagging, object bounding boxes, drivable areas, lane markings, and full-frame instance segmentation. The dataset possesses geographic, environmental, and weather diversity, which is useful for training models so that they are less likely to be surprised by new conditions.

Download here.

3| Landmarks

Google open-sourced this dataset for recognising human-made and natural landmarks. The dataset is being released as part of the Landmark Recognition and Landmark Retrieval Kaggle challenges in 2018. It contains more than 2 million images depicting 30 thousand unique landmarks from across the world (their geographic distribution is presented below), a number of classes that is ~30x larger than what is available in commonly used datasets.  

Download here.

4| Landmarks-v2

After the release of the landmarks dataset in 2018, the tech giant Google released the Google Landmarks-v2 dataset in 2019. This landmark recognition dataset is larger and much more diverse due to the difference in scale for recognition than the Landmarks dataset. It includes over 5 million images (2x that of the first release) of more than 200 thousand different landmarks (an increase of 7x).

Download here.

5| Level 5

The ride-sharing company, Lyft open-sourced the Level 5 dataset. Level 5 is a comprehensive, large-scale dataset featuring the raw sensor camera and LiDAR inputs as perceived by a fleet of multiple, high-end, autonomous vehicles in a restricted geographic area. The dataset also includes high quality, human-labelled 3D bounding boxes of traffic agents, an underlying HD spatial semantic map.

Download here.

6| nuScenes Dataset

nuScenes is a large-scale public dataset for autonomous driving. The dataset enables researchers to study urban driving situations using the full sensor suite of a real-self-driving car. The dataset features 1,400,000 camera images, 390,000 lidar sweeps, detailed map information, full sensor suites such as 1x LIDAR, 5x RADAR, 6x camera, IMU, GPS, manual annotations for 23 object classes and other such. 

Download here.

7| Open Images V5

Open Images V5 is a dataset consisting of more than nine million images annotated with labels spanning thousands of object categories. The Open Images V5 dataset features segmentation masks for 2.8 million object instances in 350 groups. The dataset includes 2.68M segmentation masks on the training set, 36.5M image-level labels with over 20k categories as well as 99k masks on the validation and test sets. 

Download here.

8| Oxford Radar RobotCar Dataset

The Oxford RobotCar dataset is comprised of over 100 repetitions of a consistent route through Oxford, the UK which has been captured for more than one year. The dataset is a combination of many different combinations of weather, traffic, and pedestrians, along with longer-term changes such as construction and roadworks.

Download here.

9| Pandaset

Pandaset is one of the popular large scale datasets for autonomous driving. This dataset enables the researchers to study self-driving and aims to promote advanced research and development in autonomous driving and machine learning. The dataset features 60k cameras, 20k Lidar, 28 annotation classes, 37 segmentation labels and much more.

Download here.

10| Waymo Open Dataset

The Waymo Open dataset is an open-sourced high-quality multimodal sensor dataset for autonomous driving. The dataset is extracted from Waymo self-driving vehicles and covers a wide variety of environments, from dense urban centres to suburban landscapes. The collection is comprised of different times, including sunshine, rain, day, night, dawn and dusk. It contains 1000 types of different segments where each segment captures 20 seconds of continuous driving, corresponding to 200,000 frames at 10 Hz per sensor. 

Download here.

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.