The Tech Behind Uber’s Bet On Self-Driving Cars

Uber Self-Driving Cars

For the first time, ride-hailing company Uber has opened up about what is going on under the hood of their ATG’s machine learning infrastructure and versioning control platform for autonomous driving vehicles. ATG is the Advanced Technologies Group, which concentrates and researches on self-driving vehicles by deploying machine learning models into the cars.

The self-driving division at Uber has more than 450 employees who have been working on autonomous vehicle technology for several years now. Recently, the self-driving team at Uber developed a set of tools and microservices to support the ML workflow known as VerCD. The team also discussed their self-driving vehicle components, which use machine learning models as well as the machine learning model life cycle.

Let’s take a deep dive into the VerCD platform along with the life cycle of machine learning models at Uber. 

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Behind VerCD

The self-driving researchers at Uber developed VerCd to provide versioning and continuous delivery of all machine learning code as well as artefacts for Uber ATG’s self-driving vehicle software. VerCD offers regular daily integration tests for many of the data set building, model training, and metrics pipelines at Uber ATG.

Throughout the workflow, this platform tracks the dependencies between code, datasets, and models. VerCD also orchestrates the creation of these ML artefacts, making it a critical part of the process. Specifically, the workflows that are covered by VerCD start with the dataset extraction stage, cover model training, and conclude with computing metrics.


Download our Mobile App



Unlike traditional version control and continuous delivery systems, VerCD tracks all dependencies of each ML component, which often includes data and model artefacts in addition to code. This metadata service provided by VerCD follows dependency graphs and is used by a continuous integration orchestrator to run entire ML workflow pipelines on a regular basis to produce data sets, trained models, and metrics.

Machine Learning Model Life Cycle

At ATG Uber, most of the self-driving components use complex ML models, which enables them to drive in a more accurate and safe manner. A component consists of one or more ML models, and the components are perception, prediction, motion planning, and control. The ML models that comprise these components go through our five-step iteration process, ensuring their optimal operation.

The machine learning models make predictions, forecasts, and estimates based on training using historical data. The researchers use data collected from sensor-equipped vehicles, which include LiDAR, cameras and radar in a wide variety of traffic conditions. 

The ML model life cycle consists of 5 stages, which are data ingestion, data validation, model training, model evaluation and model serving. The ML models are put through each stage of the life cycle to ensure that the models exhibit high-quality model, system, and hardware metrics before they are being deployed to the self-driving vehicles.

Wrapping Up

The ML model life cycle process and tools like VerCD helps the researchers to manage various complex ML models as well as iterate them in a faster manner. After the incident at Tempe, the multinational ride-hailing company has put a lot of effort into the game. The company launched several efficient AI platforms last year, and it continues to witness an exponential growth within a few years. Last year, the company also acquired Seattle-based Mighty AI to boost its autonomous driving capabilities.

A few months ago, Uber created a learning algorithm known as Generative Teaching Networks or GTN, which has the capability to generate synthetic training data for other AI models. Last year, the company also open-sourced their conversational AI platform known as Plato Research Dialogue System.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

Our Upcoming Events

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023

21 Jul, 2023 | New York
MachineCon USA 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

The Great Indian IT Reshuffling

While both the top guns of TCS and Tech Mahindra are reflecting rather positive signs to the media, the reason behind the resignations is far more grave.

OpenAI, a Data Scavenging Company for Microsoft

While it might be true that the investment was for furthering AI research, this partnership is also providing Microsoft with one of the greatest assets of this digital age, data​​, and—perhaps to make it worse—that data might be yours.