For the first time, ride-hailing company Uber has opened up about what is going on under the hood of their ATG’s machine learning infrastructure and versioning control platform for autonomous driving vehicles. ATG is the Advanced Technologies Group, which concentrates and researches on self-driving vehicles by deploying machine learning models into the cars.
The self-driving division at Uber has more than 450 employees who have been working on autonomous vehicle technology for several years now. Recently, the self-driving team at Uber developed a set of tools and microservices to support the ML workflow known as VerCD. The team also discussed their self-driving vehicle components, which use machine learning models as well as the machine learning model life cycle.
Let’s take a deep dive into the VerCD platform along with the life cycle of machine learning models at Uber.
The self-driving researchers at Uber developed VerCd to provide versioning and continuous delivery of all machine learning code as well as artefacts for Uber ATG’s self-driving vehicle software. VerCD offers regular daily integration tests for many of the data set building, model training, and metrics pipelines at Uber ATG.
Throughout the workflow, this platform tracks the dependencies between code, datasets, and models. VerCD also orchestrates the creation of these ML artefacts, making it a critical part of the process. Specifically, the workflows that are covered by VerCD start with the dataset extraction stage, cover model training, and conclude with computing metrics.
Unlike traditional version control and continuous delivery systems, VerCD tracks all dependencies of each ML component, which often includes data and model artefacts in addition to code. This metadata service provided by VerCD follows dependency graphs and is used by a continuous integration orchestrator to run entire ML workflow pipelines on a regular basis to produce data sets, trained models, and metrics.
Machine Learning Model Life Cycle
At ATG Uber, most of the self-driving components use complex ML models, which enables them to drive in a more accurate and safe manner. A component consists of one or more ML models, and the components are perception, prediction, motion planning, and control. The ML models that comprise these components go through our five-step iteration process, ensuring their optimal operation.
The machine learning models make predictions, forecasts, and estimates based on training using historical data. The researchers use data collected from sensor-equipped vehicles, which include LiDAR, cameras and radar in a wide variety of traffic conditions.
The ML model life cycle consists of 5 stages, which are data ingestion, data validation, model training, model evaluation and model serving. The ML models are put through each stage of the life cycle to ensure that the models exhibit high-quality model, system, and hardware metrics before they are being deployed to the self-driving vehicles.
The ML model life cycle process and tools like VerCD helps the researchers to manage various complex ML models as well as iterate them in a faster manner. After the incident at Tempe, the multinational ride-hailing company has put a lot of effort into the game. The company launched several efficient AI platforms last year, and it continues to witness an exponential growth within a few years. Last year, the company also acquired Seattle-based Mighty AI to boost its autonomous driving capabilities.
A few months ago, Uber created a learning algorithm known as Generative Teaching Networks or GTN, which has the capability to generate synthetic training data for other AI models. Last year, the company also open-sourced their conversational AI platform known as Plato Research Dialogue System.