In 2017, Uber introduced its ML-as-a-service platform Michelangelo to democratise machine learning and make scaling AI ‘as easy as requesting a ride’.
In 2020 Q1, Uber made a staggering 1,658 million trips a day on average. With such a big fleet of vehicles and drivers and an ever-growing customer base, Uber has access to a rich dataset. Uber has always been bullish on AI and machine learning, and Michelangelo is one of its pet projects.
Need For Standardisation
In production from 2015 and finally put into operation in 2016, Michelangelo was built to enable internal teams to seamlessly build, deploy, and operate ML solutions at Uber’s scale. It is now deployed across several Uber data centres and is used to predict the company’s highest loaded online services.
The motivation to build Michelangelo came when the team started finding it excessively difficult to develop and deploy machine learning models at scale. Before Michelangelo, the engineering teams relied mainly on creating separate predictive models or one-off bespoke systems. But such short term solutions were limited in many aspects.
Michelangelo is an end-to-end system that standardises workflows and tools across teams to build and operate machine learning models at scale easily. It has now emerged as the de-facto system for machine learning for Uber engineers and data scientists, with several teams leveraging it to build and deploy models.
Michelangelo is built on open-source components such as HDFS, XGBoost, Tensorflow, Cassandra, MLLib, Samza, and Spark. It uses Uber’s data and the compute infrastructure to provide a data lake that stores Uber’s transactional and logged data; Kafka brokers for aggregating logged messages; a Samza streaming compute engine; managed Cassandra clusters; and in-house service provisioning and deployment tools.
Michelangelo provides scalable, reliable, reproducible, and automated tools to address the six-step workflow:
- Manage data
- Train models
- Evaluate models
- Deploy Models
- Make predictions
- Monitor predictions
The platform consists of a data lake that is accessed during training and inferencing. Applications access the centralised data store through batch prediction and online inferencing. Michelangelo offers standard algorithms for ML model training. Plus, developers can add new algorithms for training their ML models. Michelangelo also has end-to-end support for managing model deployment through UI or API, which can be done for both online and offline predictions. The platform monitors predictions continuously for accuracy and speed to trigger re-training when required.
In 2018, Uber extended Michelangelo through PyML to make it easier for Python users to train and deploy their models. These models can contain arbitrary use code and use any Python package or native Linux libraries. It allows data scientists to locally run an identical copy of a model in real-time experiments and large-scale parallelised offline prediction works.
The functionalities are accessible through simple Python SDK and can be leveraged directly through development environments such as Jupyter notebooks without switching between separate applications. Since this solution leverages several open-source components, it can be easily transferred to other ML platforms and model serving systems.
In 2017, Michelangelo was launched with a monolithic architecture that managed tightly coupled workflows and Spark jobs for training and serving. Michelangelo had specific pipeline definitions for each supported model type. Offline serving was handled through Spark, and online serving was handled using custom APIs.
In 2019, the team at Uber decided to update Michelangelo model representation for flexibility at scale. The original model supported only a subset of Spark MLlib models with in-house custom model serialisation and representation. This prevented customers from experimenting with complex model pipelines. The team then decided to develop Michelangelo’s use of Spark MLlib in areas such as model representation, persistence, and online serving.
- Several models of UberEats are running on Michelangelo for meal delivery time predictions, search rankings, restaurant rankings, and search autocomplete. Data scientists use gradient boosted decision tree regression models for predicting delivery time using information from order request, historical features, and near real-time calculated features. Michelangelo plays a vital role in achieving the end goal of predicting the total duration of this multi-stage and complex process and recalculating delivery time predictions at every step of the process.
- Uber uses a Customer Obsession Ticket Assistant (COTA) tool to help agents deliver better customer support. It is based on Michelangelo, placed on top of Uber’s customer support platform. COTA helps in quick and efficient issue resolution for as many as 90 percent of the inbound support tickets.
- Uber leverages various spatiotemporal forecasting models based on Michelangelo to predict rider demand and cab availability at several locations and times in the future. Uber relies on these models to forecast imbalance between demand and supply and encourage driver-partners to reach high demand locations beforehand.
- The one-click chat feature based on Michelangelo streamlines communication between riders and driver-partners by leveraging NLP models for predicting and displaying most likely responses on the in-app chat messages. This helps driver-partners respond with pre-fed templates to reduce distraction.
- Engineers at Uber use Michelangelo’s Horovord for building their self-driving car systems. It uses deep learning models for functions such as object detection and motion planning.
- Feast, a leading open-source feature store for machine learning, was built by Willem Pienaar. He said Feast drew inspiration from Michelangelo. Notably, Tecton, an enterprise feature store company founded by members of the team that built Michelangelo, said they would become a core contributor to Feast.