How Uber Implements CI/CD Of Machine Learning Models


The ride-hailing giant Uber is currently present in 10K cities across 71 countries, and its platform is used by 93 million customers and 3.5 million drivers globally. 

Every quarter, the ride-hailing platform completes nearly 1.44 billion trips. However, as a result of a global pandemic and travel restrictions, the total number of quarterly Uber trips decreased by 24.21% in 2020. 


Sign up for your weekly dose of what's up in emerging technology.

“At Uber, we have witnessed a significant increase in ML adoption across various organisations and use-cases over the last few years,” said the company in its latest blog post co-authored by Yi Zhang, Joseph Wang, Jia Li, and Yunfeng Bai. The blog further highlighted various pain points, alongside explaining the solution implementation of continuous integration (CI) and continuous deployment (CD) of machine learning models as a solution.

Showcasing CI/CD for models and service binary at Uber (Source: Uber)

MLOps hurdles 

Here are the four MLOps challenges Uber faces: 

  • The first challenge was to support a large volume of ‘model deployments’ daily while keeping the real-time prediction service highly available.
  • The memory footprint associated with a ‘real-time prediction service’ instance grew as newly retained models got deployed, resulting in a second challenge. “We observed a great portion of ‘older models’ received no traffic as ‘newer models’ were being deployed,” said Uber researchers. 
  • The third challenge was associated with ‘model rollout strategies.’ “ML engineers may choose to roll out models through different stages, such as ‘shadow testing’ or experimentation. We observed some common patterns in ‘model rollout strategies’ and decided to incorporate them into the real-time prediction service.” said the researchers.
  • “As we are managing a fleet of ‘real-time prediction services,’ manual service software deployment is not an option,” said researchers.

The researchers said, when deploying the machine learning model, the ‘model deployment service’ performs validation by making prediction calls to the ‘candidate model’ with ‘sampled data.’ 

However, it does not match against ‘existing models deployed’ to the ‘real-time prediction services.’ Therefore, even if a model passes validation, there is no guarantee that the model can be used or exhibits the same behaviour (for feature transformation and model evaluation) when deployed to production. This could be due to dependency changes, service build script changes, or interface changes between two ‘real-time prediction service’ releases.


Uber relies on CI/CD for service release deployment for a fleet of ‘real-time prediction services.’ “Since we are supporting ‘critical business use cases,’ in addition to validation during ‘model deployment,’ we need to ensure high confidence in the automated CI/CD process,” the researchers added. 

Uber addressed the following MLOps challenges, such as behaviour changes with new releases, dependency changes and service build script changes by employing a three-stage strategy for validating and deploying the latest binary of the real-time prediction service. The three-stage strategy includes staging integration test, canary integration test, and production rollout. 

The staging integration test and canary integration tests are run against non-production environments, while staging integration tests are used to verify the ‘basic functionalities.’ Once the ‘staging integration tests’ have been passed, the team runs canary integration tests to ensure the serving performance across all production models. After ensuring the behaviour for production models remains unchanged, the release is later deployed onto all ‘real-time prediction service’ production instances in a rolling deployment fashion. 

Wrapping up 

Uber is currently working on: near real-time monitoring for inference accuracy, feature quality, and business metrics; deploy and serve multi-task learning and hybrid models; perform feature validation; better model fallback mechanism; model traceability and debuggability, etc.

“As we evolve Uber’’s ML infrastructure and platform and support new machine learning use cases, we see new MLOPs challenges emerge,” said Uber

More Great AIM Stories

Amit Raja Naik
Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.

Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM