Uber has 93 million active users, with drivers completing 4.98 billion trips in 2020, accumulating $11.1 billion in revenue. That is a ton of transactions happening every second of the day across countries worldwide. So, how does Uber ensure the payments are completed and the company isn’t cheated? Through RADAR.
Uber uses the RADAR (Real-time Anomaly Detection and Response) tool to tag potential fraudsters attempting to get away without paying for the ride. The platform detects payment fraud by combining dozens of time signals. Recently, the accuracy of RADAR was improved by the company’s latest version of Orbit 1.1. In a recent blog post, Uber explained how the RADAR works.
RADAR is an AI fraud detection and mitigation system with humans in the loop. Essentially, the AI system detects fraud in its beginning stages and generates rules to prevent or stop it. The humans in the loop are fraud analysts that review and approve these rules. These rules are targeted towards specific actions and attacks and thus tend to be short-lived.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Humans in the loop: XAI
Explainable AI is one of the major concerns related to emerging technologies today. While the black box problem is still developing, having a human in the automated loop is a valid method of ensuring the system is working in a way engineers can understand. This characteristic is especially important in fraud detection and the delicate scenarios involved. Human-in-the-loop systems work in two ways; providing context to AI models and guiding humans for further actions. Thus, the humans in the loop feature of RADAR allow the system to be humanised, traceable and accurate.
All about RADAR
Uber uses synthetic data for their anomaly detection algorithms. This data runs on Apache Spark and Peloton for two main reasons. First, Apache’s accessible Python and Scala API are used to separate data processing code from data access code to run unit and integration tests on the data and AI code. Second, Peloton helps in scheduling the scales to process the data. Automatically testing the data pipelines and designing the data generation process assumptions and attack patterns into the probabilistic programming models makes it easier for the RADAR engineers to respond quickly to changes in the data generation.
At the backend of the system, RADAR analyses the activity time-series from Uber’s payment platform, converts this data into human action and predicts fraud attacks. The system uses two-time dimension codes to tag the data; OT standing for ‘Order time’ refers to the order being fulfilled, and PSMT for ‘Payment settlement maturity time’, pointing to the time frame for processing the payment.
On the backbone of Apache Spark® SQL API, RADAR’s data pipeline collects the information used for the activity time-series generation. Next, the platform uses AthenaX to aggregate the risk payment stream data from live payment streams, accompanied by data aggregation on Hive from Kafka, using Marmaray. Lastly, RADAR builds a single-dimensional hourly time series using OT dimensions with selective slices for PSMT. Following this, unit testing of the data processing is done on Apache PySpark API to test the data for the local development machine and add it to the CI/CD cycle.
Orbit is Uber’s general interface for Bayesian time series modelling. The tool is built on probabilistic programming packages like PyStan and Uber’s Pyro. It enables simple model specification and analysis without the restriction of a small number of models. Orbit supports the implementations of Exponential Smoothing (ETS), Local-Global Trend (LGT) and Damped Local Trend (DLT) forecasting models. With the latest update, Orbit v1.1 now has a new class design, forecaster and KTR that allows it to explore and detect the dynamic pattern within time series.
Fraud detection with humans in the loop
Since fraud attacks are detected as time series anomalies, the team collects various long and short term anomaly signals and combines them to determine fraud patterns. First, the baseline for the anomalies is established using time-series forecasting in the OT dimension. “We build the time series decomposition model for each one-dimensional time series in the OT dimension for each aggregate signal and each PSMT slice,” said the team.
Once the anomaly signs are detected, PSMT dimension and losses from historical data are used to forecast which attacks may lead to significant losses, and depending on those, prioritise the severe attacks for human analysts to review. Upon detecting the attack severity, the forecast is applied as a smart threshold to predict if the severity will be anomalous at full maturity; if it is, the loss is prioritised. All the data is backed-up into the backend. Upon identifying the most severe anomalies, the humans are taken into account, and Jira is the task tracking tool. The analysis will now review the anomalies and take over the fraud investigations.
The patterns are detected by encoding the selected feature-value pairs as unique items and analysing them to find common fraud patterns. The dataset consists of key-value pair combination items, and the recurring ones are tagged. FP-Tree is used to accelerate the identification of these common patterns, followed by the rule selection process to eliminate false positives further. Additionally, feature selection is a fully automated process to limit the patterns detected by the algorithm. Feature relevance in associative rule mining starts from the MRMR algorithm. The team has also added features, including working with categorical features and leveraging imbalance in feature value distributions.
The rule selection process includes a fixed set of records and queries to be matched against each other to narrow down the list of variable rules. This percolation is done on the vertically scaled Apache Spark® driver. The two-step percolation process includes FP-Growth algorithms on horizontal Apache Spark® scaling, and sample-based percolation uses vertical scaling on the Apache Spark® driver. The human reviewers then analyse the final rules, and the feedback is incorporated back into the system to improve the automated rule generation process.
Finally, Uber’s rule engine, Mastermind, is used to push these rules into production, maintain them, evaluate them and suggest improvements. Furthermore, humans ensure the rules are correct before they undergo production.