With increasing transactions and avenues of spending money, financial institutions and consumers are becoming victim to fraud and scams. According to a report by Nilson, payments card-related fraud losses alone reached more than $32 billion in 2019. In the same period, total card-related transactions worldwide were estimated at around $35 trillion, which is a massive market for financial criminals. These crimes tend to happen where there is a movement of money. However, analytics and data science are now being explored to catch these fraud trends and prevent bad actors from stealing money.
While talking at DLDC 2020, Satyam Misra, head of Analytics and Insights group for Small Business Group at Intuit, shared how anomaly detection and behaviour analytics, in particular, is helping. He has led fraud risk and credit risk management teams at multiple Fortune 500 companies such as American Express, Citicards & Barclays.
Deep Learning In Risk Management
While companies have been using traditional ways of risk management such as underwriting data, rule-based controls, credit checks, aggregation and more, there are flaws with these methods that make them inefficient for detecting fraud patterns. Highlighting some of the disadvantages, Misra pointed out the flaws with each of these methods as below:
- Underwriting data: This method solely depends on the information provided by the consumer, which makes it prone to errors.
- Rule-based controls: This is the most used method for detecting frauds currently and might often result in false positives, hurting innocent consumers in the long run.
- Credit check and vendor data: It does not rely on real-time data, deeming it ineffective.
- Aggregation and regular monitoring: This depends on gathering consumer data in the long run, which is a slow process.
Hence comes deep learning into the picture, which is being explored to score quickly and display results. Most of the data and trends for creating deep learning models are being done with anomaly detection and behavioural analytics.
Explaining the concept, Misra said that anomalies in data analytics are when the observations of a dataset do not conform to an expected pattern. Also known as outlier detection, it is about identifying those observations that are anomalous. There are three types of anomalies:
- Point anomalies: It is when a single instance of data is anomalous.
- Contextual anomalies: It is when the abnormality is context-specific. It is common in time-series data.
- Collective anomalies: It is when a set of data instances collectively helps in detecting anomalies.
Detailing on the key steps to use anomaly detection, Misra mentioned the steps as below:
- Selecting and understanding the right use case: Misra shared that every use case is different to spot the right anomaly. For instance, fluctuations in stock prices may not be an anomaly, but for cards may be a problem. Therefore, understanding the right use case is essential.
- Getting the data: Having as much data as possible from various sources is the key. It will help build accurate models.
- Exploring, cleaning, enriching the data: Like in every machine learning model, it is a key step. It is essential to detect and clean human errors to avoid false positives.
- Becoming predictive: It includes supervised and unsupervised anomaly detection to predict the anomalies.
- Deploying and iterating: This is the most important step that involves deploying the keep on correcting the errors from the previous methods to get correct results.
He further shared that choosing the right algorithm is equally important. For instance, supervised methods require the existence of a labelled dataset that contains both normal and abnormal data points. It may involve anomaly detection using networks, Bayesian networks, K-nearest neighbours, among others. “Supervised methods provide a better rate of anomaly detection, thanks to their ability to encode any interdependency between variables,” said Misra.
On the other hand, unsupervised methods of anomaly detection do not depend on any training data with manual labelling. “These methods are based on the statistical assumption that most of the inflowing data are normal and only a minor percentage would be anomalous data,” he said. Some of the unsupervised methods include K-means method, autoencoders and hypothesis-based analysis.
It involves identifying the users and their payment transaction trends based on their past behaviour. It takes into account the type of payments done, devices used, IP addresses, among others to identify the behaviour of a user. It focuses specifically on a user’s interaction with the device, the site and how the data is used. “It also takes into account the response patterns, typical time to insert the password, the way a user creates a password, among other trends to detect fraud,” added Misra.
He elaborated on some of the attributes as below:
- Physical attributes: It gauges hand-eye coordination and how users hold the device, scroll, swipe and press.
- Cognitive attributes: Measures interaction preferences and input habits such as the way the user toggles between fields.
- Response pattern: Inserts subtle tests such as small mouse deviations to track responses.
These trends and patterns are being extensively used to then design deep learning models. “Having said that, the fraud in payments is ever-evolving and to keep a check on these, there is a need to constantly evolve on deep learning models with time,” said Misra on a concluding note.