Advertisement

Active Hackathon

Gojek’s Sruthi Sekar on ML-driven fraud prevention systems

Supervised models have better predictive power than unsupervised models in fraud detection.

Sruthi Sekar, data scientist at Gojek, has spoken about the ML-driven fraud prevention systems at Analytics India Magazine’s Women In AI Conference: The Rising 2022. She has been working with the GoPay Data Science team for four years and specialises in the fraud and risk domain.

Data science models play an important role in inculcating a sense of security in e-wallet transactions. In her session, Sruthi talked about the lessons learned from building an ML model to predict fraudulent logins to hijack wallets. She has also touched upon handling insufficient labels, data drift and selecting a sustainable model for monitoring metrics.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

Watch all the recorded sessions of Rising 2022 here>>

From creating an account to the final transaction, every stage in the user’s e-wallet journey is susceptible to fraud. To solve this challenge, ML models need to proactively monitor different stages of a customer journey.

Supervised vs unsupervised learning

Simple rule-based systems don’t work as there are way too many factors involved in the transaction process. You need a diverse and complex system to tackle different types of problems in real-time. On the other side, fraudsters are upping their game and it has become increasingly difficult to tell a genuine account from a fake one. In her experience, Sruthi said, supervised models have better predictive power than unsupervised models in fraud detection. While supervised models rely heavily on data labelling, which is time consuming and costly, unsupervised models’ performance is hampered by data imbalance and the overlapping nature of data– making it difficult for the model to differentiate between genuine and fraud cases. She said active learning and semi-supervised learning are the best approaches for fraud detection.

Modelling

According to Sruthi, data scientists should put a lot of emphasis on feature generation. For proper feature generation, it is important that the data scientist understands the modus operandi of the fraudster. She also suggested use of graph based features, location based features, sequence information and use insights given by human validators. In model selection, she said tree-based models work the best. 

Model decay

Model decay happens when the dependency and relationship of the input data features and class variable varies gradually. There are two types of model decay: 

Concept drift– The relationship between the independent and the target variables shift. 

Covariate Shift–Changes in the independent variable distribution.

Source: arxiv.org

She suggested periodical re-training, incremental learning and feature dropping to counteract model decay. 

Sruthi said looking to build a robust error-free fraud detection solution is nothing short of finding a needle in a haystack. However, continuous testing, experimenting, running diagnostics and creating data validation setups can result in high performing ML models.

Sruthi took questions from the audience after her session. “I’m a data scientist and I have worked on a model for a period of time. Later, when I leave the project and someone replaces me to work on the same model, does GO-JEK have a record of the variations introduced to the model during my time that the new data scientist can work with?”
To this, Sruthi responded: “Documentation is the way to go! The immediate point is that one must ensure that each modification of the model is documented. Usually part of the problems developed in the model doesn’t go away in an allotted period of time. Tackling some problems might need constant and continuous modifications. To prevent confusion, one must document the types of tests done by you and features tried during your tenure. Documentation is the only way with which one can properly transit the role to the next data scientist.”

More Great AIM Stories

Kartik Wali
A writer by passion, Kartik strives to get a deep understanding of AI, Data analytics and its implementation on all walks of life. As a Senior Technology Journalist, Kartik looks forward to writing about the latest technological trends that transform the way of life!

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR
[class^="wpforms-"]
[class^="wpforms-"]