Gojek’s Sruthi Sekar on ML-driven fraud prevention systems

Supervised models have better predictive power than unsupervised models in fraud detection.

Sruthi Sekar, data scientist at Gojek, has spoken about the ML-driven fraud prevention systems at Analytics India Magazine’s Women In AI Conference: The Rising 2022. She has been working with the GoPay Data Science team for four years and specialises in the fraud and risk domain.

Data science models play an important role in inculcating a sense of security in e-wallet transactions. In her session, Sruthi talked about the lessons learned from building an ML model to predict fraudulent logins to hijack wallets. She has also touched upon handling insufficient labels, data drift and selecting a sustainable model for monitoring metrics.

Watch all the recorded sessions of Rising 2022 here>>

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

From creating an account to the final transaction, every stage in the user’s e-wallet journey is susceptible to fraud. To solve this challenge, ML models need to proactively monitor different stages of a customer journey.




Supervised vs unsupervised learning

Simple rule-based systems don’t work as there are way too many factors involved in the transaction process. You need a diverse and complex system to tackle different types of problems in real-time. On the other side, fraudsters are upping their game and it has become increasingly difficult to tell a genuine account from a fake one. In her experience, Sruthi said, supervised models have better predictive power than unsupervised models in fraud detection. While supervised models rely heavily on data labelling, which is time consuming and costly, unsupervised models’ performance is hampered by data imbalance and the overlapping nature of data– making it difficult for the model to differentiate between genuine and fraud cases. She said active learning and semi-supervised learning are the best approaches for fraud detection.

Modelling

According to Sruthi, data scientists should put a lot of emphasis on feature generation. For proper feature generation, it is important that the data scientist understands the modus operandi of the fraudster. She also suggested use of graph based features, location based features, sequence information and use insights given by human validators. In model selection, she said tree-based models work the best. 

Model decay

Model decay happens when the dependency and relationship of the input data features and class variable varies gradually. There are two types of model decay: 

Concept drift– The relationship between the independent and the target variables shift. 

Covariate Shift–Changes in the independent variable distribution.

Source: arxiv.org

She suggested periodical re-training, incremental learning and feature dropping to counteract model decay. 

Sruthi said looking to build a robust error-free fraud detection solution is nothing short of finding a needle in a haystack. However, continuous testing, experimenting, running diagnostics and creating data validation setups can result in high performing ML models.

Sruthi took questions from the audience after her session. “I’m a data scientist and I have worked on a model for a period of time. Later, when I leave the project and someone replaces me to work on the same model, does GO-JEK have a record of the variations introduced to the model during my time that the new data scientist can work with?”
To this, Sruthi responded: “Documentation is the way to go! The immediate point is that one must ensure that each modification of the model is documented. Usually part of the problems developed in the model doesn’t go away in an allotted period of time. Tackling some problems might need constant and continuous modifications. To prevent confusion, one must document the types of tests done by you and features tried during your tenure. Documentation is the only way with which one can properly transit the role to the next data scientist.”

Kartik Wali
A writer by passion, Kartik strives to get a deep understanding of AI, Data analytics and its implementation on all walks of life. As a Senior Technology Journalist, Kartik looks forward to writing about the latest technological trends that transform the way of life!

Download our Mobile App

MachineHack

AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.