Using Knowledge Distillation On Augmented Graph Convolutional Networks To Detect Money Laundering In Bitcoin Transactions

This is one of the top voted thesis papers from upGrad's online working professional programs in partnership with one of the UK's leading universities.
Using Knowledge Distillation On Augmented Graph Convolutional Networks To Detect Money Laundering In Bitcoin Transactions
Image © Using Knowledge Distillation On Augmented Graph Convolutional Networks To Detect Money Laundering In Bitcoin Transactions


Based on its real meaning, cryptocurrency is more than just digital money. It’s decentralized digital money built on innovative blockchain technology, i.e., there’s no authority governing cryptocurrency. Instead, the users take on various tasks associated with managing a cryptocurrency’s value through digital mediums. 

Bitcoin, the most popular cryptocurrency today, was also the first launched by an anonymous person or group called Satoshi Nakamoto in 2008. As a decentralized and digitized form of currency, there are many positives to investing in Bitcoin. But it also has a fair share of drawbacks. 


Sign up for your weekly dose of what's up in emerging technology.

Anonymity for all users in a decentralized climate also offers anonymity for criminals. It encourages money laundering, theft, and other malicious activities. Governments and institutions have quickly realized this and put forth Anti-money Laundering (AML) regulations for everyone. However, these policies have shortcomings, too.

So, can technology be the ultimate solution to detect and mitigate money laundering for a currency it birthed? Let’s find out in this research project.

Gaps posed real challenges

To find solutions, the project needed to overcome some key challenges that the field of fraud detection has faced so far:

  1. Machine Learning (ML) could be the answer to money laundering problems. Through data pattern studies, it can provide adaptable techniques to detect fraud. To date, a good amount of work has already been done to build such ML models. However, almost all these models reflect real-world scenarios poorly because they use synthetic data sets. Moreover, no current literature has enough information on the execution load of the models.
  1. As researched by Weber et al. (2019), one model made significant progress combining Random Forest with EvolveGCN. But there are practical limitations to applying it in the real world. It needs excess storage and computational expenses, which isn’t widely applicable. 
  1. When previous work was examined extensively, it was concluded that existing research and models assume incoming data will always be organized and static. This is far from the truth in a real-world scenario. Real-world data is in a constant state of flux because of many transactions and interactions in cryptocurrency. 
  1. Fraud detection in cryptocurrency is at a nascent stage because the industry itself is only ten years old. Researchers and data scientists worldwide are only just understanding the potential it holds and the problems it creates. Therefore, technology-aided detection of fraud is not as advanced. Investigative agencies take months to trace one transaction and determine its authenticity. 
  1. A shortcoming of AML regulations is the cost it imposes on users. This shortcoming negatively impacts the poor, low-income groups as well as refugees and immigrants. They have limited access to earning money. Add to this the costs involved in identifying themselves as honest transactors, and they’re further discouraged from taking part in economic activities. 

Building the model for Bitcoin fraud detection

The above challenges pushed this project to gather inspiration from multiple ML techniques and models. Eventually, it came up with a new model that adds to the previous models’ abilities and gets better-blended performance. The final model is compact and maintains or tries to exceed the benchmarks set. 


A freely available dataset, courtesy of the collaboration between Elliptic Co. and Weber et al. (2019), was used. It’s one of the most extensively labelled cryptocurrency datasets in the industry. This dataset contains Bitcoin transaction data in graphical format and stores transactional entities in the form of nodes. All the nodes have 166 features and one of the three labels – licit, illicit, or unknown. 21% of the data comes under licit, 2% under illicit, while the rest is unknown. 


The project wanted to find the latest techniques of AML in cryptocurrency.

First, it looked at the outcomes after applying attributes to the dataset. Then, it implemented a Graph Neural Network with Classifier Algorithms to improve learning and classification capabilities. It then used Knowledge Distillation to compress the model and reduce its memory footprint. Finally, the model captured and compared the performance of the proposed model with benchmarks.

The process went as follows:

  1. Literature Review: 

The literature review provided answers to understand cryptocurrency and fraud detection. It revealed gaps in technology as well as room for improvement. It also formed the basis of the research, from describing the problem statement to setting down objectives. 

  1. Proposed Implementation: 

The algorithms found during the literature review were implemented in a specific order – starting with achieving a baseline, applying Random Forest and Knowledge Distillation, and assessing the model against the baseline. 

  1. Precision, Recall, and Weighted F score: 

Precision and recall were essential metrics to check the efficacy of output. Most existing models work with synthetic datasets that don’t reflect real-world scenarios. These don’t provide accurate results. The weighted F score also helped provide the most unbiased estimate of the new model’s performance.

  1. Model Execution Time: 

It was the time taken to label a given data point. First, however, the metric was checked by applying the proposed Knowledge Distillation technique.

  1. Storage Space: 

To understand the computational intensity of the model, the performance benchmarks used were Model Storage Space and CPU usage. Knowledge Distillation was applied to reduce this. 


The idea was to implement the future work as proposed by Weber et al. (2019).

Node features were built using EvolveGCN and Random Forest to create embeddings. Then, a variant of decision forest – Logistic Regression – was used as the output layer. This method was considered the best way to integrate both EvolveGCN and Random Forest algorithms.

Outcomes of the Fraud Detection Model

The outcomes of the proposed models were:

EvolveGCN-OBaseline●      Good classification performance
●      Decent performance time and use of disk space
●      Fairly complex
d-EvolveGCNDistilled●      Increased classification performance
●      10% increase in illicit metrics and 8% increase in MicroAvg metrics performance
●      Slightly lower execution time
●      Less complex and lower disk space
dNDFBaseline●      Similar performance to EvolveGCN-O
●      Better precision score
●      Worse performance time and 337% increased execution time
●      23% higher disk space
d-dNDFDistilled●      Much better classification performance than dNDF
●      At par performance with d-EvolveGCN
●      Peak CPU memory was the best of all

Where can this Fraud Detection model be used?

The project successfully used existing models to create a robust and accurate algorithm and compact it for widespread application.

The proposed model is conceived to unmask fraudulent transactions in the ever-growing cryptocurrency industry. In addition, it will help discourage individuals with malicious intent from defaming a decentralized currency system.

Everyone deserves equal opportunities. For people to leverage the benefits of cryptocurrency, they need to be included in new financial systems without the apprehension of rules, regulations, and fear of being cheated. This model will bring governments one step closer, using technology to minimize physical and monetary restrictions.

The model could also be used for claim and default prediction, churn and conversion prediction, spam, and anomaly detection, intrusion detection, and more.


Suraj Krishnamoorthy is an upGrad learner, and as a part of his program, he has developed the thesis report titled — Using Knowledge Distillation On Augmented Graph Convolutional Networks To Detect Money Laundering In Bitcoin Transactions.

More Great AIM Stories

Suraj Krishnamoorthy
program, he has developed the thesis report titled, Using Knowledge Distillation on Augmented graph Convolutional Networks to Detect Money Laundering inBitcoin transactions. Beyond work, Suraj is an avid reader and nascent guitar player.

Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM