MITB Banner

Why Graph Neural Networks Should Be Calibrated?

Share

Graph Neural Networks (GNNs) are an effective framework for representation learning of graphs.

Large scale knowledge graphs are usually known for their ability to support NLP applications like semantic search or dialogue generation. Whereas, companies like Pinterest already have opted for a graph-based system(Pixie) for real time high performance tasks.

GNNs follow a neighborhood aggregation scheme, where the representation vector of a node is computed by recursively aggregating and trans- forming representation vectors of its neighboring nodes.

Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited understanding of their representational properties and limitations.

Talking about limitations, it is extremely important to investigate the workings of graph neural networks and how well calibrated they are as they are proven to be useful in classification tasks.

So, in order to test the efficacy of GNNs, few researchers experimented with the existing technologies and demonstrated their findings in a recently published paper.

A machine learning model with good calibration produces best results consistently. For instance, a softmax function in a convolutional neural networks tasked to predict the class of an image, will predict the right one almost every time.

Avoiding Misfires And Mishits

The evaluation was done using two tools: calibration or reliability diagram and the metric expected calibration error(ECE).

In the calibration diagram, the predictions of the model are grouped in bins, according to their confidence value. Then, for each bin, a point is drawn where the x-axis is the average confidence of the predictions in the bin, while the y-axis is their average accuracy.

Whereas, the Expected Calibration Error (ECE). The ECE metric is a single number that summarizes the calibration error.  ECE is the average of the gaps in the reliability diagram, weighted by the number of predictions in each bin.

Firstly, the graph neural networks were trained on data obtained from social networking platforms like Friendster, Facebook and also from Amazon and PUBMED.

Popular GNNs like Graph Convolutional Networks, Graph Attention Networks and Graph isomorphism Networks were trained using  PyTorch geometric library.

Illustration of Isotonic regression via sci-kit learn docs

Since evaluating the existing methodologies is the prime concern here, the researchers considered calibration improvement techniques like MC Dropout, histogram binning, isotonic regression and temperature scaling.

In  many  real world datasets, it is common to face an imbalanced class distribution, which can pose a challenge to learn meaningful models. In the FRIENDSTER dataset, researchers observed a severe class imbalance, with more than 60% of labeled nodes belonging to the most prevalent class, while less than 1% of labeled belong to least prevalent class.

This led to collapsed predictions towards a single (most prevalent) class, with all GNNs predicting at least 95% of examples to the most prevalent class (with some hyperparameter configurations actually predicting 100% examples in a single class).

Key Findings

To address the imbalances, the less prevalent classes are upweighted so that they would contribute equally in the network.

When the test and train distribution are dissimilar, the conclusions drawn from the evaluation can be misleading.

The results show that for easier tasks all GNNs are reasonably calibrated, while for harder tasks, such as in the FRIENDSTER dataset, GNNs can be miscalibrated and existing calibration techniques are unable to calibrate them. And, using the proper test distribution when evaluating has an impact on both accuracy and calibration.

Along with finding new metrics to calibrate GNNs, adopting reinforcement learning to graph-based reasoning to make the model search will open up other interesting avenues in graph based machine learning systems.

Know more about miscalibration of GNNs here

PS: The story was written using a keyboard.
Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India