Now Reading
Introduction To Keras Graph Convolutional Neural Network(KGCNN) & Ragged Tensor

Introduction To Keras Graph Convolutional Neural Network(KGCNN) & Ragged Tensor

KGCNN and Ragged Tensor

Graph Neural Networks is a neural network architecture that has recently become more common in research publications and real-world applications.  And since neural graph networks require modified convolution and pooling operators, many Python packages like PyTorch Geometric, StellarGraph, and DGL have emerged for working with graphs.  In Keras Graph Convolutional Neural Network(kgcnn) a straightforward and flexible integration of graph operations into the TensorFlow-Keras framework is achieved using RaggedTensors. It contains a set of TensorFlow-Keras layer classes that can be used to build graph convolution models. The package also includes standard bench-mark graph datasets such as Cora,45 MUTAG46, and QM9.

The main problem with handling graphs is their variable size. This makes graph data hard to arrange in tensors. For example, placing small charts of different sizes in mini-batches poses a problem with fixed-sized tensors. One way to solve this problem is to use zero-padding with masking or composite tensors. Another is disjoint representation. This entails joining the small graphs into a single large graph without connecting the individual subgraphs. 

Deep Learning DevCon 2021 | 23-24th Sep | Register>>

Graphs are usually represented by an adjacency matrix 𝐴 of shape ([batch], N, N), which has 𝐴𝑖𝑗 = 1if the graph has an edge between nodes i and j and 0 otherwise. When represented using Tensors, graphs are stored using:

  • Node list n of shape ([batch], N, F)
  • A connection table of edge indices of incoming and outgoing node m with shape([batch], M, 2)
  • Corresponding edge feature list e of shape ([batch], M, F). 

Here, N denotes the number of nodes, F denotes the node representation dimension, and M the number of edges. 

RaggedTensors are the TensorFlow equivalent of nested variable-length lists. With RaggedTensors, graphs can be represented using just the node features and edge index lists a flexible tensor dimension that incorporates different numbers of nodes and edges. For example, a ragged node tensor of shape ([batch], None, F) can accommodate a flexible graph size in the second dimension.

Looking for a job change? Let us help you.

A ragged tensor should not be confused with a sparse tensor, it is a dense tensor with an irregular shape.  The key difference is that a ragged tensor keeps track of where each row begins and ends, whereas a sparse tensor tracks each item’s coordinates. This difference can be illustrated using the concatenation operation:

Implementing MEGNet using KCGNN

  1. Install KGCNN

Install from source by cloning the repository:

git clone https://github.com/aimat-lab/gcnn_keras

pip install -e ./gcnn_keras

or install using pip:

pip install kgcnn

  1. Import necessary library and classes
 import math
 import numpy as np
 import tensorflow as tf
 import tensorflow.keras as ks
 import matplotlib.pyplot as plt
 %matplotlib inline
 from sklearn.utils import shuffle
 from kgcnn.data.qm.qm9 import qm9_graph
 from kgcnn.literature.Megnet import getmodelMegnet,softplus2
 from kgcnn.utils.learning import lr_lin_reduction 
  1. Download and prepare the data
 # Download Dataset
 qm9_data = qm9_graph()
 y_data = qm9_data[0][:,7]*27.2114  #select LUMO in eV
 x_data = qm9_data[1:]

 #Scale output
 y_mean = np.mean(y_data)
 y_data = (np.expand_dims(y_data,axis=-1)-y_mean)  
 data_unit = 'eV'

 #Make test/train split
 VALSIZE = 100
 TRAINSIZE = 2000
 print("Training Size:",TRAINSIZE," Validation Size:",VALSIZE )
 inds = np.arange(len(y_data))
 inds = shuffle(inds)
 ind_val = inds[:VALSIZE ]
 ind_train = inds[VALSIZE:(VALSIZE + TRAINSIZE)]

 # Select train/test data
 xtrain = [[x[i] for i in ind_train] for x in x_data]
 ytrain = y_data[ind_train]
 xval = [[x[i] for i in ind_val] for x in x_data]
 yval = y_data[ind_val] 
  1. Convert the feature lists into RaggedTensors
 def make_ragged(inlist):
     return tf.RaggedTensor.from_row_lengths(np.concatenate(inlist,axis=0), np.array([len(x) for x in inlist],dtype=np.int))

 #Make ragged graph tensors plus normal tensor for graph state
 xval = [make_ragged(x) for x in xval[:3]] + [tf.constant(xval[3])]
 xtrain = [make_ragged(x) for x in xtrain[:3]] + [tf.constant(xtrain[3])] 
  1. Create and train the model
 model =  getmodelMegnet(
                     # Input
                     input_node_shape = [None],
                     input_edge_shape = [None,20],
                     input_state_shape = [1],
                     input_node_vocab = 10,
                     input_node_embedd = 16,
                     input_edge_embedd = 16,
                     input_type = 'ragged',
                     # Output
                     output_embedd = 'graph', #Only graph possible for megnet
                     output_use_bias = [True,True,True],
                     output_dim = [32,16,1],
                     output_activation = ['softplus2','softplus2','linear'],
                     output_type = 'padded',
                     #Model specs
                     is_sorted = True,
                     has_unconnected = False,
                     nblocks = 3,
                     n1= 64,
                     n2 = 32,
                     n3= 16,
                     set2set_dim = 16,
                     use_bias = True,
                     act = 'softplus2',
                     l2_coef = None,
                     has_ff = True,
                     dropout = None,
                     dropout_on_predict = False,
                     use_set2set = True,
                     npass= 3,
                     set2set_init = '0',
                     set2set_pool = "sum"
                     )

 learning_rate_start = 0.5e-3
 learning_rate_stop = 1e-5
 epo = 500
 epomin = 400
 optimizer = tf.keras.optimizers.Adam(lr=learning_rate_start)
 cbks = tf.keras.callbacks.LearningRateScheduler(lr_lin_reduction(learning_rate_start,learning_rate_stop,epomin,epo))

 model.compile(loss='mean_squared_error',
               optimizer=optimizer,
               metrics=['mean_absolute_error', 'mean_squared_error'])
 print(model.summary()) 
MEGNet model created using KGCNN
 trainlossall = []
 testlossall = []
 validlossall = []
 epostep = 10

 hist = model.fit(xtrain, ytrain, 
           epochs=epo,
           batch_size=64,
           callbacks=[cbks],
           validation_freq=epostep,
           validation_data=(xval,yval),
           verbose=2
           )

 trainlossall = hist.history['mean_absolute_error']
 testlossall = hist.history['val_mean_absolute_error']
 trainlossall =np.array(trainlossall)
 testlossall = np.array(testlossall)
 mae_valid = np.mean(np.abs(yval-model.predict(xval))) 
Training results of MEGNet model created using KGCNN
Actual vs Predicted value plot of MEGNet model created using KGCNN
Actual value vs Predicted value plot of the MEGNet model created using KGCNN
What Do You Think?

Join Our Discord Server. Be part of an engaging online community. Join Here.


Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top