Active Hackathon

# Introduction To Keras Graph Convolutional Neural Network(KGCNN) & Ragged Tensor

KGCNN offers a straightforward and flexible integration of graph operations into the Tensorflow-Keras framework using RaggedTensors.

Graph Neural Networks is a neural network architecture that has recently become more common in research publications and real-world applications.  And since neural graph networks require modified convolution and pooling operators, many Python packages like PyTorch Geometric, StellarGraph, and DGL have emerged for working with graphs.  In Keras Graph Convolutional Neural Network(kgcnn) a straightforward and flexible integration of graph operations into the TensorFlow-Keras framework is achieved using RaggedTensors. It contains a set of TensorFlow-Keras layer classes that can be used to build graph convolution models. The package also includes standard bench-mark graph datasets such as Cora,45 MUTAG46, and QM9.

The main problem with handling graphs is their variable size. This makes graph data hard to arrange in tensors. For example, placing small charts of different sizes in mini-batches poses a problem with fixed-sized tensors. One way to solve this problem is to use zero-padding with masking or composite tensors. Another is disjoint representation. This entails joining the small graphs into a single large graph without connecting the individual subgraphs.

#### THE BELAMY

Graphs are usually represented by an adjacency matrix ???? of shape ([batch], N, N), which has ???????????? = 1if the graph has an edge between nodes i and j and 0 otherwise. When represented using Tensors, graphs are stored using:

• Node list n of shape ([batch], N, F)
• A connection table of edge indices of incoming and outgoing node m with shape([batch], M, 2)
• Corresponding edge feature list e of shape ([batch], M, F).

Here, N denotes the number of nodes, F denotes the node representation dimension, and M the number of edges.

RaggedTensors are the TensorFlow equivalent of nested variable-length lists. With RaggedTensors, graphs can be represented using just the node features and edge index lists a flexible tensor dimension that incorporates different numbers of nodes and edges. For example, a ragged node tensor of shape ([batch], None, F) can accommodate a flexible graph size in the second dimension.

A ragged tensor should not be confused with a sparse tensor, it is a dense tensor with an irregular shape.  The key difference is that a ragged tensor keeps track of where each row begins and ends, whereas a sparse tensor tracks each item’s coordinates. This difference can be illustrated using the concatenation operation:

### Implementing MEGNet using KCGNN

1. Install KGCNN

Install from source by cloning the repository:

`git clone https://github.com/aimat-lab/gcnn_keras `

`pip install -e ./gcnn_keras `

or install using pip:

`pip install kgcnn`

1. Import necessary library and classes
``` import math
import numpy as np
import tensorflow as tf
import tensorflow.keras as ks
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.utils import shuffle
from kgcnn.data.qm.qm9 import qm9_graph
from kgcnn.literature.Megnet import getmodelMegnet,softplus2
from kgcnn.utils.learning import lr_lin_reduction ```
``` # Download Dataset
qm9_data = qm9_graph()
y_data = qm9_data[0][:,7]*27.2114  #select LUMO in eV
x_data = qm9_data[1:]

#Scale output
y_mean = np.mean(y_data)
y_data = (np.expand_dims(y_data,axis=-1)-y_mean)
data_unit = 'eV'

#Make test/train split
VALSIZE = 100
TRAINSIZE = 2000
print("Training Size:",TRAINSIZE," Validation Size:",VALSIZE )
inds = np.arange(len(y_data))
inds = shuffle(inds)
ind_val = inds[:VALSIZE ]
ind_train = inds[VALSIZE:(VALSIZE + TRAINSIZE)]

# Select train/test data
xtrain = [[x[i] for i in ind_train] for x in x_data]
ytrain = y_data[ind_train]
xval = [[x[i] for i in ind_val] for x in x_data]
yval = y_data[ind_val] ```
1. Convert the feature lists into RaggedTensors
``` def make_ragged(inlist):
return tf.RaggedTensor.from_row_lengths(np.concatenate(inlist,axis=0), np.array([len(x) for x in inlist],dtype=np.int))

#Make ragged graph tensors plus normal tensor for graph state
xval = [make_ragged(x) for x in xval[:3]] + [tf.constant(xval[3])]
xtrain = [make_ragged(x) for x in xtrain[:3]] + [tf.constant(xtrain[3])] ```
1. Create and train the model
``` model =  getmodelMegnet(
# Input
input_node_shape = [None],
input_edge_shape = [None,20],
input_state_shape = [1],
input_node_vocab = 10,
input_node_embedd = 16,
input_edge_embedd = 16,
input_type = 'ragged',
# Output
output_embedd = 'graph', #Only graph possible for megnet
output_use_bias = [True,True,True],
output_dim = [32,16,1],
output_activation = ['softplus2','softplus2','linear'],
#Model specs
is_sorted = True,
has_unconnected = False,
nblocks = 3,
n1= 64,
n2 = 32,
n3= 16,
set2set_dim = 16,
use_bias = True,
act = 'softplus2',
l2_coef = None,
has_ff = True,
dropout = None,
dropout_on_predict = False,
use_set2set = True,
npass= 3,
set2set_init = '0',
set2set_pool = "sum"
)

learning_rate_start = 0.5e-3
learning_rate_stop = 1e-5
epo = 500
epomin = 400
cbks = tf.keras.callbacks.LearningRateScheduler(lr_lin_reduction(learning_rate_start,learning_rate_stop,epomin,epo))

model.compile(loss='mean_squared_error',
optimizer=optimizer,
metrics=['mean_absolute_error', 'mean_squared_error'])
print(model.summary()) ```
``` trainlossall = []
testlossall = []
validlossall = []
epostep = 10

hist = model.fit(xtrain, ytrain,
epochs=epo,
batch_size=64,
callbacks=[cbks],
validation_freq=epostep,
validation_data=(xval,yval),
verbose=2
)

trainlossall = hist.history['mean_absolute_error']
testlossall = hist.history['val_mean_absolute_error']
trainlossall =np.array(trainlossall)
testlossall = np.array(testlossall)
mae_valid = np.mean(np.abs(yval-model.predict(xval))) ```

## More Great AIM Stories

### How Gupshup Uses AI

A machine learning enthusiast with a knack for finding patterns. In my free time, I like to delve into the world of non-fiction books and video essays.

## Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Telegram Channel

Discover special offers, top stories, upcoming events, and more.

### How can the Indian Railway benefit from 5G?

Deploying multiple sensors will allow the Railways to monitor tracks, power systems and environmental conditions in real-time.

### Need a Fashion Designer? Just Ask the AI

AI technology has advanced to the level that it can create complicated unique designs

### Does India match up to the USA and China in AI-enabled warfare?

India’s military spending for 2021 was ranked as the third-highest in the world.

### ThoughtWorks Bats Thoughtfully, calls for Leveraging Tech Responsibly

Across the globe, there’s a lot of demand for data mesh, data platforms and modernising data ecosystems.

### The origin of Neo4j

Neo4j has more than 700 employees globally.

### Attention aspiring data scientists and analytics enthusiasts: Genpact is holding a career day in September!

Don’t miss the opportunity to interact with some of the brightest minds in analytics during Genpact’s Analytics Career Day.

### Poll Campaigns Get Interesting with Deepfakes, Chatbots & AI Candidates

The world around politics is changing as people nominate AI bots in elections, deepfake videos are circulated by political parties and AR and 3D holograms get popular in Indian politics.

### Decentralised, Distributed, Transparent: Blockchain to Disrupt Ad Industry

The distributed, decentralised and transparent system of blockchain checks ad frauds and increase ROI

### A Case for IT Professionals Switching Jobs Frequently

For Indian companies, the ability to retain employees has become a tight ropewalk between transforming their working models and adopting a hybrid working model successfully. Over 60% respondents in the Qualtrics survey said that they would look for a new job, if forced to return to work from office full time.

### The Shaky Foundations of Web3 Companies

Soon after coming out of jail, Shkreli announced that he would be debuting ‘Druglike’, a Web3 drug discovery software platform.