# How TensorFlow Probability is used in Neural Networks?

There are many cases where we get the requirements of probabilistic models and techniques in neural networks. These requirements can be filled up by adding probability layers to the network that are provided by TensorFlow.

TensorFlow is one of the leading contributors of many different models, layers, and modelling techniques that make the building process of neural networks very easy and efficient. There are many cases where we get the requirements of probabilistic models and techniques in neural networks. These requirements can be filled up by adding probability layers to the network that are provided by TensorFlow. In this article, we are going to discuss TensorFlow probability layers along with how we can use them in any neural network. The major points to be discussed in this article are listed below.

1. What are TensorFlow Probability (TFP) Layers?
1. Installation of TensorFlow Probability
2. What are Variational Autoencoders?
3. Implementing the TensorFlow Probability

Let’s begin with understanding the TensorFlow probability layers.

What are TensorFlow Probability(TFP) Layers?

As discussed in the introduction, TensorFlow provides various layers for building neural networks. Similarly, the TensorFlow probability is a library provided by the TensorFlow that helps in probabilistic reasoning and statistical analysis in the neural networks or out of the neural networks. This means that this library makes us capable of performing probabilistic reasoning and statistical analysis either for deep learning models or for other machine learning models.

Instead of this, we can also use this library for using low-level building blocks like distributions and bijectors and higher-level constructs like Markov Chain Monte Carlo, Probabilistic Layers, Structural Time Series, Generalized Linear Models, etc.  When we talk about the distribution this library has various distributions, which are listed below.

And when we talk about the bijectors, we can get the following bijectors in the library:

Here we can see that we have a huge amount of distributions and bijectors in the library which can be used for the integration of probabilistic methods with neural networks. Instead of these, we have the following distributions for model building.

Also, we have various probabilistic layers and inferences such as Markov Chain Monte Carlo, Variational inferences, and optimizers in the library.

Installation of TensorFlow Probability

We can install the library using the following lines of codes.

!python -m pip install –upgrade –user pip

!python -m pip install –upgrade –user tensorflow tensorflow_probability

After installation, we are ready to use the library for approaching probabilistic methods with neural networks. We can start with the variational autoencoders, which can be used in different tasks like collaborative filtering, image compression and also for reinforcement learning.

Variational Autoencoders

As we have discussed the domains where we can use the VAE models, we can also use them to generate the data. Here we will try to generate digits, as in MNIST data. This generation can be done by following two steps:

• Sampling some latent representation from many distributions.
• Based on the sample, we can draw the actual representation.

In digit creation, we can imagine variations similar to the class identity of the digits in the MNIST dataset. Here in the dataset, we can find the variation in the digits due to noise in the signal. Using the VAE model, we will try to separate these noises from the signal.

To make this objective applicable, we can maximize the evidence lower bound(ELBO):

By the above formula, we can say that the ELBO is lower bound on log p(x) that is a log probability of a data point that is already observed. The integral in the first place is a reconstruction term and the second integral term is Kullback–Leibler divergence term. It represents a measure of closeness of encoder and prior. This measure can be considered as a process of making the encoder network honest. Let’s start with the implementation of the process, which will make a clear picture of the process in our mind.

``````import numpy as np
import tensorflow.compat.v2 as tf
tf.enable_v2_behavior()
import tensorflow_datasets as tfds
import tensorflow_probability as tfp
tfk = tf.keras
tfkl = tf.keras.layers
tfpl = tfp.layers
tfd = tfp.distributions``````

To make the process faster, we can use GPU. Since I am pursuing these codes in the Google Colab, we can start GPU from the runtime panel. We are required to follow the below process.

`"Runtime" -> "Change runtime type" -> "Hardware accelerator" -> "GPU".`

Now we can import the dataset as:

``````datasets, datasets_info = tfds.load(name='mnist',
with_info=True,
as_supervised=False)
def _preprocess(sample):
image = tf.cast(sample['image'], tf.float32) / 255.
image = image < tf.random.uniform(tf.shape(image))
return image, image
train_dataset=(datasets['train'].map(_preprocess).batch(256).prefetch(tf.data.AUTOTUNE).shuffle(int(10e3)))
test_dataset=(datasets['test'].map(_preprocess).batch(256).prefetch(tf.data.AUTOTUNE))``````

Output:

Building the Model

Let’s specify the model.

``````input_shape = datasets_info.features['image'].shape
encoded_size = 16
base_depth = 32``````

We can use the isotropic Gaussian prior for the VAE model.

Defining the prior as,

`prior = tfd.Independent(tfd.Normal(loc=tf.zeros(encoded_size), scale=1),reinterpreted_batch_ndims=1)`

Now we can make a model.

Encoder

First, we are making an encoder network as,

``````encoder = tfk.Sequential([
tfkl.InputLayer(input_shape=input_shape),
tfkl.Lambda(lambda x: tf.cast(x, tf.float32) - 0.5),
tfkl.Conv2D(base_depth, 5, strides=1,
tfkl.Conv2D(base_depth, 5, strides=2,
tfkl.Conv2D(2 * base_depth, 5, strides=2,
tfkl.Conv2D(2 * base_depth, 5, strides=2,
tfkl.Conv2D(4 * encoded_size, 7, strides=1,
tfkl.Flatten(),
tfkl.Dense(tfpl.MultivariateNormalTriL.params_size(encoded_size),
activation=None)
tfpl.MultivariateNormalTriL(
encoded_size,
activity_regularizer=tfpl.KLDivergenceRegularizer(prior)),
])
``````

This is a simple sequential model where we have introduced a MultivariateNormalTril() layer to the output from the convolutional and dense layers that is a TFP layer. A helper MultivariateNormalTriL() layer will also be used which will output the correct number of activations. The activity_regularizer will make sure that distribution will contribute a regularization term to the final loss where we have used KL divergence to measure the closeness between the encoder and prior.

Decoder

``````decoder = tfk.Sequential([
tfkl.InputLayer(input_shape=[encoded_size]),
tfkl.Reshape([1, 1, encoded_size]),
tfkl.Conv2DTranspose(2 * base_depth, 7, strides=1,
tfkl.Conv2DTranspose(2 * base_depth, 5, strides=1,
tfkl.Conv2DTranspose(2 * base_depth, 5, strides=2,
tfkl.Conv2DTranspose(base_depth, 5, strides=1,
tfkl.Conv2DTranspose(base_depth, 5, strides=2,
tfkl.Conv2D(filters=1, kernel_size=5, strides=1,
tfkl.Flatten(),
tfpl.IndependentBernoulli(input_shape, tfd.Bernoulli.logits),
])``````

Here the decoder network is introduced for decoding the images, where it is also a sequential model in which have a transposed convolutional layer to take the latent representation from the encoder model.

Now we can apply these decoders and encoders to the model as,

``````vae = tfk.Model(inputs=encoder.inputs,
outputs=decoder(encoder.outputs[0]))``````

Now we can fit the model on the data and train it.

``````negloglik = lambda x, rv_x: -rv_x.log_prob(x)
loss=negloglik)
_ = vae.fit(train_dataset,
epochs=15,
validation_data=eval_dataset)``````

Output

Plotting the Results

Now we can examine the random sample as,

``````x = next(iter(eval_dataset))[0][:10]
xhat = vae(x)
assert isinstance(xhat, tfd.Distribution)``````

Plotting samples from the data:

``````print('Originals:')
display_imgs(x)

print('Decoded Random Samples:')
display_imgs(xhat.sample())

print('Decoded Modes:')
display_imgs(xhat.mode())

print('Decoded Means:')
display_imgs(xhat.mean())``````

Output:

Let’s generate random sample using the model as,

``````z = prior.sample(10)
xtilde = decoder(z)
assert isinstance(xtilde, tfd.Distribution)``````

Plotting the generated samples:

``````print('Randomly Generated Samples:')
display_imgs(xtilde.sample())

print('Randomly Generated Modes:')
display_imgs(xtilde.mode())

print('Randomly Generated Means:')
display_imgs(xtilde.mean())``````

Output

Here we can see the random samples of the generated images using the MNIST dataset and VAE model, where we have used functions and layers from the TensorFlow probability library.

Final Words

Here in the article, we have seen how we can combine the neural networks with the TensorFlow Probability library. It helps in generating the images using the old data which we have in the datasets provided by the TensorFlow module.

References

Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.

## Our Upcoming Events

### Telegram group

Discover special offers, top stories, upcoming events, and more.

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Bhashini will Truly Democratise AI in India

Bhashini is already breaking language barriers, enabling underserved communities to access government services seamlessly

### Call For Developers to Build the Next-Gen LLM In the ‘oneAPI Hackathon: The LLM Challenge’

Developers’ go-to platform MachineHack has teamed up with Intel® to push forth the ongoing LLM advancements.

### Bringing ‘Common Sense’ into Machines with Former Google DeepMind Scientist

Hailing from Pune, a former Google DeepMind research scientist Tejas Kulkarni co-founded Common Sense Machines, a Boston-based AI startup aiming to revolutionise 3D generation AI platforms

### Hallucinations are Bothersome, but Not That Bad

A majority of the research community is fed up of being lied to by AI models but there’s an alternative philosophy

### Apple Launches iCringe with a Sustainability Twist

With Mother Nature in mind, Apple is making impactful strides towards carbon-neutral products. However, there is a slight hiccup

### Data Science Hiring Process at Zoho

Zoho has over 10 open positions for both freshers and experienced professionals.

### Will AGI Be Built in China?

AGIEval Seems to Think So

### NVIDIA Expands Cloud Business with Investments, Partnerships

With NVIDIA partnership, Hugging Face users get access to SOTA GPUs and infrastructure needed to rapidly train and finetune foundation models at scale and drive a new wave of enterprise LLM development.

### Intel Soon to be on Par with NVIDIA

A green CPU with a blue GPU might soon be possible.

### Shell Hackathon to Protect Against Cyber Threats

The aim of the Cyber Threat Detection Hackathon is to build a model capable of identifying code in a body of text.