Hands-On Guide to Generating Artificial Faces Using Progressive GAN

When it comes to large datasets with higher pixel values, GAN generates the images with sharp pixels that look crispy though make the training unstable. Generating high-resolution images is a challenging task because the generator must know the details and structures involved in images. The high-resolution images can cause any issues that the discriminator can easily spot; therefore, the whole training process fails.

Nowadays, Generative adversarial networks, in short, GAN, are the effective ways to train data for computer vision-based applications. Moreover, it can be used to create a synthetic dataset to fulfil the requirements. A typical GAN architecture uses two models for training: Generator and Discriminator; the generator is used to output a synthetic image. The discriminator model is used to check whether the image is real or fake based on the discriminator output the generator model is trained on. To achieve proper equilibrium, both the models are trained in an adversarial manner.    

If you are new to GAN, I recommend this article to have a proper understanding of GAN.

When it comes to large datasets with higher pixel values, GAN generates the images with sharp pixels that look crispy though make the training unstable. Generating high-resolution images is a challenging task because the generator must know the details and structures involved in images. The high-resolution images can cause any issues that the discriminator can easily spot; therefore, the whole training process fails.    

The solution for this behaviour of GAN is training the network by progressively adding layers. Progressive GAN is an extension of standard GAN that holds the generator in stable mode while dealing with large images to achieve better stable performance. Methods involve starting with very small images such as 4 x 4-pixel images and successively adding blocks of layers that increase the size of images to 8 x 8, 64 x 64 until the desired size, as shown in the picture below. This makes Progressive GAN capable of generating high pixel images such as 1024 x 1024 images.     

As we can see from the above image, during training, the new convolutional layer is added to both generator and discriminator models; this makes the whole model effectively learn deep detailing of the pixels and finer level pixels simultaneously. 

Today in this article, we will implement the Progressive GAN model using TensorFlow and see how this model can be used to generate artificial faces.

Implementation of Progressive GAN:

The below demonstration uses the Tensorflow model based on GAN, which maps the N-dimensional Latent space vector into an RGB image. 

The below code shows the mapping of a latent space vector to the image and Generating the target image using gradient descent which gives the latent vector.     

Here the latent vector is simply a representation of the compressed data in which similar data points are closer to each other in space. This is important because the GAN model takes points from the latent vector and generates an image based on it.

The below code is part of the official implementation of Progressive GAN in Tensorflow.

Install & Import all dependencies:

# imageio for creating animation
!pip -q install imageio
!pip -q install scikit-image
!pip install git+https://github.com/tensorflow/docs

from absl import logging
import imageio
import PIL.Image
import PIL.Image
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
tf.random.set_seed(42)
import tensorflow_hub as hub
from tensorflow_docs.vis import embed
import time

try:
  from google.colab import files
except ImportError:
  pass

from IPython import display
from skimage import transform

Helper functions:

The model uses the latent dimension as a multiple of 512. Here in this section, we create a function for displaying the image, animating the image to see the changes, and the interpolation function to create new pixels.

latent_dim = 512
def interpolating_vectors(v1, v2, num_steps):
  v1_n = tf.norm(v1)
  v2_n = tf.norm(v2)
  v2_normal = v2 * (v1_n / v2_n)
  vect = []
  for step in range(num_steps):
    interpol = v1 + (v2_normal - v1) * step / (num_steps - 1)
    interpol_norm = tf.norm(interpol)
    interpol_normal = interpol * (v1_n / interpol_norm)
    vect.append(interpol_normal)
  return tf.stack(vect)

def image_display(img):
  img = tf.constant(img)
  img = tf.image.convert_image_dtype(img, dtype=tf.uint8)
  return PIL.Image.fromarray(img.numpy())

def animation(img):
  img = np.array(img)
  converted_images = np.clip(img * 255, 0, 255).astype(np.uint8)
  imageio.mimsave('./animation.gif', converted_images)
  return embed.embed_file('./animation.gif')  

Latent space interpolation:

Here we are using the random vectors for the interpolation; for that, we are using the TF-hub module, which consists pretrained model of Progressive GAN.

progan = hub.load('https://tfhub.dev/google/progan-128/1').signatures['default']
def interpolating_between_vect():
  v1 = tf.random.normal([latent_dim])
  v2 = tf.random.normal([latent_dim])
  vect = interpolating_vectors(v1, v2, 150)
  interpolated_images = progan(vect)['default']
  return interpolated_images

interpolated_images = interpolating_between_vect()
animation(interpolated_images)

See the image generated by Progressive GAN by randomly generated vectors;

Finding closet vector in latent space:

Here we try to generate a target image using the latent space vector; we can also upload our image by changing the  image_from_module_space to False. 

inside_image = True  
def from_module_space():
  vector = tf.random.normal([1, 512])
  images = progan(vector)['default'][0]
  return images

def upload_image():
  uploaded = files.upload()
  img = imageio.imread(uploaded[list(uploaded.keys())[0]])
  return transform.resize(img, [128, 128])

if inside_image:
  target_image = from_module_space()
else:
  target_image = upload_image()

image_display(target_image)

As I have chosen to use the image from the module, here is our target image; 

Below we are generating our starting image based on that model tries to converge to the target image. We need to define the loss function between the target image and the latent space image. Then, we can use gradient descent to find variables that minimize loss. 

tf.random.set_seed(4)
initial_vector = tf.random.normal([1, latent_dim])
display_image(progan(initial_vector)['default'][0])

Below is our starting image;

Now get the closest latent point for the target image;

def closest_vetor(initial_vector, num_optimization_,
                               steps_per_img):
  images = []
  losses = []
  vector = tf.Variable(initial_vector)  
  optimizer = tf.optimizers.Adam(learning_rate=0.01)
  loss_fn = tf.losses.MeanAbsoluteError(reduction="sum")
  for step in range(num_optimization_):
    if (step % 100)==0:
      print()
    print('.', end='')

    with tf.GradientTape() as tape:
      image = progan(vector.read_value())['default'][0]
      if (step % steps_per_img) == 0:
        images.append(image.numpy())

      target_image_difference = loss_fn(image, target_image[:,:,:3])
      regularizer = tf.abs(tf.norm(vector) - np.sqrt(latent_dim))
      loss = target_image_difference + regularizer
      losses.append(loss.numpy())

    grads = tape.gradient(loss, [vector])
    optimizer.apply_gradients(zip(grads, [vector]))
  return images, losses
num_optimization_=200
steps_per_img=5
images, loss = closest_vetor(initial_vector, num_optimization_, steps_per_img)

Results:

Check out the animated image, which shows how the model has converged to our targeted image; and check the result side by side left-sided is generated image, and right-sided is the target image 

animate(np.stack(images))

display_image(np.concatenate([images[-1], target_image], axis=1))

Conclusion:

From this article, we have seen how Standard GAN architectures lag when handling large pixel values. But the Progressive GAN comes to fulfil the task; we see how progressively adding the layers helps the generator function remain stable throughout the operation and generate a reasonable image. 

References:

Download our Mobile App

Vijaysinh Lendave
Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week. 

How Generative AI is Revolutionising Data Science Tools

How Generative AI is Revolutionising Data Science Tools

Einblick Prompt enables users to create complete data workflows using natural language, accelerating various stages of data science and analytics. Einblick has effectively combined the capabilities of a Jupyter notebook with the user-friendliness of ChatGPT.