Now Reading
Hands-On Guide To Generate Car Models Using Deep Convolutional GAN

Hands-On Guide To Generate Car Models Using Deep Convolutional GAN

Generate Car Models Using Deep Convolutional GAN
W3Schools

Generative Adversarial Networks(GANs in short) are unsupervised learning models that belong to a set of algorithms called generative models. These models are typically used to generate information, such as pictures. GANs learn how to mimic the given data distribution. That is, GANs can be taught to create worlds similar to our own in any domain: images, music, speech, prose and thus are extensively used in image generation, video generation and voice generation.

Through this article, we will demonstrate how the Deep Convolutional GAN (DCGAN) can be used to generate the new car models when trained on the dataset having images of car models. In this article we deal with:

  • The architecture and working of a GAN
  • How a DCGAN is different from Vanilla GAN
  • Code to build car models using DCGAN

So let’s get started!



The architecture and working of a GAN

The GAN architecture contains two neural networks that compete with one another to generate data that is closely related to the real data. The two models used in GANs are the generative model and the discriminative model. 

Think of the discriminative model as your regular classification model. It takes in the input data, understands the features of it and predicts the data into a label or category. These models are concerned with the correlation between input and target and map the features to labels.

Contrary to this, the generative model attempts to predict the image, given a label or category. They try to model the distribution of individual labels. So, how do these two models work together?

Let us understand the working of these models with an example. Suppose you want to generate handwritten digits like the MNIST dataset. Now, the job of the generative model is to create new data that is as close as possible to the real one and pass it onto the discriminator. The goal of the generator is to lie and not get caught! It hopes that the synthetic images it has generated will be deemed authentic by the discriminator, though they are fake. 

The discriminator, on the other hand, has to analyse the real data and the fake ones and classify them as authentic or not (0 or 1).

Steps taken by a GAN:

  • The generator takes a set of random numbers and generates images. 
  • These images along with the ground truth dataset are fed into the discriminator.
  • The discriminator is in a feedback loop with the ground truth of the images. For a vanilla GAN, the discriminator is a standard neural network that classifies the images. The generator is an inverse convolution network. It takes a vector of random noise and up-samples it to generate the image. 
  • The discriminator analyses these images and returns 0 if they are authentic and 1 if they are fake.
Generate Car Models Using Deep Convolutional GAN

How a Deep Convolutional GAN is different from Vanilla GAN

We saw the architecture and working of a GAN. There are various types of GANs depending on the architecture. For the scope of this article, we will discuss DCGAN which is very similar to the vanilla GAN. 

Deep Convolution GANs use a convolutional neural network instead of a vanilla neural network at both discriminator and generator. The fully connected networks are replaced by deep convolutional networks. ConvNets, in general, find areas of correlation within an image, that is, they look for spatial correlations. This means a DCGAN would likely be more fitting for image/video data, whereas the general idea of a GAN can be applied to wider domains.

Code to build car models using Deep Convolutional GAN

We have seen the architecture and working of GANs. Let us implement a project to try and understand this better. 

For this project, we will build a DCGAN to generate new images of Indian cars.

The Dataset

You can collect different images of Indian cars. I suggest you collect specific portions of the car. Ex: if you are collecting the side view of the car, be sure to collect only the side view of different models. If you want to choose the same dataset as I have, click here to download it. I have downloaded these specific customized images from Google as per the requirement of this implementation.

Once the dataset has been collected, let us start with our code. First, mount Google Drive to Google Colab and import the necessary libraries. 

from google.colab import drive
drive.mount('/content/gdrive')
import time
import os
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import numpy as np
from keras.models import Sequential
from keras.layers import Conv2D, Conv2DTranspose, Reshape
from keras.layers import Flatten, BatchNormalization, Dense, Activation
from keras.layers.advanced_activations import LeakyReLU
from keras.optimizers import Adam
from keras.preprocessing.image import ImageDataGenerator
import os
import cv2
from sklearn.model_selection import train_test_split

Select the path where images have to be saved

path=os.path.join('gdrive','My Drive','indian_car')
path

We will write a function to load our dataset, split it into training and test set and select the batch size. 

def dataset(dataset_path, batch_size, image_shape):
   dataset_generator = ImageDataGenerator()
   print(dataset_path, image_shape)
   dataset_generator = dataset_generator.flow_from_directory(
       dataset_path, target_size=(image_shape[0], image_shape[1]),
       batch_size=batch_size,
       class_mode=None)
   print(dataset_generator)
   return dataset_generator

For better accuracy and faster convergence, I have used a custom loss function that is applied to our discriminator and generator models. 

from tensorflow.keras.losses import categorical_crossentropy
import tensorflow_probability as tfp
import tensorflow as tf
from tensorflow.keras import backend as K

class custom_loss:

  def get_L2_enhanced_loss(model):
   conv_layers = [layer for layer in model.layers if type(layer)==Conv2DTranspose and layer.trainable_weights[0].shape.as_list()[0]>0]
   conv_layers = conv_layers[:-1]
   def L2_enhanced():
     total_loss = K.variable(0)
     for layer in conv_layers:
       weights = layer.trainable_weights[0]
       total_loss = total_loss + tf.nn.l2_loss(weights)
     return total_loss
   return L2_enhanced

def get_combined_L2_cross_entropy_loss(model,alpha,batch_size,total_data_size, epochs, end_percentage=0.1, scale=100):
   l2_loss = custom_loss.get_L2_enhanced_loss(model, )
   iters_per_epoch = int(np.ceil(total_data_size/batch_size))
   num_iterations = epochs*iters_per_epoch
   mid_cycle_id = int(num_iterations * ((1. - end_percentage)) / float(2))
   container ={"iterations":0}
   sess = K.get_session()

   def combined_loss(y_true, y_pred):
     container['iterations'] = container['iterations'] + 1
     iterations = container['iterations']
     new_alpha = alpha
     loss = categorical_crossentropy(y_true, y_pred) + new_alpha * l2_loss()
     return loss
   return combined_loss

Defining Discriminator and Generator

Now, let us build our discriminator. We will be using Batch Normalization. Leaky relu is the activation function. 

See Also
animation generated by deep convolutional gan

def construct_discriminator(image_shape):
   discriminator = Sequential()
   discriminator.add(Conv2D(filters=64, kernel_size=(5, 5),
                            strides=(2, 2), padding='same',
                            data_format='channels_last',
                            kernel_initializer='glorot_uniform',
                            input_shape=(image_shape)))
   discriminator.add(LeakyReLU(0.2))

   discriminator.add(Conv2D(filters=128, kernel_size=(5, 5),
                            strides=(2, 2), padding='same',
                            data_format='channels_last',
                            kernel_initializer='glorot_uniform'))
   discriminator.add(BatchNormalization(momentum=0.5))
  discriminator.add(LeakyReLU(0.2))

   discriminator.add(Conv2D(filters=256, kernel_size=(5, 5),
                            strides=(2, 2), padding='same',
                            data_format='channels_last',
                            kernel_initializer='glorot_uniform'))
   discriminator.add(BatchNormalization(momentum=0.5))
   discriminator.add(LeakyReLU(0.2))

   discriminator.add(Conv2D(filters=512, kernel_size=(5, 5),
                            strides=(2, 2), padding='same',
                            data_format='channels_last',
                            kernel_initializer='glorot_uniform'))
   discriminator.add(BatchNormalization(momentum=0.5))
   discriminator.add(LeakyReLU(0.2))

   discriminator.add(Flatten())
   discriminator.add(Dense(1))
   discriminator.add(Activation('sigmoid'))

   optimizer = Adam(lr=0.0002, beta_1=0.5)
   discriminator.compile(loss='binary_crossentropy',
                         optimizer=optimizer,
                         metrics=None)

   return discriminator

Next, the generator. 

def constuct_generator():
   generator = Sequential()
   generator.add(Dense(units=4 * 4 * 512,
                       kernel_initializer='glorot_uniform',
                       input_shape=(1, 1, 100)))
   generator.add(Reshape(target_shape=(4, 4, 512)))
   generator.add(BatchNormalization(momentum=0.5))
   generator.add(Activation('relu'))
   generator.add(Conv2DTranspose(filters=256, kernel_size=(5, 5),
                                 strides=(2, 2), padding='same',
                                 data_format='channels_last',
                                 kernel_initializer='glorot_uniform'))
   generator.add(BatchNormalization(momentum=0.5))
   generator.add(Activation('relu'))
   generator.add(Conv2DTranspose(filters=128, kernel_size=(5, 5),
                                 strides=(2, 2), padding='same',
                                 data_format='channels_last',
                                 kernel_initializer='glorot_uniform'))
   generator.add(BatchNormalization(momentum=0.5))
   generator.add(Activation('relu'))
   generator.add(Conv2DTranspose(filters=64, kernel_size=(5, 5),
                                 strides=(2, 2), padding='same',
                                 data_format='channels_last',
                                 kernel_initializer='glorot_uniform'))
   generator.add(BatchNormalization(momentum=0.5))
   generator.add(Activation('relu'))
   generator.add(Conv2DTranspose(filters=3, kernel_size=(5, 5),
                                 strides=(2, 2), padding='same',
                                 data_format='channels_last',
                                 kernel_initializer='glorot_uniform'))
   generator.add(Activation('tanh'))
   optimizer = Adam(lr=0.00015, beta_1=0.5)

  generator.compile(loss=custom_loss.get_combined_L2_cross_entropy_loss(generator,5e-4,16,128, epochs=650, end_percentage=0.05), optimizer=optimizer, metrics=['accuracy'])

    return generator

Before we load our dataset and start training, let us save the generated images to the drive. 

def save_generated_images(generated_images, epoch, batch_number):
   plt.figure(figsize=(8, 8), num=2)
   gs1 = gridspec.GridSpec(8, 8)
   gs1.update(wspace=0, hspace=0)

   for i in range(32):
       ax1 = plt.subplot(gs1[i])
       ax1.set_aspect('equal')
       image = generated_images[i, :, :, :]
       image += 1
       image *= 127.5
       fig = plt.imshow(image.astype(np.uint8))
       plt.axis('off')
       fig.axes.get_xaxis().set_visible(False)
       fig.axes.get_yaxis().set_visible(False)
   plt.tight_layout()
   save_name = 'gdrive/My Drive/indian_car/generated_images/generatedSamples_epoch' + str(epoch + 1) + '_batch' + str(batch_number + 1) + '.png'
   plt.savefig(save_name, bbox_inches='tight', pad_inches=0)
   plt.pause(0.0000000001)
   plt.show()

Training of Deep Convolutional GAN

In the training phase, we will add a little bit of noise only to confuse the discriminator and make our generator model perform better. 

def train_dcgan(batch_size, epochs, image_shape, dataset_path):
   generator = construct_generator()
   discriminator = construct_discriminator(image_shape)
   gan = Sequential()
   discriminator.trainable = False
   gan.add(generator)
   gan.add(discriminator)
   optimizer = Adam(lr=0.00015, beta_1=0.5)
   gan.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=None)
   dataset_generator = dataset(dataset_path, batch_size, image_shape)
   number_of_batches = int(394 / batch_size)
   adversarial_loss = np.empty(shape=1)
   discriminator_loss = np.empty(shape=1)
   batches = np.empty(shape=1)
   plt.ion()
   current_batch = 0
   for epoch in range(epochs):
       print("Epoch " + str(epoch+1) + "/" + str(epochs) + " :")
       for batch_number in range(number_of_batches):
           start_time = time.time()
           real_images = dataset_generator.next()
           real_images /= 127.5
           real_images -= 1
           current_batch_size = real_images.shape[0]
           noise = np.random.normal(0, 1, size=(current_batch_size,) + (1, 1, 100))
           generated_images = generator.predict(noise)
           real_y = (np.ones(current_batch_size) -
                     np.random.random_sample(current_batch_size) * 0.2)
           fake_y = np.random.random_sample(current_batch_size) * 0.2
           discriminator.trainable = True
           d_loss = discriminator.train_on_batch(real_images, real_y)
           d_loss += discriminator.train_on_batch(generated_images, fake_y)
           discriminator_loss = np.append(discriminator_loss, d_loss)
           discriminator.trainable = False
           noise = np.random.normal(0, 1,
                                    size=(current_batch_size * 2,) +
                                    (1, 1, 100))
           fake_y = (np.ones(current_batch_size * 2) -
                     np.random.random_sample(current_batch_size * 2) * 0.2)
           g_loss = gan.train_on_batch(noise, fake_y)
           adversarial_loss = np.append(adversarial_loss, g_loss)
           batches = np.append(batches, current_batch)

            if((batch_number + 1) % 12 == 0 and
              current_batch_size == batch_size):
               save_generated_images(generated_images, epoch, batch_number)
           time_elapsed = time.time() - start_time
           print("     Batch " + str(batch_number + 1) + "/" +
                 str(number_of_batches) +
                 " generator loss | discriminator loss : " +
                 str(g_loss) + " | " + str(d_loss) + ' - batch took ' +
                 str(time_elapsed) + ' s.')
           current_batch += 1
       if (epoch + 1) % 20 == 0:
           discriminator.trainable = True
           generator.save('gdrive/My Drive/indian_car/models/generator_epoch' + str(epoch) + '.hdf5')
           discriminator.save('gdrive/My Drive/indian_car/models/discriminator_epoch' +
                              str(epoch) + '.hdf5')
       plt.figure(1)
       plt.plot(batches, adversarial_loss, color='green',
                label='Generator Loss')
       plt.plot(batches, discriminator_loss, color='blue',
                label='Discriminator Loss')
       plt.title("DCGAN Train")
       plt.xlabel("Batch Iteration")
       plt.ylabel("Loss")
       if epoch == 0:
           plt.legend()
       plt.pause(0.0000000001)
       plt.show()
       plt.savefig('trainingLossPlot.png')
print(path)
dataset_path = path
print(dataset_path)
batch_size = 32
image_shape = (64, 64, 3)
epochs = 500

#Calling the training function
train_dcgan(batch_size, epochs,image_shape, dataset_path)

It is trained for 500 epochs with adam optimizer and here are the results. 

Initially, the images generated will have noise and are not clear, but with the increase in epochs, the models learn well and produce higher quality images. 

Generate Car Models Using Deep Convolutional GAN

Here are the generated images of the cars with a loss of Batch 11/12 generator loss | discriminator loss : 1.9832957 | 0.76329595

Generate Car Models Using Deep Convolutional GAN

New (Fake) Image Generated by DCGAN

However, these fake images of car models, generated after the 500 epochs of training looks somewhat good, but for more clear images, we can try for more epochs on the system where we can use GPU on the desktop because the free GPU limit of Google Colab was exhausted in my case. 

Conclusion

Through this model, we could demonstrate how the new (fake) models of cars can be generated using the DCGAN. GANs are very powerful tools that can be put to great use in a lot of industries, one application to generate car models using GAN we saw here. They are robot artists in a sense, and their output is impressive A lot of research is ongoing on developing different types of GANs. 

What Do You Think?

If you loved this story, do join our Telegram Community.


Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top