A Beginner’s Guide To Generative Adversarial Networks(GANs)-The Concept Behind DeepFake 

Oil Painting by Urs Schmid illustrating Roger Pensrose’s theory of 5-fold Symmetry

In the early 1960s, AI pioneer Herbert Simon observed that in a span of two decades, machines will match the cognitive abilities of humankind. Predictions like these motivated theorists, sceptics and thinkers from a cross-section of domains to find ways to use computers to perform routine tasks. From Heron’s Automatons in the first century to Google’s Deep Mind in the 21st century, there has been an undying pursuit to create human like intelligence external to humans.

AI is now able to make art that can get hefty price at auctions. It can help e-commerce industry with recommendation(pixel level domain transfer), medical anomaly detection, music generation and popularly, face generation of people who never existed.

One thing that underlies the successful commercialisation of AI in all the above case is the use of Generative Adversarial Networks (GANs).

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

With the publication of this paper in 2014, applications of GANs have witnessed a tremendous growth. Generative-Adversarial Networks(GANs) have been successfully used for high-fidelity natural image synthesis, improving learned image compression and data augmentation tasks.

GANs have advanced to a point where they can pick up trivial expressions denoting significant human emotions. They have become the powerhouses of unsupervised machine learning.

The latest developments in AI, especially in the applications of Generative Adversarial Networks (GANs), can help researchers tackle the final frontier for replicating human intelligence. With a new paper being released every week, GANs are proving to be a front-runner for achieving the ultimate — AGI.

This article is an attempt to familiarise the reader with the jargon surrounding GANs and to give a high-level view of its functioning.

The Working Principle

GANs are generative models devised by Goodfellow et al. in 2014.  They work on the principle of generating and discriminating the inputs. The two networks, Generator and Discriminator go toe to toe with each other like arch-nemesis; benefitting the overall model eventually.

Source: Thalles Silva

The Generator (G) is responsible to produce a rich, high dimensional vector attempting to replicate a given data generation process; the Discriminator(D) acts to separate the input created by the Generator and of the real/observed data generation process. They are trained jointly, with G benefiting from D incapability to recognise true from generated data, whilst D loss is minimized when it is able to classify correctly inputs coming from G as fake and the dataset as true.

Generator Architecture

Source: researchgate

The Generator network has 4 convolutional layers, all followed by BatchNorm (except for the output layer) and Rectified Linear unit (ReLU) activations.

Rectified Linear Unit or ReLU is now one of the most widely used activation functions. The function operates on max(0,x), which means that anything less than zero will be returned as 0 and linear with the slope of 1 when the value is greater than 0.

The network takes as an input drawn from a normal distribution and is fed through the consecutive layers. Each of these layers represent a convolution operation.

Discriminator Architecture

Source: researchgate

The discriminator is also a 4 layer CNN with BatchNorm (except its input layer) and leaky ReLU activations.

The drawback with ReLU function is their fragility, that is, when a large gradient is made to flow through ReLU neuron, it can render the neuron useless and make it unable to fire on any other datapoint again for the rest of the process. In order to address this problem, leaky ReLU were introduced. So, unlike in ReLU when anything less than zero is returned as zero, leaky version instead has a small negative slope, which is crucial for the functioning of a Discriminator network.

Half of the time the Discriminator network receives images from the training set and the other half from the generator.

The Discriminator has to output probabilities close to 1 for real images and near 0 for fake images. To do that, the discriminator takes  sum of two partial losses- One for maximizing the probabilities for the real images and another for minimizing the probability of fake images.

As training progresses, the generator starts to output images that look closer to the images from the training set. That happens because the generator trains to learn the data distribution that composes the training set images.

At the same time, the discriminator starts to get real good at classifying samples as real or fake.

Summary

The whole concept of GANs can be summarised as, “the generator is trying to fool the discriminator while the discriminator is trying to not get fooled by the generator. As the models train through alternating optimization, both methods are improved until a point where the counterfeits are indistinguishable from the genuine ones.”

Further reading:

https://adeshpande3.github.io/Deep-Learning-Research-Review-Week-1-Generative-Adversarial-Nets

https://skymind.ai/wiki/generative-adversarial-network-gan

https://www.freecodecamp.org/news/an-intuitive-introduction-to-generative-adversarial-networks-gans-7a2264a81394/

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR