How Generative Adversarial Network Works

Aditya Mohanty
DataDrivenInvestor
Published in
4 min readJan 1, 2020

--

Generative adversarial networks or GANs are said to be one of the biggest inventions in the field of deep learning. Recently face-aging apps have become immensely popular which uses a variant of GAN as the underlying algorithm. Also, GANs have become extremely useful in converting low-resolution images into its higher resolution counterpart. In this article, we shall see the building block of a basic generative adversarial network.

Architecture Of GANs:

Each GAN has two basic elements i.e a Generator and a discriminator. These two networks can be any deep learning network like an artificial neural network, convolutional neural network, Long short term memory (LSTM). The discriminator is supposed to have a classifier at the end of the network.

(Schematic Diagram Of A Basic GAN Architecture)

The above diagram summarizes the working of a basic GAN. As we can see the generator takes random values as input and produces an output with the expectation that it looks real to the discriminator. On the other hand, the discriminator takes input from both the real image set and the generated image of the discriminator and tries to classify both of them in a correct manner i.e should be able to distinguish which is real or fake.

Layman’s Way Of Understanding GAN:

The generator of the gan is expected to create images that are supposed to look like a real image. However, the generator does not have any idea regarding how the real images look like. So it takes feedback from the discriminator (as it knows or claims to know about the real image) about how to tweak the parameters so that it would look real. Meanwhile, the discriminator tries to identify the image generated by discriminator and to decrease its probability of becoming a real image and also increase the probability of classifying real image correctly. This competitive learning that is inspired by game theory makes both these networks stronger.

Mathematical Implications:

Let us consider z to be a noise vector which is given as an input to the generator. In that case G(z) would be the generator’s output. Also, x is the set of training samples in that case D(x) is discriminator’s probability for the real training samples.Similarly, D(G(z)) is a discriminator’s output which is a probability value for generated image i.e for fake images.

Maximizing D(G(z)) is same as minimizing 1-D(G(z)) also minimizing D(G(z)) is same as maximizing 1-D(G(z)). Also as we can see in the discriminator part has two losses as it has two kinds of input the output from the generator and the real data samples. So loss has to be calculated twice. So the total loss at the discriminator would be sum of the two losses that we got due to real and fake data. After calculating the losses we can do the required backpropagation and adjust the parameters.

The above equation shows V as a value function which lets us adjust the parameters for both D and G.

As we apply the V function for both V and G

So with the above-said operations both discriminator and generator become stronger to do their required tasks(Like fooling the discriminator for generator and distinguishing the real and fake images for the discriminator)

In the next article, we shall see some cool applications of GAN and build a GAN from scratch.

--

--