GAN architecture explained

28 Jan 2018

first, an overview of GAN in more technical detail (pretty sure you know how things work by now on a high level overview):

Code version of a standard GAN, in Keras:

then, the different GANs we explore:

Vanilla GAN to DCGAN (Deep Convolution GAN)

DCGAN to WGAN (Wasserstein GAN)

and then, let’s go down into each of the individual parts in more detail below:

Batch Normalisation

For a summary, see the extract from paper by Sergey Ioffe, Christian Szegedy

Next, for a simpler breakdown, see this awesome explanation by Karl N. here:

Cross Entropy

Amazing and simple explanation by Rob DiPietro here:

Very helpful to first understand entropy, then cross entropy, then KL divergence.


tanh function

Hyperbolic tangent function is essentially rescaling the sigmoid function from its range of 0 to 1, to -1 and 1. There is horizontal scaling as well.

kernel initializer Initializations define the way to set the initial random weights of Keras layers.

he_normal: Gaussian initialization scaled by fan_in (i.e. its number of inputs)