Digit Creation with GANs

Generative Adversarial Networks (GANs) were developed by Ian J. Goodfellow in 2014. They can be used to generate complex colour images such as faces. Images created by GANs can look so real that it is practically impossible to distinguish between real and fake images. However, generating complex images of these level of realism requires a large amount of resources to train the network.

GAN - General Adversarial Network

In a GAN two neural networks, a generator and a discriminator are trained simultaneously by an adversarial process. The generator (artist) learns to create images that look real. The discriminator (critic) learns to detect fake images. The two competing models try to beat each other. The goal is to train the generator to outperform the discriminator.

How does GAN work?
Model architecture
Defining loss functions
Model training
Summary
Acknowledgements
Connect with me

How does GAN work?

Training a GAN consists of two parts:

While the generator remains idle, the discriminator is trained on real images for a number of epochs to see if it can correctly predict them as real. In the same phase the discriminator is trained on fake images to (generated by the generator) to see if it can predict them as fake.
While the discriminator is idle, train the generator and use the results from the discriminator to improve the images. These steps are repeated for a large number of epochs. The results (fake images) are examined manually to determine if they look real. If they look real the training is stopped. If not the preceding two steps are repeated until the fake images appears real. This process is depicted in Figure 1.

Figure 1 - GAN working schematic

Source: Sarang(2021)

Model Architecture

The Generator

The generator schematic is shown in Figure 2. The generator takes a random noise vector of a given dimension, in this example a dimension of 100 is used. Using this vector, an image of 64x64x3 is generated. The image is upscaled in a series of transitions through convolutional layers. Each convolutional layer is followed by a batch normalisation and leaky ReLU. The leaky ReLU has neither the dying ReLU or vanishing gradient problems. Strides are used in each convolutional layer to avoid unstable training.

Figure 2 - Generator architecture

Source: Sarang(2021)

The Discriminator

The discriminator schematic is shown in Figure 3. The discriminator downsizes the given image for evaluation using convolutional layers.

Figure 3 - Discriminator schematic

Source: Sarang(2021)

Model Architecture

Defining the Generator

The purpose of the generator is to create images containing the digit 5 which look similar to the images in the training dataset. A Keras sequential model is used to create the generator, show in Figure 4. A summary of the generator model is shown in Figure 5.

Figure 4 - Generator architecture

Figure 5 - Generator model summary

Testing the Generator

Test output from the generator is shown in Figure 6.

Figure 6 - Test Generator output

Defining the discriminator model

The discriminator model uses just two convolutional layers. The output of the last convolutional layer is of type (batch size, height, width, filters). The Flatten layer in the network flattens this output to feed it into the last Dense layer in the network.

Figure 7 - Discriminator model architecture

Figure 8 - Discriminator model summary

Testing the Discriminator

The discriminator can be tested by feeding it with the earlier generated image. The discriminator will give a negative value of the image is fake and a positive value if it is real.

Figure 9 - Discriminator test output

The decision value is -0.0014415, a negative value indicating that the image is fake.

Defining Loss Functions

Keras’s binary cross-entropy is used for the loss function as there are two classes (1) for a real image and (0) for a fake one. The return value of the function is how well the generator tricks the discriminator. If the generator performs well, the discriminator will classify the fake image as real, returning a decision of 1. The function compares the discriminators decision on the generated image with an array of ones. The discriminator first considers the image is real and computes the loss with respect to an array of ones. The discriminator then considers the image is fake and computes the loss with respect to an array of zeros. The total loss determined by the discriminator is the sum of the two losses.

Model Training

The generator and discriminator models are trained in several steps. Gradient tape is used for automatic differentiation on both the generator and the discriminator.

At each step, a batch of images is given to the function as an input. The discriminator is asked to produce outputs for both the training and generated images. The training output is called as real and the generated image as fake. The generator loss is calculated on the fake image, and the discriminator loss on the real and fake. Gradient tape is used to compute the gradients on both of these losses and apply new gradients to the models.

Figure 10 - Images for the digit 5 generated by the GAN

The GAN can create an acceptable output after just 20 / 30 epochs with better quality above 70 epochs.

Summary

Generative Adversarial Networks (GANs) provide a methods for imitating given images. GANs consist of two networks - a generator and discriminator which are trained simultaneously in an adversarial process. In this repo a GAN network was constructed and trained on handwritten digits, alphabets and anime characters. Training a GAN can require huge resources, but the results can be impressive. GANs have been successfully applied to many applications from generating images for large datasets, creating celebrity faces and generating emojis from photos.

Acknowledgements

Artificial Neural Networks with TensorFlow 2, ANN Architecture Machine Learning Projects (2021) Poornachandra Sarang

Connect with me

Visit my website 💻
Follow me on LinkedIn and Twitter) 🐤
Follow me on Medium 📕

matthewbishop58 / digit-gan Goto Github PK

digit-gan's Introduction

Digit Creation with GANs

GAN - General Adversarial Network

Table of Contents

How does GAN work?

Figure 1 - GAN working schematic

Model Architecture

The Generator

Figure 2 - Generator architecture

The Discriminator

Figure 3 - Discriminator schematic

Model Architecture

Defining the Generator

Figure 4 - Generator architecture

Figure 5 - Generator model summary

Testing the Generator

Figure 6 - Test Generator output

Defining the discriminator model

Figure 7 - Discriminator model architecture

Figure 8 - Discriminator model summary

Testing the Discriminator

Figure 9 - Discriminator test output

Defining Loss Functions

Model Training

Figure 10 - Images for the digit 5 generated by the GAN

Summary

Acknowledgements

Connect with me

digit-gan's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org