An Autoencoder is an Artifical Neural Network that is trained to "encode" a given input into a specified latent space, and conversely reconstruct input from this latent space by "decoding" it. The application of such neural networks range all the way from data compression to denoising.
An Autoencoder for images works in two phases. A encoding phase, where the image is "encoded" into a particular latent space size, as shown for a 32x32x3 image below:
And a decoding phase, where the Neural Network "decodes" the entire image back using the latent code it was intially crunched into, as seen below:
As evidenced above, the entire image is fed into the Autoencoder at one go, to encoder and decode from a single latent space.
My design of a Channel-Specific-Autoencoder involves splitting the image into seperate colour channels of R , G , and B and feeding them to three seperate encoders to encoder into separate latent-spaces and decode the individual channels from seperately, and stack together to reconstruct the entire image. A depiction of it can be seen below:
The primary motivation for such an architecture was to "divide labour". As opposed to having a single autoencoder reconstruct the entire image, the proposed model "outsources" reconstructing individual colour channels to 3 separate autoencoders, thereby working in unison to generate the entire image.
A few of the results can be seen below:
As seen in a lot of the results, the colour in the CS-Autoencoder appear more "true" compared to a Traditional Autoencoder.
Note: Both Autoencoders have the images encoded into the same net latent space size of 96 (32*3 in the case of the CS-Autencoder)