Code Monkey home page Code Monkey logo

computer-vision's Introduction

Computer-Vision

A collection of CV implementations using Pytorch and OpenCV. Will continue to upload more

Image Classification (MNIST)

Autoencoder Generated Images Vs Ground Truth Images (Vanilla_Autoencoder)

Autoencoder Generated Images Vs Ground Truth Images (CNN_Autoencoder)

After being fed through an autoencoder, we can see that the reconstructed images are blurrier than the original images. Image quality seems a little better compared to the linear autoencoder.

Variational Autoencoder Generated Images Vs Ground Truth Images (Vanilla VAE)

Variational Autoencoder Generated Images Vs Ground Truth Images (CNN VAE)

VAE with a CNN. Unlike the vanilla VAE above, the bottleneck is rather small (Batch_size * 2 * 2). The resulting images clearly show the model struggling to generated a clear image due to the bottleneck.

Vanilla GANs (Linear Layers)

GANs training over time on MNIST data

LSGANs (Linear Layers)

Same network architecture as Vanilla GANs but with Least Square loss

DCGANs

Same cost function as Vanilla GANs but with Deep convolutional layers. Produces better clearer images compared to the Vanilla GANs

DCGANs with LS Loss

Same architecture as DCGANs but with LeastSquares loss

Auxillary Gan

AuxGAN

Training process

Generating images with specific labels

CGAN

CGAN with a LS loss

InfoGAN

Paper

CVAE

For this reconstruction task, MNIST images were cropped to only keep the middle 4 columns of pixel values, and CVAE model was told to reconstruct the original image using the cropped images as inputs.

Cropped Images / Reconstructed Images / Original Images

Cropped Image Reconstructed Image Original Image

AE-GAN

Paper

Adversarial Autoencoder that combines AE and GANs. This Pytorch implementation uses VAE instead of a vanilla AE.

WGAN

Paper

WGAN-GP

Paper

RecycleGan

Paper

UNIT

Paper

The model was trained on edgeToShoes dataset. The training takes about 6 hours per epoch, and uses a little more than 5GB of gpu memory on my 1080ti. The model was trained for 5 epochs total so the model is not great. The gif is only here to illustrate that the training does improve the model overtime. Shoes->Edge seem much easier for the model to learn than Edge->shoes.

Training

Edge / Shoes->Edge / Shoe / Edge->Shoe

Testing

Edge / Shoes->Edge / Shoe / Edge->Shoe

ADaIN

Paper

Training

Content / Style / Transformed Image

Testing

Content / Style / Transformed Image

Different Alpha

How changing alpha changes how much style to be trasnferred.

StarGAN

Training progress

Turning MNIST numbers to an 8.

paper

computer-vision's People

Contributors

yk287 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.