Code Monkey home page Code Monkey logo

basic-image-reconstruction-with-autoencoder's Introduction

Workflow

The training approach is always training the 2 sized bottleneck along with the input custom size. To ensure the non-familiarity with test data for both models, I chose to save the test data fin the training file to be used later in the latent space analysis. The test data file is replaced with every training.

Structure of the project

  • data/ : contains the data files( downloaded in the training script for torchvision datasets)
  • model/autoencoder.py : the structure of the autoencoder
  • train.py : the training script for both models, fixed 2 sized bottleneck and custom input sized and saving the test data.
  • viz_reconstruction.py : the script for visualizing the reconstruction of the test data comparing the 2 models performance.
  • encode_latent_space.py : script to use the encoder part to generate the latent space of from a number of input images (e.g. 1000), the results are saved in .npy numpy file.
  • viz_latent_space.py : script to visualize both latent spaces of both models and compare them, in the argument you have the choice of using UMAP or t-SNE.
  • viz_kmeans_latent_space.py : script to visualize the clustered latent space using k-means.
  • requirements.txt : the requirements file for the project.
  • notebook_alternative.ipynb : the notebook for this project instead running scripts internally in terminal.
  • model/ : the folder containing the saved models, test data, and latent spaces.

Dataset description

GTSRB (German Traffic Sign Recognition Benchmark)

  • Size : 26640
  • 43 classes

Usage guide

Training the models:

python train.py --num_epochs 10 --latent_size 256 --batch_size 128 --lr 0.0001

Note : the only required argument is latent_size, the rest are optional(defaults as displayed).

Visualizing the reconstruction:

python viz_reconstruction.py --model_path output/model_latent_size_256.pth --test_data output/test_data.pth

Note : the only required argument is latent_size, the rest are optional(defaults as displayed).

Encoding the latent space:

python encode_latent_space.py --model_path output/model_latent_size_256.pth --test_data output/test_data.pth --num_samples 1000

Note : the only required argument is latent_size, the rest are optional(defaults as displayed).

Visualizing the latent space:

python viz_latent_space.py --latent_space output/latent_space_size_256.npy --vis_type umap

Vizualizing clustered umap reduced latent space with k-means:

python viz_kmeans_latent_space.py --latent_space output/latent_space_size_256.npy --num_clusters 43

Reconstruction results

reconstructing images with both models

basic-image-reconstruction-with-autoencoder's People

Contributors

tekayanidham avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.