Code Monkey home page Code Monkey logo

deepsee's Introduction

License CC BY-NC-SA 4.0 Python 3.6

DeepSEE: Deep Disentangled Semantic Explorative Extreme Super-Resolution

This is the official repository of this paper:

DeepSEE: Deep Disentangled Semantic Explorative Extreme Super-Resolution
Marcel Bühler, Andrés Romero, and Radu Timofte.
Computer Vision Lab, ETH Zurich, Switzerland
Abstract: Super-resolution (SR) is by definition ill-posed. There are infinitely many plausible high-resolution variants for a given low-resolution natural image. Most of the current literature aims at a single deterministic solution of either high reconstruction fidelity or photo-realistic perceptual quality. In this work, we propose an explorative facial super-resolution framework, DeepSEE, for Deep disentangled Semantic Explorative Extreme super-resolution. To the best of our knowledge, DeepSEE is the first method to leverage semantic maps for explorative super-resolution. In particular, it provides control of the semantic regions, their disentangled appearance and it allows a broad range of image manipulations. We validate DeepSEE on faces, for up to 32x magnification and exploration of the space of super-resolution.

Updates

25. Nov 2020: Updated Demo notebook. You can now run our demo in Google Colab.

19. Nov 2020: Demo notebook and checkpoints released.

8. June 2020: Training code released.

Installation

git clone https://github.com/mcbuehler/DeepSEE
cd DeepSEE/
pip install -r requirements.txt

Downloads

Training New Models

We provide example training scripts for CelebA and CelebAMask-HQ in scripts/train. If you want to train on your own dataset, you can adjust one of these scripts.

  1. Make sure to have a dataset consisting of images and segmentation masks, as described below.
  2. Set the correct paths (and other options, if applicable) in the training script. We list the training commands for the independent and the guided model in the same script. You can uncomment the one that you want to train. Example training command:
sh ./scripts/train/train_8x_256x256.sh

If you interrupt training and want to restart, make sure to use the flag --continue_train. Otherwise, your previous checkpoints might be overwritten.

Note when Training Models for 32x Upscaling

Models for extreme upscaling require significant GPU memory. You will have to use 2 V100 GPUs with 16GB memory each (or similar). You can enable model parallelism by setting --model_parallel_mode 1. This will compute the first part of the model pass on one GPU, and the second part on the second GPU. This is already pre-set in scripts/train/train_32x_512x512.sh.

Dataset Preparation

CelebAMask-HQ

  1. Obtain images: Download the dataset CelebAMask-HQ.zip from the authors GitHub repository and extract the contents.

For the guided model, you also need the identities. You can download the pre-computed annotations here.

As an alternative, you can also recompute them via
python data/celebamaskhq_compute_identities_file.py.
You find the required mapping files in the file CelebAMask-HQ.zip.

  1. Obtain the semantic masks predicted from downscaled images. You have two options: a) Download from here or b) create them yourself.

  2. Split the dataset into train / val / test. In data/celebamaskhq_partition.py, you can update the paths for in- and outputs. Then, run

    python data/celebamaskhq_partition.py

CelebA

  1. Obtain images: On the author's website, click the link Align&Cropped Images. It will open a Google Drive, where you can download the images under "CelebA" -> "Img" -> "img_align_celeba.zip". For the guided model, you also need the identity annotations. You can download these in the same Google Drive under "CelebA" -> "Anno" -> "identity_CelebA.txt".
  2. Obtain the semantic masks predicted from downscaled images. You have two options: a) Download from here or b) predict them yourself.
  3. Create dataset splits: On the author's website, follow the link Train/Val/Test Partitions and download the file with dataset partitions CelebA_list_eval_partition.txt. It is located in the Eval folder. Update the paths in data/celeba_split.py and run python data/celeba_split.py.

Custom Dataset

You need one folder containing the images and another folder with semantic masks. The filenames for the image name and label should be the same (ignoring extensions). For example, the image 0001.jpg and the semantic mask 0001.png would belong to the same sample. You can then copy and adapt one of the *_dataset.py classes to your new dataset.

Code Structure

  • train.py: Training script. Run this via the bash scripts in scripts/train/.
  • demo.py: This script provides the interface for loading data and running inference for the demo notebook.
  • data/: Dataset classes, pre-processing scripts and dataset preparation scripts.
  • deepsee_models/sr_model.py: Interface to encoder, generator and discriminator.
  • deepsee_models/networks/: Actual model code (residual block, normalization block, encoder, discriminator, etc.)
  • evaluator: Evaluates during training and testing. evaluator/evaluate_folder.py computes scores for two folders, one containing the upscaled images, and on the ground truth.
  • managers/: We have different manager classes for training, inference and demo.
  • options/: Contains the configurations and command line parameters for training and testing.
  • scripts: Contains scripts with pre-defined configurations.
  • util/: Code for logging, visualizations and more.

Citation

Marcel Christoph Bühler, Andrés Romero, and Radu Timofte.
Deepsee: Deep disentangled semantic explorative extreme super-resolution.
In The 15th Asian Conference on Computer Vision (ACCV), 2020.

Bibtex

License

Copyright belongs to the authors. All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

For the SPADE part: Copyright (C) 2019 NVIDIA Corporation.

Acknowledgments

We built on top of the code from SPADE. Thanks to Jiayuan Mao for Synchronized Batch Normalization code, mseitzer for the FID implementation in pytorch and the authors of PerceptualSimilarity.

deepsee's People

Contributors

mcbuehler avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.