Code Monkey home page Code Monkey logo

balagan's Introduction

BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

Project Page | Paper | Video

balagan-teaser

balagan

State-of-the-art image-to-image translation methods tend to struggle in an imbalanced domain setting, where one image domain lacks richness and diversity. We introduce a new unsupervised translation network, BalaGAN, specifically designed to tackle the domain imbalance problem. We leverage the latent modalities of the richer domain to turn the image-to-image translation problem, between two imbalanced domains, into a balanced, multi-class, and conditional translation problem, more resembling the style transfer setting. Specifically, we analyze the source domain and learn a decomposition of it into a set of latent modes or classes, without any supervision. This leaves us with a multitude of balanced cross-domain translation tasks, between all pairs of classes, including the target domain. During inference, the trained network takes as input a source image, as well as a reference or style image from one of the modes as a condition, and produces an image which resembles the source on the pixel-wise level, but shares the same mode as the reference. We show that employing modalities within the dataset improves the quality of the translated images, and that BalaGAN outperforms strong baselines of both unconditioned and style-transfer-based image-to-image translation methods, in terms of image quality and diversity.

Prerequisites

  • Linux (may work on windows and macOS but was not tested)
  • cuda 10.1
  • Anaconda3
  • pytorch (tested on >=1.5.0)
  • tensorboardX
  • faiss-gpu
  • opencv-python

Training

Data Preparation

A dataset directory should have the following structure:

dataset
├── train
│   ├── A
│   └── B
└── test
    ├── A
    └── B

where A is the source domain, and B is the target domain.

Train

The main training script is train.py. It receives several command line arguments, for more details please the file. The most important argument is a path to a config file. An example for such a file is provided in configs/dog2wolf.yaml

Tracking The Training

For each experiment, a dedicated directory is created, and all the outputs are saved there. An experiment directory contains the following:

  • logs directory with a tensorboard file which contains the losses along the training, and images produced by the model.
  • images directory, in which the images are saved as files.
  • checkpoints directory in which checkpoints are saved along the training.

We highly recommend using trains to track experiments!

Resume An Experiment

To resume an experiment, provide the --resume flag to the main training script. When providing this flag, the state of the latest experiment with the same --exp_name is loaded.

Pretrained Models

Coming soon...

Citation

If you use this code for your research, please cite our paper BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

@article{patashnik2020balagan,
      title={BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer}, 
      author={Or Patashnik and Dov Danon and Hao Zhang and Daniel Cohen-Or},
      journal={arXiv preprint arXiv:2010.02036},
      year={2020}
}

balagan's People

Contributors

orpatashnik avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.