Code Monkey home page Code Monkey logo

pix2latent's Introduction

pix2latent: framework for inverting images into generative models

Framework for inverting images. Codebase used in:

Transforming and Projecting Images into Class-conditional Generative Networks
project page | paper
Minyoung Huh   Richard Zhang   Jun-Yan Zhu   Sylvain Paris   Aaron Hertzmann
MIT CSAIL   Adobe Research
ECCV 2020 (oral)

@inproceedings{huh2020ganprojection,
    title = {Transforming and Projecting Images to Class-conditional Generative Networks},
    author = {Minyoung Huh and Richard Zhang, Jun-Yan Zhu and Sylvain Paris and Aaron Hertzmann},
    booktitle = {ECCV},
    year = {2020}
}

NOTE [8/25/20] The codebase has been renamed from GAN-Transform-and-Project to pix2latent, and also refactored to make it easier to use and extend to any generative model beyond BigGAN. To access the original codebase refer to the legacy branch.

Example results

All results below are without fine-tuning.

BigGAN (z-space) - ImageNet (256x256)

StyleGAN2 (z-space) - LSUN Cars (384x512)

StyleGAN2 (z-space) - FFHQ (1024x1024)

Prerequisites

The code was developed on

  • Ubuntu 18.04
  • Python 3.7
  • PyTorch 1.4.0

Getting Started

  • Install PyTorch
    Install the correct PyTorch version for your machine

  • Install the python dependencies
    Install the remaining dependencies via

    pip install -r requirements.txt
  • Install pix2latent

    git clone https://github.com/minyoungg/pix2latent
    cd pix2latent
    pip install .

Examples

We provide several demo codes in ./examples/ for both BigGAN and StyleGAN2. Note that the codebase has been tuned and developed on BigGAN.

> cd examples
> python invert_biggan_adam.py --num_samples 4

Using the make_video flag will save the optimization trajectory as a video.

> python invert_biggan_adam.py --make_video --num_samples 4

(slow) To optimize with CMA-ES or BasinCMA, we use PyCMA. Note that the PyCMA version of CMA-ES has a predefined number of samples to jointly evaluate (18 for BigGAN) and (22 for StyleGAN2).

> python invert_biggan_cma.py 
> python invert_biggan_basincma.py 

(fast) Alternatively CMA-ES in Nevergrad provides sample parallelization so you can set your own number of samples. Although this runs faster, we have observed the performance to be slightly worse. (warning: performance depends on num_samples).

> python invert_biggan_nevergrad.py --ng_method CMA --num_samples 4
> python invert_biggan_hybrid_nevergrad.py --ng_method CMA --num_samples 4

Same applies to StyleGAN2. See ./examples/ for extensive list of examples.

Template pseudocode

import torch, torch.nn as nn
import pix2latent.VariableManger
from pix2latent.optimizer import GradientOptimizer

# load your favorite model
class Generator(nn.Module):
    ...
    
    def forward(self, z):
        ...
        return im

model = Generator() 

# define your loss objective .. or use the predefined loss functions in pix2latent.loss_functions
loss_fn = lambda out, target: (target - out).abs().mean()

# tell the optimizer what the input-output relationship is
vm = VariableManager()
vm.register(variable_name='z', shape=(128,), var_type='input')
vm.register(variable_name='target', shape(3, 256, 256), var_type='output')

# setup optimizer
opt = GradientOptimizer(model, vm, loss_fn)

# optimize
vars, out, loss = opt.optimize(num_samples=1, grad_steps=500)

detailed usage

pix2latent

Command Description
pix2latent.loss_function predefined loss functions
pix2latent.distribution distribution functions used to initialize variables

pix2latent.VariableManger

class variable for managing variables. variable manager instance is initialized by

var_man = VariableManager()
Method Description
var_man.register(...) registers variable. this variable is created when initialize is called
var_man.unregister(...) removes a variable that is already registered
var_man.edit_variable(...) edits existing variable
var_man.initialize(...) initializes variable from defined specification

pix2latent.optimizer

Command Description
pix2latent.optimizer.GradientOptimizer gradient-based optimizer. defaults to optimizer defined in pix2latent.VariableManager
pix2latent.optimizer.CMAOptimizer uses CMA optimization to search over latent variables z
pix2latent.optimizer.BasinCMAOptimizer uses BasinCMA optimization. a combination of CMA and gradient-based optimization
pix2latent.optimizer.NevergradOptimizer uses Nevergrad library for optimization. supports most gradient-free optimization method implemented in Nevergrad
pix2latent.optimizer.HybridNevergradOptimizer uses hybrid optimization by alternating gradient and gradient-free optimization provided by Nevergrad

pix2latent.transform

Command Description
pix2latent.SpatialTransform spatial transformation function, used to optimize for image scale and position
pix2latent.TransformBasinCMAOptimizer BasinCMA-like optimization method used to search for image transformation

pix2latent.util

Command Description
pix2latent.util.image utility for image pre and post processing
pix2latent.util.video utility for video (e.g. saving videos)
pix2latent.util.misc miscellaneous functions
pix2latent.util.function_hooks function hooks that can be attached to variables in the optimization loop. (e.g. Clamp, Perturb)

pix2latent.model

Command Description
pix2latent.model.BigGAN BigGAN model wrapper. Uses implementation by huggingface using the official weights
pix2latent.model.StyleGAN2 StyleGAN2 model wrapper. Uses PyTorch implementation by rosinality using the official weights

pix2latent.edit

Command Description
pix2latent.edit.BigGANLatentEditor BigGAN editor. Simple interface to edit class and latent variables using oversimplified version of GANSpace

pix2latent's People

Contributors

minyoungg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pix2latent's Issues

Batch optimization

Hello, sorry for asking too often.

I am curious about doing optimization in batch input in the code.

I think in actual use, batch size should be one and this code just show the result of diverse initialization results, is it right?
Or, is it better to do it in batch for the results in the case of basincma or adam?

how to apply to images sized 256x256 /stylegan2

I want to invert images sized 256x256 into .pt through stylegan2, however it reports 【RuntimeError: mat1 dim 1 must match mat2 dim 0】. I know it is because original model is for images sized 512x512, but how should I change it?

PerceptualLoss broken

https://github.com/richzhang/PerceptualSimilarity was recently updated,
this broke the import models that is called from the examples.

Traceback (most recent call last):
  File "invert_biggan_adam.py", line 44, in <module>
    loss_fn = LF.ProjectionLoss()
  File "/home/r/.local/share/virtualenvs/single_view_mpi-aYhVwZ1J/lib/python3.6/site-packages/pix2latent/loss_functions.py", line 90, in __init__
    self.ploss_fn = PerceptualLoss(net=lpips_net)
  File "/home/r/.local/share/virtualenvs/single_view_mpi-aYhVwZ1J/lib/python3.6/site-packages/pix2latent/loss_functions.py", line 134, in __init__
    import models
ModuleNotFoundError: No module named 'models'

I believe now you should call something like lpips.LPIPS(params) to instantiate a loss object.
(model='net-lin' is also not available anymore)
I'm going to try and clone an older version of the repo.

Update: git checkout 6abcdd1077b090cb9f0892103b45d56531d50689 seems to do the trick.

NameError: name 'mask' is not defined

Hello,
I have a question,what is the mask in line 107 and line 108 in the file invert_biggan_with_transform.py,

target_transform_fn = SpatialTransform(pre_align=mask)
weight_transform_fn = SpatialTransform(pre_align=mask)

when I run this program, the following error occurs:

Traceback (most recent call last):
  File "invert_biggan_with_transform.py", line 107, in <module>
    target_transform_fn = SpatialTransform(pre_align=mask)
NameError: name 'mask' is not defined

How can I solve it?Thank you.

Fine-tuning the weights of generative model?

Hi there, excellent job! Thanks for providing the code!

I have a question about fine-tuning. In Section 3.5 of the paper, I think there is a regularization term that can also optimize the weights of the original GAN. Where can I find this part in your code? Currently, I just saw the optimization of z,c, and ф. Did I miss something? Please correct me if I was wrong.

Thank you again!

Nevergrad vs. HybridNevergrad

Hello,

I want to try pix2latent on the FFHQ dataset on Google Colab. Due to RAM constraints, Colab won't run the optimization process with CMA or BasinCMA (unless I use the cars dataset), so I have to go with the faster (yet worse) option relying on Nevergrad.

I see that:

  • Nevergrad is gradient-free optimization (CMA by default), followed by ADAM fine-tuning, so that would be similar to:

ADAM + CMA

  • HybridNevergrad alternates gradient-free optimization (CMA by default) and SGD optimization. That would be akin to the following, albeit with SGD instead of ADAM:

ADAM + BasinCMA

Between the two options (Nevergrad vs. HybridNevergrad), which one would you recommend?

Edit: Below are results obtained with Nevergrad .

Target image Results with Nevergrad

Edit: Below are results obtained with HybridNevergrad.

Target image Results with HybridNevergrad

I guess I would have to try another portrait, tweak parameters, or forget Colab and stick to CMA/BasinCMA on a local machine.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.