Code Monkey home page Code Monkey logo

yuval-alaluf / restyle-encoder Goto Github PK

View Code? Open in Web Editor NEW
1.0K 18.0 155.0 28.12 MB

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" (ICCV 2021) https://arxiv.org/abs/2104.02699

Home Page: https://yuval-alaluf.github.io/restyle-encoder/

License: MIT License

Python 79.11% C++ 0.63% Cuda 4.13% Jupyter Notebook 16.13%
generative-adversarial-networks stylegan stylegan-encoder iterative-refinement iccv2021

restyle-encoder's Introduction

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement (ICCV 2021)

Recently, the power of unconditional image synthesis has significantly advanced through the use of Generative Adversarial Networks (GANs). The task of inverting an image into its corresponding latent code of the trained GAN is of utmost importance as it allows for the manipulation of real images, leveraging the rich semantics learned by the network. Recognizing the limitations of current inversion approaches, in this work we present a novel inversion scheme that extends current encoder-based inversion methods by introducing an iterative refinement mechanism. Instead of directly predicting the latent code of a given image using a single pass, the encoder is tasked with predicting a residual with respect to the current estimate of the inverted latent code in a self-correcting manner. Our residual-based encoder, named ReStyle, attains improved accuracy compared to current state-of-the-art encoder-based methods with a negligible increase in inference time. We analyze the behavior of ReStyle to gain valuable insights into its iterative nature. We then evaluate the performance of our residual encoder and analyze its robustness compared to optimization-based inversion and state-of-the-art encoders.



Inference Notebook:
Animation Notebook:


Different from conventional encoder-based inversion techniques, our residual-based ReStyle scheme incorporates an iterative refinement mechanism to progressively converge to an accurate inversion of real images. For each domain, we show the input image on the left followed by intermediate inversion outputs.

Description

Official Implementation of our ReStyle paper for both training and evaluation. ReStyle introduces an iterative refinement mechanism which can be applied over different StyleGAN encoders for solving the StyleGAN inversion task.

Table of Contents

Getting Started

Prerequisites

  • Linux or macOS
  • NVIDIA GPU + CUDA CuDNN (CPU may be possible with some modifications, but is not inherently supported)
  • Python 3

Installation

  • Dependencies:
    We recommend running this repository using Anaconda. All dependencies for defining the environment are provided in environment/restyle_env.yaml.

Pretrained Models

In this repository, we provide pretrained ReStyle encoders applied over the pSp and e4e encoders across various domains.

Please download the pretrained models from the following links.

ReStyle-pSp

Path Description
FFHQ - ReStyle + pSp ReStyle applied over pSp trained on the FFHQ dataset.
Stanford Cars - ReStyle + pSp ReStyle applied over pSp trained on the Stanford Cars dataset.
LSUN Church - ReStyle + pSp ReStyle applied over pSp trained on the LSUN Church dataset.
AFHQ Wild - ReStyle + pSp ReStyle applied over pSp trained on the AFHQ Wild dataset.

ReStyle-e4e

Path Description
FFHQ - ReStyle + e4e ReStyle applied over e4e trained on the FFHQ dataset.
Stanford Cars - ReStyle + e4e ReStyle applied over e4e trained on the Stanford Cars dataset.
LSUN Church - ReStyle + e4e ReStyle applied over e4e trained on the LSUN Church dataset.
AFHQ Wild - ReStyle + e4e ReStyle applied over e4e trained on the AFHQ Wild dataset.
LSUN Horse - ReStyle + e4e ReStyle applied over e4e trained on the LSUN Horse dataset.

Auxiliary Models

In addition, we provide various auxiliary models needed for training your own ReStyle models from scratch.
This includes the StyleGAN generators and pre-trained models used for loss computation.

Path Description
FFHQ StyleGAN StyleGAN2 model trained on FFHQ with 1024x1024 output resolution.
LSUN Car StyleGAN StyleGAN2 model trained on LSUN Car with 512x384 output resolution.
LSUN Church StyleGAN StyleGAN2 model trained on LSUN Church with 256x256 output resolution.
LSUN Horse StyleGAN StyleGAN2 model trained on LSUN Horse with 256x256 output resolution.
AFHQ Wild StyleGAN StyleGAN-ADA model trained on AFHQ Wild with 512x512 output resolution.
IR-SE50 Model Pretrained IR-SE50 model taken from TreB1eN for use in our ID loss and encoder backbone on human facial domain.
ResNet-34 Model ResNet-34 model trained on ImageNet taken from torchvision for initializing our encoder backbone.
MoCov2 Model Pretrained ResNet-50 model trained using MOCOv2 for computing MoCo-based loss on non-facial domains. The model is taken from the official implementation.
CurricularFace Backbone Pretrained CurricularFace model taken from HuangYG123 for use in ID similarity metric computation.
MTCNN Weights for MTCNN model taken from TreB1eN for use in ID similarity metric computation. (Unpack the tar.gz to extract the 3 model weights.)

Note: all StyleGAN models are converted from the official TensorFlow models to PyTorch using the conversion script from rosinality.

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models. However, you may use your own paths by changing the necessary values in configs/path_configs.py.

Training

Preparing your Data

In order to train ReStyle on your own data, you should perform the following steps:

  1. Update configs/paths_config.py with the necessary data paths and model paths for training and inference.
dataset_paths = {
    'train_data': '/path/to/train/data'
    'test_data': '/path/to/test/data',
}
  1. Configure a new dataset under the DATASETS variable defined in configs/data_configs.py. There, you should define the source/target data paths for the train and test sets as well as the transforms to be used for training and inference.
DATASETS = {
	'my_data_encode': {
		'transforms': transforms_config.EncodeTransforms,   # can define a custom transform, if desired
		'train_source_root': dataset_paths['train_data'],
		'train_target_root': dataset_paths['train_data'],
		'test_source_root': dataset_paths['test_data'],
		'test_target_root': dataset_paths['test_data'],
	}
}
  1. To train with your newly defined dataset, simply use the flag --dataset_type my_data_encode.

Preparing your Generator

In this work, we use rosinality's StyleGAN2 implementation. If you wish to use your own generator trained using NVIDIA's implementation there are a few options we recommend:

  1. Using NVIDIA's StyleGAN2 / StyleGAN-ADA TensorFlow implementation.
    You can then convert the TensorFlow .pkl checkpoints to the supported format using the conversion script found in rosinality's implementation.
  2. Using NVIDIA's StyleGAN-ADA PyTorch implementation.
    You can then convert the PyTorch .pkl checkpoints to the supported format using the conversion script created by Justin Pinkney found in dvschultz's fork.

Once you have the converted .pt files, you should be ready to use them in this repository.

Training ReStyle

The main training scripts can be found in scripts/train_restyle_psp.py and scripts/train_restyle_e4e.py. Each of the two scripts will run ReStyle applied over the corresponding base inversion method.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs.
Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs.

We currently support applying ReStyle on the pSp encoder from Richardson et al. [2020] and the e4e encoder from Tov et al. [2021].

Training ReStyle with the settings used in the paper can be done by running the following commands.

  • ReStyle applied over pSp:
python scripts/train_restyle_psp.py \
--dataset_type=ffhq_encode \
--encoder_type=BackboneEncoder \
--exp_dir=experiment/restyle_psp_ffhq_encode \
--workers=8 \
--batch_size=8 \
--test_batch_size=8 \
--test_workers=8 \
--val_interval=5000 \
--save_interval=10000 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--w_norm_lambda=0 \
--id_lambda=0.1 \
--input_nc=6 \
--n_iters_per_batch=5 \
--output_size=1024 \
--stylegan_weights=pretrained_models/stylegan2-ffhq-config-f.pt
  • ReStyle applied over e4e:
python scripts/train_restyle_e4e.py \
--dataset_type ffhq_encode \
--encoder_type ProgressiveBackboneEncoder \
--exp_dir=experiment/restyle_e4e_ffhq_encode \
--workers=8 \
--batch_size=8 \
--test_batch_size=8 \
--test_workers=8 \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--delta_norm_lambda 0.0002 \
--id_lambda 0.1 \
--use_w_pool \
--w_discriminator_lambda 0.1 \
--progressive_start 20000 \
--progressive_step_every 2000 \
--input_nc 6 \
--n_iters_per_batch=5 \
--output_size 1024 \
--stylegan_weights=pretrained_models/stylegan2-ffhq-config-f.pt

Additional Notes

  • Encoder backbones:
    • For the human facial domain (ffhq_encode), we use an IRSE-50 backbone using the flags:
      • --encoder_type=BackboneEncoder for pSp
      • --encoder_type=ProgressiveBackboneEncoder for e4e
    • For all other domains, we use a ResNet34 encoder backbone using the flags:
      • --encoder_type=ResNetBackboneEncoder for pSp
      • --encoder_type=ResNetProgressiveBackboneEncoder for e4e
  • ID/similarity losses:
    • For the human facial domain we also use a specialized ID loss which is set using the flag --id_lambda=0.1.
    • For all other domains, please set --id_lambda=0 and --moco_lambda=0.5 to use the MoCo-based similarity loss from Tov et al.
      • Note, you cannot set both id_lambda and moco_lambda to be active simultaneously.
  • You should also adjust the --output_size and --stylegan_weights flags according to your StyleGAN generator.
  • See options/train_options.py and options/e4e_train_options.py for all training-specific flags.

Inference Notebooks

To help visualize the results of ReStyle we provide a Jupyter notebook found in notebooks/inference_playground.ipynb.
The notebook will download the pretrained models and run inference on the images found in notebooks/images or on images of your choosing. It is recommended to run this in Google Colab.

We have also provided a notebook for generating interpolation videos such as those found in the project page. This notebook can be run using Google Colab here.

Testing

Inference

You can use scripts/inference_iterative.py to apply a trained model on a set of images:

python scripts/inference_iterative.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=experiment/checkpoints/best_model.pt \
--data_path=/path/to/test_data \
--test_batch_size=4 \
--test_workers=4 \
--n_iters_per_batch=5

This script will save each step's outputs in a separate sub-directory (e.g., the outputs of step i will be saved in /path/to/experiment/inference_results/i).

Notes:

  • By default, the images will be saved at their original output resolutions (e.g., 1024x1024 for faces, 512x384 for cars). If you wish to save outputs resized to resolutions of 256x256 (or 256x192 for cars), you can do so by adding the flag --resize_outputs.
  • This script will also save all the latents as an .npy file in a dictionary format as follows:
{
    "0.jpg": [latent_step_1, latent_step_2, ..., latent_step_N],
    "1.jpg": [latent_step_1, latent_step_2, ..., latent_step_N],
    ...
}

That is, the keys of the dictionary are the image file names and the values are lists of length N containing the output latent of each step where N is the number of inference steps. Each element in the list is of shape (Kx512) where K is the number of style inputs of the generator.

You can use the saved latents to perform latent space manipulations, for example.

Step-by-Step Inference


Visualizing the intermediate outputs. Here, the intermediate outputs are saved from left to right with the input image shown on the right-hand side.

Sometimes, you may wish to save each step's outputs side-by-side instead of in separate sub-folders. This would allow one to easily see the progression in the reconstruction with each step. To save the step-by-step outputs as a single image, you can run the following:

python scripts/inference_iterative_save_coupled.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=experiment/checkpoints/best_model.pt \
--data_path=/path/to/test_data \
--test_batch_size=4 \
--test_workers=4 \
--n_iters_per_batch=5

Computing Metrics

Given a trained model and generated outputs, we can compute the loss metrics on a given dataset.
These scripts receive the inference output directory and ground truth directory.

  • Calculating LPIPS loss:
python scripts/calc_losses_on_images.py \
--mode lpips \
--output_path=/path/to/experiment/inference_results \
--gt_path=/path/to/test_images
  • Calculating L2 loss:
python scripts/calc_losses_on_images.py \
--mode l2 \
--output_path=/path/to/experiment/inference_results \
--gt_path=/path/to/test_images
  • Calculating the identity loss for the human facial domain:
python scripts/calc_id_loss_parallel.py \
--output_path=/path/to/experiment/inference_results \
--gt_path=/path/to/test_images

These scripts will traverse through each sub-directory of output_path to compute the metrics on each step's output images.

Editing


Editing results using InterFaceGAN on inversions obtained using ReStyle-e4e.

For performing edits using ReStyle-e4e, you can run the script found in `editing/inference_editing.py`, as follows:
python editing/inference_editing.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/e4e_ffhq_encoder.pt \
--data_path=/path/to/test_data \
--test_batch_size=4 \
--test_workers=4 \
--n_iters_per_batch=5 \
--edit_directions=age,pose,smile \
--factor_ranges=5,5,5

This script will perform the inversion immediately followed by the latent space edit.
The results for each edit will be saved to different sub-directories in the specified experiment directory. For each image, we save the original image followed by the inversion and the resulting edits.
We support running inference using ReStyle-e4e models on the faces domain using edit several directions obtained via InterFaceGAN (age, pose, and smile).

Encoder Bootstrapping


Image toonification results using our proposed encoder bootstrapping technique.

In the paper, we introduce an encoder bootstrapping technique that can be used to solve the image toonification task by pairing an FFHQ-based encoder with a Toon-based encoder.
Below we provide the models used to generate the results in the paper:

Path Description
FFHQ - ReStyle + pSp Same FFHQ encoder as linked above.
Toonify - ReStyle + pSp ReStyle applied over pSp trained for the image toonification task.
Toonify Generator Toonify generator from Doron Adler and Justin Pinkney converted to Pytorch using rosinality's conversion script.

Note that the ReStyle toonify model is trained using only real images with no paired data. More details regarding the training parameters and settings of the toonify encoder can be found here.

If you wish to run inference using these two models and the bootstrapping technique you may run the following:

python scripts/encoder_bootstrapping_inference.py \
--exp_dir=/path/to/experiment \
--model_1_checkpoint_path=/path/to/restyle_psp_ffhq_encode.pt \
--model_2_checkpoint_path=/path/to/restyle_psp_toonify.pt \
--data_path=/path/to/test_data \
--test_batch_size=4 \
--test_workers=4 \
--n_iters_per_batch=1  # one step for each encoder is typically good

Here, we output the per-step outputs side-by-side with the inverted initialization real-image on the left and the original input image on the right.

Repository structure

Path Description
restyle-encoder Repository root folder
├  configs Folder containing configs defining model/data paths and data transforms
├  criteria Folder containing various loss criterias for training
├  datasets Folder with various dataset objects
├  docs Folder containing images displayed in the README
├  environment Folder containing Anaconda environment used in our experiments
├  licenses Folder containing licenses of the open source projects used in this repository
├ models Folder containing all the models and training objects
│  ├  e4e_modules Folder containing the latent discriminator implementation from encoder4editing
│  ├  encoders Folder containing various architecture implementations including our simplified encoder architectures
│  ├  mtcnn MTCNN implementation from TreB1eN
│  ├  stylegan2 StyleGAN2 model from rosinality
│  ├  psp.py Implementation of pSp encoder extended to work with ReStyle
│  └  e4e.py Implementation of e4e encoder extended to work with ReStyle
├  notebooks Folder with jupyter notebook containing ReStyle inference playground
├  options Folder with training and test command-line options
├  scripts Folder with running scripts for training, inference, and metric computations
├  training Folder with main training logic and Ranger implementation from lessw2020
├  utils Folder with various utility functions

Credits

StyleGAN2 model and implementation:
https://github.com/rosinality/stylegan2-pytorch
Copyright (c) 2019 Kim Seonghyeon
License (MIT) https://github.com/rosinality/stylegan2-pytorch/blob/master/LICENSE

IR-SE50 model and implementations:
https://github.com/TreB1eN/InsightFace_Pytorch
Copyright (c) 2018 TreB1eN
License (MIT) https://github.com/TreB1eN/InsightFace_Pytorch/blob/master/LICENSE

Ranger optimizer implementation:
https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
License (Apache License 2.0) https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer/blob/master/LICENSE

LPIPS model and implementation:
https://github.com/S-aiueo32/lpips-pytorch
Copyright (c) 2020, Sou Uchida
License (BSD 2-Clause) https://github.com/S-aiueo32/lpips-pytorch/blob/master/LICENSE

pSp model and implementation:
https://github.com/eladrich/pixel2style2pixel
Copyright (c) 2020 Elad Richardson, Yuval Alaluf
License (MIT) https://github.com/eladrich/pixel2style2pixel/blob/master/LICENSE

e4e model and implementation:
https://github.com/omertov/encoder4editing Copyright (c) 2021 omertov
License (MIT) https://github.com/omertov/encoder4editing/blob/main/LICENSE

Please Note: The CUDA files under the StyleGAN2 ops directory are made available under the Nvidia Source Code License-NC

Acknowledgments

This code borrows heavily from pixel2style2pixel and encoder4editing.

Citation

If you use this code for your research, please cite the following works:

@InProceedings{alaluf2021restyle,
      author = {Alaluf, Yuval and Patashnik, Or and Cohen-Or, Daniel},
      title = {ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement}, 
      month = {October},
      booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},  
      year = {2021}
}
@InProceedings{richardson2021encoding,
      author = {Richardson, Elad and Alaluf, Yuval and Patashnik, Or and Nitzan, Yotam and Azar, Yaniv and Shapiro, Stav and Cohen-Or, Daniel},
      title = {Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation},
      booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month = {June},
      year = {2021}
}
@article{tov2021designing,
      title={Designing an Encoder for StyleGAN Image Manipulation},
      author={Tov, Omer and Alaluf, Yuval and Nitzan, Yotam and Patashnik, Or and Cohen-Or, Daniel},
      journal={arXiv preprint arXiv:2102.02766},
      year={2021}
}

restyle-encoder's People

Contributors

brandozhang avatar chenxwh avatar donno2048 avatar johnsel avatar pizzaz93 avatar yuval-alaluf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

restyle-encoder's Issues

Dataset Horses and Churches for training and testing.

Hi,

I see in the paper, you prepare for the training and testing set of Horses dataset as follows:

  • Horses: From the LSUN Horse dataset, we randomly select 10,000 images to be used for training and 2,215 images used for evaluations.

So, could you release these image ids or image filenames of these sampled images for the reproducing purpose?

Thanks!

Encoder Bootstrapping

Hi @yuval-alaluf,

Thanks for your awesome project! I have a small question about the encoder bootstrapping.

I have a pre-trained StyleGAN2 model trained on FFHQ with 256x256 output resolution and fine-tuned on an anime dataset. Now, I am trying to train a ReStyle model from scratch using this model and I also used a FFHQ dataset with 256x256 resolution images for training and testing. Am I doing everything right, no anime dataset is needed for the training, only the pre-trained model and a dataset with real images?

Looking forward to your soonest answer!

Training Encoder On Xray Dataset

Hello,

First of all, thanks for your great work. I am trying to train the encoder on chest X-ray dataset. Although the results seems good, some details are missing, especially for medical sense. As can be seen from the example below, some important details such as cables are not recovered and this case is absolutely undesirable. By the way, results may seem pretty good for you but medical experts totally disagree :)

ex-res

I know that you recommend moco loss for non-facial domains. In the beginning of my experiment, I used moco loss with your recommended parameters and got similar results above. Then, I used DenseNet model, trained on X-ray dataset, for extracting features while calculating id loss. I thought it would be suitable to use id loss in this conditions. I gave different lambda values to id loss, namely 0.1, 0.2 and finally 0.5. However, the results did not change significantly. The cables and artifacts was not reconstructed just like in the example above. My question here is how I can get better results? Why my results did not improve after using feature extractor trained on x-ray dataset for calculating id loss? Can you give me any feedback for my method or suggest some other methods to improve my results?

The parameters are:

{
"batch_size": 8,
"board_interval": 50,
"checkpoint_path": null,
"d_reg_every": 16,
"dataset_type": "xray_encode",
"delta_norm": 2,
"delta_norm_lambda": 0.0002,
"encoder_type": "ResNetProgressiveBackboneEncoder",
"exp_dir": "/path/to/outdir",
"id_lambda": 0.5,
"image_interval": 100,
"input_nc": 6,
"l2_lambda": 1.0,
"learning_rate": 0.0001,
"lpips_lambda": 0.8,
"max_steps": 500000,
"moco_lambda": 0,
"n_iters_per_batch": 5,
"optim_name": "ranger",
"output_size": 256,
"progressive_start": null,
"progressive_step_every": 2000,
"progressive_steps": null,
"r1": 10,
"resume_training_from_ckpt": null,
"save_interval": 10000,
"save_training_data": false,
"start_from_latent_avg": true,
"stylegan_weights": "/path/to/stylegan2_model",
"sub_exp_dir": null,
"test_batch_size": 8,
"test_workers": 8,
"train_decoder": false,
"update_param_list": null,
"use_w_pool": false,
"val_interval": 1000,
"w_discriminator_lambda": 0,
"w_discriminator_lr": 2e-05,
"w_norm_lambda": 0.0,
"w_pool_size": 50,
"workers": 8
}

Thanks in advance...

Target path?

Hello. I wonder what does y variable means in the Coach as well as what is the target path for Dataloader.

Options does not change anything

Hello!
There is set of options available to change the encoder behavior. They got printed out when colab notebook is run:
{'batch_size': 8,
'board_interval': 50,
'checkpoint_path': '',
'dataset_type': 'ffhq_encode',
'device': 'cuda:0',
'encoder_type': 'BackboneEncoder',
'exp_dir': '',
'id_lambda': 0,
'image_interval': 100,
'input_nc': 6,
'l2_lambda': 1.0,
'learning_rate': 0.0001,
'lpips_lambda': 0,
'max_steps': 500000,
'moco_lambda': 0,
'n_iters_per_batch': 5,
'optim_name': 'ranger',
'output_size': 1024,
'save_interval': 10000,
'start_from_latent_avg': True,
'stylegan_weights': '',
'test_batch_size': 8,
'test_workers': 8,
'train_decoder': False,
'val_interval': 10000,
'w_norm_lambda': 0.0,
'workers': 8}

Apparently, that ones that supposed to change encoder behavior do not change anything. I tried
'encoder_type', 'id_lambda', 'l2_lambda', 'lpips_lambda', 'moco_lambda'
Encoder just provides the same output even with lpips, id and moco set to 0 at same time (used ffhq .pt) .

I tried to walk through source code and only found these options been mentioned in the encoder "train" section.
Does that mean we can't control encoder behavior? (I personally want to disable or lower lpips lambda)
What does "training an encoder " means? I just want to embed aligned image into stylegan2 latent space, do not need to "train" anything for it, or do I?

How can we have these options working? I am using colab notebook provided with code.

Problem with importing dlib

I have problem with installing dlib. It seems that dlib is incompatible with other packages of this repository (pytorch and cuda).
When I try to install dlib with pip install dlib, I get this error:

ERROR: Failed building wheel for dlib

I have also tried different conda installation for dlib, such as: conda install -c conda-forge dlib , conda install -c menpo dlib , conda install -c conda-forge/label/cf201901 dlib , conda install -c conda-forge/label/cf202003 dlib , but I get the following error:

Output in format: Requested package -> Available versions The following specifications were found to be incompatible with your system: feature:/linux-64::__glibc==2.17=0 feature:|@/linux-64::__glibc==2.17=0 Your installed version is: 2.17

I created a clean conda environment and right after installing python, I could easily install dlib with conda and I could import it correctly, but then I couldn't install pytorch, there were some incompatible packages. If I first install pytorch, then I cannot install dlib, and if I first install dlib, then I cannot install pytorch. The command that I use for installing pytorch is:

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch

The other packages that I use are: python=3.6, GCC 7.3.0, libgcc-ng=7.3, regex-2018.11.22, tqdm-4.62.2, ftfy-5.3.0, matplotlib-2.2.2, numpy-1.15.2. The conflict just happens between pytorch and dlib installations.

The installation of dlib doesn't exist in the conda environment link of this repository. In this code you explained the dlib installation separately. I'm wondering whether you used a different environment for that code or you have experienced the same error for installing dlib and pytorch in a same conda environment.

Thanks for sharing your fantastic work.

Gender Editing

Hello Yuval,

Thank you a lot for all of your work it amazing and fascinating to discover. I'm brand new in GAN universe (and even in ML universe) and i'm trying to apply gender editing on Restyle-Encoder.
Thank to your playbooks i already figured how get inference from a real picture, apply a transition between 2 inferences and lastly apply an age edition.
For my research I need now to apply a gender edition. As i understand it your project works with "ReStyle-e4e models [...] obtained via InterFaceGAN".
Can you give me some direction to succeed on my purpose ? Have I to train the .pt model by myself ? Is there somewhere others models already usable on Restyle ? Sorry if my question seems stupid, I'm a bit lost.

Thank you in advance !

It doesn't work.

Hi!
I entered the training code, but nothing happens.
When stopped, the following message appears:
image

Generate the image according to the latent code

Hello, after I realize the latent code editing, I want to get the edited image. I would like to ask whether it can be achieved by using this code. I would like to ask what my latent code format should be like.My current latent code format is (1, 18, 512) shape, numpy.ndarray, looking forward to your reply
104
my code about Generate image by latent code,I do the latent code editing based on StyleGan2

Can't import fuse on Mac M1

How Do I run On Mac M1 ?

when I have !export CUDA_VISIBLE_DEVICES=""

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-12-0ce4d6089834> in <module>
     13 
     14 from utils.common import tensor2im
---> 15 from models.psp import pSp
     16 from models.e4e import e4e
     17 

~/Downloads/Cartoon/restyle-encoder/models/psp.py in <module>
      6 from torch import nn
      7 
----> 8 from models.stylegan2.model import Generator
      9 from configs.paths_config import model_paths
     10 from models.encoders import fpn_encoders, restyle_psp_encoders

~/Downloads/Cartoon/restyle-encoder/models/stylegan2/model.py in <module>
      5 from torch.nn import functional as F
      6 
----> 7 from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d
      8 
      9 

~/Downloads/Cartoon/restyle-encoder/models/stylegan2/op/__init__.py in <module>
----> 1 from .fused_act import FusedLeakyReLU, fused_leaky_relu
      2 from .upfirdn2d import upfirdn2d

~/Downloads/Cartoon/restyle-encoder/models/stylegan2/op/fused_act.py in <module>
      7 
      8 module_path = os.path.dirname(__file__)
----> 9 fused = load(
     10     'fused',
     11     sources=[

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1077                 verbose=True)
   1078     '''
-> 1079     return _jit_compile(
   1080         name,
   1081         [sources] if isinstance(sources, str) else sources,

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1315         return _get_exec_path(name, build_directory)
   1316 
-> 1317     return _import_module_from_library(name, build_directory, is_python_module)
   1318 
   1319 

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _import_module_from_library(module_name, path, is_python_module)
   1697 def _import_module_from_library(module_name, path, is_python_module):
   1698     # https://stackoverflow.com/questions/67631/how-to-import-a-module-given-the-full-path
-> 1699     file, path, description = imp.find_module(module_name, [path])
   1700     # Close the .so file after load.
   1701     with file:

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/imp.py in find_module(name, path)
    294         break  # Break out of outer loop when breaking out of inner loop.
    295     else:
--> 296         raise ImportError(_ERR_MSG.format(name), name=name)
    297 
    298     encoding = None

ImportError: No module named 'fused'

when that variable is not exported,

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-8-0ce4d6089834> in <module>
     13 
     14 from utils.common import tensor2im
---> 15 from models.psp import pSp
     16 from models.e4e import e4e
     17 

~/Downloads/Cartoon/restyle-encoder/models/psp.py in <module>
      6 from torch import nn
      7 
----> 8 from models.stylegan2.model import Generator
      9 from configs.paths_config import model_paths
     10 from models.encoders import fpn_encoders, restyle_psp_encoders

~/Downloads/Cartoon/restyle-encoder/models/stylegan2/model.py in <module>
      5 from torch.nn import functional as F
      6 
----> 7 from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d
      8 
      9 

~/Downloads/Cartoon/restyle-encoder/models/stylegan2/op/__init__.py in <module>
----> 1 from .fused_act import FusedLeakyReLU, fused_leaky_relu
      2 from .upfirdn2d import upfirdn2d

~/Downloads/Cartoon/restyle-encoder/models/stylegan2/op/fused_act.py in <module>
      7 
      8 module_path = os.path.dirname(__file__)
----> 9 fused = load(
     10     'fused',
     11     sources=[

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1077                 verbose=True)
   1078     '''
-> 1079     return _jit_compile(
   1080         name,
   1081         [sources] if isinstance(sources, str) else sources,

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1290                             clean_ctx=clean_ctx
   1291                         )
-> 1292                     _write_ninja_file_and_build_library(
   1293                         name=name,
   1294                         sources=sources,

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _write_ninja_file_and_build_library(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_standalone)
   1379     if with_cuda is None:
   1380         with_cuda = any(map(_is_cuda_file, sources))
-> 1381     extra_ldflags = _prepare_ldflags(
   1382         extra_ldflags or [],
   1383         with_cuda,

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _prepare_ldflags(extra_ldflags, with_cuda, verbose, is_standalone)
   1487                 extra_ldflags.append(os.path.join(CUDNN_HOME, 'lib/x64'))
   1488         elif not IS_HIP_EXTENSION:
-> 1489             extra_ldflags.append(f'-L{_join_cuda_home("lib64")}')
   1490             extra_ldflags.append('-lcudart')
   1491             if CUDNN_HOME is not None:

/opt/homebrew/Caskroom/miniforge/base/envs/cartoon/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _join_cuda_home(*paths)
   1980     '''
   1981     if CUDA_HOME is None:
-> 1982         raise EnvironmentError('CUDA_HOME environment variable is not set. '
   1983                                'Please set it to your CUDA install root.')
   1984     return os.path.join(CUDA_HOME, *paths)

OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

When I tried to train ReStyle from a converted StyleGAN2 generator, I got:

Traceback (most recent call last):
  File "/restyle_encoder/scripts/train_restyle_psp.py", line 34, in <module>
    main()
  File "/restyle_encoder/scripts/train_restyle_psp.py", line 30, in main
    coach.train()
  File "/restyle_encoder/scripts/../training/coach_restyle_psp.py", line 140, in train
    y_hats, loss_dict, id_logs = self.perform_train_iteration_on_batch(x, y)
  File "/restyle_encoder/scripts/../training/coach_restyle_psp.py", line 125, in perform_train_iteration_on_batch
    loss.backward()
  File "/usr/local/lib/python3.7/dist-packages/torch/tensor.py", line 185, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 127, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

Issue with loading moco pretrained model

Hello!

Faced some issue when training my custom dataset with moco pretrained model. In your document, you posted moco_v2_800ep_pretrain.pt file, but the code tried to load moco_v2_800ep_pretrain.pth.tar

Not a big issue, but just in case!

GradualStyleEncoder

Hi!
Theoretically,GradualStyleEncoder in PSP is better than BackboneEncoder, have you tried the GradualStyleEncoder?

The model collapse when use moco loss

Hi Yuval,

When I train my datasets with the configuration you recommended, the model seems to collapse after just a few iterations(As shown below). However, when I set the moco lambda as 0.0001, there is no problem so far. So do you have any clue why that happens? Is there a way to disable the moco loss totally? (when I set the moco loss as 0, it seems you bind the id_logs with the moco_loss calculation)

image

how to train Encoder Bootstrapping?

In Encoder Bootstrapping, you get latent vector by FFHQ encoder instead of the average toon image.
Could I know how to train Encoder Bootstrapping?
I only find Encoder Bootstrapping inference scrip.
Looking forward to your reply.

failed to run colab notebook

Downloading ReStyle model for toonify...

ValueError Traceback (most recent call last)
in ()
4 # if google drive receives too many requests, we'll reach the quota limit and be unable to download the model
5 if os.path.getsize(EXPERIMENT_ARGS['model_path']) < 1000000:
----> 6 raise ValueError("Pretrained model was unable to be downloaded correctly!")
7 else:
8 print('Done.')

ValueError: Pretrained model was unable to be downloaded correctly!

Is current operation correct?

Hello again. Im trying your code (except I've chosed lucidrains GAN) to invert fingerprints (toy project, my first working GAN, publically available data). GAN works nice, but when trying your code Im getting only the shape of fingerprint right, whereas the pattern is completely non-natural. Max iterations I've tried were 25k. I use single-channel image (slight modifications to your code).
Example attached.
What can you suggest to improve the quality? I use l2, discriminator from GAN and LPIPS.

I have a code question about restyle-encoder/training/coach_restyle_psp.py

Hello. Thank you for releasing an awesome code and a paper for StyleGAN Encoder. I read Restyle paper very interestingly.

I have a question about your codes, at 107-th line of restyle-encoder/training/coach_restyle_psp.py

for iter in range(self.opts.n_iters_per_batch):
	if iter == 0:
		avg_image_for_batch = self.avg_image.unsqueeze(0).repeat(x.shape[0], 1, 1, 1)
		x_input = torch.cat([x, avg_image_for_batch], dim=1)
		y_hat, latent = self.net.forward(x_input, latent=None, return_latents=True)
	else:
		y_hat_clone = y_hat.clone().detach().requires_grad_(True)
		latent_clone = latent.clone().detach().requires_grad_(True)
		x_input = torch.cat([x, y_hat_clone], dim=1)
		y_hat, latent = self.net.forward(x_input, latent=latent_clone, return_latents=True)

	if self.opts.dataset_type == "cars_encode":
		y_hat = y_hat[:, :, 32:224, :]

	loss, loss_dict, id_logs = self.calc_loss(x, y, y_hat, latent)
	loss.backward()

In here, I am curious why we do loss.backward() at every iteration step. Isn't it enough only backward at the end of the last iteration(in the given args, at the 5-th iteration)?

Thank you for your attention, and interesting works again!\

How long for training?(time, and batch size)

Hi there,
Great work and thank you for sharing this great repository.
I've read issue #20, but still unclear about how long for the training in terms of time with specific batch size. Could you please describe a little bit more? I notice that the restyle-PSP training time is significantly longer than the PSP.
For instance, with batch 32 on a single RTX 8000, it takes 3.5 hours for 4k iteration, but for restyle-PSP, it takes almost 9.5 hours. I'm not sure if it's because of the recurrent architecture, and moco loss or some errors from my end?

Thanks.

Error when loading converted ada-pytorch model

Hello and thank you for sharing your fascinating work!

I'm trying to use restyle with a pretrained stylegan-ada-pytorch model. I followed the conversion script (thanks btw!) and have my .pt model file ready. Unfortunately, when I'm trying to run training using the following command

python scripts/train_restyle_psp.py --dataset_type=buildings --encoder_type=BackboneEncoder --exp_dir=experiment/restyle_psp_ffhq_encode --workers=8 --batch_size=8 --test_batch_size=8 --test_workers=8 --val_interval=5000 --save_interval=10000 --start_from_latent_avg --lpips_lambda=0.8 --l2_lambda=1 --w_norm_lambda=0 --id_lambda=0.1 --input_nc=6 --n_iters_per_batch=5 --output_size=512 --stylegan_weights=F:\Experimentation\Generative_models\GANs\StyleGAN2\pretrained_models\rosalinity\buildings_5kimg_upsampled.pt

I get the following error when loading the pretrained model

  File "scripts/train_restyle_psp.py", line 30, in <module>
    main()
  File "scripts/train_restyle_psp.py", line 25, in main
    coach = Coach(opts)
  File "F:\Experimentation\Generative_models\GANs\StyleGAN2\GAN_editing\restyle-encoder\training\coach_restyle_psp.py", line 31, in __init__
    self.net = pSp(self.opts).to(self.device)
  File "F:\Experimentation\Generative_models\GANs\StyleGAN2\GAN_editing\restyle-encoder\models\psp.py", line 25, in __init__
    self.load_weights()
  File "F:\Experimentation\Generative_models\GANs\StyleGAN2\GAN_editing\restyle-encoder\models\psp.py", line 52, in load_weights
    self.decoder.load_state_dict(ckpt['g_ema'], strict=True)
  File "C:\Users\user\miniconda3\envs\archelites\lib\site-packages\torch\nn\modules\module.py", line 1052, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Generator:
        Missing key(s) in state_dict: "style.3.weight", "style.3.bias", "style.4.weight", "style.4.bias", "style.5.weight", "style.5.bias", "style.6.weight", "style.6.bias", "style.7.weight", "style.7.bias", "style.8.weight", "style.8.bias".```

Any idea why I get a mismatch here? Thanks!

Question about Appendix D "Analyzing the Toonify Latent Space"

Would you be able to provide a few additional details about how the toonify StyleGAN2 generator was fine-tuned to get the output in Figure 17, e.g., did the latent spaces of both generators seem "aligned" for all fine-tuning checkpoints you checked? Would you be able to share the code used for creating Figure 17? Thanks!

How to apply a single encoding iteration

Hi, thanks for the very diligent work all you guys are doing in the lab

I'm a little bit confused by all the various wrapping functions and the many flags in the forward pass of the psp and e4e models ("input_code", "latent", etc.).

If my goal is to simply apply a single iteration of the encoder by sending it a target image and a current guess image (and nothing else) and receive as output the w offset that is needed to slightly make the next guess better (and nothing else).
i.e. something like:

delta_w = encoder_single_iteration(I_target, I_curr_estimate)

How should that function look like?
For simplicity let's assume both images have already underwent the appropriate "img_transforms" and are torch tensors of shape (3,256,256)

I believe this simpler interface to the encoder/refiner might also be useful for other researchers, so perhaps worth having an example in the notebook?

Thanks again for all the good work!

Edit ReStyle’s inversion by InterFaceGAN

In ReStyle’s paper Figure 8, you shown edit ReStyle’s inversion by InterFaceGAN .

did you edit latent in w_plus latent space by used the age weight in w latent space of InterFaceGAN ?

for example :

latent_codes = np.load('boundaries/stylegan_ffhq_age_w_boundary.npy')
ws = torch.from_numpy(latent_codes).type(torch.FloatTensor)
ws = ws.to(self.run_device)
wps = self.model.truncation(ws)
results['wp'] = self.get_value(wps)

or you trained a new age weight in w_plus latent space of InterFaceGAN ?

I want to edit real image by ReStyle and InterFaceGAN. I think your inversion method is better than their method !

Invalid Syntax

Think I should have everything set up correct but I get an invalid syntax when I run

python scripts/train_restyle_psp.py
--dataset_type=fashion
--encoder_type=ResNetBackboneEncoder
--exp_dir=experiment/restyle_psp_ffhq_encode
--workers=8
--batch_size=8
--test_batch_size=8
--test_workers=8
--val_interval=5000
--save_interval=10000
--start_from_latent_avg
--lpips_lambda=0.8
--l2_lambda=1
--w_norm_lambda=0
--id_lambda=0.1
--input_nc=6
--n_iters_per_batch=5
--output_size=256
--stylegan_weights=pretrained_models/stylegan2-ffhq-config-f.pt

Error when launching inference

Hello.

Thanks for your code! I'm currently experiencing some troubles running inference. I'm using conda environment and cuda 10.1 is installed on the system, Ubuntu 20. Videocard is a mobile rtx 2070. SO, when launching your inference_iterative.py i'm getting following error.

"Traceback (most recent call last):
File "/home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1030, in _build_extension_module
check=True)
File "/home/daddywesker/anaconda3/envs/torch/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/daddywesker/SingularityNet/MixingImages/restyle-encoder/scripts/inference_iterative.py", line 16, in
from models.psp import pSp
File "../models/psp.py", line 8, in
from models.stylegan2.model import Generator
File "../models/stylegan2/model.py", line 7, in
from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d
File "../models/stylegan2/op/init.py", line 1, in
from .fused_act import FusedLeakyReLU, fused_leaky_relu
File "../models/stylegan2/op/fused_act.py", line 13, in
os.path.join(module_path, 'fused_bias_act_kernel.cu'),
File "/home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 661, in load
is_python_module)
File "/home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 830, in jit_compile
with_cuda=with_cuda)
File "/home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 883, in write_ninja_file_and_build
build_extension_module(name, build_directory, verbose)
File "/home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1043, in build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'fused': [1/3] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/TH -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/THC -isystem /home/daddywesker/anaconda3/envs/torch/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS
-D__CUDA_NO_HALF_CONVERSIONS
-D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++11 -c /home/daddywesker/SingularityNet/MixingImages/restyle-encoder/models/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/TH -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/THC -isystem /home/daddywesker/anaconda3/envs/torch/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++11 -c /home/daddywesker/SingularityNet/MixingImages/restyle-encoder/models/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
ERROR: No supported gcc/g++ host compiler found, but clang-8 is available.
Use 'nvcc -ccbin clang-8' to use that instead.
[2/3] c++ -MMD -MF fused_bias_act.o.d -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/TH -isystem /home/daddywesker/anaconda3/envs/torch/lib/python3.7/site-packages/torch/include/THC -isystem /home/daddywesker/anaconda3/envs/torch/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++11 -c /home/daddywesker/SingularityNet/MixingImages/restyle-encoder/models/stylegan2/op/fused_bias_act.cpp -o fused_bias_act.o
ninja: build stopped: subcommand failed."Process finished with exit code 1"

Any suggestions? I've already tried to reinstall conda env, cuda... Searching through the internet to understand where i'm wrong, but got nothing yet. Thanks in advance.

Image-to-Image Translation using ReStyle

Hey!
First of all, great work!
I just wanted to ask whether there is any documentation for trying the ReStyle Encoder for Image-to-Image Translation?
I am working on generating real images from sketches in a non-facial domain and have already tried the vanilla psp image-to-image translation pipeline.
I saw your comment here saying that the ReStyle Encoder is better suited for non-facial domain and thus wanted to try it.
I have already tried setting the source and the target in the data config as the folder to sketch and real images respectively. It generates images as attached below. Shouldn't the input here be a sketch image?
1800
Thanks in advance!

StyleGAN 3 support?

Hello!
Any chance of NVLabs official Stylegan3 support? Specifically, it generates new config-T and config-R models, which would be awesome to use to train ReStyle on!
Also, there is an BSRGAN approach that seems to make everything better: it downgrades quality of input images in different realistic ways, which already helped to improve few GAN's. What I assume, if ReStyle would be trained to encode not only perfect images, but also their variations of degraded quality, then trained ReStyle will be less sensitive to image quality itself, but more about image actual contents, or am I totally wrong?

How long for training?

Hi,

This work is interesting!

I wonder how long for training your models to reproduce the results in the paper?

Thanks!

Head cropping

Hi! Is it possible to increase the size of the image in order to avoid head cropping?
Untitled-2

Restyle doesnt work anymore after running StyleGAN Nada

After getting restyle to work, I had much fun using it for a few months.

Then I heard about StyleGAN Nada and wanted to try it out... so I downloaded the repo and tried it out... first time running it worked but after that it didnt anymore... I did not install anything new and there arent any error messages.

The problem is that both restyle and nada (which depends on restyle) dont work anymore because they are stuck loading both the pSp and the e4e encoder inside the models folder. After a bit of research I found out that they are stuck loading the StyleGan2 Generator which is stuck at importing the FusedLeakyReLU, fused_leaky_relu and upfirdn2d. It doesnt crash/throw any errors and seems to load forever.

I am pretty frustrated right now because I cant use restyle anymore...

Does anyone know why this happens and how I can fix it?

错误 错误 error

Why did I make this mistake?
subprocess.CalledProcessError: Command '['where', 'cl']' returned non-zero exit status 1.

how to interpolate latents?

how to interpolate latents.

If I have two latents, how do I interpolate them? to have an intermediate face.

Improving toonification result

Hi,
I was wondering what can we do to improve the toonification result. I tested with Encoder Bootstrapping method, Using the following command :
python scripts/encoder_bootstrapping_inference.py --exp_dir=./toonify --model_1_checkpoint_path=./pretrained/restyle_psp_ffhq_encode.pt --model_2_checkpoint_path=./pretrained/restyle_psp_toonify.pt --data_path=./test/test_A --test_batch_size=1 --test_workers=1 --n_iters_per_batch=1

I get decent results, But would like to make it look more like the input image.

A sample of result I am getting.

emma_stone

Question about training restyle on a different dataset

Hello again.

Recently, I've decided to train your Restyle on another dataset. Question is, do i need to train StyleGan or any other model your Restyle is using on the same dataset?

Thanks in advance for the answer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.