airi-institute / stylefeatureeditor Goto Github PK

Official Implementation for "The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing"

License: MIT License

Python 6.89% Cuda 0.81% Shell 0.01% C++ 0.17% Jupyter Notebook 92.13%

cvpr2024 face-editing generative-adversarial-networks image-editing image-processing stylegan2

stylefeatureeditor's Introduction

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing (CVPR 2024)

The task of manipulating real image attributes through StyleGAN inversion has been extensively researched. This process involves searching latent variables from a well-trained StyleGAN generator that can synthesize a real image, modifying these latent variables, and then synthesizing an image with the desired edits. A balance must be struck between the quality of the reconstruction and the ability to edit. Earlier studies utilized the low-dimensional W-space for latent search, which facilitated effective editing but struggled with reconstructing intricate details. More recent research has turned to the high-dimensional feature space F, which successfully inverses the input image but loses much of the detail during editing. In this paper, we introduce StyleFeatureEditor -- a novel method that enables editing in both w-latents and F-latents. This technique not only allows for the reconstruction of finer image details but also ensures their preservation during editing. We also present a new training pipeline specifically designed to train our model to accurately edit F-latents. Our method is compared with state-of-the-art encoding approaches, demonstrating that our model excels in terms of reconstruction quality and is capable of editing even challenging out-of-domain examples.

SFE is able to edit a real face image with the desired editing. It first reconstructs (inverts) the original image and then edits it according to the chosen direction. On the left is an examples of how our method works for several directions with different editing power p. On the right we display a comparison with previous approaches. LPIPS (lower is better) indicates inversion quality, while FID (lower is better) indicates editing ability. The size of markers indicates the inference time of the method, with larger markers indicating a higher time.

Updates

18.06.2024: StyleFeatureEditor release
15.07.2024: Add gradio demo
20.07.2024: Add DeltaEdit editings
02.08.2024: Add image unalignment

Getting Started

Prerequisites

Linux or macOS
NVIDIA GPU + CUDA CuDNN
CMAKE
Python 3.10

Installation

Clone this repo:

git clone https://github.com/AIRI-Institute/StyleFeatureEditor
cd StyleFeatureEditor

Install the environment:

Step 1, create new conda environment:

conda create -n sfe python=3.10 -y
source deactivate
conda activate sfe

Step 2, install all necessary libraries via script:

bash env_install.sh

Download pretrained models:

git clone https://huggingface.co/AIRI-Institute/StyleFeatureEditor
cd StyleFeatureEditor && git lfs pull && cd ..
mv StyleFeatureEditor/pretrained_models pretrained_models
rm -rf StyleFeatureEditor

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models. However, you can use your own paths by changing the necessary values in configs/paths.py.

Download full weights [optional]:

Weights of Inverter (result of 1 phase) and and Feature Editor (result of 2 phase) stored in pretrained_models/sfe_inverter_light.pt and pretrained_models/sfe_editor_light.pt respectively. If you need full checkpoints including weights of all parts of our pipeline (discriminator, optimisers, etc.), you can download them manually from Google Drive:

Path	Description
SFE Editor	SFE trained both phases on FFHQ dataset.
SFE Inverter	SFE Inverter trained on FFHQ dataset first phase only.

Inference

Examples of how our method works on several real images. You can find inference of these examples in our Google Colab notebook below.

Inference Notebook

We provide a Jupiter Notebook that demonstrates the workings of our method. It includes downloading all the necessary components, running our method on several examples and creating a gif.

Inference single sample

If you need to edit single or several images, you can use SimpleRunner from runners/simple_runner.py. You need to initialize it with the path to the sfe checkpoints. To edit the image you need to use .edit() method and pass path to the input image, name of desired editing, power of desired editing and path where to save edited image:

from runners.simple_runner import SimpleRunner

runner = SimpleRunner(
    editor_ckpt_pth="pretrained_models/sfe_editor_light.pt"
)

# Inference
result = runner.edit(
    orig_img_pth="path/to/original/image.jpg",
    editing_name="editing_name",
    edited_power=1.0,
    save_pth="path/to/save/edited/image.jpg",
    align=False
)

You can find all available directions in available_directions.txt or by running:

print(runner.available_editings())

Alignment

If you want to edit raw image, do not forget to align it and resize it to 1024 x 1024 by passing align=True to runner.edit(...). Alignment means that the face is cropped from the original image. If you are using SimpleRunner, the edited image is automatically inserted into the original one, and can be found in the save_pth parent directory with postscript "_unaligned".

Masking

If during editing some artefacts appear on the background, or wrong parts are being edited to avoid this you could use image mask to choose, which regions of the image should be edited -- just pass use_mask=True in runner.edit(...). By default we use FARL to separate the face zone from the background and leave the background unedited. You can control which part of the image counts as a background by passing mask_trashold=0.35 in runner.edit(...) -- the more mask_trashold, the more is background part.

result = runner.edit(
    orig_img_pth="path/to/original/image.jpg",
    editing_name="editing_name",
    edited_power=1.0,
    save_pth="path/to/save/edited/image.jpg",
    use_mask=True,
    mask_trashold=0.995
)

After using default masker, it saves the obtained mask, cropped face, and cropped background in directory where save_pth is stored. If you need some specific regions to be unedited, you could pass your own mask. Just specify the path to it by passing mask_path="path/to/mask.jpg" in runner.edit(...):

result = runner.edit(
    orig_img_pth="path/to/original/image.jpg",
    editing_name="editing_name",
    edited_power=1.0,
    save_pth="path/to/save/edited/image.jpg",
    use_mask=True,
    mask_path="path/to/mask.jpg"
)

Script

You could also use a script with the same syntaxis:

python scripts/simple_inference.py \
    --orig_img_pth=path/to/original/image.jpg \
    --editing_name=editing_name \
    --edited_power=1.0 \
    --save_pth=path/to/save/edited/image.jpg \
    --align \
    --use_mask \
    --mask_trashold=0.995 \
    --mask_path=path/to/mask.jpg

Inference dataset

If you need to inference a large set of images rather than a single image, you can use scripts/inference.py.

First, you need to select powers and directions you want to infer and pass them to configs/fse_inference.yaml as editings_data argument (json dict-like format as in original config). Then you need to run a script:

python scripts/inference.py \
    exp.config_dir=configs \
    exp.config=fse_inference.yaml \
    model.checkpoint_path="path/to/sfe/checkpoint" \
    data.inference_dir="path/to/input/dir" \
    exp.output_dir="path/where/to/save/results"

Remember that the input data should be aligned. If you are using a custom dataset (not FFHQ or CelebaHQ), do not forget to align it first.

Metrics calculation

Inversion metrics

To calculate inversion metrics, you could use the script scripts/calculate_metrics.py:

python scripts/calculate_metrics.py \
    --orig_path="path/to/original/aligned/images/dir" \
    --reconstr_path="path/to/reconstructed/images/dir" \
    --metrics fid l2 lpips

Available metrics are l2, lpips, fid, id, id_vit and msssim, more details can be found in metrics/metrcis.py. Metric names in --metrics should be separated by spaces. If you need to save information about metric values of particular images, you can add --metrics_dir "path/where/to/save/metrics" to arguments, this information will be saved in json format.

Editing metric

To calculate editing metric (described in the paper) we assume that you have a dataset\subset of original CelebaHQ Images and its edited version (e.g. obtained by running scrpits/inference.py of our method). You will need to use the following script:

python scripts/fid_calculation.py \
    --orig_path="path/to/original/celeba/images/dir" \
    --synt_path="path/to/edited/celeba/images/dir" \
    --attr_name=Eyeglasses

Attribute name should be one of the names listed in the CelebAMask-HQ-attribute-anno.txt. If the selected attribute was not added but removed during editing, pass --attr_is_reversed flag.

Training

Configs

We use OmegaConf package to manage our configs. All configs can be found in configs/ directory. You can change them according to the lists of all arguments, stored in arguments/ directory. In addition, if you are using the script, you can change arguments directly on the command line (see examples below in section Scripts' ).

Experiments start

For each experiment you need to pass the path to the config directory exp.config_dir, the name of the .yaml config exp.config and the name of the experiment exp.name. The directory associated with exp.name will be created in exp.exp_dir and all necessary results will be stored in it.

You will also need to pass path to the datasets. Pass path to the training dataset via data.input_train_dir, path to the validation images via data.input_train_dir. All inversion metrics will be calculated on the validation dataset. When using custom datasets, remember that all data should be aligned.

To track our experiments we use Weights & Biases (option exp.wandb which is True by default). It will log repository code (at the start of the training), passed config, metrics, losses and inversion of several selected aligned images (you need to pass path to them in data.special_dir). If you are using W&B, do not forget to put your W&B API key into the WANDB_KEY environment variable.

To reimplement results of our paper, you could use default configs from configs/.

Scripts

Stage 1 This stage is related to training Inverter. To start stage use:

python3 scripts/train.py \
    exp.config_dir=configs \
    exp.config=fse_inverter_train.yaml \
    exp.name=fse_inverter_train \
    data.input_train_dir=path/to/train/images \
    data.input_val_dir=path/to/validation/images \
    data.special_dir=path/to/several/special/images

Stage 2 This stage is related to training Feature Editor. To start stage use:

python3 scripts/train.py \
    exp.config_dir=configs \
    exp.config=fse_editor_train.yaml \
    exp.name=fse_editor_train \
    methods_args.fse_full.inverter_pth=path/to/trained/inverter.pt \
    data.input_train_dir=path/to/train/images \
    data.input_val_dir=path/to/validation/images \
    data.special_dir=path/to/several/special/images \
    train.start_step=300001

If you are using W&B, it is better to pass train.start_step according to the last training step of Inverter to get a better visualisation of the inversion metrics.

Method diagrams

Training stage 1

The Inverter training pipeline. Input image $X$ is passed to Feature-Style-like backbone that predicts $w \in W^+$ and $F_{pred} \in \mathcal{F}_k$. Then $F_w = G(w_{0:k})$ is synthesized and passed with $F_{pred}$ to the Fuser that predicts $F_k$. Inversion $\widehat{X} = G(F_k, w_{k+1:N})$ is generated. Additional reconstruction $\widehat{X}_w = G(w_{0:N})$ is synthesized from w-latents only. Loss is calculated for pairs $(X, \widehat{X})$ and $(X, \widehat{X}_w)$

Training stage 2 and Inference

The Feature Editor training pipeline and inference. To obtain editing loss, one need to synthesize training samples: $X_{E}$ -- training input, and $X_{E}'$ -- training target. The pre-trained encoder $E$ takes the real image $X$ and predicts $w_{E} \in W^+$. Edited direction $d \in \mathcal{D}$ is randomly sampled, after which $w_E$ is edited to $w_E' = w_E + d$. Image $X_{E}$ and intermediate features $F_{w_{E}}$ are synthesized from $w_{E}$, while $X_{E}'$ and $F_{w_{E}'}$ are synthesized from $w_{E}'$ via generator $G$. $X_{E}$ is used as input and passed to frozen Inverter $I$ that predicts $F_k$ and $w$ that is edited to $w'$ according sampled $d$. Then $\Delta$ is calculated, and Feature Editor $H$ edits $F_k$ according $\Delta$. The edited reconstruction $\widehat{X}_{E}'$ is synthesized from $F_k'$ and $w_{k+1:N}'$. Editing loss is calculated between $X_{E}'$ and $\widehat{X}_{E}'$. To obtain the inversion loss, the real image $X$ is passed to $I$ that predicts $w$ and $F_k$, $F_k$ is edited to $F_k'$ by $H$ with $\Delta = 0$. The inversion $\widehat{X}$ is synthesized from $F_k'$ and $w_{k+1:N}$. The Inversion loss is calculated between $X$ and $\widehat{X}$. Inference pipeline is the same as synthesizing $\widehat{X}_{E}'$ but with the assumption that $I$ takes real image $X$ instead of $X_E$.

Hierarchy of our training class

🏛️Training Runner                        # Training Runner responsible  for ...
  ├── 🔧 _setup_device(...)                # Setting pipeline device
  ├── 🔧 _setup_experiment_dir(...)        # Setting directory to save checkpoints
  ├── 🔧 _setup_datasets(...)              # Setting train\val\special datasets
  ├── 🔧 _setup_dataloaders(...)           # Setting train\val\special loaders
  ├── 🔧 run(...)                          # Training loop, responsible  for ...
  │   ├── 🔧 train_step(...)                  # Model forward, loss calulation, optimizer step and etc.
  │   ├── 🔧 validate(...)                    # Metrics calculation, inference special images
  │   ├── 🔧 save_..._logs(...)               # Saving training\validation logs
  │   └── 🔧 save_checkpoint(...)             # Saving models, optimizers chekpoints
  │
  ├── 🏛️ Logger                            # Gather all training logs
  ├── 🏛️ Metrics                           # Inversion metrics to validate 
  ├── 🏛️ Optimizers                        # Encoder and Discriminator optimizers
  ├── 🏛️ LossBuilder                       # Contain all losses used for training
  ├── 🏛️ LatentEditor                      # Latent Editor
  │   ├── 🏛️ Editing models                   # Contain all models for editing
  │   └── 🔧 get_[...]_editings(...)          # Editing particular directions for [...] editing method
  │
  └── 🏛️ Method                            # Method
      ├── 🔧 load_weights(...)                # Responsible for loading checkpoints
      ├── 🔧 forward(...)                     # Responsible for batch inversion via Inverter
      ├── 🏛️ Discriminator                    # StyleGAN 2 Discriminator, trainable for adv loss
      ├── 🏛️ Decoder                          # StyleGAN 2 Generator, not trainable
      ├── 🏛️ Encoder                          # Trainable part, either Inverter or Feture Editor
      ⠓⠒⠒🏛️ Inverter                         # Pretrained module used only in second stage

Repository structure

  .
  ├── 📂 arguments                  # Contains all arguments used in training and inference
  ├── 📂 assets                     # Folder with method preview and example images
  ├── 📂 configs                    # Includes configs (associated with arguments) for training and inference
  ├── 📂 criteria                   # Contains original code for used losses and metrics
  ├── 📂 datasets                   
  │   ├── 📄 datasets.py                # A branch of custom datasets 
  │   ├── 📄 loaders.py                 # Custom infinite loader
  │   └── 📄 transforms.py              # Transforms used in SFE
  │
  ├── 📂 editings                   # Includes original code for various editing methods and an editor that applies them
  │   ├── ...
  │   └── 📄 latent_editor.py           # Implementation of module that edits w or stylespace latents 
  │
  ├── 📂 metrics                    # Contains wrappers over original code for all used inversion metrics
  ├── 📂 models                     # Includes original code from several previous inversion methods 
  │   ├── ...
  │   ├── 📂 farl                       # Modified FARL module, used to search face mask
  │   ├── 📂 psp
  │   │   ├── 📂 encoders                   # Contains all the Inverter, Feature Editor and E4E parts
  │   │   └── 📂 stylegan2                  # Includes modified StyleGAN 2 generator 
  │   │ 
  │   └── 📄 methods.py                  # Contains code for Inverter and Feature Editor modules
  │   
  ├── 📂 notebook                   # Folder for Jupyter Notebook and raw images
  ├── 📂 runners                    # Includes main code for training and inference pipelines
  ├── 📂 scripts                    # Script to ...
  │   ├── 📄 align_all_parallel.py       # Align raw images 
  │   ├── 📄 calculate_metrics.py        # Inversion metrics calculation
  │   ├── 📄 fid_calculation.py          # Editing metric calculation
  │   ├── 📄 inference.py                # Inference large set of data with several directions
  │   ├── 📄 simple_inference.py         # Inference single image with one direction and mask
  │   └── 📄 train.py                    # Start training process 
  │   
  ├── 📂 training                   
  │   ├── 📄 loggers.py                  # Code for loggers used in training
  │   ├── 📄 losses.py                   # Wrappers over used losses  
  │   └── 📄 optimizers.py               # Wrappers over used optimizers 
  │   
  ├── 📂 utils                                # Folder with utility functions
  ├── 📜 CelebAMask-HQ-attribute-anno.txt     # Matches between CelebA HQ images and attributes
  ├── 📜 available_directions.txt             # Info about available editings directions
  ├── 📜 requirements.txt                     # Lists required Python packages
  └── 📜 env_install.sh                       # Script to install necessary enviroment

References & Acknowledgments

The code structure of this repository is heavily based on pSp and e4e.

The project has also been inspired by a number of existing inversion techniques, using the source code of several prominent examples. These include HyperInverter, FeatureStyleEncoder and StyleRes.

Citation

If you use this code for your research, please cite our paper:

@InProceedings{Bobkov_2024_CVPR,
    author    = {Bobkov, Denis and Titov, Vadim and Alanov, Aibek and Vetrov, Dmitry},
    title     = {The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {9337-9346}
}

stylefeatureeditor's People

Contributors

Stargazers

Watchers

Forkers

techthiyanes slysets hubin858130 lijingle-coder macderru shubham303 shartoo digitalpotter ctk2156

stylefeatureeditor's Issues

Question about 'sfe_editor_light.pt'

Thanks for releasing code!
I want to run the code to edit my images, but some error reported.
First, I create a simple_demo.py in runner folder, content as:

import sys
sys.path = ['.'] + sys.path
from simple_runner import SimpleRunner

runner = SimpleRunner( editor_ckpt_pth="pretrained_models/sfe_editor_light.pt" )
print(runner.available_editings())

then I run the file, it shows:
File "/home/ubuntu/miniconda3/envs/recon/lib/python3.11/dataclasses.py", line 815, in _get_field raise ValueError(f'mutable default {type(f.default)} for field ' ValueError: mutable default <class 'configs.paths.DefaultPathsClass'> for field paths is not allowed: use default_factory

I modify dataclasses.field(default=v.default) to dataclasses.field(default_factory=v.default) in class_registry.py, line 29.

And another error reported as:
Loading default Discriminator from pretrained_models/stylegan2-ffhq-config-f.pkl
Loading from checkpoint: pretrained_models/sfe_editor_light.pt
Traceback (most recent call last):
File "/home/ubuntu/Documents/StyleFeatureEditor-main/runners/simple_demo.py", line 3, in <module> from simple_runner import SimpleRunner
File "/home/ubuntu/Documents/StyleFeatureEditor-main/runners/simple_runner.py", line 188, in <module> runner = SimpleRunner(
^^^^^^^^^^^^^
File "/home/ubuntu/Documents/StyleFeatureEditor-main/runners/simple_runner.py", line 86, in __init__ self.inference_runner.setup()
File "/home/ubuntu/Documents/StyleFeatureEditor-main/runners/base_runner.py", line 25, in setup self._setup_method()
File "/home/ubuntu/Documents/StyleFeatureEditor-main/runners/base_runner.py", line 93, in _setup_method self.method = methods_registry[method_name]( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/Documents/StyleFeatureEditor-main/models/methods.py", line 52, in __init__ self.load_weights()
File "/home/ubuntu/Documents/StyleFeatureEditor-main/models/methods.py", line 73, in load_weights self.discriminator.load_state_dict(get_keys(ckpt, "discriminator"), strict=True)
File "/home/ubuntu/miniconda3/envs/recon/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Discriminator: Missing key(s) in state_dict: "b1024.resample_filter", "b1024.fromrgb.weight", "b1024.fromrgb.bias", "b1024.fromrgb.resample_filter", "b1024.conv0.weight", "b1024.conv0.bias", "b1024.conv0.resample_filter", "b1024.conv1.weight", "b1024.conv1.bias", "b1024.conv1.resample_filter", "b1024.skip.weight", "b1024.skip.resample_filter", "b512.resample_filter", "b512.conv0.weight", "b512.conv0.bias", "b512.conv0.resample_filter", "b512.conv1.weight", "b512.conv1.bias", "b512.conv1.resample_filter", "b512.skip.weight", "b512.skip.resample_filter", "b256.resample_filter", "b256.conv0.weight", "b256.conv0.bias", "b256.conv0.resample_filter", "b256.conv1.weight", "b256.conv1.bias", "b256.conv1.resample_filter", "b256.skip.weight", "b256.skip.resample_filter", "b128.resample_filter", "b128.conv0.weight", "b128.conv0.bias", "b128.conv0.resample_filter", "b128.conv1.weight", "b128.conv1.bias", "b128.conv1.resample_filter", "b128.skip.weight", "b128.skip.resample_filter", "b64.resample_filter", "b64.conv0.weight", "b64.conv0.bias", "b64.conv0.resample_filter", "b64.conv1.weight", "b64.conv1.bias", "b64.conv1.resample_filter", "b64.skip.weight", "b64.skip.resample_filter", "b32.resample_filter", "b32.conv0.weight", "b32.conv0.bias", "b32.conv0.resample_filter", "b32.conv1.weight", "b32.conv1.bias", "b32.conv1.resample_filter", "b32.skip.weight", "b32.skip.resample_filter", "b16.resample_filter", "b16.conv0.weight", "b16.conv0.bias", "b16.conv0.resample_filter", "b16.conv1.weight", "b16.conv1.bias", "b16.conv1.resample_filter", "b16.skip.weight", "b16.skip.resample_filter", "b8.resample_filter", "b8.conv0.weight", "b8.conv0.bias", "b8.conv0.resample_filter", "b8.conv1.weight", "b8.conv1.bias", "b8.conv1.resample_filter", "b8.skip.weight", "b8.skip.resample_filter", "b4.conv.weight", "b4.conv.bias", "b4.conv.resample_filter", "b4.fc.weight", "b4.fc.bias", "b4.out.weight", "b4.out.bias".

I'm trying to print some information in methods.py, line 72, it shows:
ipdb> self.encoder.load_state_dict(get_keys(ckpt, "encoder"), strict=True)
<All keys matched successfully>
ipdb> self.inverter.load_state_dict(get_keys(ckpt, "inverter"), strict=True)
<All keys matched successfully>
ipdb> self.discriminator.load_state_dict(get_keys(ckpt, "discriminator"), strict=True)
*** RuntimeError: Error(s) in loading state_dict for Discriminator: Missing key(s) in state_dict: "b1024.resample_filter", "b1024.fromrgb.weight", "b1024.fromrgb.bias", "b1024.fromrgb.resample_filter", "b1024.conv0.weight", "b1024.conv0.bias", "b1024.conv0.resample_filter", "b1024.conv1.weight", "b1024.conv1.bias", "b1024.conv1.resample_filter", "b1024.skip.weight", "b1024.skip.resample_filter", "b512.resample_filter", "b512.conv0.weight", "b512.conv0.bias", "b512.conv0.resample_filter", "b512.conv1.weight", "b512.conv1.bias", "b512.conv1.resample_filter", "b512.skip.weight", "b512.skip.resample_filter", "b256.resample_filter", "b256.conv0.weight", "b256.conv0.bias", "b256.conv0.resample_filter", "b256.conv1.weight", "b256.conv1.bias", "b256.conv1.resample_filter", "b256.skip.weight", "b256.skip.resample_filter", "b128.resample_filter", "b128.conv0.weight", "b128.conv0.bias", "b128.conv0.resample_filter", "b128.conv1.weight", "b128.conv1.bias", "b128.conv1.resample_filter", "b128.skip.weight", "b128.skip.resample_filter", "b64.resample_filter", "b64.conv0.weight", "b64.conv0.bias", "b64.conv0.resample_filter", "b64.conv1.weight", "b64.conv1.bias", "b64.conv1.resample_filter", "b64.skip.weight", "b64.skip.resample_filter", "b32.resample_filter", "b32.conv0.weight", "b32.conv0.bias", "b32.conv0.resample_filter", "b32.conv1.weight", "b32.conv1.bias", "b32.conv1.resample_filter", "b32.skip.weight", "b32.skip.resample_filter", "b16.resample_filter", "b16.conv0.weight", "b16.conv0.bias", "b16.conv0.resample_filter", "b16.conv1.weight", "b16.conv1.bias", "b16.conv1.resample_filter", "b16.skip.weight", "b16.skip.resample_filter", "b8.resample_filter", "b8.conv0.weight", "b8.conv0.bias", "b8.conv0.resample_filter", "b8.conv1.weight", "b8.conv1.bias", "b8.conv1.resample_filter", "b8.skip.weight", "b8.skip.resample_filter", "b4.conv.weight", "b4.conv.bias", "b4.conv.resample_filter", "b4.fc.weight", "b4.fc.bias", "b4.out.weight", "b4.out.bias".

It seems something wrong with get_keys(ckpt, "discriminator"). However, get_keys(ckpt, "encoder") and get_keys(ckpt, "inverter") work well.

May I know the md5 of sfe.editor_light.pt? I want to check the model download completely or not.

how to improve result become blurry?

result become blurry

Celeba HQ Test set

Hi there,

I'm working on a project that requires the test set of the CelebA-HQ dataset. Could you please provide information on how to obtain the list of filenames for this set?

Ideally, a text file containing the filenames would be helpful. If this isn't possible, could you share a link to where this information can be found?

Thanks in advance.

Replicating Metrics

Hi, Thanks for releasing the code!

I'm encountering difficulties replicating the metrics reported in the main document. I'm wondering if the mask=True flag is used during the inference step when calculating these metrics.

Specifically, are the metrics calculated using Metric(Mask * I, Mask * I'), where Mask is applied to both input images before computing the Metric?

Thanks,

Time to train Inverter and Feature Editor

Thanks for releasing code!
And I have some questions about training the network.
Both sfe_editor.pt and sfe_inverter.pt are very large, I wonder the time and equipment required to train these two models. I didn't found related information in paper.

`e4e_encoder` in models/methods.py seems useless?

I debuged the FSEInferenceRunner code and found that an e4e_encoder seemed total useless?

 print("Loading E4E from", self.opts.e4e_path)
 ckpt = torch.load(self.opts.e4e_path, map_location="cpu")
 self.e4e_encoder.load_state_dict(get_keys(ckpt, "encoder"), strict=True)
 self.e4e_encoder = self.e4e_encoder.eval().to(self.device)
 toogle_grad(self.e4e_encoder, False)

How to get more attributes?

The project provides a variety of attribute editing directions, but the current attribute editing methods such as GANSpace, InterfaceGAN only a small number of attributes can get better editing results, which is due to the natural entanglement of w+ space. Since this project is based on the new FS space, is it possible to train new attribute editing models in FS space and are there some useful suggestions? Thanks!

Training the inverter

How many steps does it take to train an inverter with full FFHQ data to achieve model convergence? Do you really need 300000 steps? Can you give a loss curve for reference? Thank you!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.