Code Monkey home page Code Monkey logo

stylegan3-fun's Introduction

StyleGAN3-Fun
Let's have fun with StyleGAN2/ADA/3!

SOTA GANs are hard to train and to explore, and StyleGAN2/ADA/3 are no different. The point of this repository is to allow the user to both easily train and explore the trained models without unnecessary headaches.

As before, we will build upon the official repository, which has the advantage of being backwards-compatible. As such, we can use our previously-trained models from StyleGAN2 and StyleGAN2-ADA. Please get acquainted with the official repository and its codebase, as we will be building upon it and as such, increase its capabilities (but hopefully not its complexity!).

Additions

This repository adds/has the following changes (not yet the complete list):

  • Dataset Setup (dataset_tool.py)

    • RGBA support, so revert saving images to .png (Issue #156 by @1378dm). Training can use RGBA and images can be generated.
      • TODO: Check that training code is correct for normalizing the alpha channel, as well as making the interpolation code work with this new format (look into moviepy.editor.VideoClip)
      • For now, interpolation videos will only be saved in RGB format, e.g., discarding the alpha channel.
    • --center-crop-tall: add vertical black bars to the sides of each image in the dataset (rectangular images, with height > width), and you wish to train a square model, in the same vein as the horizontal bars added when using --center-crop-wide (where width > height).
      • This is useful when you don't want to lose information from the left and right side of the image by only using the center crop (ibidem for --center-crop-wide, but for the top and bottom of the image)
      • Note that each image doesn't have to be of the same size, and the added bars will only ensure you get a square image, which will then be resized to the model's desired resolution (set by --resolution).
    • Grayscale images in the dataset are converted to RGB
      • If you want to turn this off, remove the respective line in dataset_tool.py, e.g., if your dataset is made of images in a folder, then the function to be used is open_image_folder in dataset_tool.py, and the line to be removed is img = img.convert('RGB') in the iterate_images inner function.
    • The dataset can be forced to be of a specific number of channels, that is, grayscale, RGB or RGBA.
      • To use this, set --force-channels=1 for grayscale, --force-channels=3 for RGB, and --force-channels=4 for RGBA.
    • If the dataset tool encounters an error, print it along the offending image, but continue with the rest of the dataset (PR #39 from Andreas Jansson).
    • For conditional models, we can use the subdirectories as the classes by adding --subfolders-as-labels. This will generate the dataset.json file automatically as done by @pbaylies here
      • Additionally, in the --source folder, we will save a class_labels.txt file, to further know which classes correspond to each subdirectory.
  • Training

    • Add --cfg=stylegan2-ext, which uses @aydao's extended modifications for handling large and diverse datasets.
      • A good explanation is found in Gwern's blog here
      • If you wish to fine-tune from @aydao's Anime model, use --cfg=stylegan2-ext --resume=anime512 when running train.py
      • Note: This is an extremely experimental configuration! The .pkl files will be ~1.1Gb each and training will slow down significantly. Use at your own risk!
    • --blur-percent: Blur both real and generated images before passing them to the Discriminator.
      • The blur (blur_init_sigma=10.0) will completely fade after the selected percentage of the training is completed (using a linear ramp).
      • Another experimental feature, should help with datasets that have a lot of variation, and you wish the model to slowly learn to generate the objects and then its details.
    • --mirrory: Added vertical mirroring for doubling the dataset size (quadrupling if --mirror is used; make sure your dataset has either or both of these symmetries in order for it to make sense to use them)
    • --gamma: If no R1 regularization is provided, the heuristic formula from StyleGAN will be used.
      • Specifically, we will set gamma=0.0002 * resolution ** 2 / batch_size
    • --aug: TODO: add Deceive-D/APA as an option.
    • --augpipe: Now available to use is StyleGAN2-ADA's full list of augpipe, i.e., individual augmentations (blit, geom, color, filter, noise, cutout) or their combinations (bg, bgc, bgcf, bgcfn, bgcfnc).
    • --img-snap: Set when to save snapshot images, so now it's independent of when the model is saved (e.g., save image snapshots more often to know how the model is training without saving the model itself, to save space).
    • --snap-res: The resolution of the snapshots, depending on how many images you wish to see per snapshot. Available resolutions: 1080p, 4k, and 8k.
    • --resume-kimg: Starting number of kimg, useful when continuing training a previous run
    • --outdir: Automatically set as training-runs, so no need to set beforehand (in general this is true throughout the repository)
    • --metrics: Now set by default to None, so there's no need to worry about this one
    • --freezeD: Renamed --freezed for better readability
    • --freezeM: Freeze the first layers of the Mapping Network Gm (G.mapping)
    • --freezeE: Freeze the embedding layer of the Generator (for class-conditional models)
    • --freezeG: TODO: Freeze the first layers of the Synthesis Network (G.synthesis; less cost to transfer learn, focus on high layers?)
    • --resume: All available pre-trained models from NVIDIA (and more) can be used with a simple dictionary, depending on the --cfg used. For example, if you wish to use StyleGAN3's config-r, then set --cfg=stylegan3-r. In addition, if you wish to transfer learn from FFHQU at 1024 resolution, set --resume=ffhqu1024.
      • The full list of currently available models to transfer learn from (or synthesize new images with) is the following (TODO: add small description of each model, so the user can better know which to use for their particular use-case; proper citation to original authors as well):

        StyleGAN2 models
        1. Majority, if not all, are config-f: set --cfg=stylegan2
          • ffhq256
          • ffhqu256
          • ffhq512
          • ffhq1024
          • ffhqu1024
          • celebahq256
          • lsundog256
          • afhqcat512
          • afhqdog512
          • afhqwild512
          • afhq512
          • brecahad512
          • cifar10 (conditional, 10 classes)
          • metfaces1024
          • metfacesu1024
          • lsuncar512 (config-f)
          • lsuncat256 (config-f)
          • lsunchurch256 (config-f)
          • lsunhorse256 (config-f)
          • minecraft1024 (thanks to @jeffheaton)
          • imagenet512 (thanks to @shawwn)
          • wikiart1024-C (conditional, 167 classes; thanks to @pbaylies)
          • wikiart1024-U (thanks to @pbaylies)
          • maps1024 (thanks to @tjukanov)
          • fursona512 (thanks to @arfafax)
          • mlpony512 (thanks to @arfafax)
          • lhq1024 (thanks to @justinpinkney)
          • afhqcat256 (Deceive-D/APA models)
          • anime256 (Deceive-D/APA models)
          • cub256 (Deceive-D/APA models)
          • sddogs1024 (Self-Distilled StyleGAN models)
          • sdelephant512 (Self-Distilled StyleGAN models)
          • sdhorses512 (Self-Distilled StyleGAN models)
          • sdbicycles256 (Self-Distilled StyleGAN models)
          • sdlions512 (Self-Distilled StyleGAN models)
          • sdgiraffes512 (Self-Distilled StyleGAN models)
          • sdparrots512 (Self-Distilled StyleGAN models)
        2. Extended StyleGAN2 config from @aydao: set --cfg=stylegan2-ext
        StyleGAN3 models
        1. config-t: set --cfg=stylegan3-t
          • afhq512
          • ffhqu256
          • ffhq1024
          • ffhqu1024
          • metfaces1024
          • metfacesu1024
          • landscapes256 (thanks to @justinpinkney)
          • wikiart1024 (thanks to @justinpinkney)
          • mechfuture256 (thanks to @edstoica; 29 kimg tick)
          • vivflowers256 (thanks to @edstoica; 68 kimg tick)
          • alienglass256 (thanks to @edstoica; 38 kimg tick)
          • scificity256 (thanks to @edstoica; 210 kimg tick)
          • scifiship256 (thanks to @edstoica; 168 kimg tick)
        2. config-r: set --cfg=stylegan3-r
          • afhq512
          • ffhq1024
          • ffhqu1024
          • ffhqu256
          • metfaces1024
          • metfacesu1024
      • The main sources of these pretrained models are both the official NVIDIA repository, as well as other community repositories, such as Justin Pinkney 's Awesome Pretrained StyleGAN2 and Awesome Pretrained StyleGAN3, Deceive-D/APA, Self-Distilled StyleGAN/Internet Photos, and edstoica 's Wombo Dream -based models. Others can be found around the net and are properly credited in this repository, so long as they can be easily downloaded with dnnlib.util.open_url.

  • Interpolation videos

    • Random interpolation
      • Generate images/interpolations with the internal representations of the model
        • Usage: Add --layer=<layer_name> to specify which layer to use for interpolation.
        • If you don't know the names of the layers available for your model, add the flag --available-layers and the layers will be printed to the console, along their names, number of channels, and sizes.
        • Use one of --grayscale or --rgb to specify whether to save the images as grayscale or RGB during the interpolation.
        • For --rgb, three consecutive channels (starting at --starting-channel=0) will be used to create the RGB image. For --grayscale, only the first channel will be used.
    • Style-mixing
    • Sightseeding (jumpiness has been fixed)
    • Circular interpolation
    • Visual-reactive interpolation (Beta)
    • Audiovisual-reactive interpolation (TODO)
    • TODO: Give support to RGBA models!
  • Projection into the latent space

  • Discriminator Synthesis (official code)

    • Generate a static image (python discriminator_synthesis.py dream --help) or a video with a feedback loop (python discriminator_synthesis.py dream-zoom --help, python discriminator_synthesis.py channel-zoom --help, or python discriminator_synthesis.py interp --help)
    • Start from a random image (random for noise or perlin for 2D fractal Perlin noise, using Mathieu Duchesneau's implementation) or from an existing one
  • Expansion on GUI/visualizer.py

    • Added the rest of the affine transformations
    • Added widget for class-conditional models (TODO: mix classes with continuous values for cls!)
  • General model and code additions

    • Multi-modal truncation trick: find the different clusters in your model and use the closest one to your dlatent, in order to increase the fidelity
      • Usage: Run python multimodal_truncation.py get-centroids --network=<path_to_model> to use default values; for extra options, run python multimodal_truncation.py get-centroids --help
    • StyleGAN3: anchor the latent space for easier to follow interpolations (thanks to Rivers Have Wings and nshepperd).
    • Use CPU instead of GPU if desired (not recommended, but perfectly fine for generating images, whenever the custom CUDA kernels fail to compile).
    • Add missing dependencies and channels so that the conda environment is correctly setup in Windows (PR's #111/#125 and #80 /#143 from the base, respectively)
    • Use StyleGAN-NADA models with any part of the code (Issue #9)
      • The StyleGAN-NADA models must first be converted via Vadim Epstein 's conversion code found here.
    • Add PR #173 for adding the last remaining unknown kwarg for using StyleGAN2 models using TF 1.15.
  • TODO list (this is a long one with more to come, so any help is appreciated):

Notebooks (Coming soon!)

Sponsors GitHub Sponsor

This repository has been sponsored by:

isosceles

Thank you so much!

If you wish to sponsor me, click here:


Alias-Free Generative Adversarial Networks (StyleGAN3)
Official PyTorch implementation of the NeurIPS 2021 paper

Teaser image

Alias-Free Generative Adversarial Networks
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, Timo Aila
https://nvlabs.github.io/stylegan3

Abstract: We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation.

For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing

Release notes

This repository is an updated version of stylegan2-ada-pytorch, with several new features:

  • Alias-free generator architecture and training configurations (stylegan3-t, stylegan3-r).
  • Tools for interactive visualization (visualizer.py), spectral analysis (avg_spectra.py), and video generation (gen_video.py).
  • Equivariance metrics (eqt50k_int, eqt50k_frac, eqr50k).
  • General improvements: reduced memory usage, slightly faster training, bug fixes.

Compatibility:

  • Compatible with old network pickles created using stylegan2-ada and stylegan2-ada-pytorch. (Note: running old StyleGAN2 models on StyleGAN3 code will produce the same results as running them on stylegan2-ada/stylegan2-ada-pytorch. To benefit from the StyleGAN3 architecture, you need to retrain.)
  • Supports old StyleGAN2 training configurations, including ADA and transfer learning. See Training configurations for details.
  • Improved compatibility with Ampere GPUs and newer versions of PyTorch, CuDNN, etc.

Synthetic image detection

While new generator approaches enable new media synthesis capabilities, they may also present a new challenge for AI forensics algorithms for detection and attribution of synthetic media. In collaboration with digital forensic researchers participating in DARPA's SemaFor program, we curated a synthetic image dataset that allowed the researchers to test and validate the performance of their image detectors in advance of the public release. Please see here for more details.

Additional material

  • Result videos
  • Curated example images
  • StyleGAN3 pre-trained models for config T (translation equiv.) and config R (translation and rotation equiv.)

    Access individual networks via https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/<MODEL>, where <MODEL> is one of:
    stylegan3-t-ffhq-1024x1024.pkl, stylegan3-t-ffhqu-1024x1024.pkl, stylegan3-t-ffhqu-256x256.pkl
    stylegan3-r-ffhq-1024x1024.pkl, stylegan3-r-ffhqu-1024x1024.pkl, stylegan3-r-ffhqu-256x256.pkl
    stylegan3-t-metfaces-1024x1024.pkl, stylegan3-t-metfacesu-1024x1024.pkl
    stylegan3-r-metfaces-1024x1024.pkl, stylegan3-r-metfacesu-1024x1024.pkl
    stylegan3-t-afhqv2-512x512.pkl
    stylegan3-r-afhqv2-512x512.pkl

  • StyleGAN2 pre-trained models compatible with this codebase

    Access individual networks via https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan2/versions/1/files/<MODEL>, where <MODEL> is one of:
    stylegan2-ffhq-1024x1024.pkl, stylegan2-ffhq-512x512.pkl, stylegan2-ffhq-256x256.pkl
    stylegan2-ffhqu-1024x1024.pkl, stylegan2-ffhqu-256x256.pkl
    stylegan2-metfaces-1024x1024.pkl, stylegan2-metfacesu-1024x1024.pkl
    stylegan2-afhqv2-512x512.pkl
    stylegan2-afhqcat-512x512.pkl, stylegan2-afhqdog-512x512.pkl, stylegan2-afhqwild-512x512.pkl
    stylegan2-brecahad-512x512.pkl, stylegan2-cifar10-32x32.pkl
    stylegan2-celebahq-256x256.pkl, stylegan2-lsundog-256x256.pkl

Requirements

  • Linux and Windows are supported, but we recommend Linux for performance and compatibility reasons.
  • 1–8 high-end NVIDIA GPUs with at least 12 GB of memory. We have done all testing and development using Tesla V100 and A100 GPUs.
  • 64-bit Python 3.8 and PyTorch 1.9.0 (or later). See https://pytorch.org for PyTorch install instructions.
  • CUDA toolkit 11.1 or later. (Why is a separate CUDA toolkit installation required? See Troubleshooting).
  • GCC 7 or later (Linux) or Visual Studio (Windows) compilers. Recommended GCC version depends on CUDA version, see for example CUDA 11.4 system requirements.
  • Python libraries: see environment.yml for exact library dependencies. You can use the following commands with Miniconda3 to create and activate your StyleGAN3 Python environment:
    • conda env create -f environment.yml
    • conda activate stylegan3
  • Docker users:

The code relies heavily on custom PyTorch extensions that are compiled on the fly using NVCC. On Windows, the compilation requires Microsoft Visual Studio. We recommend installing Visual Studio Community Edition and adding it into PATH using "C:\Program Files (x86)\Microsoft Visual Studio\<VERSION>\Community\VC\Auxiliary\Build\vcvars64.bat".

See Troubleshooting for help on common installation and run-time problems.

Getting started

Pre-trained networks are stored as *.pkl files that can be referenced using local filenames or URLs:

# Generate an image using pre-trained AFHQv2 model ("Ours" in Figure 1, left).
python gen_images.py --outdir=out --trunc=1 --seeds=2 \
    --network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-afhqv2-512x512.pkl

# Render a 4x2 grid of interpolations for seeds 0 through 31.
python gen_video.py --output=lerp.mp4 --trunc=1 --seeds=0-31 --grid=4x2 \
    --network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-afhqv2-512x512.pkl

Outputs from the above commands are placed under out/*.png, controlled by --outdir. Downloaded network pickles are cached under $HOME/.cache/dnnlib, which can be overridden by setting the DNNLIB_CACHE_DIR environment variable. The default PyTorch extension build directory is $HOME/.cache/torch_extensions, which can be overridden by setting TORCH_EXTENSIONS_DIR.

Docker: You can run the above curated image example using Docker as follows:

# Build the stylegan3:latest image
docker build --tag stylegan3 .

# Run the gen_images.py script using Docker:
docker run --gpus all -it --rm --user $(id -u):$(id -g) \
    -v `pwd`:/scratch --workdir /scratch -e HOME=/scratch \
    stylegan3 \
    python gen_images.py --outdir=out --trunc=1 --seeds=2 \
         --network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-afhqv2-512x512.pkl

Note: The Docker image requires NVIDIA driver release r470 or later.

The docker run invocation may look daunting, so let's unpack its contents here:

  • --gpus all -it --rm --user $(id -u):$(id -g): with all GPUs enabled, run an interactive session with current user's UID/GID to avoid Docker writing files as root.
  • -v `pwd`:/scratch --workdir /scratch: mount current running dir (e.g., the top of this git repo on your host machine) to /scratch in the container and use that as the current working dir.
  • -e HOME=/scratch: let PyTorch and StyleGAN3 code know where to cache temporary files such as pre-trained models and custom PyTorch extension build results. Note: if you want more fine-grained control, you can instead set TORCH_EXTENSIONS_DIR (for custom extensions build dir) and DNNLIB_CACHE_DIR (for pre-trained model download cache). You want these cache dirs to reside on persistent volumes so that their contents are retained across multiple docker run invocations.

Interactive visualization

This release contains an interactive model visualization tool that can be used to explore various characteristics of a trained model. To start it, run:

python visualizer.py

Visualizer screenshot

Using networks from Python

You can use pre-trained networks in your own Python code as follows:

with open('ffhq.pkl', 'rb') as f:
    G = pickle.load(f)['G_ema'].cuda()  # torch.nn.Module
z = torch.randn([1, G.z_dim]).cuda()    # latent codes
c = None                                # class labels (not used in this example)
img = G(z, c)                           # NCHW, float32, dynamic range [-1, +1], no truncation

The above code requires torch_utils and dnnlib to be accessible via PYTHONPATH. It does not need source code for the networks themselves — their class definitions are loaded from the pickle via torch_utils.persistence.

The pickle contains three networks. 'G' and 'D' are instantaneous snapshots taken during training, and 'G_ema' represents a moving average of the generator weights over several training steps. The networks are regular instances of torch.nn.Module, with all of their parameters and buffers placed on the CPU at import and gradient computation disabled by default.

The generator consists of two submodules, G.mapping and G.synthesis, that can be executed separately. They also support various additional options:

w = G.mapping(z, c, truncation_psi=0.5, truncation_cutoff=8)
img = G.synthesis(w, noise_mode='const', force_fp32=True)

Please refer to gen_images.py for complete code example.

Preparing datasets

Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file dataset.json for labels. Custom datasets can be created from a folder containing images; see python dataset_tool.py --help for more information. Alternatively, the folder can also be used directly as a dataset, without running it through dataset_tool.py first, but doing so may lead to suboptimal performance.

FFHQ: Download the Flickr-Faces-HQ dataset as 1024x1024 images and create a zip archive using dataset_tool.py:

# Original 1024x1024 resolution.
python dataset_tool.py --source=/tmp/images1024x1024 --dest=~/datasets/ffhq-1024x1024.zip

# Scaled down 256x256 resolution.
python dataset_tool.py --source=/tmp/images1024x1024 --dest=~/datasets/ffhq-256x256.zip \
    --resolution=256x256

See the FFHQ README for information on how to obtain the unaligned FFHQ dataset images. Use the same steps as above to create a ZIP archive for training and validation.

MetFaces: Download the MetFaces dataset and create a ZIP archive:

python dataset_tool.py --source=~/downloads/metfaces/images --dest=~/datasets/metfaces-1024x1024.zip

See the MetFaces README for information on how to obtain the unaligned MetFaces dataset images. Use the same steps as above to create a ZIP archive for training and validation.

AFHQv2: Download the AFHQv2 dataset and create a ZIP archive:

python dataset_tool.py --source=~/downloads/afhqv2 --dest=~/datasets/afhqv2-512x512.zip

Note that the above command creates a single combined dataset using all images of all three classes (cats, dogs, and wild animals), matching the setup used in the StyleGAN3 paper. Alternatively, you can also create a separate dataset for each class:

python dataset_tool.py --source=~/downloads/afhqv2/train/cat --dest=~/datasets/afhqv2cat-512x512.zip
python dataset_tool.py --source=~/downloads/afhqv2/train/dog --dest=~/datasets/afhqv2dog-512x512.zip
python dataset_tool.py --source=~/downloads/afhqv2/train/wild --dest=~/datasets/afhqv2wild-512x512.zip

Training

You can train new networks using train.py. For example:

# Train StyleGAN3-T for AFHQv2 using 8 GPUs.
python train.py --outdir=~/training-runs --cfg=stylegan3-t --data=~/datasets/afhqv2-512x512.zip \
    --gpus=8 --batch=32 --gamma=8.2 --mirror=1

# Fine-tune StyleGAN3-R for MetFaces-U using 1 GPU, starting from the pre-trained FFHQ-U pickle.
python train.py --outdir=~/training-runs --cfg=stylegan3-r --data=~/datasets/metfacesu-1024x1024.zip \
    --gpus=8 --batch=32 --gamma=6.6 --mirror=1 --kimg=5000 --snap=5 \
    --resume=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhqu-1024x1024.pkl

# Train StyleGAN2 for FFHQ at 1024x1024 resolution using 8 GPUs.
python train.py --outdir=~/training-runs --cfg=stylegan2 --data=~/datasets/ffhq-1024x1024.zip \
    --gpus=8 --batch=32 --gamma=10 --mirror=1 --aug=noaug

Note that the result quality and training time depend heavily on the exact set of options. The most important ones (--gpus, --batch, and --gamma) must be specified explicitly, and they should be selected with care. See python train.py --help for the full list of options and Training configurations for general guidelines & recommendations, along with the expected training speed & memory usage in different scenarios.

The results of each training run are saved to a newly created directory, for example ~/training-runs/00000-stylegan3-t-afhqv2-512x512-gpus8-batch32-gamma8.2. The training loop exports network pickles (network-snapshot-<KIMG>.pkl) and random image grids (fakes<KIMG>.png) at regular intervals (controlled by --snap). For each exported pickle, it evaluates FID (controlled by --metrics) and logs the result in metric-fid50k_full.jsonl. It also records various statistics in training_stats.jsonl, as well as *.tfevents if TensorBoard is installed.

Quality metrics

By default, train.py automatically computes FID for each network pickle exported during training. We recommend inspecting metric-fid50k_full.jsonl (or TensorBoard) at regular intervals to monitor the training progress. When desired, the automatic computation can be disabled with --metrics=none to speed up the training slightly.

Additional quality metrics can also be computed after the training:

# Previous training run: look up options automatically, save result to JSONL file.
python calc_metrics.py --metrics=eqt50k_int,eqr50k \
    --network=~/training-runs/00000-stylegan3-r-mydataset/network-snapshot-000000.pkl

# Pre-trained network pickle: specify dataset explicitly, print result to stdout.
python calc_metrics.py --metrics=fid50k_full --data=~/datasets/ffhq-1024x1024.zip --mirror=1 \
    --network=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhq-1024x1024.pkl

The first example looks up the training configuration and performs the same operation as if --metrics=eqt50k_int,eqr50k had been specified during training. The second example downloads a pre-trained network pickle, in which case the values of --data and --mirror must be specified explicitly.

Note that the metrics can be quite expensive to compute (up to 1h), and many of them have an additional one-off cost for each new dataset (up to 30min). Also note that the evaluation is done using a different random seed each time, so the results will vary if the same metric is computed multiple times.

Recommended metrics:

  • fid50k_full: Fréchet inception distance[1] against the full dataset.
  • kid50k_full: Kernel inception distance[2] against the full dataset.
  • pr50k3_full: Precision and recall[3] againt the full dataset.
  • ppl2_wend: Perceptual path length[4] in W, endpoints, full image.
  • eqt50k_int: Equivariance[5] w.r.t. integer translation (EQ-T).
  • eqt50k_frac: Equivariance w.r.t. fractional translation (EQ-Tfrac).
  • eqr50k: Equivariance w.r.t. rotation (EQ-R).

Legacy metrics:

  • fid50k: Fréchet inception distance against 50k real images.
  • kid50k: Kernel inception distance against 50k real images.
  • pr50k3: Precision and recall against 50k real images.
  • is50k: Inception score[6] for CIFAR-10.

References:

  1. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium, Heusel et al. 2017
  2. Demystifying MMD GANs, Bińkowski et al. 2018
  3. Improved Precision and Recall Metric for Assessing Generative Models, Kynkäänniemi et al. 2019
  4. A Style-Based Generator Architecture for Generative Adversarial Networks, Karras et al. 2018
  5. Alias-Free Generative Adversarial Networks, Karras et al. 2021
  6. Improved Techniques for Training GANs, Salimans et al. 2016

Spectral analysis

The easiest way to inspect the spectral properties of a given generator is to use the built-in FFT mode in visualizer.py. In addition, you can visualize average 2D power spectra (Appendix A, Figure 15) as follows:

# Calculate dataset mean and std, needed in subsequent steps.
python avg_spectra.py stats --source=~/datasets/ffhq-1024x1024.zip

# Calculate average spectrum for the training data.
python avg_spectra.py calc --source=~/datasets/ffhq-1024x1024.zip \
    --dest=tmp/training-data.npz --mean=112.684 --std=69.509

# Calculate average spectrum for a pre-trained generator.
python avg_spectra.py calc \
    --source=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhq-1024x1024.pkl \
    --dest=tmp/stylegan3-r.npz --mean=112.684 --std=69.509 --num=70000

# Display results.
python avg_spectra.py heatmap tmp/training-data.npz
python avg_spectra.py heatmap tmp/stylegan3-r.npz
python avg_spectra.py slices tmp/training-data.npz tmp/stylegan3-r.npz

Average spectra screenshot

License

Copyright © 2021, NVIDIA Corporation & affiliates. All rights reserved.

This work is made available under the Nvidia Source Code License.

Citation

@inproceedings{Karras2021,
  author = {Tero Karras and Miika Aittala and Samuli Laine and Erik H\"ark\"onen and Janne Hellsten and Jaakko Lehtinen and Timo Aila},
  title = {Alias-Free Generative Adversarial Networks},
  booktitle = {Proc. NeurIPS},
  year = {2021}
}

Development

This is a research reference implementation and is treated as a one-time code drop. As such, we do not accept outside code contributions in the form of pull requests.

Acknowledgements

We thank David Luebke, Ming-Yu Liu, Koki Nagano, Tuomas Kynkäänniemi, and Timo Viitanen for reviewing early drafts and helpful suggestions. Frédo Durand for early discussions. Tero Kuosmanen for maintaining our compute infrastructure. AFHQ authors for an updated version of their dataset. Getty Images for the training images in the Beaches dataset. We did not receive external funding or additional revenues for this project.

stylegan3-fun's People

Contributors

ainaroca avatar jannehellsten avatar kgonia avatar nurpax avatar pdillis avatar tkarras avatar zibbezabbe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

stylegan3-fun's Issues

error: XDG_RUNTIME_DIR not set in the environment. when using generate.py in Colab

I get the error: XDG_RUNTIME_DIR not set in the environment. when I attempt to run generate.py, even just the help function

The full error output is below. It doesn't seem to effect the results

error: XDG_RUNTIME_DIR not set in the environment.
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default
ALSA lib confmisc.c:855:(parse_card) cannot find card '0'
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_card_inum returned error: No such file or directory
ALSA lib confmisc.c:422:(snd_func_concat) error evaluating strings
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_concat returned error: No such file or directory
ALSA lib confmisc.c:1334:(snd_func_refer) error evaluating name
ALSA lib conf.c:5178:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5701:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2664:(snd_pcm_open_noupdate) Unknown PCM default
Usage: generate.py [OPTIONS] COMMAND [ARGS]...

RuntimeError: aten::grid_sampler_2d_backward()

Running train.py results in the following runtime error:
RuntimeError: aten::grid_sampler_2d_backward() is missing value for argument 'output_mask'. Declaration: aten::grid_sampler_2d_backward(Tensor grad_output, Tensor input, Tensor grid, int interpolation_mode, int padding_mode, bool align_corners, bool[2] output_mask) -> (Tensor, Tensor)

  • OS: Windows 11
  • PyTorch version 1.11
  • CUDA toolkit version 11.6
  • GPU GTX 3080ti
  • Docker: did you use Docker? No

Changes from the below commit to conv2d_gradfix.py & grid_sample_gradfix.py will correct issues:
NVlabs@407db86

convert_to_grayscale

I lost hours to this "trolling to Pillow" since I'm actually working with a grayscale SG network 😢
Is there a better way to name this or at least call attention to the fact this is for 3-channel models?
Thanks!

if convert_to_grayscale:
    image = image.convert('L').convert('RGB')  # We do a little trolling to Pillow (so we have a 3-channel image)

visualizer music synchronisation idea

hello,
so i've been thinking on how to implement the animating of the stylegan to music in the visualizer.
this should be fairly easy to be honest.
i found this BPM detector on github. (it's pretty fast, calculates the bpm in +-2 seconds)
https://github.com/scaperot/the-BPM-detector-python/tree/4156ea7ba0f0883ff8ff3fa52fd386aa93ff9478
the code for running this python file is
python bpm_detection.py --filename song_name.wav

the native animation speed of the visualizer is 0.25 (4 seconds per fully new image), if we calculate this into beats per minute this is
60 (seconds) * 0.25 (anim speed) = 15 BPM
so basically an "anim speed of 1" is "1 second" so "60 BPM"

so if we want to calculate the anim speed for a generated bpm we do this (for example BPM=101.626)
101.626(generated BPM) divided by 60(seconds) = 1.6938(anim speed)

so basically the calculation is
"generated BPM" divided by "60 seconds" = "anim speed"
(the 60 seconds always remains constant because it's beats per Minute but could be divided or multiplied by a factor of 2 to keep the sync but make the animation faster or slower (so multiply the anim speed connected to bpm by 0.125, 0.25, 0.5, 1, 2, 4, 8)

so basically we could connect the generated anim speed as a button to connect to the anim speed with then a couple more buttons to multiply or divide the anim speed by 2 to make it faster or slower but still matching the beat.

let me know what you think, i saw that u were planning to do something like this in your TODO list so i thought i'd drop it here :)

Is rotated generated images a normal part of the training progression.

I have been training a gans with the stylegans3-r configuration for about 7kimgs. I am using your fork and have taken advice from some of the other issues on the main repo that you have commented on in the past.
I have been using these arguments: --gpus=8 --batch=32 --gamma=32 --aug=ada --augpipe=bg --target=0.8 --initstrength=[last training round augment score] --snap=10 --img-snap=10 --mirror=1 --metrics=none --resume-kimg=[foobar k imgs]

At a little over 5kimgs the generated images began to be rotated 90 degrees right, then after some time they transitioned to being rotated 90 degrees the opposite direction. They have currently been rotated upside down.

Is this a normal part of the training progression in which the generated images will return to the proper orientation, or is this some type of mode collapse or augmentation leak?

In case its useful the images were non square so white bars have been added to each side to make them square.
The objects in the images themselves are not perfectly symmetrical especial since there is various high and low perspectives.

The images themselves look ok but definitely not converged.

As stylegans3 training is significantly slower than 2 is I have put the training on hiatus until I know if I can continue training this model or if I should start over from scratch.

When the training interruption resumes, the visualization of tensorboard seems to have a bug

The previous training ended at 140 rounds, I used the parameter "--resume-kimg=140" to continue this training, training to 260 rounds, but I found that the two tensorboard output log files did not lose equal when training to 140 rounds, what is the reason.
Or should I use "--resume-kimg=264" instead of 260 for my next follow-up training?

This is the parameter I used for training
QQ截图20230322214921

This is the display panel of tesnorboard
QQ截图20230322215006

Unidentified AssertionError When Using 'projector.py'

Describe the bug
Unidentified AssertionError when I run the projector.py.

To Reproduce
Steps to reproduce the behavior:

  1. In the root directory of this project, and execute this command: "python projector.py --network=my-pretrained-models/StyleGAN2-Ada-DEM1024-CLAHE.pkl --cfg=stylegan2 --target=targets/RiverValley.png"
  2. See error

Expected behavior
I don't know what I should expect to happen, but I definitely know there's something wrong.

Error Information
Setting up PyTorch plugin "bias_act_plugin"... /home/MYUSERID/anaconda3/envs/pytorch180-A100/lib/python3.8/site-packages/scipy/init.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.23.3
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
projector.py:447: DeprecationWarning: LANCZOS is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.LANCZOS instead.
target_pil = target_pil.resize((G.img_resolution, G.img_resolution), PIL.Image.LANCZOS)
Done.
Projecting in W latent space...
Starting from W midpoint using 10000 samples...
Setting up PyTorch plugin "upfirdn2d_plugin"... Done.
Traceback (most recent call last):
File "projector.py", line 549, in
run_projection() # pylint: disable=no-value-for-parameter
File "/home/MYUSERID/anaconda3/envs/pytorch180-A100/lib/python3.8/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/home/MYUSERID/anaconda3/envs/pytorch180-A100/lib/python3.8/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/home/MYUSERID/anaconda3/envs/pytorch180-A100/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/MYUSERID/anaconda3/envs/pytorch180-A100/lib/python3.8/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/home/MYUSERID/anaconda3/envs/pytorch180-A100/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "projector.py", line 456, in run_projection
projected_w_steps, run_config = project(
File "projector.py", line 178, in project
synth_features = vgg16(synth_images, resize_images=False, return_lpips=True)
File "/home/MYUSERID/anaconda3/envs/pytorch180-A100/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "", line 71, in forward
AssertionError

Environment

  • OS: CentOS7
  • PyTorch 1.7.1
  • CUDA 11.0
  • NVIDIA driver version - 470.82.01
  • NVIDIA driver version(CUDA) - 11.4
  • GPU NVIDIA A100

BTW, I use the slurm to submit my work to the lab's server. I have successfully done the training on my own dataset. And the dataset is not about human faces, the images in my dataset are grayscale digital elevation maps (DEM) with a resolution of 1024x1024. This error is unidentified through the log. Any effort on solving this error is appreciated.

train.py failed to run

I tried to train images with transparency using train.py from argb branch, but it still fails, here are some error messages, can you help me with it?

Setting up PyTorch plugin "bias_act_plugin"... Done.
Setting up PyTorch plugin "filtered_lrelu_plugin"... Done.

Generator                    Parameters  Buffers  Output shape       Datatype
---                          ---         ---      ---                ---
mapping.fc0                  262656      -        [32, 512]          float32
mapping.fc1                  262656      -        [32, 512]          float32
mapping                      -           512      [32, 16, 512]      float32
synthesis.input.affine       2052        -        [32, 4]            float32
synthesis.input              262144      1545     [32, 512, 36, 36]  float32
synthesis.L0_36_512.affine   262656      -        [32, 512]          float32
synthesis.L0_36_512          2359808     25       [32, 512, 36, 36]  float16
synthesis.L1_36_512.affine   262656      -        [32, 512]          float32
synthesis.L1_36_512          2359808     25       [32, 512, 36, 36]  float16
synthesis.L2_36_512.affine   262656      -        [32, 512]          float32
synthesis.L2_36_512          2359808     25       [32, 512, 36, 36]  float16
synthesis.L3_36_512.affine   262656      -        [32, 512]          float32
synthesis.L3_36_512          2359808     25       [32, 512, 36, 36]  float16
synthesis.L4_52_512.affine   262656      -        [32, 512]          float32
synthesis.L4_52_512          2359808     37       [32, 512, 52, 52]  float16
synthesis.L5_52_512.affine   262656      -        [32, 512]          float32
synthesis.L5_52_512          2359808     25       [32, 512, 52, 52]  float16
synthesis.L6_52_512.affine   262656      -        [32, 512]          float32
synthesis.L6_52_512          2359808     25       [32, 512, 52, 52]  float16
synthesis.L7_52_512.affine   262656      -        [32, 512]          float32
synthesis.L7_52_512          2359808     25       [32, 512, 52, 52]  float16
synthesis.L8_84_512.affine   262656      -        [32, 512]          float32
synthesis.L8_84_512          2359808     37       [32, 512, 84, 84]  float16
synthesis.L9_84_512.affine   262656      -        [32, 512]          float32
synthesis.L9_84_512          2359808     25       [32, 512, 84, 84]  float16
synthesis.L10_84_512.affine  262656      -        [32, 512]          float32
synthesis.L10_84_512         2359808     25       [32, 512, 84, 84]  float16
synthesis.L11_84_512.affine  262656      -        [32, 512]          float32
synthesis.L11_84_512         2359808     25       [32, 512, 84, 84]  float16
synthesis.L12_84_512.affine  262656      -        [32, 512]          float32
synthesis.L12_84_512         2359808     25       [32, 512, 84, 84]  float16
synthesis.L13_64_512.affine  262656      -        [32, 512]          float32
synthesis.L13_64_512         2359808     25       [32, 512, 64, 64]  float16
synthesis.L14_64_4.affine    262656      -        [32, 512]          float32
synthesis.L14_64_4           2052        1        [32, 4, 64, 64]    float16
synthesis                    -           -        [32, 4, 64, 64]    float32
---                          ---         ---      ---                ---
Total                        37768712    2432     -                  -

Setting up PyTorch plugin "upfirdn2d_plugin"... Done.

Discriminator  Parameters  Buffers  Output shape       Datatype
---            ---         ---      ---                ---
b64.fromrgb    2560        16       [32, 512, 64, 64]  float16
b64.skip       262144      16       [32, 512, 32, 32]  float16
b64.conv0      2359808     16       [32, 512, 64, 64]  float16
b64.conv1      2359808     16       [32, 512, 32, 32]  float16
b64            -           16       [32, 512, 32, 32]  float16
b32.skip       262144      16       [32, 512, 16, 16]  float16
b32.conv0      2359808     16       [32, 512, 32, 32]  float16
b32.conv1      2359808     16       [32, 512, 16, 16]  float16
b32            -           16       [32, 512, 16, 16]  float16
b16.skip       262144      16       [32, 512, 8, 8]    float16
b16.conv0      2359808     16       [32, 512, 16, 16]  float16
b16.conv1      2359808     16       [32, 512, 8, 8]    float16
b16            -           16       [32, 512, 8, 8]    float16
b8.skip        262144      16       [32, 512, 4, 4]    float16
b8.conv0       2359808     16       [32, 512, 8, 8]    float16
b8.conv1       2359808     16       [32, 512, 4, 4]    float16
b8             -           16       [32, 512, 4, 4]    float16
b4.mbstd       -           -        [32, 513, 4, 4]    float32
b4.conv        2364416     16       [32, 512, 4, 4]    float32
b4.fc          4194816     -        [32, 512]          float32
b4.out         513         -        [32, 1]            float32
---            ---         ---      ---                ---
Total          26489345    288      -                  -

Setting up augmentation...
Distributing across 1 GPUs...
Setting up training phases...
Exporting sample images...
Initializing logs...
Skipping tfevents export: No module named 'tensorboard'
Training for 10000 kimg...

C:\Users\John\Desktop\New\stylegan3-fun-rgba\training\augment.py:231: UserWarning: Specified kernel cache directory could not be created! This disables kernel caching. Specified directory is C:\Users\John\AppData\Local\Temp/torch/kernels. This warning will appear only once per process. (Triggered internally at  C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\jit_utils.cpp:860.)
  s = torch.exp2(torch.randn([batch_size], device=device) * self.scale_std)
Traceback (most recent call last):
  File "train.py", line 330, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "C:\Users\John\AppData\Local\Programs\Python\Python38\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\John\AppData\Local\Programs\Python\Python38\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\John\AppData\Local\Programs\Python\Python38\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\John\AppData\Local\Programs\Python\Python38\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "train.py", line 323, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "train.py", line 92, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "train.py", line 50, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "C:\Users\John\Desktop\New\stylegan3-fun-rgba\training\training_loop.py", line 279, in training_loop
    loss.accumulate_gradients(phase=phase.name, real_img=real_img, real_c=real_c, gen_z=gen_z, gen_c=gen_c, gain=phase.interval, cur_nimg=cur_nimg)
  File "C:\Users\John\Desktop\New\stylegan3-fun-rgba\training\loss.py", line 75, in accumulate_gradients
    gen_logits = self.run_D(gen_img, gen_c, blur_sigma=blur_sigma)
  File "C:\Users\John\Desktop\New\stylegan3-fun-rgba\training\loss.py", line 59, in run_D
    img = self.augment_pipe(img)
  File "C:\Users\John\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\John\Desktop\New\stylegan3-fun-rgba\training\augment.py", line 370, in forward
    raise ValueError('Image must be RGB (3 channels) or L (1 channel)')
ValueError: Image must be RGB (3 channels) or L (1 channel)```

Training stalls when using multiple GPU's

I have been struggling to utilize 2 GPU's when training. After executing the code below, everything loads as usual, and then it stalls when reaching the training step. But when I execute the code below using <--gpus=1> then it run perfectly.
python train.py --outdir=results --cfg=stylegan2 --metrics=None --data=escher-512.zip --kimg=5000 --gamma=10 --gpus=2 --batch=32 --batch-gpu=8 --resume=stylegan2-ffhq-512x512.pkl

I'm not running out of VRAM (x2: Quadro RTX 5000 16GB) or RAM (32GB). Here is a screenshot where you can see both GPU's have 0% load for an extended time:
2023-04-04 16_04_10-Greenshot

I believe that both GPU's are correctly setup and StyleGAN2 should be able to use them both. Here is a screenshot after having run:
nvidia-smi
2023-04-04 16_07_56-Window

I was doing some googling to see if anyone else has had a similar issue... And interestingly this recent issue over on the original repository seems to describe my problem precisely. Yet when I tried out the suggested fix then I still experienced the same problem as before with it stalling upon reaching the training step.

Am I missing some detail or is this a bug? Thanks!

Pytorch MPS Mac M1 Support

Is your feature request related to a problem? Please describe.
I'd like to be able to generate images using the Metal Performance Shaders (MPS) pytorch acceleration for M1 macs

Describe the solution you'd like
Run the generate.py script with device=mps

Describe alternatives you've considered
I've modified the code and am able to run the function, but the resulting image is completely gray. I'm curious if anyone else has tried and succeeded.

The issue seems to be in w_to_img -> G.synthesis(). The output of that function does not match the output when I run with device=cpu. Up until that point everything matched (for example, the output of get_w_from_seed was correct with mps).

The code changes were:

  1. Set device to mps if selected
if torch.cuda.is_available() and device == 'cuda':
        device = torch.device('cuda')
    elif torch.backends.mps.is_available() and device == "mps":
        device = torch.device('mps')
    else:
        device = torch.device('cpu')
  1. Ensure float32 conversion, for example
 z = z.astype(np.float32)
w = G.mapping(torch.from_numpy(z).to(device), None)

Render Video of the Internal Representations using gen_video.py

When using the StyleGAN3 interactive visualization tool, you can checkmark specific nodes to visualize the internal representations of the model. Here is an example -
https://github.com/NVlabs/stylegan3/blob/main/docs/stylegan3-teaser-1920x1006.png
https://nvlabs-fi-cdn.nvidia.com/_web/stylegan3/videos/video_8_internal_activations.mp4

But it is possible to specify and visualize these nodes using the gen_video.py script? I'm using StyleGAN3 within Google Colab and would like to render out video of the internal representations of a specific sequence of seeds.

Also, thank you for releasing this amazing fork! I've been using it to train very small datasets (500 to 1500 images) and so the added mirrorY attribute has been useful, along with the "stabilize-video" attribute too. Here is some of my projects if you're curious.

I loaded the pre-training weights during training and the resolution matches my training set, but an error is reported in train.py. If it works fine without pre-training weights, which file do I need to change?

Traceback (most recent call last):
File "train.py", line 369, in
main() # pylint: disable=no-value-for-parameter
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/miniconda3/lib/python3.8/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "train.py", line 362, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "train.py", line 94, in launch_training
torch.multiprocessing.spawn(fn=subprocess_fn, args=(c, temp_dir), nprocs=c.num_gpus)
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in wrap
fn(i, *args)
File "/root/autodl-tmp/stylegan3-fun-main/train.py", line 50, in subprocess_fn
training_loop.training_loop(rank=rank, **c)
File "/root/autodl-tmp/stylegan3-fun-main/training/training_loop.py", line 163, in training_loop
misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
File "/root/autodl-tmp/stylegan3-fun-main/torch_utils/misc.py", line 162, in copy_params_and_buffers
tensor.copy
(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 1

Would it be possible to render a video from projected W?

I would like to generate a video from a projected w vector and specify the number of frames between this interpolation. The current image generator permits the option to ==projected-w, however, this does not seem possible for video. Is this currently possible?

Describe the solution you'd like
Project images as npz file (vs npy) -> combine multiple vectors into a single npz file -> generate interpolation video between projected images.

Describe alternatives you've considered
Resembles this generator - https://github.com/dvschultz/stylegan2-ada-pytorch/blob/main/generate.py, or the colab : https://colab.research.google.com/github/dvschultz/stylegan2-ada-pytorch/blob/main/SG2_ADA_PyTorch.ipynb#scrollTo=4cgezYN8Dsyh

@ !python generate.py --process=interpolation --interpolation=linear --easing=easeInOutQuad --space=w --network=/content/ladiesblack.pkl --outdir=/content/combined-proj/ --projected-w=/content/npz/combined.npz --frames=120

Amazing set of features! Thank you @PDillis

Specify latent space variables in generate.py

It was not clear to me how I could specify the latent space variables to generate.py, because it seems you can only provide a seed. Can I provide arguments or ranges to generate.py corresponding to the two latent space variables that one can control in visualizer.py?
Thank you & sorry if this is an obvious question.

"FileExistsError: [WinError 183] Cannot create a file when that file already exists" when resuming training

Describe the bug
my guess is that it tries to create the resume outdir twice because when I run the code it does create the outdir in the map but then gives the error below because it probably can't override it or something like that.
i also tried to leave outdir empty ad your code creates the outdir automatically but it gives the same error.

Expected behavior
only create the outdir once so it doesn't create the error

Screenshots

Output directory:    C:\deepdream-test\stylegan3-fun\training-runs\00027-stylegan3-t-datasets-gpus1-batch8-gamma6.6-resume_custom
Number of GPUs:      1
Batch size:          8 images
Training duration:   25000 kimg
Dataset path:        C:\deepdream-test\stylegan3-fun\dataset22\datasets.zip
Dataset size:        5953 images
Dataset resolution:  512
Dataset labels:      False
Dataset x-flips:     True
Dataset y-flips:     False

Creating output directory...
Traceback (most recent call last):
  File "C:\deepdream-test\stylegan3-fun\train.py", line 324, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "C:\python\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "C:\python\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "C:\python\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\python\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "C:\deepdream-test\stylegan3-fun\train.py", line 317, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "C:\deepdream-test\stylegan3-fun\train.py", line 86, in launch_training
    os.makedirs(c.run_dir)
  File "C:\python\lib\os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'C:\\deepdream-test\\stylegan3-fun\\training-runs\\00027-stylegan3-t-datasets-gpus1-batch8-gamma6.6-resume_custom'

thanks for fixing the keyerror: none and activating issues on your repo :)

some files in the zip get corrupted when trying to zip it

Describe the bug
some files in the zip get corrupted when trying to zip it
how can i use it without zip files?

tick 0     kimg 11252.0  time 1m 39s       sec/tick 22.9    sec/kimg 2864.63 maintenance 75.8   cpumem 4.66   gpumem 17.19  reserved 19.70  augment 11.202
tick 1     kimg 11256.0  time 17m 46s      sec/tick 953.9   sec/kimg 238.48  maintenance 13.3   cpumem 4.71   gpumem 14.83  reserved 18.82  augment 11.186
tick 2     kimg 11260.0  time 33m 31s      sec/tick 931.1   sec/kimg 232.77  maintenance 13.6   cpumem 4.71   gpumem 14.88  reserved 18.82  augment 11.170
tick 3     kimg 11264.0  time 49m 06s      sec/tick 922.5   sec/kimg 230.63  maintenance 13.2   cpumem 4.72   gpumem 14.85  reserved 18.82  augment 11.155
tick 4     kimg 11268.0  time 1h 04m 52s   sec/tick 932.2   sec/kimg 233.06  maintenance 13.8   cpumem 4.72   gpumem 15.01  reserved 18.82  augment 11.140
tick 5     kimg 11272.0  time 1h 20m 43s   sec/tick 936.9   sec/kimg 234.21  maintenance 13.8   cpumem 4.72   gpumem 14.96  reserved 18.82  augment 11.132
tick 6     kimg 11276.0  time 1h 36m 30s   sec/tick 933.1   sec/kimg 233.28  maintenance 13.6   cpumem 4.72   gpumem 14.76  reserved 18.82  augment 11.117
Traceback (most recent call last):
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\StyleGAN3\train.py", line 330, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\StyleGAN3\train.py", line 323, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\StyleGAN3\train.py", line 92, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\StyleGAN3\train.py", line 50, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\StyleGAN3\training\training_loop.py", line 260, in training_loop
    phase_real_img, phase_real_c = next(training_set_iterator)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
    data.reraise()
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\torch\_utils.py", line 425, in reraise
    raise self.exc_type(msg)
zipfile.BadZipFile: Caught BadZipFile in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\torch\utils\data\_utils\worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\StyleGAN3\training\dataset.py", line 97, in __getitem__
    image = self._load_raw_image(self._raw_idx[idx])
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\StyleGAN3\training\dataset.py", line 227, in _load_raw_image
    image = np.array(PIL.Image.open(f))
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\PIL\Image.py", line 719, in __array__
    new["data"] = self.tobytes()
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\PIL\Image.py", line 762, in tobytes
    self.load()
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\PIL\ImageFile.py", line 239, in load
    s = read(self.decodermaxblock)
  File "C:\Users\Gebruiker\AppData\Roaming\Visions of Chaos\Examples\MachineLearning\venv\voc_base\lib\site-packages\PIL\PngImagePlugin.py", line 921, in load_read
    self.fp.
read(4)  # CRC
  File "C:\Python\lib\zipfile.py", line 922, in read
    data = self._read1(n)
  File "C:\Python\lib\zipfile.py", line 1012, in _read1
    self._update_crc(data)
  File "C:\Python\lib\zipfile.py", line 940, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '00006/img00006513.png'

Creating environment results in not being able to train

Describe the bug
Creating environment results in pytorch CPU being downloaded
Clip by openAI addition results in torch 1.7.1 being downloaded, unsure if that was cause for pytorch CPU version

To Reproduce
I would run "Conda clean -a" and "pip cache purge"
Then attemp to build environment. Doing so would not allow me to train using
"python train.py --outdir=C:\AI\output\stylegan --cfg=stylegan3-r --data=C:\AI\data\data-512x512.zip --gpus=1 --batch=12 --gamma=8.2 --mirror=1"
or similar commands

Expected behavior
running train.py not erroring out

Screenshots
image

Desktop (please complete the following information):

  • OS: Win 11
  • PyTorch version pytorch 1.7.1
  • CUDA toolkit version 11.3
  • NVIDIA driver version 511.79
  • GPU RTX 3090
  • Docker: no
  • Anaconda: miniconda

Error when training Stylegan2-ext

When I try to start training using --cfg=stylegan2-ext then it errors out with the following message:

"TypeError: __init__() got an unexpected keyword argument 'extended_sgan2'"

is it possible to resume tick and augment too?

Describe the bug
the storage of the remote pc was full so it stopped training but it's already pretty far into training
i'd like to resume without it blurring the new gens
maybe it's possible with the log file?

tick 1587 kimg 6348.0 time 7d 08h 07m sec/tick 396.5 sec/kimg 99.12 maintenance 0.3 cpumem 6.35 gpumem 30.74 reserved 41.21 augment 34.266

Multi-Modal Based Truncation

Hi!Thank you for your excellent work and summary!

I would like to know how to use multi-modal Based Truncation.

--data flag is telling me its an invalid value because its a directory?

Describe the bug
When using my run command: python train.py --outdir C:\Users\User\Documents\machinelearning\6\styleganfunresults --cfg=stylegan2 --data C:\Users\User\Documents\machinelearning\6\styleganfunganimages --gamma=1 --snap=3 --metrics=none --mbstd-group=20 --gpus=1 --batch=20

I get this error:

Usage: train.py [OPTIONS]
Try 'train.py --help' for help.

Error: Invalid value for '--data': File 'C:\Users\User\Documents\machinelearning\6\styleganfunganimages' is a directory.

(styleganfun) C:\Users\User\Documents\machinelearning\stylegan3-fun>

To Reproduce

I have 2k png images with transparent backgrounds and used the dataset_tool.py first with the below command.

python dataset_tool.py --source C:\Users\User\Documents\machinelearning\5\512croppedCopy --dest C:\Users\User\Documents\machinelearning\6\styleganfunganimages
then i tried to train on that data with

python train.py --outdir C:\Users\User\Documents\machinelearning\6\styleganfunresults --cfg=stylegan2 --data C:\Users\User\Documents\machinelearning\6\styleganfunganimages --gamma=1 --snap=3 --metrics=none --mbstd-group=20 --gpus=1 --batch=20

and received that above error?

Expected behavior
Obviously it should just accept that being a directory? not sure why it wouldn't be a directory? even the flags in the train.py file says it should be a directory

Screenshots
image
image

Desktop (please complete the following information):

  • OS: Windows 10
  • Python 3.8,
  • CUDA toolkit 11.1
  • NVIDIA grpahics driver 551.23
  • GPU [ASUS rog strix RTX 3090]

Error running Circular Interpolation

I'm seeing an error when running the Circular Interpolation portion of the generate.py script. Check out the error pasted below.

I don't believe that I'm passing any illegal values or character into the attributes. Am I missing something simple or is it a bug?

Here is a Google Colab notebook showcasing the issue (logs included). Included at the bottom of the notebook is a proof-of-concept test using just the required attributes, and then also a more realistic use case scenario. Both return the same error.
https://colab.research.google.com/drive/1VADM8w2b9fSnO_25axSJB44hMY8jnCB2?usp=sharing

Traceback (most recent call last):
  File "generate.py", line 743, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "generate.py", line 704, in circular_video
    videoclip = moviepy.editor.VideoClip(make_frame, duration=duration_sec)
  File "/usr/local/lib/python3.7/dist-packages/moviepy/video/VideoClip.py", line 86, in __init__
    self.size = self.get_frame(0).shape[:2][::-1]
  File "<decorator-gen-10>", line 2, in get_frame
  File "/usr/local/lib/python3.7/dist-packages/moviepy/decorators.py", line 89, in wrapper
    return f(*new_a, **new_kw)
  File "/usr/local/lib/python3.7/dist-packages/moviepy/Clip.py", line 94, in get_frame
    return self.make_frame(t)
  File "generate.py", line 689, in make_frame
    dlatents = gen_utils.z_to_dlatent(G, latents, label, truncation_psi=1.0)  # Get the pure dlatent
TypeError: z_to_dlatent() got an unexpected keyword argument 'truncation_psi'

Freeze Mapping Network and affine transformation layer

As we know, freeze the mapping network and affine transformation layer during fine-tuning phase to better preserve semantic. The official repository only supports freezeD. if add freeze M and A, I think it is useful to explore the trained models without unnecessary headaches.

Start from pretrained at different resolution

Is your feature request related to a problem? Please describe.
Is it possible to load a pretrained model at different resolution? I have a pretrained at 512x512 and I would start from it to train a new one at 256x256.

Describe the solution you'd like
Automatic recovery of previously trained layers, when they match

Describe alternatives you've considered
Resize images, but train at 512 require too much time

dataset_tool.py has no output

I'm trying to use dataset_tool.py to pack an ARGB image into a dataset, but when the progress bar completes, I don't see any output for the path defined by --dest, can you help me?

Bug in conditioning of discriminator?

I'm pretty sure i wouldn't get any support in official SG3 repo, because it all looks abandoned, the issues of this repo mostly remain silently unanswered.
I noticed that you provided some community support by answering to some issues, for the users in it, i think this is important contribution for StyleGAN community, so props to you.
I think this is the only place where this problem might be unraveled, and i thought you could shred some light on it.

Recently i've tried to train conditional model, and i'm super hyped about it, because i've been playing with SG for quite a while already, and it is the first time i was trying conditional model. The power that conditioning is able to provide is just super cool to me.
Also, turned out SG supports multiple labels out of the box, which was quite unexpected for me, and i'm even more hyped to try that out.

Papers of SG/SG2/SG3 doesn't seems to have even a single word about conditioning, but the code has it.
I was trying to find something related to it in the papers, but no luck.

Describe the bug
Everything related to the bug is already described here:
NVlabs#209

Thanks a lot in advance.

is it possible to resume training a .pkl file on the same kimg with a new datasetof pictures?

Describe the bug
I tried doing this but it gives me an error (see below)
when resume kimg with the normal dataset of images it doesn't give me this error.
I have checked if all the images are 1024px and they are.
it seems to start training but fails after the first tick.

input code
python train.py --cfg=stylegan3-t --data=C:\deepdream-test\stylegan3-fun\dataset22\images\1024.zip --aug=ada --augpipe=bg --target=0.7 --gpus=1 --batch=8 --batch-gpu=8 --mbstd-group=8 --gamma=6.6 --mirror=1 --kimg=25000 --snap=1 --metrics=none --resume=C:\deepdream-test\stylegan3-fun\training-runs\network-snapshot-005832.pkl --resume-kimg=5832

error code

Setting up augmentation...
Distributing across 1 GPUs...
Setting up training phases...
Exporting sample images...
Initializing logs...
Training for 25000 kimg...

tick 0     kimg 5832.0   time 1m 34s       sec/tick 20.5    sec/kimg 2557.87 maintenance 73.5   cpumem 4.52   gpumem 16.10  reserved 19.92  augment 0.000
Traceback (most recent call last):
  File "c:\deepdream-test\stylegan3-fun\train.py", line 324, in <module>
    main()  # pylint: disable=no-value-for-parameter
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "c:\deepdream-test\stylegan3-fun\train.py", line 317, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "c:\deepdream-test\stylegan3-fun\train.py", line 95, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "c:\deepdream-test\stylegan3-fun\train.py", line 50, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "c:\deepdream-test\stylegan3-fun\training\training_loop.py", line 260, in training_loop
    phase_real_img, phase_real_c = next(training_set_iterator)
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
    data.reraise()
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\torch\_utils.py", line 425, in reraise
    raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\torch\utils\data\_utils\worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Gebruiker\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "c:\deepdream-test\stylegan3-fun\training\dataset.py", line 99, in __getitem__
    assert list(image.shape) == self.image_shape
AssertionError

any way to get the visualizer to support importing stylegan3-nada .pkl files?

(https://github.com/rinongal/StyleGAN-nada/blob/StyleGAN3-NADA/stylegan3_nada.ipynb)
i'm making some tweaks to my stylegan3 network with this but it seems like i'm not able to import it to the visualizer
any way to get the visualizer to support these?
the generating of images in this repo is a little whacky but it's such a good tweak to stylegan, would be great if it worked :)
there is a fix for converting stylegan2-nada files to stylegan2 but not for 3 yet :/

https://github.com/eps696/stylegan2ada (this is the one for stylegan2 that converts the .pt files to .pkl files again)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.