nvlabs / denoising-diffusion-gan Goto Github PK

View Code? Open in Web Editor NEW

663.0 34.0 72.0 83.09 MB

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs https://arxiv.org/abs/2112.07804

License: Other

Python 90.27% C++ 1.42% Cuda 8.31%

denoising-diffusion-gan's Introduction

Official PyTorch implementation of "Tackling the Generative Learning Trilemma with Denoising Diffusion GANs" (ICLR 2022 Spotlight Paper)

Zhisheng Xiao · Karsten Kreis · Arash Vahdat

Project Page

Generative denoising diffusion models typically assume that the denoising distribution can be modeled by a Gaussian distribution. This assumption holds only for small denoising steps, which in practice translates to thousands of denoising steps in the synthesis process. In our denoising diffusion GANs, we represent the denoising model using multimodal and complex conditional GANs, enabling us to efficiently generate data in as few as two steps.

Set up datasets

We trained on several datasets, including CIFAR10, LSUN Church Outdoor 256 and CelebA HQ 256. For large datasets, we store the data in LMDB datasets for I/O efficiency. Check here for information regarding dataset preparation.

Training Denoising Diffusion GANs

We use the following commands on each dataset for training denoising diffusion GANs.

CIFAR-10

We train Denoising Diffusion GANs on CIFAR-10 using 4 32-GB V100 GPU.

python3 train_ddgan.py --dataset cifar10 --exp ddgan_cifar10_exp1 --num_channels 3 --num_channels_dae 128 --num_timesteps 4 \
--num_res_blocks 2 --batch_size 64 --num_epoch 1800 --ngf 64 --nz 100 --z_emb_dim 256 --n_mlp 4 --embedding_type positional \
--use_ema --ema_decay 0.9999 --r1_gamma 0.02 --lr_d 1.25e-4 --lr_g 1.6e-4 --lazy_reg 15 --num_process_per_node 4 \
--ch_mult 1 2 2 2 --save_content

LSUN Church Outdoor 256

We train Denoising Diffusion GANs on LSUN Church Outdoor 256 using 8 32-GB V100 GPU.

python3 train_ddgan.py --dataset lsun --image_size 256 --exp ddgan_lsun_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 4 \
--num_res_blocks 2 --batch_size 8 --num_epoch 500 --ngf 64 --embedding_type positional --use_ema --ema_decay 0.999 --r1_gamma 1. \
--z_emb_dim 256 --lr_d 1e-4 --lr_g 1.6e-4 --lazy_reg 10 --num_process_per_node 8 --save_content

CelebA HQ 256

We train Denoising Diffusion GANs on CelebA HQ 256 using 8 32-GB V100 GPUs.

python3 train_ddgan.py --dataset celeba_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 \
--num_res_blocks 2 --batch_size 4 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. \
--z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10  --num_process_per_node 8 --save_content

Pretrained Checkpoints

We have released pretrained checkpoints on CIFAR-10 and CelebA HQ 256 at this Google drive directory. Simply download the saved_info directory to the code directory. Use --epoch_id 1200 for CIFAR-10 and --epoch_id 550 for CelebA HQ 256 in the commands below.

Evaluation

After training, samples can be generated by calling test_ddgan.py. We evaluate the models with single V100 GPU. Below, we use --epoch_id to specify the checkpoint saved at a particular epoch. Specifically, for models trained by above commands, the scripts for generating samples on CIFAR-10 is

python3 test_ddgan.py --dataset cifar10 --exp ddgan_cifar10_exp1 --num_channels 3 --num_channels_dae 128 --num_timesteps 4 \
--num_res_blocks 2 --nz 100 --z_emb_dim 256 --n_mlp 4 --ch_mult 1 2 2 2 --epoch_id $EPOCH

The scripts for generating samples on CelebA HQ is

python3 test_ddgan.py --dataset celeba_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 \
--ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2  --epoch_id $EPOCH

The scripts for generating samples on LSUN Church Outdoor is

python3 test_ddgan.py --dataset lsun --image_size 256 --exp ddgan_lsun_exp1 --num_channels 3 --num_channels_dae 64 \
--ch_mult 1 1 2 2 4 4  --num_timesteps 4 --num_res_blocks 2  --epoch_id $EPOCH

We use the PyTorch implementation to compute the FID scores, and in particular, codes for computing the FID are adapted from FastDPM.

To compute FID, run the same scripts above for sampling, with additional arguments --compute_fid and --real_img_dir /path/to/real/images.

For Inception Score, save samples in a single numpy array with pixel values in range [0, 255] and simply run

python ./pytorch_fid/inception_score.py --sample_dir /path/to/sampled_images

where the code for computing Inception Score is adapted from here.

For Improved Precision and Recall, follow the instruction here.

License

Please check the LICENSE file. Denoising diffusion GAN may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Bibtex

Cite our paper using the following bibtex item:

@inproceedings{
xiao2022tackling,
title={Tackling the Generative Learning Trilemma with Denoising Diffusion GANs},
author={Zhisheng Xiao and Karsten Kreis and Arash Vahdat},
booktitle={International Conference on Learning Representations},
year={2022}
}

Contributors

Denoising Diffusion GAN was built primarily by Zhisheng Xiao during a summer internship at NVIDIA research.

denoising-diffusion-gan's People

Stargazers

Watchers

Forkers

shaun95 pkulwj1994 xavierxiao eridgd mrtornado24 jeongwhanchoi zebrajack cryptowealth-technology peterzhousz khaledlarbi mateo-vial luizapozzobon coimbra574 wn1695173791 javiernistal fodark liutengjun stjordanis devzhk gaozhihan shinypond dhruvhs dclw29 junhopark0314 lukovnikov janvanlooy vedantdere phymhan lpsunny peterouzh lyapunovstability jcbrouwer jxzhangjhu duncesurfer zivzone adderbyte tuttyfrutyee jh-001 matt-bendel hao-pt namnaku87 mehdidc javokhirajabov ssusantachary kushbavaria mikeswf 1lovesjohnny cocoaaa cap6412-group-4 muhammadasadhaider ginlov whuhxb michaelwonggod yinyin-llll yamm01 htyjers chenxu31 ethanepp huseyin-karaca ysm2000 wahyuramadhandotcom lingxiao108105 importnumpy sakhan-1111 wxaaron fork-the-world swyoon crazytiy beratersari firetemple

denoising-diffusion-gan's Issues

When will you release the code

Thanks for the interesting work!

Inconsistency of the initial channel number of the generator for LSUN Church.

Thanks for this wonderful work. I wonder what is the initial channel number of the generator for LSUN Church. In the paper, it is 128 in Table 6. While in the README, it is set to 64.

Single GPU Training

I wanted to plug and play the model for sampling however I am able to use a single GPU for the task. I wanted to know how to modify the mode for single GPU training.
The following output is shown if I directly run the training file on say Google Colab.

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

The code about 25Gaussian dataset?

Dear author,
It is a quite interesting work. However, I find the MLP-based structure is far worse than the U-Net-based structure when I experiment the diffusion models. So I want the training and testing code about your 25Gaussian dataset to verify my idea.

Best regards!

Recall score of StyleGAN 2 w/ ADA

Thanks for the great work.

When using the provided script in https://github.com/NVlabs/stylegan2-ada-pytorch to calculate the recall score, StyleGAN 2 w/ ADA obtains 0.57. But your reported result in Table is only 0.49. Are there some differences?

Generating samples from different classes in LSUN

The current sample_from_model function does not include arguments for generating from specific classes. Is there a way to do that? what parameters have to be changed? Thank you for your time.

Query: CelebA HQ 256

I am trying to implement the DDGAN paper. The authors ask the users to refer to this repository to for the dataset preparation.

The following command is suggested for downloading the dataset from openai's glow project:-
mkdir -p $DATA_DIR/celeba
cd $DATA_DIR/celeba

wget https://storage.googleapis.com/glow-demo/data/celeba-tfr.tar
tar -xvf celeba-tfr.tar
cd $CODE_DIR/scripts

python convert_tfrecord_to_lmdb.py --dataset=celeba --tfr_path=$DATA_DIR/celeba/celeba-tfr --lmdb_path=$DATA_DIR/celeba/celeba-lmdb --split=train

python convert_tfrecord_to_lmdb.py --dataset=celeba --tfr_path=$DATA_DIR/celeba/celeba-tfr --lmdb_path=$DATA_DIR/celeba/celeba-lmdb --split=validation

However when I implement the commands, I face the following errors:
`--2022-07-27 09:03:47-- https://storage.googleapis.com/glow-demo/data/celeba-tfr.tar
Resolving storage.googleapis.com (storage.googleapis.com)... 142.250.4.128, 172.217.194.128, 142.251.10.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.250.4.128|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-07-27 09:03:48 ERROR 404: Not Found.

tar: celeba-tfr.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now`

Prafulla Dhariwal has addressed the some related errors here openai/glow/issues/1 for reference.

train question

I always get this error during training:
free(): invalid pointer
Aborted (core dumped)
I would like to ask if there is any solution

Trainig set

Hello,

Did you used the whole cifar10 or celeba dataset for the training or the predefined split for test and train?

The NCCL error

$ python train_ddgan.py --dataset cifar10 --exp ddgan_cifar10_exp1 --num_channels 3 --num_channels_dae 128 --num_timesteps 4 --num_res_blocks 2 --batch_size 64 --num_epoch 1800 --ngf 64 --nz 100 --z_emb_dim 256 --n_mlp 4 --embedding_type positional --use_ema --ema_decay 0.9999 --r1_gamma 0.02 --lr_d 1.25e-4 --lr_g 1.6e-4 --lazy_reg 15 --num_process_per_node 1 --ch_mult 1 2 2 2 --save_content starting in debug mode Files already downloaded and verified Traceback (most recent call last): File "train_ddgan.py", line 564, in <module> init_processes(0, size, train, args) File "train_ddgan.py", line 470, in init_processes fn(rank, gpu, args) File "train_ddgan.py", line 265, in train broadcast_params(netG.parameters()) File "train_ddgan.py", line 36, in broadcast_params dist.broadcast(param.data, src=0) File "/home/mapengsen/anaconda3/envs/37/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 1039, in broadcast work = default_pg.broadcast([tensor], opts) RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:825, unhandled system error, NCCL version 2.7.8 ncclSystemError: System call (socket, malloc, munmap, etc) failed.

when i run train CIFAR-10 ,but get the NCCL error，how can i solve it ?please

Can this model's generation be controlled?

https://youtu.be/XCUlnHP1TNM?t=3305

This talks about how diffusion models can be used to do class conditional generation, inpainting and colorization. Can the diffusion GAN be used for that as well? Personally, I am interested in style transfer, could the model do that as well?

In code, I think "t -> t+1" is correct. can you explain it ?

Hello.

I read your code and your paper (in Figure 3), and I got confused.

So, I think it is correct. (from link in train_ddgan.py)
Because of x_tp1 got from t+1.

def q_sample(coeff, x_start, t, *, noise=None):
    if noise is None:
      noise = torch.randn_like(x_start)
      
    x_t = extract(coeff.a_s_cum, t, x_start.shape) * x_start + \
          extract(coeff.sigmas_cum, t, x_start.shape) * noise
    
    return x_t

def q_sample_pairs(coeff, x_start, t):
    noise = torch.randn_like(x_start)
    x_t = q_sample(coeff, x_start, t)
    x_t_plus_one = extract(coeff.a_s, t+1, x_start.shape) * x_t + \
                   extract(coeff.sigmas, t+1, x_start.shape) * noise
    
    return x_t, x_t_plus_one

Line 378: D_real = netD(x_t, t, x_tp1.detach()).view(-1) -> D_real = netD(x_t, t+1, x_tp1.detach()).view(-1)
Line 342: x_0_predict = netG(x_tp1.detach(), t, latent_z) -> x_0_predict = netG(x_tp1.detach(), t+1, latent_z)
Line 381: output = netD(x_pos_sample, t, x_tp1.detach()).view(-1) -> output = netD(x_pos_sample, t+1, x_tp1.detach()).view(-1)
Line 378: x_0_predict = netG(x_tp1.detach(), t, latent_z) -> x_0_predict = netG(x_tp1.detach(), t+1, latent_z)
Line 379: x_pos_sample = sample_posterior(pos_coeff, x_0_predict, x_tp1, t) -> x_pos_sample = sample_posterior(pos_coeff, x_0_predict, x_tp1, t+1)
Line 381: output = netD(x_pos_sample, t, x_tp1.detach()).view(-1) -> output = netD(x_pos_sample, t+1, x_tp1.detach()).view(-1)
Line 411~414:

x_0_predict = netG(x_tp1.detach(), t, latent_z)
x_pos_sample = sample_posterior(pos_coeff, x_0_predict, x_tp1, t)
output = netD(x_pos_sample, t, x_tp1.detach()).view(-1)

->
x_0_predict = netG(x_tp1.detach(), t+1, latent_z)
x_pos_sample = sample_posterior(pos_coeff, x_0_predict, x_tp1, t+1)
output = netD(x_pos_sample, t+1, x_tp1.detach()).view(-1)

I could be wrong. Can you explain which one is correct?

ninja: build stopped: subcommand failed.

Is this something to do with the cuda version? The cuda version I am using is 10.2.
starting in debug mode
Traceback (most recent call last):
File "/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build
env=env)
File "/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "train_ddgan.py", line 601, in
init_processes(0, size, train, args)
File "train_ddgan.py", line 471, in init_processes
fn(rank, gpu, args)
File "train_ddgan.py", line 192, in train
from score_sde.models.discriminator import Discriminator_small, Discriminator_large
File "/data/songwei/YING/denoising-diffusion-gan/score_sde/models/discriminator.py", line 11, in
from . import up_or_down_sampling
File "/data/songwei/YING/denoising-diffusion-gan/score_sde/models/up_or_down_sampling.py", line 15, in
from score_sde.op import upfirdn2d
File "/data/songwei/YING/denoising-diffusion-gan/score_sde/op/init.py", line 1, in
from .fused_act import FusedLeakyReLU, fused_leaky_relu
File "/data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_act.py", line 23, in
os.path.join(module_path, "fused_bias_act_kernel.cu"),
File "/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1091, in load
keep_intermediates=keep_intermediates)
File "/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1302, in jit_compile
is_standalone=is_standalone)
File "/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1407, in write_ninja_file_and_build_library
error_prefix=f"Error building extension '{name}'")
File "/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1683, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'fused': [1/3] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output fused_bias_act_kernel.cuda.o.d -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/TH -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/THC -isystem /data/songwei/anacondaV100/envs/ddpmgan/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
/usr/bin/nvcc --generate-dependencies-with-compile --dependency-output fused_bias_act_kernel.cuda.o.d -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/TH -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/THC -isystem /data/songwei/anacondaV100/envs/ddpmgan/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
nvcc fatal : Unknown option '-generate-dependencies-with-compile'
[2/3] c++ -MMD -MF fused_bias_act.o.d -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/TH -isystem /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/THC -isystem /data/songwei/anacondaV100/envs/ddpmgan/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp -o fused_bias_act.o
In file included from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/c10/core/DeviceType.h:8:0,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/c10/core/Device.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/c10/core/Allocator.h:6,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/ATen.h:7,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/extension.h:4,
from /data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:8:
/data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp: In function ‘at::Tensor fused_bias_act(const at::Tensor&, const at::Tensor&, const at::Tensor&, int, int, float, float)’:
/data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:14:42: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
#define CHECK_CUDA(x) TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
^
/data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:20:5: note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(input);
^
In file included from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/Tensor.h:3:0,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/Context.h:4,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/ATen.h:9,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/extension.h:4,
from /data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:8:
/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/core/TensorBody.h:303:30: note: declared here
DeprecatedTypeProperties & type() const {
^~~~
In file included from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/c10/core/DeviceType.h:8:0,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/c10/core/Device.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/c10/core/Allocator.h:6,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/ATen.h:7,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/extension.h:4,
from /data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:8:
/data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:14:42: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
#define CHECK_CUDA(x) TORCH_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor")
^
/data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:21:5: note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(bias);
^
In file included from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/Tensor.h:3:0,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/Context.h:4,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/ATen.h:9,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/torch/extension.h:4,
from /data/songwei/YING/denoising-diffusion-gan/score_sde/op/fused_bias_act.cpp:8:
/data/songwei/anacondaV100/envs/ddpmgan/lib/python3.6/site-packages/torch/include/ATen/core/TensorBody.h:303:30: note: declared here
DeprecatedTypeProperties & type() const {
^~~~
ninja: build stopped: subcommand failed.

How should I put the location of dataset?

(th1.71) zzj@zzj:~/disk1/zzj/denoising-diffusion-gan-main$ python train_ddgan.py --dataset lsun
starting in debug mode
Traceback (most recent call last):
File "train_ddgan.py", line 600, in
init_processes(0, size, train, args)
File "train_ddgan.py", line 470, in init_processes
fn(rank, gpu, args)
File "train_ddgan.py", line 226, in train
train_data = LSUN(root='/datasets/LSUN/', classes=['church_outdoor_train'], transform=train_transform)
File "/home/zzj/disk1/zzj/denoising-diffusion-gan-main/datasets_prep/lsun.py", line 95, in init
transform=transform))
File "/home/zzj/disk1/zzj/denoising-diffusion-gan-main/datasets_prep/lsun.py", line 33, in init
readahead=False, meminit=False)
lmdb.Error: /datasets/LSUN//church_outdoor_train_lmdb: No such file or directory

AttributeError: 'EMA' object has no attribute '_optimizer_state_dict_pre_hooks'

Hello there,

Thanks for the awesome work first. I try to reproduce it on cifar10. However, I got below error. I noticed that the parent class Optimizer has the field '_optimizer_state_dict_pre_hooks', which however is not accessible in EMA.

Any tips will be greatly appreciated.

Epoch 001/1200 [3101/3125] -- errD: 1.4794 | errG: 0.9537 | errD_real: 0.7405 | errD_fake: 0.7389 -- ETA: 13 days, 23:38:04.064760epoch 0 iteration3100, G Loss: 1.1456470489501953, D Loss: 1.4277989864349365
Epoch 001/1200 [3125/3125] -- errD: 1.4790 | errG: 0.9532 | errD_real: 0.7403 | errD_fake: 0.7387 -- ETA: 13 days, 23:31:45.533348
Saving content.
python-BaseException
Traceback (most recent call last):
File "/home/cruk/anaconda3/envs/dev_py310_pt211/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, **kwargs)
File "/home/cruk/anaconda3/envs/dev_py310_pt211/lib/python3.10/site-packages/torch/optim/optimizer.py", line 568, in state_dict
for pre_hook in self._optimizer_state_dict_pre_hooks.values():
AttributeError: 'EMA' object has no attribute '_optimizer_state_dict_pre_hooks'. Did you mean: 'register_state_dict_pre_hook'?

ImportError: DLL load failed while importing fused: The specified module could not be found.

When I run the command:python train_ddgan.py --dataset celeba_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2 --batch_size 4 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. --z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10 --num_process_per_node 1 --save_content to perform training in an anaconda virtual environment, and I get the error as shown as picture below:

(The Chinese in last line is "The specified module could not be found.")

I don't know what causes this error and how to solve it, could anyone give me some suggestions to solve this error?

I have tried some solutions for the problems that are issued in other repositories and very similar with mine, some of reasons causing this kind of problem are the version issues of torch (pytorch) and CUDA. In addition, there was an another reason causing this problem due to the 'ninja' as described in NVlabs/stylegan3#88, but I don't know what version of 'ninja' I have to install (currently, I installed the latest version), honestly, I'm not sure if my case (my problem) is associated with 'ninja'.
My libs and their versions installed in my anaconda virtual environment are shown as below:

By the way, I only use a single GPU, it is 12GB GeForce RTX 4070

I have tried to upgrade the versions of pytorch and cudatoolkit, however, whenever the version of pytorch is newer than 1.10.1 (>1.10.1), If I run the same training command mentioned in beginning, it gives me another new error as shown as picture below (the new error was the same as how-to-resolve-the-error-message-return-tcpstore-runtimeerror-unmatched, and I don't know how to solve it, either

In conclusion, if I want to solve the error:"DLL load failed while importing fused: The specified module could not be found", it seems I have to upgrade the versions of pytorch, however, if I want to solve the error:"return TCPStore( ) RuntimeError: unmatched '}' in format string", I have to downgrade the version of pytorch, the solutions of the two errors conflict each other, I have tried many combinations of versions of pytorch and cudatoolkit to install, unfortunately, there isn't a combination can solve these two errors simultaneously.

I mainly want to solve the error: ImportError: DLL load failed while importing fused: The specified module could not be found..
If someone knows how to solve the error:ImportError: DLL load failed while importing fused: The specified module could not be found., please give me some suggestions to let me successfully perform training.

Thanks a lot for anyone's help !!!
If you need, I will provide more details about my problem, thanks !

Training setting on Cifar

Training setting on repo is different from paper: nz=100, z_emb_dim=256.

Could you please tell me what is the correct setting? Thank you very much!

why cannot “from score_sde.models.discriminator import Discriminator_small, Discriminator_large”

Hi everyone！My python is 3.8 and torch is 1.8.
When my "train_ddgan.py" run into “from score_sde.models.discriminator import Discriminator_small, Discriminator_large”，nothing happens and program cannot continue. Why can't I successfully import？
please give me some help！

How to fine tune model on my own dataset?

Hello.
First of all, I want to thank you for your great work.
I'm really interested in your work and i want to fine tune your pre-trained weights on my own dataset.
as far as i know you just published the weights for NetG (Generator) model. However, in your code in order to finetune it you must have contant.pt which has weights for discriminator and some other things.
so could you please share contant.pt with me?

Image to Image

Hi can we use this code for image to image translation?

Use model for inpainting

Hi. Thanks for your great code. How can we use the trained model for inpainting? Is there any code available? The SDE model, for example, has a code for inpainting but on your repo I couldn't find such thing.

Thanks

how can i train it in my own datasets?

how can i train it in my own datasets?
if i use my datasets, how can i train it? need i change the code of dataloader?
and how can i train in image floder rather LMDB datasets?

I am very interested in your research. I want to do some verification experiments, I want to use your pre-trained model, can you please provide your pre-trained weights content.pth in cifar10. and celebA, thank you for your excellent work

FFHQ Training

Thank you for sharing the implementation of the DDGAN model. I am trying to train the model on FFHQ 256x256 dataset. I used the NVLabs/NVAE repository for the dataset preparation. I have the file structure as follows:

To use another dataset similar to the CelebA-HQ 256x256, I modified the train function given in the line 190 of the train_ddgan.py file.

    elif args.dataset == 'ffhq_256':
        train_transform = transforms.Compose([
                transforms.Resize(args.image_size),
                transforms.RandomHorizontalFlip(),
                transforms.ToTensor(),
                transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))
            ])
        dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)

My implementation for the DDGAN uses 4 NVIDIA GTX 1080ti GPUs with a total batch size of 32 for training the CelebA-HQ 256x256 dataset

(--batch_size 8 and --num_process_per_node 4)

I use the following command for training!python3 train_ddgan.py --dataset ffhq_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2 --batch_size 8 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. --z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10 --num_process_per_node 4 --save_content

I am getting the following output message:

Node rank 0, local proc 0, global proc 0
Node rank 0, local proc 1, global proc 1
Node rank 0, local proc 2, global proc 2
Node rank 0, local proc 3, global proc 3
Process Process-4:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-1:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-3:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory

Can fitting the data distribution of DDIM using conditional GANs of DDGANs further improve the generation speed

LSUN Church pre-trained models.

Thanks for this wonderful work. I wonder if you would release the LSUN Church pre-trained models? When would you release them?

Query: Evaluation Error

I am facing the following error while trying to evaluate using the model checkpoint for the CelebA-HQ 256x256 dataset.

Traceback (most recent call last):
  File "test_ddgan.py", line 272, in <module>
    sample_and_test(args)
  File "test_ddgan.py", line 186, in sample_and_test
    fake_sample = sample_from_model(pos_coeff, netG, args.num_timesteps, x_t_1,T,  args)
  File "test_ddgan.py", line 123, in sample_from_model
    x_0 = generator(x, t_time, latent_z)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 282, in forward
    zemb = self.z_transform(z)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`

I have tried to downgrade to torch==1.7.1 as suggested here and even on PyTorch discussion forums. However the error is still there.

运行错误无法解决

[W socket.cpp:697] [c10d] The client socket has failed to connect to [DESKTOP-H11RS21]:6020 (system error: 10049 - ��У��ĵ�ַ��Ч��).
Traceback (most recent call last):
File "D:\conda\envs\py39\lib\site-packages\torch\utils\cpp_extension.py", line 2096, in _run_ninja_build
subprocess.run(
File "D:\conda\envs\py39\lib\subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "E:\Azhe\denoising-diffusion-gan-main\train_ddgan.py", line 566, in
init_processes(0, size, train, args)
File "E:\Azhe\denoising-diffusion-gan-main\train_ddgan.py", line 472, in init_processes
fn(rank, gpu, args)
File "E:\Azhe\denoising-diffusion-gan-main\train_ddgan.py", line 193, in train
from score_sde.models.discriminator import Discriminator_small, Discriminator_large
File "E:\Azhe\denoising-diffusion-gan-main\score_sde\models\discriminator.py", line 11, in
from . import up_or_down_sampling
File "E:\Azhe\denoising-diffusion-gan-main\score_sde\models\up_or_down_sampling.py", line 15, in
from score_sde.op import upfirdn2d
File "E:\Azhe\denoising-diffusion-gan-main\score_sde\op_init_.py", line 1, in
from .fused_act import FusedLeakyReLU, fused_leaky_relu
File "E:\Azhe\denoising-diffusion-gan-main\score_sde\op\fused_act.py", line 20, in
fused = load(
File "D:\conda\envs\py39\lib\site-packages\torch\utils\cpp_extension.py", line 1306, in load
return jit_compile(
File "D:\conda\envs\py39\lib\site-packages\torch\utils\cpp_extension.py", line 1710, in jit_compile
write_ninja_file_and_build_library(
File "D:\conda\envs\py39\lib\site-packages\torch\utils\cpp_extension.py", line 1823, in write_ninja_file_and_build_library
run_ninja_build(
File "D:\conda\envs\py39\lib\site-packages\torch\utils\cpp_extension.py", line 2112, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'fused': [1/2] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc --generate-dependencies-with-compile --dependency-output fused_bias_act_kernel.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4068 -Xcompiler /wd4067 -Xcompiler /wd4624 -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -ID:\conda\envs\py39\lib\site-packages\torch\include -ID:\conda\envs\py39\lib\site-packages\torch\include\torch\csrc\api\include -ID:\conda\envs\py39\lib\site-packages\torch\include\TH -ID:\conda\envs\py39\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -ID:\conda\envs\py39\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 -std=c++17 -c E:\Azhe\denoising-diffusion-gan-main\score_sde\op\fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc --generate-dependencies-with-compile --dependency-output fused_bias_act_kernel.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4068 -Xcompiler /wd4067 -Xcompiler /wd4624 -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -ID:\conda\envs\py39\lib\site-packages\torch\include -ID:\conda\envs\py39\lib\site-packages\torch\include\torch\csrc\api\include -ID:\conda\envs\py39\lib\site-packages\torch\include\TH -ID:\conda\envs\py39\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -ID:\conda\envs\py39\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 -std=c++17 -c E:\Azhe\denoising-diffusion-gan-main\score_sde\op\fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o

ninja: build stopped: subcommand failed.

Query: FFHQ 256x256

Respected sir
Thank you for sharing the implementation and weights for the DDGAN model. I am comparing the DDGAN model with other generative models for image generation.I wanted to train the model on FFHQ 256x256 dataset. For getting to the 256x256 version of the dataset, one has to download the 1024x1024 version of it (the dataset preparation method is given in the NVIDIA NVAE repository). However I am facing an issue, the dataset (FFHQ 1024x1024) is almost 90 GB in size, which exceeds the limits of my current resources.

I thought of downloading the resized FFHQ 256x256 version from kaggle, however I am not sure the pre-processing scripts will work fine. I humbly request you to guide me.

PS I would be grateful if you could share the pre-trained DDGAN model on the FFHQ 256x256 dataset.

Continue Train option support

Hi! Thank you for your work! I would like to know whether the code support Continue Train option. When I stop unexpectedly， i want to continue to train the model from last checkpoint!

CelebA-HQ 256x256 Training

The training is done on CelebA-HQ 256x256 pre-processed as per the instructions given in the NVAE repository.

This implementation for the DDGAN uses 4 NVIDIA GTX 1080 TI GPUs with a total batch size of 32 for training the CelebA-HQ 256x256 dataset

(--batch_size 8 and --num_process_per_node 4)

I am using the following command for training:
!python3 train_ddgan.py --dataset celeba_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2 --batch_size 8 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. --z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10 --num_process_per_node 4 --save_content

I am getting the following output. I humbly request you to guide me.

Node rank 0, local proc 0, global proc 0
Node rank 0, local proc 1, global proc 1
Node rank 0, local proc 2, global proc 2
Node rank 0, local proc 3, global proc 3
<memory at 0x7f5d253c7040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c8040>
<memory at 0x7f5d253c7040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c7040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c7040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c7040>
<memory at 0x7f5d253c8040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c7040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c8040><memory at 0x7f5d253c9040>

<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c7040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c8040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c7040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c9040><memory at 0x7f5d253c8040>

<memory at 0x7f5d253c1040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c8040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c1040>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c1040><memory at 0x7f5d253c8040>

<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c8040>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253db100>
<memory at 0x7f5d22395040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d6100>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c9040>
<memory at 0x7f5d22394040>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d22395040>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253c4040><memory at 0x7f5d253c4040>

<memory at 0x7f5d22394040>
<memory at 0x7f5d253d6100>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253d6100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253d2100><memory at 0x7f5d253d8100>

<memory at 0x7f5d253db100>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d253d6100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253c3040><memory at 0x7f5d2239f100>

<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d6100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253d6100><memory at 0x7f5d253db100>

<memory at 0x7f5d253c4040>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d6100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d2239e100><memory at 0x7f5d2239f100>

<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d6100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253d8100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253c3040>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253c4040><memory at 0x7f5d253d2100>

<memory at 0x7f5d2239e100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d2239f100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253c4040>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253d3100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253db100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d253ce100><memory at 0x7f5d2239e100>

<memory at 0x7f5d2239e100>
<memory at 0x7f5d253ce100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d2239e100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253d2100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
<memory at 0x7f5d253cd100>
Process Process-3:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 390, in train
    x_0_predict = netG(x_tp1.detach(), t, latent_z)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 367, in forward
    h = modules[m_idx](torch.cat([h, hs.pop()], dim=1), temb, zemb)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 279, in forward
    h = self.act(self.GroupNorm_0(x, zemb))
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 60, in forward
    out = self.norm(input)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 245, in forward
    return F.group_norm(
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/functional.py", line 2111, in group_norm
    return torch.group_norm(input, num_groups, weight, bias, eps,
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 2; 10.92 GiB total capacity; 9.71 GiB already allocated; 151.50 MiB free; 10.06 GiB reserved in total by PyTorch)
Process Process-4:
Process Process-1:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 390, in train
    x_0_predict = netG(x_tp1.detach(), t, latent_z)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 367, in forward
    h = modules[m_idx](torch.cat([h, hs.pop()], dim=1), temb, zemb)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 279, in forward
    h = self.act(self.GroupNorm_0(x, zemb))
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 60, in forward
    out = self.norm(input)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 245, in forward
    return F.group_norm(
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/functional.py", line 2111, in group_norm
    return torch.group_norm(input, num_groups, weight, bias, eps,
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 3; 10.92 GiB total capacity; 9.71 GiB already allocated; 151.50 MiB free; 10.06 GiB reserved in total by PyTorch)
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 390, in train
    x_0_predict = netG(x_tp1.detach(), t, latent_z)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 367, in forward
    h = modules[m_idx](torch.cat([h, hs.pop()], dim=1), temb, zemb)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 279, in forward
    h = self.act(self.GroupNorm_0(x, zemb))
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 60, in forward
    out = self.norm(input)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 245, in forward
    return F.group_norm(
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/functional.py", line 2111, in group_norm
    return torch.group_norm(input, num_groups, weight, bias, eps,
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 10.92 GiB total capacity; 9.71 GiB already allocated; 151.50 MiB free; 10.06 GiB reserved in total by PyTorch)
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 390, in train
    x_0_predict = netG(x_tp1.detach(), t, latent_z)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 367, in forward
    h = modules[m_idx](torch.cat([h, hs.pop()], dim=1), temb, zemb)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 279, in forward
    h = self.act(self.GroupNorm_0(x, zemb))
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 60, in forward
    out = self.norm(input)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 245, in forward
    return F.group_norm(
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/functional.py", line 2111, in group_norm
    return torch.group_norm(input, num_groups, weight, bias, eps,
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 1; 10.92 GiB total capacity; 9.71 GiB already allocated; 151.50 MiB free; 10.06 GiB reserved in total by PyTorch)

How to reproduce the results in Figure 8 from the paper?

Thank you for your work! I would like to know how to reproduce the results of adding noise and then removing it in Figure 8. I have attempted to add noise and remove it myself, but the results were not satisfactory. I am curious about the parameters you used. Thank you!

No module named 'fused'

Hello,
I try to run this repository. But I had the following error when I train the model.

Traceback (most recent call last): File "train_ddgan.py", line 609, in <module> init_processes(0, size, train, args) File "train_ddgan.py", line 478, in init_processes fn(rank, gpu, args) File "train_ddgan.py", line 192, in train from score_sde.models.discriminator import Discriminator_small, Discriminator_large File "/home/ibakkaya/Desktop/denoising-diffusion-gan/score_sde/models/discriminator.py", line 11, in <module> from . import up_or_down_sampling File "/home/ibakkaya/Desktop/denoising-diffusion-gan/score_sde/models/up_or_down_sampling.py", line 15, in <module> from score_sde.op import upfirdn2d File "/home/ibakkaya/Desktop/denoising-diffusion-gan/score_sde/op/__init__.py", line 1, in <module> from .fused_act import FusedLeakyReLU, fused_leaky_relu File "/home/ibakkaya/Desktop/denoising-diffusion-gan/score_sde/op/fused_act.py", line 23, in <module> os.path.join(module_path, "fused_bias_act_kernel.cu"), File "/home/ibakkaya/Desktop/denoising-diffusion-gan/env/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load keep_intermediates=keep_intermediates) File "/home/ibakkaya/Desktop/denoising-diffusion-gan/env/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1317, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "/home/ibakkaya/Desktop/denoising-diffusion-gan/env/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1699, in _import_module_from_library file, path, description = imp.find_module(module_name, [path]) File "/home/ibakkaya/miniconda3/lib/python3.7/imp.py", line 296, in find_module raise ImportError(_ERR_MSG.format(name), name=name) ImportError: No module named 'fused'

Could you help me to solve this problem?

Thanks.

Question about update netG

Hi, I have a question about generator update step. From the code, I see that you disable gradient for the discriminator. I think it is not correct and not necessary because netD will not be updated in this step, isn't it? All other GAN implementation, they do not disable gradient like this).

CelebA-HQ 256x256 Data Pre-processing

Thank you team for the sharing the project resources. I am trying to process the CelebA-HQ 256x256 dataset for the DDGAN model. The DDGAN repository recommends going over the dataset preparation methods in the NVAE repository.

The following commands will download tfrecord files from GLOW and convert them to store them in an LMDB dataset.

Use the link by openai/glow for downloading the CelebA-HQ 256x256 dataset (4 Gb).
To convert/store the CelebA-HQ 256x256 dataset to/as the lmdb dataset one needs to install module called "tfrecord".
The missing module error can be rectified by simply executing the command pip install tfrecord.

!mkdir -p $DATA_DIR/celeba
%cd $DATA_DIR/celeba
!wget https://openaipublic.azureedge.net/glow-demo/data/celeba-tfr.tar
!tar -xvf celeba-tfr.tar
%cd $CODE_DIR/scripts
!pip install tfrecord
!python convert_tfrecord_to_lmdb.py --dataset=celeba --tfr_path=$DATA_DIR/celeba/celeba-tfr --lmdb_path=$DATA_DIR/celeba/celeba-lmdb --split=train

The final command !python convert_tfrecord_to_lmdb.py --dataset=celeba --tfr_path=$DATA_DIR/celeba/celeba-tfr --lmdb_path=$DATA_DIR/celeba/celeba-lmdb --split=train gives the following output:

.
.
.
26300
26400
26500
26600
26700
26800
26900
27000
added 27000 items to the LMDB dataset.
Traceback (most recent call last):
  File "convert_tfrecord_to_lmdb.py", line 73, in <module>
    main(args.dataset, args.split, args.tfr_path, args.lmdb_path)
  File "convert_tfrecord_to_lmdb.py", line 58, in main
    print('added %d items to the LMDB dataset.' % count)
lmdb.Error: mdb_txn_commit: Disk quota exceeded

I am not sure I have made the LMDB dataset properly, I request you to guide me.

Question about training

Hi,
Thanks for your work! I am wondering what the "training set" is in the experiments?
For example, Cifar10 has 50k training images and 10k testing images. But it seems that we can use all of them for generative tasks?
However, on the other hand, I note that the paper has mentioned the FID is computed in which the reference batch is the 50k training set.
So, which set should I use for training? Is there a traditional rule? I am a little confused.
Thanks!

Inverse a image to latent space and recover it

How to

inverse a image to latent space and record the latent
and then reconstruction latent vector to a same image.

anyone can help me, how can i do this.

Thanks.

Best wishes.

How to test a model trained on StackedMNIST?

I trained a model on StackedMNIST dataset with train_ddgan.py until I got satisfying samples. Now I'd like to test that model with test_ddgan.py but it seems that stackedMNIST is not supported for testing, so I don't know how to do it.

Can fitting the denoising data distribution of DDIM using conditional GAN networks in DDGANs further improve the generation speed？

Hello! I have a question and I hope to discuss it with you
DDGANs abandon the assumption that the denoising distribution is Gaussian and use a conditional GAN to simulate this denoising distribution.
So, the acceleration model of DDPM (which actually only modified the sampling algorithm), such as DDIM, also has a data distribution and a non Markov chain for denoising. Can the conditional GANs in DDGANs fit the denoising distribution of DDIM, and will this further improve the generation speed

RuntimeError: CUDA out of memory.

I have tried sampling/evaluating/testing the model on colab as well as local gpu node, however I am facing the CUDA out of memory error.
Error on google colab

Traceback (most recent call last):
  File "test_ddgan.py", line 272, in <module>
    sample_and_test(args)
  File "test_ddgan.py", line 186, in sample_and_test
    fake_sample = sample_from_model(pos_coeff, netG, args.num_timesteps, x_t_1,T,  args)
  File "test_ddgan.py", line 123, in sample_from_model
    x_0 = generator(x, t_time, latent_z)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/drive/MyDrive/Repositories/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 322, in forward
    h = modules[m_idx](hs[-1], temb, zemb)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/drive/MyDrive/Repositories/denoising-diffusion-gan/score_sde/models/layerspp.py", line 300, in forward
    h = self.act(self.GroupNorm_1(h, zemb))
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/drive/MyDrive/Repositories/denoising-diffusion-gan/score_sde/models/layerspp.py", line 61, in forward
    out = gamma * out + beta
RuntimeError: CUDA out of memory. Tried to allocate 3.12 GiB (GPU 0; 14.76 GiB total capacity; 12.95 GiB already allocated; 887.75 MiB free; 12.95 GiB reserved in total by PyTorch)

Error on GPU node:

Traceback (most recent call last):
  File "test_ddgan.py", line 272, in <module>
    sample_and_test(args)
  File "test_ddgan.py", line 186, in sample_and_test
    fake_sample = sample_from_model(pos_coeff, netG, args.num_timesteps, x_t_1,T,  args)
  File "test_ddgan.py", line 123, in sample_from_model
    x_0 = generator(x, t_time, latent_z)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 322, in forward
    h = modules[m_idx](hs[-1], temb, zemb)
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 279, in forward
    h = self.act(self.GroupNorm_0(x, zemb))
  File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 61, in forward
    out = gamma * out + beta
RuntimeError: CUDA out of memory. Tried to allocate 3.12 GiB (GPU 0; 10.76 GiB total capacity; 6.70 GiB already allocated; 3.06 GiB free; 6.70 GiB reserved in total by PyTorch)

In both the cases the system could't somehow allocate 3.12 GiB