Code Monkey home page Code Monkey logo

Comments (15)

forever208 avatar forever208 commented on August 23, 2024 1

q(x_0) stands for the whole data distribution, i.e your training dataset.
We can not explicitly express the correct q(x_0) (we do not know if q(x_0) is Gaussian or other distribution), we can only draw samples from the data distribution q(x_0)

from ddpm-ip.

forever208 avatar forever208 commented on August 23, 2024 1

yes

from ddpm-ip.

forever208 avatar forever208 commented on August 23, 2024 1

Thank you. Also, for example, I’m working on cifar10 dataset. Then the dimension of x0,⋯,xt is 32×32×3, right?

yes

from ddpm-ip.

forever208 avatar forever208 commented on August 23, 2024 1

Thank you! I wonder what you mean by "each FID value is computed using T = 1000 sampling steps". Does it imply diffusion_steps 1000 in the code below? Thanks.

mpirun python scripts/image_sample.py \
--image_size 32 --timestep_respacing 100 \
--model_path PATH_TO_CHECKPOINT \
--num_channels 128 --num_head_channels 32 --num_res_blocks 3 --attention_resolutions 16,8 \
--resblock_updown True --use_new_attention_order True --learn_sigma True --dropout 0.3 \
--diffusion_steps 1000 --noise_schedule cosine --use_scale_shift_norm True --batch_size 256 --num_samples 50000

In Figure 3 of your paper, you calculated FID scores using T = 1000 sampling steps. image
The code uses 100 sampling steps. Figure 3 will be updated in paper later

from ddpm-ip.

forever208 avatar forever208 commented on August 23, 2024 1

Got it. Could you tell me which parameter determines the number of sampling steps in the code below? Thank you.

mpirun python scripts/image_sample.py \
--image_size 32 --timestep_respacing 100 \
--model_path PATH_TO_CHECKPOINT \
--num_channels 128 --num_head_channels 32 --num_res_blocks 3 --attention_resolutions 16,8 \
--resblock_updown True --use_new_attention_order True --learn_sigma True --dropout 0.3 \
--diffusion_steps 1000 --noise_schedule cosine --use_scale_shift_norm True --batch_size 256 --num_samples 50000

--timestep_respacing 100

from ddpm-ip.

forever208 avatar forever208 commented on August 23, 2024 1

Thank you. I'm still a little confused about the notations in the paper. You mentioned in the paper "When training, we always use T=1000 steps for all the models. At inference time, the results reported with T′<T sampling steps have been obtained using the respacing technique." So in here T=1000 refers to diffusion_steps 1000, and T′ refers to the parameter timestep_respacing. Am I right? Thanks.

yes

from ddpm-ip.

KevinWang676 avatar KevinWang676 commented on August 23, 2024

I see, thank you! And in $\textbf{x}_t \sim q(\textbf{x}_t|\textbf{x}_{t-1})=\mathcal{N}(\textbf{x}_t;\sqrt{1-\beta_t}\textbf{x}_{t-1},\beta_t \textbf{I})$, the function $q(\textbf{x}_t|\textbf{x}_{t-1})$ is the conditional distribution of $x_t$ given $x_{t-1}$. Am I right? Thanks.

from ddpm-ip.

KevinWang676 avatar KevinWang676 commented on August 23, 2024

Thank you. Also, for example, I’m working on cifar10 dataset. Then the dimension of $\textbf{x}_0, \cdots, \textbf{x}_t$ is 32×32×3, right?

from ddpm-ip.

KevinWang676 avatar KevinWang676 commented on August 23, 2024

Thank you! I wonder what you mean by "each FID value is computed using T = 1000 sampling steps". Does it imply diffusion_steps 1000 in the code below? Thanks.

mpirun python scripts/image_sample.py \
--image_size 32 --timestep_respacing 100 \
--model_path PATH_TO_CHECKPOINT \
--num_channels 128 --num_head_channels 32 --num_res_blocks 3 --attention_resolutions 16,8 \
--resblock_updown True --use_new_attention_order True --learn_sigma True --dropout 0.3 \
--diffusion_steps 1000 --noise_schedule cosine --use_scale_shift_norm True --batch_size 256 --num_samples 50000

In Figure 3 of your paper, you calculated FID scores using T = 1000 sampling steps.
image

from ddpm-ip.

forever208 avatar forever208 commented on August 23, 2024

Thank you! I wonder what you mean by "each FID value is computed using T = 1000 sampling steps". Does it imply diffusion_steps 1000 in the code below? Thanks.

mpirun python scripts/image_sample.py \
--image_size 32 --timestep_respacing 100 \
--model_path PATH_TO_CHECKPOINT \
--num_channels 128 --num_head_channels 32 --num_res_blocks 3 --attention_resolutions 16,8 \
--resblock_updown True --use_new_attention_order True --learn_sigma True --dropout 0.3 \
--diffusion_steps 1000 --noise_schedule cosine --use_scale_shift_norm True --batch_size 256 --num_samples 50000

In Figure 3 of your paper, you calculated FID scores using T = 1000 sampling steps. image

from ddpm-ip.

KevinWang676 avatar KevinWang676 commented on August 23, 2024

Got it. Could you tell me which parameter determines the number of sampling steps in the code below? Thank you.

mpirun python scripts/image_sample.py \
--image_size 32 --timestep_respacing 100 \
--model_path PATH_TO_CHECKPOINT \
--num_channels 128 --num_head_channels 32 --num_res_blocks 3 --attention_resolutions 16,8 \
--resblock_updown True --use_new_attention_order True --learn_sigma True --dropout 0.3 \
--diffusion_steps 1000 --noise_schedule cosine --use_scale_shift_norm True --batch_size 256 --num_samples 50000

from ddpm-ip.

KevinWang676 avatar KevinWang676 commented on August 23, 2024

Thank you. I'm still a little confused about the notations in the paper. You mentioned in the paper "When training, we always use $T = 1000$ steps for all the models. At inference time, the results reported with $T^{\prime} &lt; T$ sampling steps have been obtained using the respacing technique." So in here $T = 1000$ refers to diffusion_steps 1000, and $T^{\prime}$ refers to the parameter timestep_respacing. Am I right? Thanks.

from ddpm-ip.

KevinWang676 avatar KevinWang676 commented on August 23, 2024

Thanks! By the way, I'm trying to train a new model for MNIST dataset using your code and I've created a notebook to download MNIST dataset and transform the training dataset to npz file. The only issue I have is that the dimension of the images in my npz file is 28×28×3, but the default dimension of MNIST images is 28×28×1.

So I wonder if this discrepancy will influence the training of DDPM-IP. Here is the Colab notebook. Thank you.

The dimension of my npz file:
image

The default dimension:

image

from ddpm-ip.

KevinWang676 avatar KevinWang676 commented on August 23, 2024

Hi, I also wonder what the number of total_batch_size is when you were training on the Celeba dataset. I guess total_batch_size is 8*16=128 since there are two nodes. And how long does the training process take? Thank you.

The code for CelebA 64x64 training:

mpiexec -n 16  python scripts/image_train.py --input_pertub 0.1 \
--data_dir PATH_TO_DATASET \
--image_size 64 --use_fp16 True --num_channels 192 --num_head_channels 64 --num_res_blocks 3 \
--attention_resolutions 32,16,8 --resblock_updown True --use_new_attention_order True \
--learn_sigma True --dropout 0.1 --diffusion_steps 1000 --noise_schedule cosine --use_scale_shift_norm True \
--rescale_learned_sigmas True --schedule_sampler loss-second-moment --lr 1e-4 --batch_size 16

from ddpm-ip.

forever208 avatar forever208 commented on August 23, 2024

Hi, I also wonder what the number of total_batch_size is when you were training on the Celeba dataset. I guess total_batch_size is 8*16=128 since there are two nodes. And how long does the training process take? Thank you.

The code for CelebA 64x64 training:

mpiexec -n 16  python scripts/image_train.py --input_pertub 0.1 \
--data_dir PATH_TO_DATASET \
--image_size 64 --use_fp16 True --num_channels 192 --num_head_channels 64 --num_res_blocks 3 \
--attention_resolutions 32,16,8 --resblock_updown True --use_new_attention_order True \
--learn_sigma True --dropout 0.1 --diffusion_steps 1000 --noise_schedule cosine --use_scale_shift_norm True \
--rescale_learned_sigmas True --schedule_sampler loss-second-moment --lr 1e-4 --batch_size 16

total_batch_size is 8*16=128 is correct. training celeba takes 4-5 days using 16 V100 GPUs

from ddpm-ip.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.