Since E[-log p^theta(x0)] <= E[-log p^theta(x_eps)] - E[log p^theta(x0|x_eps)] + E[

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Why `likelihood_residual_fn_lossless_compression` doesn't minus p_entropy as `likelihood_residual_fn`? about soft-truncation HOT 4 CLOSED

baofff commented on July 19, 2024

Why `likelihood_residual_fn_lossless_compression` doesn't minus p_entropy as `likelihood_residual_fn`?

from soft-truncation.

Comments (4)

Kim-Dongjun commented on July 19, 2024

It seems that I forgot to include the entropy term for computing the "lossless" bpd.
The entropy for eps=1e-5 with VPSDE is -7.92, and the offset term is 7, so we should add 0.92 value to the reported bpds (in my paper) to calculate the lossless bpd with entropy term, which is unacceptably bad. Please check the newly commited "evaluation.py" and "likelihood.py" for the update. We remove the reported bpd value at paperswithcode to prevent giving a misguided bpd report to colleagues. Also, I recognize my paper misleadingly write down the formula (in Eq. (4)), and I will modify the formula to include the entropy term. Thank you for your contribution.

The original purpose of the lossless bpd is to compute the true bpd value because the uniform dequantized bpd is too weak. I believe that the variational dequantization is not a good approach to develop further because it is highly unstable to train (I almost spent 3 months to train the flow network for the variational dequantization, but with "almost all" released flow networks, I failed for its unstable and suboptimal training. I believe that many people also fail to obtain the variational bpd). Therefore, any method is strongly needed to compute the true bpd of the checkpoint.

Any help of that? I think my computation is wrong because the bpd is too high, and I would appreciate if anyone could help me computing the lossless bpd correctly..

from soft-truncation.

baofff commented on July 19, 2024

I will try figure it out.

from soft-truncation.

LuChengTHU commented on July 19, 2024

How's it going?

from soft-truncation.

Kim-Dongjun commented on July 19, 2024

@LuChengTHU
In contrast to the discrete case, the continuous diffusion model seems to perform worse when we apply the lossless compression. Thus, up to now, variational dequantization is the best way to improve NLL. We have used the variational network suggested by DenseFlow (implemented with PyTorch), and we trained this variational network "after" the training of the score network. We used the training loss of ScoreFlow to train the variational network. See #3 for details.

from soft-truncation.

Why `likelihood_residual_fn_lossless_compression` doesn't minus p_entropy as `likelihood_residual_fn`? about soft-truncation HOT 4 CLOSED

Comments (4)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent