Code Monkey home page Code Monkey logo

soft-truncation's People

Contributors

kim-dongjun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

soft-truncation's Issues

Issue on computing log-likelihood (bpd)

There are three ways to compute bpds for a trained model.

  1. Uniform dequantization: the most popular, but the worst bpd.
  2. Variational dequantization: quite popular, great bpd.
  3. Lossless: the way to compute bpd in VAE/Discrete Diffusion, great bpd.

I have spent few months to train the flow network for the variational dequantization, but with almost all the released and well-established codes, I failed to train the flow network successfully. Here, what "successful" means that the bpd with var. deq. is not decreased from the bpd with uniform deq. around 0.10~0.14 as Song reported in his paper (https://arxiv.org/pdf/2101.09258.pdf). I only get ~0.02 gain with Song's original flow network, and I got ~0.05 gain with the bestly performed implementation.

After this experience, I decided to stop delving into training the unstable flow network. Rather than that, I focused on the lossless computation following Ho's original DDPM paper. However, it turned out that the lossless bpd is significantly worse than the bpd with unif. deq., and we are suspecting that our code is wrong.

The problem is that we cannot find any wrong point in our code. I hope if anyone successes on the lossless computation, and please let me know for his/her's know-how.

As always, the reviewers do not consider our effort, but it is extremely unfair to compare the bpd with unif. deq. with prior works that computed their bpd with var. deq. This unfair comparison could be a potential cause of paper rejection, so it is left us to invest our precious time on training variational flow network until the training succeeds. This is a huge waste of time for colleague researchers, and if the lossless bpd computation is successful and stable, then it would allow you to fairly compare your model with prior works without the flow learning! So please let's find out the way to compute lossless bpd, which is as cheap as the bpd with uniform dequantization.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.