Code Monkey home page Code Monkey logo

Comments (5)

yang-song avatar yang-song commented on June 14, 2024

Very good observation! The convergence of DSM will be plagued by large variance and will be very small for small sigma. This is a known issue but can be alleviated by control variates (see https://arxiv.org/abs/2101.03288 as an example). In our experiments we do DSM across multiple noise scales, and didn't observe slowed convergence since there are many large sigmas in the noise scales.

from ncsn.

cheind avatar cheind commented on June 14, 2024

Ah ok, I was already planning for variance reduction methods :) For larger sigmae everything seems to be much smoother - that I observed as well. I wonder if the runtime advantage of dsm over ism is not eaten up again by slower convergence? After all, for ism, we only need the trace of the jacobian, which should be faster to compute than the entire jacobian (if frameworks like PyTorch would support such an operation). I have already a quite fast version (limited to specific NN architectures) here

https://github.com/cheind/diffusion-models/blob/189fbf545f07be0f8f9c42bc803016b846602f3c/diffusion/jacobians.py#L5

from ncsn.

yang-song avatar yang-song commented on June 14, 2024

Trace of the jacobian is still very expensive to compute. That said, there are methods like sliced score matching that do not add noise and are not affected by variance issues. I tried them in training score-based models before. They gave decent performance, but didn't seem to outperform dsm.

from ncsn.

cheind avatar cheind commented on June 14, 2024

Yes, very true if data dimensions become large. I was thinking about (low-rank) approximations to the jacobian and came across this paper

Abdel-Khalik, Hany S., et al. "A low rank approach to automatic differentiation." Advances in Automatic Differentiation. Springer, Berlin, Heidelberg, 2008. 55-65.

which is also quite dated. But after skimming it, the idea seems connected to your sliced SM approach: as if sliced score matching computes a low-rank jacobian approximation.

Ok, thanks for your valuable time and have a nice Saturday.

from ncsn.

cheind avatar cheind commented on June 14, 2024

I've recreated your toy-example to compare Langevin and annealed Langevin sampling. In particular, I've not used exact scores but trained a toy model to perform score prediction. The results are below. In the first figure on right plot we see default Langevin sampling (model trained unconditionally) with expected issues. The next figure (again right plot) shows annealed Langevin sampling as proposed in your paper (model trained conditioned on noise-level). The results are as expected, but I had to change one particular thing to make it work:

  • The noise levels range from [2..0.01] compared to [20..1] as mentioned in the paper. I tried with the original settings, but a sigma of 20 basically gives a flat space and this led to particles flying off in all kind of directions.

I believe the difference is due to the inexactness of model prediction and, of course, due to potential hidden errors in the code. Would you agree?

default_langevin
annealed_langevin

from ncsn.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.