Code Monkey home page Code Monkey logo

Comments (4)

angeloskath avatar angeloskath commented on August 16, 2024

Hi,

Thanks for taking the time and for your good words! I will address your questions one by one:

  1. Regarding Figure 3 you are correct that the comment should be almost all methods provide some speedup.
  2. In Figure 6 for reasons of clarity (8 lines in a graph are hard to read) we only compare with SVRG type methods. Although they are designed to improve upon the uniform baseline, a nice take-away message from the paper is that when it comes to highly non-convex large model training they actually perform worse than plain SGD with momentum.

Regarding further explanations with respect to the results in Figure 6, the SVRG based methods require two gradient evaluations per parameter update and a certain number of gradient evaluations every m iterations. The benefits are reaped mostly at the very final stages of training where almost zero variance is needed. But in the case of large models such as the WRN-28-2 that is used in the paper the overhead is way too much and the constant factors overwhelm the asymptotic improvement. In literature there is barely any comparison of SVRG methods with momentum SGD precisely because momentum SGD sets a very competitive baseline.

I hope I covered your concerns. Feel free to ask more either here or by email.

Cheers,
Angelos

from importance-sampling.

zhiqiangdon avatar zhiqiangdon commented on August 16, 2024

Thanks for your patient explanation! @angeloskath

from importance-sampling.

angeloskath avatar angeloskath commented on August 16, 2024

You 're welcome. I am closing the issue, feel free to reopen it if you have further questions or shoot me an email.

from importance-sampling.

borre47 avatar borre47 commented on August 16, 2024

Hi, what is your email, please?

from importance-sampling.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.