Code Monkey home page Code Monkey logo

Comments (3)

jrevels avatar jrevels commented on August 16, 2024

Note that the problem here isn't the size of the input per se, but really the size of the generated instruction tape. If we simply define the norm instruction, then ReverseDiff wouldn't have to unroll the function into a large number of intermediate instructions - we'd only need the single norm instruction no matter the size of the input. The paper linked in #42 covers the implementation ReverseDiff would use for norm (as well as some other derivatives we still need primitives for).

More generally, for very large tapes that are impractical to compile, one can use the non-compiled API functions along with ReverseDiff's taping API so that you don't have to re-record the tape for every execution.

I just realized the gradient example doesn't cover this, I should definitely add that in.

from reversediff.jl.

marius311 avatar marius311 commented on August 16, 2024

Thanks, indeed the non-compiled version worked until about 1,000,000 before my 16GB RAM wasn't enough. Any ideas to squeeze just another factor of 10 without having to go to a bigger RAM machine?

Re: norm, I should have mentioned that was just for a simple example, not my actual function. Although it does raise the question, if I know the derivative of some functions involved in my calculation, is there a way I can input them in by hand to ReverseDiff uses them?

from reversediff.jl.

jrevels avatar jrevels commented on August 16, 2024

Any ideas to squeeze just another factor of 10 without having to go to a bigger RAM machine?

Depending on what your actual objective function looks like, you may be able to sprinkle @forward in some clever places to cut down on the overall number of instructions. It's hard to say without seeing the code though.

is there a way I can input them in by hand to ReverseDiff uses them?

Yup, but I haven't exposed a user-facing API for defining new instructions yet so it'd probably be pretty cumbersome for anybody who doesn't already know ReverseDiff's internals (#15 exists to track that feature). If you wanted to play around with it, though, I'd be willing to help out.

Basically, to support a new function instruction, you have to define three things: how to "record" the function to a tape instruction, how to execute it during the forward pass, and how to execute it during the backwards pass. The inv definition here is probably the most straightforward example at the moment.

from reversediff.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.