Code Monkey home page Code Monkey logo

Comments (7)

MattX avatar MattX commented on September 4, 2024 2

Yes, this should be easy enough to fix by passing the model around and adding a renew_cg(). I'll try to do it tonight.

from xnmt.

MattX avatar MattX commented on September 4, 2024

I am able to reproduce on OS X, but it goes away if I disable evaluating (which only occurs between epochs, so it's consistent with what you observe).

There's probably a memory leak in the decoder (I don't see how the evaluator could have one). Looking around, it seems never to call renew_cg(), so that could be the problem.

from xnmt.

pmichel31415 avatar pmichel31415 commented on September 4, 2024

Aah OK that's definitely it then

from xnmt.

MattX avatar MattX commented on September 4, 2024

Actually the way it currently works is that xnmt_train periodically writes the model to a file, and every time xnmt_decode needs to do decoding, it creates a new model and loads the latest parameters from that file. The issue could be that xnmt_decode's models are not properly garbage-collected.

renew_cg() isn't called by the decoder but it's called all the time by the trainer so that shouldn't be an issue, right?

from xnmt.

pmichel31415 avatar pmichel31415 commented on September 4, 2024

Ah yes. No renew_cg should only be called once per training/testing batch so it should be fine.

So the way to go should be to keep the same model at all time, which should be doable if there is no need to have a duplicate model at any time.

Or at least just create a training and testing model but keep them along the execution, you can't rely on garbage collection (for models and parameters) this is a low level dynet bug

from xnmt.

neubig avatar neubig commented on September 4, 2024

Hmm, could be related to this: clab/dynet#418
We should fix this...

But regardless, the decoder should call renw_cg().

from xnmt.

psjanani avatar psjanani commented on September 4, 2024

Hi!
I mem-topped the program calling renew_cg every time decoder is called and also the case where renew_cg is called for every translation, we might ideally call renew_cg() for every batch size of translations maybe. The decoder does sentence wise translation, so there is no concept of batch there. So the graph was technically cleared only after all the translations were done. This led to the memory leak. Sorry about that.

from xnmt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.