Code Monkey home page Code Monkey logo

Comments (5)

david-gpu avatar david-gpu commented on September 17, 2024

That is interesting. Some googling showed a potentially related issue: tensorflow/tensorflow#2654

Have you tried any of the ideas in there?

from deep-makeover.

david-gpu avatar david-gpu commented on September 17, 2024

OK, I think I know what is happening. Function _save_image() in dm_train.py is repeatedly adding new nodes to the graph.

While I fix that problem you could increase the value of summary_period so that this operation doesn't happen so frequently.

Let me know if that helps.

from deep-makeover.

david-gpu avatar david-gpu commented on September 17, 2024

OK, so I switched _save_image() to use numpy instead of tensorflow. I think that should largely take care of the problem of ever-growing checkpoints.

Could you please try commit c2ae55c..779100c and let me know whether that helps?

from deep-makeover.

david-gpu avatar david-gpu commented on September 17, 2024

Based on my testing the checkpoint size went down to ~20MB after 40K batches, so I'll be closing this issue.

Thanks for reporting it!

from deep-makeover.

ddvladimir avatar ddvladimir commented on September 17, 2024

I have also retested it after your update and it works fine now.

Thank you for your help!

from deep-makeover.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.