Code Monkey home page Code Monkey logo

Comments (6)

aosokin avatar aosokin commented on September 13, 2024

Hi,this probably can done but will likely require adjusting the code. We've never tried training on multiple GPUs.

from os2d.

saswat0 avatar saswat0 commented on September 13, 2024

Okay. In my case, I have a 48GB GPU, but the training occupies only 8GB, thereby taking three days to train the detector. How were you able to do this in a shorter time?

from os2d.

aosokin avatar aosokin commented on September 13, 2024

But what is the processing load of your GPU? if it is low you can try increasing batch size.
Another idea: it might be fine to train for significantly fewer iterations, you just need to monitor the behaviour of the validation loss to stop the process early.

from os2d.

saswat0 avatar saswat0 commented on September 13, 2024

GPU runs at 100% capacity, but most memory is left idle. Increasing the batch size in the config file isn't reflected in the final parameters. Did you face this issue while experimenting?
For the second approach, the current code doesn't have a tfboard support. Should I monitor the logs instead?

from os2d.

aosokin avatar aosokin commented on September 13, 2024

GPU runs at 100% capacity, but most memory is left idle. Increasing the batch size in the config file isn't reflected in the final parameters. Did you face this issue while experimenting?

It sounds weird, changing train.batch_size and train.class_batch_size should definitely change the training process - at least the GPU memory usage should go up.
However, if the GPU is already at 100% simply changing batch size is not likely to increase training speed.

For the second approach, the current code doesn't have a tfboard support. Should I monitor the logs instead?

Yes, the code does not have tensorboard but it includes another visualization tool: os2d/utils/plot_visdom.py

from os2d.

saswat0 avatar saswat0 commented on September 13, 2024

Got it, thanks!

from os2d.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.