Code Monkey home page Code Monkey logo

Comments (6)

VainF avatar VainF commented on July 3, 2024 1

Actually, when I train on 2 GPUs and 4 GPUs machine. The performance did variate, with a 2 percent drop on 4 GPUs machine. From my point of view, as it doesn't use global BN, thus per GPU batch size did value a lot.

@MaureenZOU, Yes, batch size is an important hyper param for BN. It is recommended to use a large batch size (e.g. >8). As far as know, there is no SyncBN in pytorch. Please try third party implementations if SyncBN is required.

Hello @shipra25jain , It works with any number of GPUs.

Thanks @VainF for the reply. It seems to be working now on adding 'device_ids' in DataParallel() as my default gpu_ids are not 0 and 1 but 5 and 7. However, there seems to be a bug in polyLR scheduler. Shouldn't it be (1 - last_epoch/max_epochs)**power ? I mean instead of max_iters in formula, it should be max_epochs?

@shipra25jain, thank you for pointing out this issue. In this repo, the learning rate is scheduled at each iteration, so last_epoch actually means last_iter. I will rename it to make the code more straightforward.

from deeplabv3plus-pytorch.

shipra25jain avatar shipra25jain commented on July 3, 2024

Did this repo work when you gave 2 GPU ids in the argument? You had to make any changes in the code?

from deeplabv3plus-pytorch.

VainF avatar VainF commented on July 3, 2024

Hello @shipra25jain , It works with any number of GPUs.

from deeplabv3plus-pytorch.

MaureenZOU avatar MaureenZOU commented on July 3, 2024

Actually, when I train on 2 GPUs and 4 GPUs machine. The performance did variate, with a 2 percent drop on 4 GPUs machine. From my point of view, as it doesn't use global BN, thus per GPU batch size did value a lot.

from deeplabv3plus-pytorch.

MaureenZOU avatar MaureenZOU commented on July 3, 2024

@VainF if my experiment did has any problem. Please point out!

from deeplabv3plus-pytorch.

shipra25jain avatar shipra25jain commented on July 3, 2024

Hello @shipra25jain , It works with any number of GPUs.

Thanks @VainF for the reply. It seems to be working now on adding 'device_ids' in DataParallel() as my default gpu_ids are not 0 and 1 but 5 and 7. However, there seems to be a bug in polyLR scheduler. Shouldn't it be (1 - last_epoch/max_epochs)**power ? I mean instead of max_iters in formula, it should be max_epochs?

from deeplabv3plus-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.