Comments (6)
Actually, when I train on 2 GPUs and 4 GPUs machine. The performance did variate, with a 2 percent drop on 4 GPUs machine. From my point of view, as it doesn't use global BN, thus per GPU batch size did value a lot.
@MaureenZOU, Yes, batch size is an important hyper param for BN. It is recommended to use a large batch size (e.g. >8). As far as know, there is no SyncBN in pytorch. Please try third party implementations if SyncBN is required.
Hello @shipra25jain , It works with any number of GPUs.
Thanks @VainF for the reply. It seems to be working now on adding 'device_ids' in DataParallel() as my default gpu_ids are not 0 and 1 but 5 and 7. However, there seems to be a bug in polyLR scheduler. Shouldn't it be (1 - last_epoch/max_epochs)**power ? I mean instead of max_iters in formula, it should be max_epochs?
@shipra25jain, thank you for pointing out this issue. In this repo, the learning rate is scheduled at each iteration, so last_epoch
actually means last_iter
. I will rename it to make the code more straightforward.
from deeplabv3plus-pytorch.
Did this repo work when you gave 2 GPU ids in the argument? You had to make any changes in the code?
from deeplabv3plus-pytorch.
Hello @shipra25jain , It works with any number of GPUs.
from deeplabv3plus-pytorch.
Actually, when I train on 2 GPUs and 4 GPUs machine. The performance did variate, with a 2 percent drop on 4 GPUs machine. From my point of view, as it doesn't use global BN, thus per GPU batch size did value a lot.
from deeplabv3plus-pytorch.
@VainF if my experiment did has any problem. Please point out!
from deeplabv3plus-pytorch.
Hello @shipra25jain , It works with any number of GPUs.
Thanks @VainF for the reply. It seems to be working now on adding 'device_ids' in DataParallel() as my default gpu_ids are not 0 and 1 but 5 and 7. However, there seems to be a bug in polyLR scheduler. Shouldn't it be (1 - last_epoch/max_epochs)**power ? I mean instead of max_iters in formula, it should be max_epochs?
from deeplabv3plus-pytorch.
Related Issues (20)
- video
- Some advice on GPU choice?
- I have three categories, but my class IOU only has two
- New additional classes not training HOT 3
- RuntimeError: The size of tensor a (125) must match the size of tensor b (126) at non-singleton dimension 3
- The size of tensor a (125) must match the size of tensor b (126) at non-singleton dimension 3' will appear during the validation phase HOT 1
- When I use a model with plus, the following error always occurs
- how can I write the argument ("--input") in predict.py
- distributed training error
- ONLY TRIANED ON A SUNGLE GPU
- How to run train.py
- about dice loss
- How to use this model on iOS?
- Issue with Multi-GPU Training/Predicting using --gpu_id
- Visualization of training results
- Wrong File Name in best_deeplabv3plus_resnet101_cityscapes_os16 HOT 3
- predict
- MobileNetV2 Width_mult
- hrnet_48 pretrain model
- How To Test On Cityscapes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deeplabv3plus-pytorch.