Thank you for your amazing work. I was trying to reproduce your results on cityscapes

Try using larger crop size, for example 768. </blockquot

My hyper-parameters: <div class="highlight highlight-source-shell notranslate posi

can't reproduce your results about deeplabv3plus-pytorch HOT 13 CLOSED

vainf commented on July 3, 2024

can't reproduce your results

from deeplabv3plus-pytorch.

Comments (13)

bb846 commented on July 3, 2024 1

Try using larger crop size, for example 768.

It gives CUDA out of memory.

@bb846 Thanks, for sharing the number of workers link. I also found the ImageNet pre-trained weights being loaded. But still can't reproduce the result. I think the issue is that BatchNorm is not synced in this repository and I'll have to use DistributedDataParallel to use the Pytorch SyncBatchNorm: https://pytorch.org/docs/master/generated/torch.nn.SyncBatchNorm.html#torch.nn.SyncBatchNorm.

So could you please share if you were able to reproduce the 77% result or get >75%; the exact command (or hyperparameters you used); and any other changes you made to this code? I think some of the hyperparameters like the learning rate are different from the original paper.

Yes, it requires large GPUs. Regarding of the result, I only get 75.8%. I used cropsize 768 and RandomScale with range [0.5, 2.0]. I also used DistributedDataParallel and SyncBatchNorm. The other hyperparameters are similar to this repo. Hope this help.

Best,

from deeplabv3plus-pytorch.

VainF commented on July 3, 2024

My hyper-parameters:

python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012_aug --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16

Tip:
please use a larger batch size (>8 data points per GPU instance) in parallel training, otherwise, the BN statistics may be incorrect.

from deeplabv3plus-pytorch.

bb846 commented on July 3, 2024

Thank you for your infos. I was using cropsize 768, batchsize 16 and inferring on the whole pictures. Now I am able to reach mIoU 70.4% for deeplabv3+_mobilenet.

from deeplabv3plus-pytorch.

zzc-ai commented on July 3, 2024

Hello, I would like to know whether you started training from scratch without loading any weight and how many epochs you have trained

from deeplabv3plus-pytorch.

bb846 commented on July 3, 2024

I only tried cityscapes and it tooks about 200 epochs. Good luck. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

…

On Wednesday, October 6th, 2021 at AM 11:05, zzc-ai ***@***.***> wrote: Hello, I would like to know whether you started training from scratch without loading any weight and how many epochs you have trained — You are receiving this because you modified the open/close state. Reply to this email directly, [view it on GitHub](#37 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AUEWECE7TENPV4CZ3DL3EBLUFO4GLANCNFSM45Z53BNQ). Triage notifications on the go with GitHub Mobile for [iOS](https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675) or [Android](https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub).

from deeplabv3plus-pytorch.

prachigarg23 commented on July 3, 2024

Hi, I'm having trouble reproducing the DeeplabV3 and DeeplabV3+ (ResNet101) results on Cityscapes. I'm using the following command:

python main.py --model deeplabv3plus_resnet101 --dataset cityscapes --gpu_id 0,1 --lr 0.1 --val_interval 300 --crop_size 513 --batch_size 16 --output_stride 16 --data_root path_to_cs

python main.py --model deeplabv3_resnet101 --dataset cityscapes --gpu_id 0,1 --lr 0.1 --val_interval 300 --crop_size 513 --batch_size 16 --output_stride 16 --data_root path_to_cs

Using 2 Nvidia 1080 Ti GPUs.

Getting 67.35% mIoU after 30k iterations on DeeplabV3 as compared to the 77.23% in the original paper.
Getting 72.1% mIoU after 30k iterations on DeeplabV3+ as compared to ~77% as reported in this repo.

I was confused about the correct learning rate and number of iterations. In the DeeplabV3 paper, they mention they use 0.007 as the initial learning rate and train for 90k training iterations for cityscapes. I saw 0.1 in the readme and 0.01 here. Can someone please confirm reproduceable hyperparameters for cityscapes?

@VainF Could I be doing something else wrong? Has anyone been able to reproduce the results? @bb846

from deeplabv3plus-pytorch.

prachigarg23 commented on July 3, 2024

Is the reported result on cityscapes after initialising from an ImageNet or COCO pretrained model? @VainF

from deeplabv3plus-pytorch.

bb846 commented on July 3, 2024

Hi, I'm having trouble reproducing the DeeplabV3 and DeeplabV3+ (ResNet101) results on Cityscapes. I'm using the following command:

python main.py --model deeplabv3plus_resnet101 --dataset cityscapes --gpu_id 0,1 --lr 0.1 --val_interval 300 --crop_size 513 --batch_size 16 --output_stride 16 --data_root path_to_cs

python main.py --model deeplabv3_resnet101 --dataset cityscapes --gpu_id 0,1 --lr 0.1 --val_interval 300 --crop_size 513 --batch_size 16 --output_stride 16 --data_root path_to_cs

Using 2 Nvidia 1080 Ti GPUs.

Getting 67.35% mIoU after 30k iterations on DeeplabV3 as compared to the 77.23% in the original paper. Getting 72.1% mIoU after 30k iterations on DeeplabV3+ as compared to ~77% as reported in this repo.

I was confused about the correct learning rate and number of iterations. In the DeeplabV3 paper, they mention they use 0.007 as the initial learning rate and train for 90k training iterations for cityscapes. I saw 0.1 in the readme and 0.01 here. Can someone please confirm reproduceable hyperparameters for cityscapes?

@VainF Could I be doing something else wrong? Has anyone been able to reproduce the results? @bb846

Try using larger crop size, for example 768.

from deeplabv3plus-pytorch.

bb846 commented on July 3, 2024

Is the reported result on cityscapes after initialising from an ImageNet or COCO pretrained model? @VainF

I think yes. From the following codes, you can see the ResNet backbone is loading weights pretrained on ImageNet.

DeepLabV3Plus-Pytorch/network/backbone/resnet.py

Lines 216 to 222 in c8cc7b2

    
           def _resnet(arch, block, layers, pretrained, progress, **kwargs): 
        
               model = ResNet(block, layers, **kwargs) 
        
               if pretrained: 
        
                   state_dict = load_state_dict_from_url(model_urls[arch], 
        
                                                         progress=progress) 
        
                   model.load_state_dict(state_dict) 
        
               return model

DeepLabV3Plus-Pytorch/network/backbone/resnet.py

Lines 14 to 24 in c8cc7b2

    
           model_urls = { 
        
               'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth', 
        
               'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth', 
        
               'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth', 
        
               'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth', 
        
               'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth', 
        
               'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth', 
        
               'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth', 
        
               'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth', 
        
               'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth', 
        
           }

from deeplabv3plus-pytorch.

bb846 commented on July 3, 2024

Should we continue to use the same number of workers (default=2 in main.py) when using multiple GPUs? Does using number_of_workers = 4*Number_of_GPUs help?

By the way, you seemed to mention the number of workers. I think this parameter will only affect the speed of data loading and has nothing to do with the model performance. For reference, you may read the following links:

https://stackoverflow.com/questions/53998282/how-does-the-number-of-workers-parameter-in-pytorch-dataloader-actually-work

https://discuss.pytorch.org/t/guidelines-for-assigning-num-workers-to-dataloader/813

from deeplabv3plus-pytorch.

prachigarg23 commented on July 3, 2024

Try using larger crop size, for example 768.

It gives CUDA out of memory.

@bb846 Thanks, for sharing the number of workers link. I also found the ImageNet pre-trained weights being loaded. But still can't reproduce the result. I think the issue is that BatchNorm is not synced in this repository and I'll have to use DistributedDataParallel to use the Pytorch SyncBatchNorm: https://pytorch.org/docs/master/generated/torch.nn.SyncBatchNorm.html#torch.nn.SyncBatchNorm.

So could you please share if you were able to reproduce the 77% result or get >75%; the exact command (or hyperparameters you used); and any other changes you made to this code? I think some of the hyperparameters like the learning rate are different from the original paper.

from deeplabv3plus-pytorch.

newzealandpaul commented on July 3, 2024

@bb846 are you able to share the command you used to run the training?

from deeplabv3plus-pytorch.

kona419 commented on July 3, 2024

Try using larger crop size, for example 768.

It gives CUDA out of memory.

@bb846 Thanks, for sharing the number of workers link. I also found the ImageNet pre-trained weights being loaded. But still can't reproduce the result. I think the issue is that BatchNorm is not synced in this repository and I'll have to use DistributedDataParallel to use the Pytorch SyncBatchNorm: https://pytorch.org/docs/master/generated/torch.nn.SyncBatchNorm.html#torch.nn.SyncBatchNorm.

So could you please share if you were able to reproduce the 77% result or get >75%; the exact command (or hyperparameters you used); and any other changes you made to this code? I think some of the hyperparameters like the learning rate are different from the original paper.

Hello, I am wonder if you got >75%. Because my result is around 63~64%.

from deeplabv3plus-pytorch.

can't reproduce your results about deeplabv3plus-pytorch HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	def _resnet(arch, block, layers, pretrained, progress, **kwargs):
	model = ResNet(block, layers, **kwargs)
	if pretrained:
	state_dict = load_state_dict_from_url(model_urls[arch],
	progress=progress)
	model.load_state_dict(state_dict)
	return model

	model_urls = {
	'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
	'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
	'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
	'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
	'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
	'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth',
	'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth',
	'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth',
	'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth',
	}