Code Monkey home page Code Monkey logo

deeplabv3plus-pytorch's Introduction

deeplabv3plus-pytorch's People

Contributors

aoxu2000 avatar danielzhangau avatar dawars avatar horseee avatar m-just avatar timothylimyl avatar vainf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplabv3plus-pytorch's Issues

question about --continue training

Hello, thanks for your nice work.
I met a bug on --continue training.

python main.py --model deeplabv3plus_mobilenet --dataset cityscapes --gpu_id 6 --lr 0.1 --crop_size 768 --batch_size 12 --output_stride 16 --data_root ./datasets/data/cityscapes --ckpt checkpoints/best_deeplabv3plus_mobilenet_cityscapes_os16.pth --continue_training

J9R5PWZFB6({0T6096AE%0V

Can you fix it?

[!] Retrain

[!] Retrain输出这个是什么原因啊?

Model performance index

Hi @VainF ,

I used THOP to add two lines of code to calculate the model parameters and flops in the modeling.py,but the result is not ideal.How does your code calculate the flops and parameters of the model as your chart shows?
Looking forward to your answer!Thanks!

reporduced your code

Hello, I would like to know whether you started training from scratch without loading any weight and how many epochs you have trained

Reproduce issue.

With the default training setting of this code, I train "deeplabv3plus_resnet101" model on voc12.
The best mIOU I can get is 0.763, whereas the provided corresponding model can score 0.783.

Deeplab code not implemented

Why everything comes to this line calling DeepLabV3 but it is not implemented as shown in the figure below?

if name=='deeplabv3plus':
    return_layers = {'high_level_features': 'out', 'low_level_features': 'low_level'}
    classifier = DeepLabHeadV3Plus(inplanes, low_level_planes, num_classes, aspp_dilate)
elif name=='deeplabv3':
    return_layers = {'high_level_features': 'out'}
    classifier = DeepLabHead(inplanes , num_classes, aspp_dilate)
backbone = IntermediateLayerGetter(backbone, return_layers=return_layers)

model = DeepLabV3(backbone, classifier)
return model

deeplab

Question about padding in Mobilenetv2

Dear VainF,

self.input_padding = fixed_padding( 3, dilation )

x_pad = F.pad(x, self.input_padding)

Notice that these two lines are different from the original Mobilenetv2. Could you please share the reason why you implement padding in these two lines and what's consequence of removing them?

Thank you very much.

With kind regards.

AdaptiveAvgPool2d

Why is this nn.AdaptiveAvgPool2d(1) done here?

class ASPPPooling(nn.Sequential):

def __init__(self, in_channels, out_channels):
    super(ASPPPooling, self).__init__(
        nn.AdaptiveAvgPool2d(1),
        nn.Conv2d(in_channels, out_channels, 1, bias=False),
        nn.BatchNorm2d(out_channels),
        nn.ReLU(inplace=True))

def forward(self, x):
    size = x.shape[-2:]
    x = super(ASPPPooling, self).forward(x)
    return F.interpolate(x, size=size, mode='bilinear', align_corners=False)

I am doing segmentation task and this abive pooling changes my output from torch.Size([1, 256, 16, 16]) to torch.Size([1, 256, 1, 1])
giving the error,
"Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])"

What could have gone wrong?

can't reproduce your results

Thank you for your amazing work. I was trying to reproduce your results on cityscapes dataset. However, I couldn't reach mIoU > 70 % for both mobilenet and resnet based model. Could you share your training hyperparameters? Also, do you have any training tips that could help to reach your results?

With kind regards.

pth -> onnx

请问怎么将训练好的 pth 分割模型转换为 onnx?用的网络的 deeplabv3plus_resnet101

how can i get your score?

我想要复现胰腺癌您的结果,想问一下您训练resnet-101时候是权重都不加载的情况下训练的嘛?还是加载了哪些预训练权重呢?训练了多少个epoch呀

How to modify the structure to fit more than three channels of input pictures?

The work I am currently facing needs to add a mask as a four-channel input based on the three-channel picture. I do n’t know how to change the network structure. For example, when using resnet101 as the backbone,how to modify the network structure to fit the four-channel Picture input?
Hope for your help, Thanks

Only support single GPU?

I find that there is no 'Parallel' or 'parallel' in the codes, so I think it only supports single GPU, right?
Then how can you put 16 images on one GPU when trained on CityScapes……

Thanks for your effort!

resnet50 training problem

Hello. I'm trying to reproducing the result. However, when training with deeplabv3plus_resnet50, the mIoU can't reach 0.772. Instead the best performance is 0.714. I wonder is there modification of hyper-parameter when you train it yourself. Thank you very much.

--year 2012_aug

Hi VainF,

I am able to train --year 2012 with following command:

python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2012 --crop_val --lr 0.01 --crop_size 513 --batch_size 14 --output_stride 16 --continue_training

But when I try to train --year 2012_aug, I encounter following error:


Setting up a new session...
Device: cuda
Dataset: voc, Train set: 10582, Val set: 1449
[!] Retrain
Traceback (most recent call last):
  File "main.py", line 390, in <module>
    main()
  File "main.py", line 335, in main
    for (images, labels) in train_loader:
  File "/home/paul/segmentation/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/paul/segmentation/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "/home/paul/segmentation/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
    data.reraise()
  File "/home/paul/segmentation/lib/python3.6/site-packages/torch/_utils.py", line 425, in reraise
    raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/paul/segmentation/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/paul/segmentation/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/paul/segmentation/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/paul/segmentation/DeepLabV3Plus-Pytorch/datasets/voc.py", line 145, in __getitem__
    target = Image.open(self.masks[index])
  File "/home/paul/segmentation/lib/python3.6/site-packages/PIL/Image.py", line 2912, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './datasets/data/VOCdevkit/VOC2012/SegmentationClassAug/2008_002913.png'

In my ./datasets/data/VOCdevkit/VOC2012/SegmentationClassAug directory, I have train_aug.txt file in it. What am I missing? Please help. Thanks a lot.

P.S. I did check 2008_002913.png exists under ./datasets/data/VOCdevkit/VOC2012/JPEGImages
So do I need to copy all the .png files to ./datasets/data/VOCdevkit/VOC2012/SegmentationClassAug? or what should I do to fix this problem? Thanks for your help.

Edited: after follow the instruction to download labels from the dropbox and extract to ./datasets/data/VOCdevkit/VOC2012/SegmentationClassAug then every thing works as expected.

FocalLoss params alpha and gamma

@VainF
I use deeplabv3plus_resnet101 to train my own dataset, and set loss='Focal_Loss'.
But I found the params in focalloss are set as α=1,γ=0, it means the same to cross_entroy loss.
image
image
Is this something you did on purpose or is this a code error ?

train --year 2007 failed

Hi, while waiting to download PascalVOC2012.zip, I try to run 2007 dataset I already downloaded before.

When run, I got the following error message:

(segmentation) paul@tensor:~/segmentation/DeepLabV3Plus-Pytorch$ python main.py --model deeplabv3plus_mobilenet --enable_vis --vis_port 28333 --gpu_id 0 --year 2007 --crop_val --lr 0.01 --crop_size 513 --batch_size 16 --output_stride 16
Setting up a new session...
Device: cuda
Dataset: voc, Train set: 209, Val set: 213
[!] Retrain
/home/paul/segmentation/lib/python3.6/site-packages/torchvision/transforms/functional.py:387: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
"Argument interpolation should be of type InterpolationMode instead of int. "
/home/paul/segmentation/lib/python3.6/site-packages/torchvision/transforms/functional.py:387: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
"Argument interpolation should be of type InterpolationMode instead of int. "
Epoch 1, Itrs 10/30000, Loss=1.980302
Traceback (most recent call last):
File "main.py", line 390, in
main()
File "main.py", line 342, in main
outputs = model(images)
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/paul/segmentation/DeepLabV3Plus-Pytorch/network/utils.py", line 16, in forward
x = self.classifier(features)
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/paul/segmentation/DeepLabV3Plus-Pytorch/network/_deeplab.py", line 49, in forward
output_feature = self.aspp(feature['out'])
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/paul/segmentation/DeepLabV3Plus-Pytorch/network/_deeplab.py", line 160, in forward
res.append(conv(x))
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/paul/segmentation/DeepLabV3Plus-Pytorch/network/_deeplab.py", line 130, in forward
x = super(ASPPPooling, self).forward(x)
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 178, in forward
self.eps,
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/functional.py", line 2279, in batch_norm
_verify_batch_size(input.size())
File "/home/paul/segmentation/lib/python3.6/site-packages/torch/nn/functional.py", line 2247, in _verify_batch_size
raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

what should I do to fix this error? Thank you for your help.

Training and Version

Hi, Could please give some clear instruction of the changes in Main.py and cityscapes.py if I want to train my own data set. Also tell the versions of libraries you used.

question about test

@VainF ,Hi,I can train normally on cityscapes datasets, but the test results are obviously wrong. What's the matter?

Error while loading DeepLabV3Plus-ResNet50 model from checkpoint with --separable_conv flag

It seems that DeepLabV3Plus-ResNet50 model is not trained while --separable_conv flag is active because trying to load the weights when this flag is active, causes an error at the checkpoint loading stage.

Command:

python main.py --model deeplabv3plus_resnet50 --separable_conv --ckpt checkpoints/best_deeplabv3plus_resnet50_voc_os16.pth --test_only --save_val_results

Error:

RuntimeError: Error(s) in loading state_dict for DeepLabV3:
Missing key(s) in state_dict: "classifier.aspp.convs.1.0.body.0.weight", "classifier.aspp.convs.1.0.body.1.weight", "classifier.aspp.convs.2.0.body.0.weight", "classifier.aspp.convs.2.0.body.1.weight", "classifier.aspp.convs.3.0.body.0.weight", "classifier.aspp.convs.3.0.body.1.weight", "classifier.classifier.0.body.0.weight", "classifier.classifier.0.body.1.weight".
Unexpected key(s) in state_dict: "classifier.aspp.convs.1.0.weight", "classifier.aspp.convs.2.0.weight", "classifier.aspp.convs.3.0.weight", "classifier.classifier.0.weight".

Perhaps, if you are still keeping the commands that you used for training the models for which you shared the weights and publish those commands, it might be easier to use the pre-trained models.

I just wanted to note this point in case somebody else also experiences the same issue. Overall, the repo is really helpful. Thank you.

关于训练结果

我使用cityscapes数据集训练,PR值都还好,但是看了一下results文件夹下产生的pre_img发现语义分割都画乱了,之后我自己又下载了一下你的模型,看一下图片检测结果,发现也不行。我不知道是我根据网络输出结果把语义分割画到原图的过程错误还是有忽略其他什么问题,希望您可以给一个将神经网络结果画回原图的示例代码。谢谢!

"

import cv2
import numpy as np

from network import *
from PIL import Image
from torchvision.transforms.transforms import *
import torch

val_transform = Compose([
#et.ExtResize( 512 ),
ToTensor(),
Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
])
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model_path = 'models_res/best_deeplabv3plus_mobilenet_cityscapes_os16.pth'
model = deeplabv3plus_mobilenet(num_classes = 19,output_stride=16)

model.load_state_dict(torch.load(model_path)['model_state'])
model.to(device)
model.eval()

img_path = 'results/4_image.png'
image = Image.open(img_path).convert('RGB')
input = cv2.cvtColor(np.asarray(image),cv2.COLOR_RGB2BGR)
if name == 'main':
import torch

cv2.namedWindow('img_draw',0)

model_dict = torch.load(model_path)
test_input = val_transform(image).unsqueeze(dim=0)
test_input = test_input.to(device)
print('输入图像:',test_input.size())
output =model(test_input).cpu().detach().clone()
print('输出:',output.size())

preds = output.max(dim=1)[1].cpu().numpy()#中括号里对应输出 19 个维度中其中一个
print(preds)
mask = (output.detach().max(dim=1)[1].cpu()==5).nonzero()
mask = mask[...,1:].numpy()
print(mask)
cv2.drawContours(input, [mask], -1, (0, 0, 255), -1)
cv2.imshow('img_draw',input)
cv2.waitKey(0)

"

how can i know miou score

Hello, I would like to know whether you started training from scratch without loading any weight and how many epochs you have trained

low GPU utility

| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:01:00.0 Off | N/A |
| 36% 64C P2 93W / 250W | 5556MiB / 11018MiB | 38% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... Off | 00000000:02:00.0 Off | N/A |
| 37% 64C P2 118W / 250W | 5486MiB / 11019MiB | 32% Default |

I'm using two RTX 2080Ti GPUs, and the average utility is around 35%. I also tried the implementation in SMP, and the utility is low as well.
Wonder anyone else also experiences this problem? And what may be the cause? Thanks.
I'm sure it's not caused by the data loader, as when I use unet or my own model, the utility is always over 90%.

Cityscapes training on Full Res image

Hi,

Thanks for this wonderful repo!

I would like to ask you whether you have trained Cityscapes images on full resolution images using DeeplabV3 + Mobilenet architecture model you have provided in this repo?

预训练模型输入维数

大神,想请教使用cityscapes预训练的resnet101时,需要把图像的维度预处理成多少后喂进网络?
毕业设计,万分感激

Train DeeplabV3Plus-MobileNetV3 for Road Only Segmentation

Hi there,

I am trying to find a model which just do segmentation for road with this model:

'tf_mobilenetv3_small_075': {
        'imagenet': 'https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/tf_mobilenetv3_small_075-da427f52.pth'
    },

I need to run it on Rpi + OAK-D camera, so I'd like it to be as not that slow on these edge devices.

Could you please provide this trained model, or help to show me how to do it?

Thanks,

Winston

Question about train

Hi

Thanks for your repo! I successfully trained the DeeplabV3Plus-Mobilenetv2 model on the Pascal2012 dataset, but my mIOU is only 69.41%(python3 main.py --model deeplabv3plus_mobilenet --separable_conv --gpu_id 0 --year 2012_aug --crop_val --lr 0.007 --crop_size 513 --batch_size 10 --output_stride 16).How can I improve?
Another question, why is the experimental section of mobilenetv2's paper up to 75.70% mIOU?Was it because his model had been pretrained on COCO?I'm so confused...

Look forward to your answers!

Testing 问题

请问新添加的test脚本预测后的效果是什么样的?我使用自己训练的模型进行预测后得到的是全黑的图片,是否是我中间的流程出了问题?

Nice Repo!

This repo is really nice, performance on pascal voc could be reproduce using 2 gpus with batchsize=16.

can not upzip the file DeepLabV3Plus-ResNet101

Hello

i can not unzip the file best_deeplabv3plus_resnet101_cityscapes_os16.pth.tar

is it the file damaged or which tool i should use to unzip the file?

thank you!

Best Reguards
Yiru

IntermediateLayerGetter parameters

hi,
thanks for this implementation.

do you think it is safe to grab the parameters of the backbone after it has been passed through IntermediateLayerGetter? as done in here.

it seems that calling backbone.parameters() will retrieve only few parameters and not the entire backbone's parameters as one expects.
see here for an example using resnet.
thanks

TypeError: the JSON object must be str, bytes or bytearray, not bool

I work with Cityscapes dataset but when training there is a error like this :
Traceback (most recent call last):
File "main.py", line 388, in
main()
File "main.py", line 217, in main
vis = Visualizer(port=opts.vis_port,
File "I:\DeepLabV3Plus-Pytorch-master\utils\visualizer.py", line 14, in init
ori_win = json.loads(ori_win)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\json_init_.py", line 341, in loads
raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not bool

Questions about evaluating cityscapes dataset

Thanks for your great work, I just wandering how do you evaluate cityscapes dataset, after reading your code, it seems like you trained the model on input size 512x512, and directly evaluate on the original image size(1024 x 2048):

  if opts.crop_val:
            val_transform = et.ExtCompose([
                et.ExtResize(opts.crop_size),     # random crop to 512 x 512
                et.ExtCenterCrop(opts.crop_size),
                et.ExtToTensor(),
                et.ExtNormalize(mean=[0.485, 0.456, 0.406],
                                std=[0.229, 0.224, 0.225]),
            ])
        else:
            val_transform = et.ExtCompose([
                et.ExtToTensor(),    
                et.ExtNormalize(mean=[0.485, 0.456, 0.406],
                                std=[0.229, 0.224, 0.225]),
            ])

Why use the same model to evaluate the different input image size? Thanks.

Is the separable_conv is better than standard conv?

Hello!
I train moblenet-deeplabv3+ and mobilenet-deeplabv3+ with --separable_conv open
and find that latter is better than former by 1.8% (MIoU).
So I want your result and I guess it is because the reducing of overfitting?
Thanks a lot!

question to the trainingdata

hi, guys, i'm recently meeting a problem which very confuses me. I'm using deeplabv3+ to train a 5 classes segmentation model include forest, ground, sky, runway asphalt and runway lane. i used 3100 images and cooresponding labels. But i exchange the label index 1,2 by mistake from 1706th label up and i trained the network. But finally i get a better segmentation than before accidently. Do you know what causes this, because i fixed the problem and modifed the wrong label index as correct index afterwards and the results is bad. Thank you in advance.

Failed to reproduce the results on VOC 2012 dataset

Hi, VainF. Thanks for sharing this nice repo, where the code has great readability and practicability. However, I failed to reproduce the results on the voc dataset.

I trained deeplabv3plus-resnet101 (os 16, provided pre-trained weights for ResNet101) on the VOC 2012_aug dataset with all the other default settings but only changed the gpu_id to '0,1,2,3' as I couldn't train the model on one 2080ti gpu with the batch size of 16. And I also applied the SyncBN: https://github.com/vacancy/Synchronized-BatchNorm-PyTorch to avoid the performance decrease caused by multi-gpu training. And the best miou is 0.7539. Then I asked my friend to help to train the model, he trained the model on a TITAN RTX gpu with no SyncBN, and his best miou is 0.7535. Therefore I think multi-gpu training is fine with SyncBN.

Did you further apply multi-scale inference for the validation? Do I need to change some settings to achieve 0.783 on VOC 2012_aug dataset?

Looking forward to your reply and suggestions. Thank you again for your effort.

Pre-training model

Hello author, can the pre-training model provide the download address in China? Like Baidu Cloud

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.