openseg-group / openseg.pytorch Goto Github PK
View Code? Open in Web Editor NEWThe official Pytorch implementation of OCNet series and SegFix.
License: MIT License
The official Pytorch implementation of OCNet series and SegFix.
License: MIT License
Hi :
I can not download this paper, could you offer me this paper pdf format ? or linklist also ok. Thanks a lot
Dear Authors,
Could you clarify the difference between the "HRNet-OCR" at (1. https://github.com/openseg-group/openseg.pytorch)
and the "HRNet-OCR" at (2. https://github.com/HRNet/HRNet-Semantic-Segmentation/tree/HRNet-OCR)?
Many thanks!
In loss_heleper.py
In the calculation of loss function, the input is two tensors[1,8,128,128] /[1,2,128,128], and the corresponding label of single is three tensors.[1,512,512],[1,512,512],[1,512,512]
targets=targets_.clone().unsqueeze(1).float()
AttributeError:'list' object has no attribute 'clone'
Could you please improve the doc about how to use the segfix method?I'm a little bit confused about how to generate the offset file.
Is the released model the same as the one achieved mIOU 84.5?
I was excited to try segfix training on my own data.
I could produce the mat files for train and val data.
Training works with run_h_48_d_4_segfix.sh and loss convergences. But on the validation the IoU is more or less random (I have 2 classes)
2020-08-20 10:47:41,932 INFO [base.py, 32] Result for mask
2020-08-20 10:47:41,932 INFO [base.py, 48] Mean IOU: 0.7853758111568029
2020-08-20 10:47:41,933 INFO [base.py, 49] Pixel ACC: 0.9692584678389714
2020-08-20 10:47:41,933 INFO [base.py, 54] F1 Score: 0.7523384841507573 Precision: 0.7928424176432377 Recall: 0.7157718538603068
2020-08-20 10:47:41,933 INFO [base.py, 32] Result for dir (mask)
2020-08-20 10:47:41,933 INFO [base.py, 48] Mean IOU: 0.5390945167184129
2020-08-20 10:47:41,933 INFO [base.py, 49] Pixel ACC: 0.7248566725097775
2020-08-20 10:47:41,933 INFO [base.py, 32] Result for dir (GT)
2020-08-20 10:47:41,934 INFO [base.py, 48] Mean IOU: 0.41990305666871003
2020-08-20 10:47:41,934 INFO [base.py, 49] Pixel ACC: 0.6007717101395131
to investigate the issue further I tried to analyse the predicted mat files with
bash scripts/cityscapes/segfix/run_h_48_d_4_segfix.sh segfix_pred_val 1
with "input_size": [640, 480] this exception happens:
File "/home/rsa-key-20190908/openseg.pytorch/lib/datasets/tools/collate.py", line 108, in collate
assert pad_height >= 0 and pad_width >= 0
after fixing it more or less, iv got similar results as val during training
They were around 3Kb instead of ~70kb
btw, it took "input_size": [640, 480] config from "test": { leave instead "val": {
is it possible validation only works with "input_size": [2048, 1024],?
Can you give me any hints how to manually verify the .mat files of there correctness? Currently I'm diving into 2007.04269.pdf and the code of dt_offset_generator.py to get an understanding.
Thanks for sharing this wonderful work with us!
I have a problem with the computing of similarity map in the OCR module.
In line 131 in lib/models/seg_hrnet_orc.py
sim_map = (self.key_channels**-.5) * sim_map
Why multiply a small value (self.key_channels**-.5
) to sim_map before softmax?
During validation, I have printed the final result of sim_map
and I found all values in this map are very close to 0.0526 (equals to 1/19), which means the probabilities of a pixel i
belong to different classes k
are almost equal.
Is this contradicting the assumption that the similarity map should represent the relation between the _i_th pixel and the _k_th object region?
#######################
Your former answer:
Multiplying the small value is following the original self-attention scheme. Please refer to the last paragraph of 3.2.1 in the paper "Attention Is All You Need". However, we find this small factor does not influence the segmentation performance.
As the final result of the sim_map, we do not understand why all the values are almost the same in your case. What checkpoints are you testing? How about the performance of the used checkpoint? Please provide more information so that we can help you.
#########################
Thanks a lot for your reply!
I used the checkpoint posted on HRNet-OCR. The segmentation performance is good ad the mIoU is 81.6, too.
In inference, I have printed 10 random rows in the sim_map
like below:
All values in this map are very close to 0.0526 (equals to 1/19).
why not release segfix weights pretrained on ade20k dataset ?
I can't find it in the MODEL_ZOO page.
Hi thanks for releasing the code, first thing was to try it by myself!
All worked very well, i successfully trained and validated with own images and own label files.
But what I don't get is, how I can generate own *.mat files to run the segfix. You provided only mat files for cityscape but how to generate them for an own dataset?
When starting val with segfix (scripts/cityscapes/hrnet/run_h_48_d_4_ocr.sh segfix 3 val) I recieve:
FileNotFoundError: [Errno 2] No such file or directory: openseg.pytorch/data/cityscapes/val/offset_pred/semantic/offset_hrnext/is-03-08-2019-normal-98-001089.mat'
Hi, noted that for spatial_ocr_block.py,this implementation is diffrent with that of your HRNet+OCR in the sub module f_object,f_pixel,f_down,f_up and so on. All the convolution layer in those sub modules are followed with a bias in this implementation, while the option 'bias' is set to 'False' in all convolution layer of those sub modules in the implementation of your HRNet+OCR. What's the motivation or different effect of the two implementation?
Dear Author,
I am trying to use the pretrained models (ResNet-101 or HRNet-W48 backbones) in my work, but similar errors are reported for both backbones.
checkpoint names:
checkpoints/cityscapes/hrnet_w48_ocr_1_latest.pth
checkpoints/cityscapes/spatial_ocrnet_deepbase_resnet101_dilated8_1_latest.pth
commands:
(for HRNet-W48:)
python -u main.py --configs configs/cityscapes/H_48_D_4.json --drop_last y --backbone hrnet48 --model_name hrnet_w48_ocr --checkpoints_name hrnet_w48_ocr_1 --phase test --gpu 0 --resume ./checkpoints/cityscapes/hrnet_w48_ocr_1_latest.pth --loss_type fs_auxce_loss --test_dir input_images --out_dir output_images
(for ResNet101:)
python -u main.py --configs configs/cityscapes/R_101_D_8.json --drop_last y --backbone deepbase_resnet101_dilated8 --model_name spatial_ocrnet --checkpoints_name spatial_ocrnet_deepbase_resnet101_dilated8_1 --phase test --gpu 0 --resume ./checkpoints/cityscapes/spatial_ocrnet_deepbase_resnet101_dilated8_1_latest.pth --loss_type fs_auxce_loss --test_dir input_images --out_dir output_images
environments:
python 3.7.3 h33d41f4_1 conda-forge
pytorch 1.1.0 py3.7_cuda10.0.130_cudnn7.5.1_0 PyTorch
torchcontrib 0.0.2
torchvision 0.3.0 py37_cu10.0.130_1 pytorch
gcc (GCC) 7.2.0
cuda 10.0
Error messages:
RuntimeError: unexpected key in source state_dict: conv_3x3.1.weight, conv_3x3.1.bias, conv_3x3.1.running_mean, conv_3x3.1.running_var, spatial_ocr_head.object_context_block.f_pixel.1.weight, spatial_ocr_head.object_context_block.f_pixel.1.bias, spatial_ocr_head.object_context_block.f_pixel.1.running_mean, spatial_ocr_head.object_context_block.f_pixel.1.running_var, spatial_ocr_head.object_context_block.f_pixel.3.weight, spatial_ocr_head.object_context_block.f_pixel.3.bias, spatial_ocr_head.object_context_block.f_pixel.3.running_mean, spatial_ocr_head.object_context_block.f_pixel.3.running_var, spatial_ocr_head.object_context_block.f_object.1.weight, spatial_ocr_head.object_context_block.f_object.1.bias, spatial_ocr_head.object_context_block.f_object.1.running_mean, spatial_ocr_head.object_context_block.f_object.1.running_var, spatial_ocr_head.object_context_block.f_object.3.weight, spatial_ocr_head.object_context_block.f_object.3.bias, spatial_ocr_head.object_context_block.f_object.3.running_mean, spatial_ocr_head.object_context_block.f_object.3.running_var, spatial_ocr_head.object_context_block.f_down.1.weight, spatial_ocr_head.object_context_block.f_down.1.bias, spatial_ocr_head.object_context_block.f_down.1.running_mean, spatial_ocr_head.object_context_block.f_down.1.running_var, spatial_ocr_head.object_context_block.f_up.1.weight, spatial_ocr_head.object_context_block.f_up.1.bias, spatial_ocr_head.object_context_block.f_up.1.running_mean, spatial_ocr_head.object_context_block.f_up.1.running_var, spatial_ocr_head.conv_bn_dropout.1.weight, spatial_ocr_head.conv_bn_dropout.1.bias, spatial_ocr_head.conv_bn_dropout.1.running_mean, spatial_ocr_head.conv_bn_dropout.1.running_var, dsn_head.1.weight, dsn_head.1.bias, dsn_head.1.running_mean, dsn_head.1.running_var
It looks like that the checkpoint model and the running model does not match at some layers. Could you take a look please? Thank you very much!
Hello,
in the requirements.txt it is recommended to use
torch==0.4.1
torchvision==0.2.1
versions. But are the newer versions of pytorch with CUDA 10 support supported?
.pth files dont match .sh scripts, will raise RuntimeError when load_state_dict, such as:
ocr/Cityscapes/hrnet_w48_ocr_1_latest.pth does not math checkpoints/cityscapes,
will raise RuntimeError when load_state_dict in segmentor/tools/module_runner.py#L156
https://github.com/openseg-group/openseg.pytorch/blob/master/segmentor/tools/module_runner.py#L156
Which is required for LIP/R_101_D_16.json
SegFix is just used to citiscapes, is right? Because my own dataset hasnot the *.mat offset files.
As the title.
And any config details or suggestions about pretraining on Mapillary will be appreciated!
Thanks!
Could you please, improve the documentation about how can we use the library with pre-trained model ?
I would like to use it on my own dataset if possible.
Thanks
I met this RuntimeError: Ninja is required to load C++ extensions when the program running, could you pls help me to solve it?
Thanks a lot!
Ps, I've installed ninja and its version is 1.10.1
Hi!
Thanks for your nice work. It is really impressive. I'm interested in the SegFix algorithm.
Could you send a copy of the paper "SegFix: Model-Agnostic Boundary Refinement for Segmentation", since I cannot find it on arXiv.
Best,
David
I met OOM problem when validate during training phase.
Here is the log:
World size: 4
['--configs', 'configs/cityscapes/R_101_D_8.json', '--drop_last', 'y', '--phase', 'train', '--gathered', 'n', '--loss_balance', 'y', '--log_to_file', 'n', '--backbone', 'deepbase_resnet101_dilated8', '--model_name', 'base_ocnet', '--gpu', '0', '1', '2', '3', '--distributed', '--data_dir', './dataset/cityscapes', '--loss_type', 'fs_auxce_loss', '--max_iters', '40000', '--checkpoints_name', 'base_ocnet_deepbase_resnet101_dilated8_20201029', '--pretrained', './pretrained_model/resnet101-imagenet.pth']
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
2020-10-30 23:23:59,797 INFO [offset_helper.py, 54] engery/max-distance: 5 engery/min-distance: 0
2020-10-30 23:23:59,797 INFO [offset_helper.py, 61] direction/num_classes: 8 scale: 1
2020-10-30 23:23:59,797 INFO [offset_helper.py, 66] c4 align axis: False
2020-10-30 23:23:59,800 INFO [offset_helper.py, 54] engery/max-distance: 5 engery/min-distance: 0
2020-10-30 23:23:59,800 INFO [offset_helper.py, 61] direction/num_classes: 8 scale: 1
2020-10-30 23:23:59,800 INFO [offset_helper.py, 66] c4 align axis: False
2020-10-30 23:23:59,808 INFO [module_runner.py, 44] BN Type is inplace_abn.
2020-10-30 23:23:59,808 INFO [init.py, 17] Using evaluator: StandardEvaluator
2020-10-30 23:23:59,811 INFO [module_runner.py, 44] BN Type is inplace_abn.
2020-10-30 23:23:59,811 INFO [init.py, 17] Using evaluator: StandardEvaluator
2020-10-30 23:24:00,360 INFO [module_helper.py, 136] Loading pretrained model:./pretrained_model/resnet101-imagenet.pth
2020-10-30 23:24:00,361 INFO [module_helper.py, 136] Loading pretrained model:./pretrained_model/resnet101-imagenet.pth
2020-10-30 23:24:00,752 INFO [offset_helper.py, 54] engery/max-distance: 5 engery/min-distance: 0
2020-10-30 23:24:00,752 INFO [offset_helper.py, 61] direction/num_classes: 8 scale: 1
2020-10-30 23:24:00,752 INFO [offset_helper.py, 66] c4 align axis: False
2020-10-30 23:24:00,759 INFO [offset_helper.py, 54] engery/max-distance: 5 engery/min-distance: 0
2020-10-30 23:24:00,760 INFO [offset_helper.py, 61] direction/num_classes: 8 scale: 1
2020-10-30 23:24:00,760 INFO [offset_helper.py, 66] c4 align axis: False
2020-10-30 23:24:00,763 INFO [module_runner.py, 44] BN Type is inplace_abn.
2020-10-30 23:24:00,763 INFO [init.py, 17] Using evaluator: StandardEvaluator
2020-10-30 23:24:00,771 INFO [module_runner.py, 44] BN Type is inplace_abn.
2020-10-30 23:24:00,771 INFO [init.py, 17] Using evaluator: StandardEvaluator
2020-10-30 23:24:01,344 INFO [module_helper.py, 136] Loading pretrained model:./pretrained_model/resnet101-imagenet.pth
2020-10-30 23:24:01,349 INFO [module_helper.py, 136] Loading pretrained model:./pretrained_model/resnet101-imagenet.pth
2020-10-30 23:24:06,815 INFO [trainer.py, 78] Params Group Method: None
2020-10-30 23:24:06,816 INFO [trainer.py, 78] Params Group Method: None
2020-10-30 23:24:06,816 INFO [trainer.py, 78] Params Group Method: None
2020-10-30 23:24:06,816 INFO [trainer.py, 78] Params Group Method: None
2020-10-30 23:24:06,817 INFO [optim_scheduler.py, 66] Use lambda_poly policy with default power 0.9
2020-10-30 23:24:06,817 INFO [data_loader.py, 131] use the DefaultLoader for train...
2020-10-30 23:24:06,818 INFO [optim_scheduler.py, 66] Use lambda_poly policy with default power 0.9
2020-10-30 23:24:06,818 INFO [optim_scheduler.py, 66] Use lambda_poly policy with default power 0.9
2020-10-30 23:24:06,818 INFO [data_loader.py, 131] use the DefaultLoader for train...
2020-10-30 23:24:06,818 INFO [optim_scheduler.py, 66] Use lambda_poly policy with default power 0.9
2020-10-30 23:24:06,818 INFO [data_loader.py, 131] use the DefaultLoader for train...
2020-10-30 23:24:06,818 INFO [data_loader.py, 131] use the DefaultLoader for train...
2020-10-30 23:24:06,855 INFO [data_loader.py, 164] use DefaultLoader for val ...
2020-10-30 23:24:06,855 INFO [data_loader.py, 164] use DefaultLoader for val ...
2020-10-30 23:24:06,856 INFO [data_loader.py, 164] use DefaultLoader for val ...
2020-10-30 23:24:06,857 INFO [data_loader.py, 164] use DefaultLoader for val ...
2020-10-30 23:24:06,861 INFO [loss_manager.py, 54] use loss: fs_auxce_loss.
2020-10-30 23:24:06,861 INFO [loss_manager.py, 54] use loss: fs_auxce_loss.
2020-10-30 23:24:06,861 INFO [loss_manager.py, 39] use distributed loss
2020-10-30 23:24:06,861 INFO [loss_manager.py, 39] use distributed loss
2020-10-30 23:24:06,863 INFO [loss_manager.py, 54] use loss: fs_auxce_loss.
2020-10-30 23:24:06,863 INFO [loss_manager.py, 54] use loss: fs_auxce_loss.
2020-10-30 23:24:06,863 INFO [loss_manager.py, 39] use distributed loss
2020-10-30 23:24:06,863 INFO [loss_manager.py, 39] use distributed loss
2020-10-30 23:24:07,060 INFO [data_helper.py, 119] Input keys: ['img']
2020-10-30 23:24:07,061 INFO [data_helper.py, 120] Target keys: ['labelmap']
2020-10-30 23:24:07,115 INFO [data_helper.py, 119] Input keys: ['img']
2020-10-30 23:24:07,115 INFO [data_helper.py, 120] Target keys: ['labelmap']
2020-10-30 23:24:07,117 INFO [data_helper.py, 119] Input keys: ['img']
2020-10-30 23:24:07,117 INFO [data_helper.py, 120] Target keys: ['labelmap']
2020-10-30 23:24:07,126 INFO [data_helper.py, 119] Input keys: ['img']
2020-10-30 23:24:07,126 INFO [data_helper.py, 120] Target keys: ['labelmap']
2020-10-30 23:24:18,010 INFO [trainer.py, 219] Train Epoch: 0 Train Iteration: 10 Time 11.147s / 10iters, (1.115) Forward Time 3.996s / 10iters, (0.400) Backward Time 6.918s / 10iters, (0.692) Loss Time 0.029s / 10iters, (0.003) Data load 0.203s / 10iters, (0.020318)
Learning rate = [0.00999797497721687, 0.00999797497721687] Loss = 3.54437590 (ave = 3.62284539)
2020-10-30 23:24:18,808 INFO [trainer.py, 259] 0 images processed
2020-10-30 23:24:19,588 INFO [trainer.py, 259] 0 images processed
2020-10-30 23:24:19,825 INFO [trainer.py, 259] 0 images processed
2020-10-30 23:24:19,840 INFO [trainer.py, 259] 0 images processed
Traceback (most recent call last):
File "/home/kururu/github/openseg.pytorch/main.py", line 227, in
model.train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 365, in train
self.__train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 240, in __train
self.__val()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 308, in __val
outputs = self.seg_net(*inputs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/nets/ocnet.py", line 58, in forward
x = self.oc_module(x)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in forward
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 89, in forward
sim_map = (self.key_channels-.5) * sim_map
RuntimeError: CUDA out of memory. Tried to allocate 4.10 GiB (GPU 0; 10.91 GiB total capacity; 5.16 GiB already allocated; 3.28 GiB free; 229.69 MiB cached)
Traceback (most recent call last):
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/distributed/launch.py", line 253, in
main()
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/distributed/launch.py", line 249, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/kururu/anaconda3/envs/kururudev-torch-1-0/bin/python', '-u', '/home/kururu/github/openseg.pytorch/main.py', '--local_rank=3', '--configs', 'configs/cityscapes/R_101_D_8.json', '--drop_last', 'y', '--phase', 'train', '--gathered', 'n', '--loss_balance', 'y', '--log_to_file', 'n', '--backbone', 'deepbase_resnet101_dilated8', '--model_name', 'base_ocnet', '--gpu', '0', '1', '2', '3', '--distributed', '--data_dir', './dataset/cityscapes', '--loss_type', 'fs_auxce_loss', '--max_iters', '40000', '--checkpoints_name', 'base_ocnet_deepbase_resnet101_dilated8_20201029', '--pretrained', './pretrained_model/resnet101-imagenet.pth']' returned non-zero exit status 1.
Traceback (most recent call last):
File "main.py", line 178, in
handle_distributed(args_parser, os.path.expanduser(os.path.abspath(file)))
File "/home/kururu/github/openseg.pytorch/lib/utils/distributed.py", line 56, in handle_distributed
cmd=command_args)
subprocess.CalledProcessError: Command '['/home/kururu/anaconda3/envs/kururudev-torch-1-0/bin/python', '-u', '-m', 'torch.distributed.launch', '--nproc_per_node', '4', '/home/kururu/github/openseg.pytorch/main.py', '--configs', 'configs/cityscapes/R_101_D_8.json', '--drop_last', 'y', '--phase', 'train', '--gathered', 'n', '--loss_balance', 'y', '--log_to_file', 'n', '--backbone', 'deepbase_resnet101_dilated8', '--model_name', 'base_ocnet', '--gpu', '0', '1', '2', '3', '--distributed', '--data_dir', './dataset/cityscapes', '--loss_type', 'fs_auxce_loss', '--max_iters', '40000', '--checkpoints_name', 'base_ocnet_deepbase_resnet101_dilated8_20201029', '--pretrained', './pretrained_model/resnet101-imagenet.pth']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/home/kururu/github/openseg.pytorch/main.py", line 227, in
model.train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 365, in train
self.__train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 240, in __train
self.__val()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 308, in __val
outputs = self.seg_net(*inputs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/nets/ocnet.py", line 58, in forward
x = self.oc_module(x)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in forward
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 89, in forward
sim_map = (self.key_channels-.5) * sim_map
RuntimeError: CUDA out of memory. Tried to allocate 4.10 GiB (GPU 3; 10.92 GiB total capacity; 5.46 GiB already allocated; 1019.50 MiB free; 3.89 GiB cached)
Traceback (most recent call last):
File "/home/kururu/github/openseg.pytorch/main.py", line 227, in
model.train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 365, in train
self.__train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 240, in __train
self.__val()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 308, in __val
outputs = self.seg_net(*inputs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/nets/ocnet.py", line 58, in forward
x = self.oc_module(x)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in forward
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 89, in forward
sim_map = (self.key_channels-.5) * sim_map
RuntimeError: CUDA out of memory. Tried to allocate 4.10 GiB (GPU 2; 10.92 GiB total capacity; 5.46 GiB already allocated; 1023.50 MiB free; 3.89 GiB cached)
Traceback (most recent call last):
File "/home/kururu/github/openseg.pytorch/main.py", line 227, in
model.train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 365, in train
self.__train()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 240, in __train
self.__val()
File "/home/kururu/github/openseg.pytorch/segmentor/trainer.py", line 308, in __val
outputs = self.seg_net(*inputs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/nets/ocnet.py", line 58, in forward
x = self.oc_module(x)
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in forward
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 153, in
priors = [stage(feats) for stage in self.stages]
File "/home/kururu/anaconda3/envs/kururudev-torch-1-0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, kwargs)
File "/home/kururu/github/openseg.pytorch/lib/models/modules/base_oc_block.py", line 89, in forward
sim_map = (self.key_channels-.5) * sim_map
RuntimeError: CUDA out of memory. Tried to allocate 4.10 GiB (GPU 1; 10.92 GiB total capacity; 5.46 GiB already allocated; 523.50 MiB free; 4.38 GiB cached)
probs = F.softmax(self.scale * probs, dim=2)# batch x k x hw
Quick question: dim should be 1 or 2? In my opinion, k represents the number of object classes. Maybe I misunderstood some detailed parts of the proposed method.
Hello,
Thanks for making the code and the pre-trained models available!
I would like to know to reproduce your results on the Cityscapes test set (mIoU 84.5/84.2 with/without SegFix) from your provided pre-trained model.
Should I take the 80000-iteration OCR HRNet-W48 weights that you listed in the model zoo?
Thank you in advance for your response.
你好,从代码来看,通过长短两步,虽然每一步的图片的尺寸变小了,但是他的batchsize变大了啊,那么他是怎么减小了计算量和参数量的呢,非常不理解这点,请大佬解释一下
For comparison in our paper, we are looking for the detailed test set results (class IoUs) of these prediction files that you shared: https://drive.google.com/drive/folders/156vMABydr7btdPDBU6b9J-e0jJHuPI73
Do you happen to have a snapshot of the submission results obtained with these predictions?
Thank you for your consideration.
您好,我想直接调用segfix来优化下我的结果,请问我应该调用哪个程序?
期待您的回复,谢谢!
Dear Author,
Thank you for your excellent work, but some errors are reported for backbones.
checkpoint names:
checkpoints/cityscapes/hrnet_w48_ocr_1_latest.pth
commands:
(for HRNet-W48:)
python -u main.py --configs configs/cityscapes/H_48_D_4.json --drop_last y --backbone hrnet48 --model_name hrnet_w48_ocr --checkpoints_name hrnet_w48_ocr_1 --phase test --gpu 0 --resume ./checkpoints/cityscapes/hrnet_w48_ocr_1_latest.pth --loss_type fs_auxce_loss --test_dir input_images --out_dir output_images
Error messages:
2020-07-15 21:00:10,470 INFO [module_runner.py, 44] BN Type is inplace_abn.
Traceback (most recent call last):
File "main.py", line 214, in
model = Tester(configer)
File "/home/dai/code/semantic_segmentation/9/openseg.pytorch-master/segmentor/tester.py", line 69, in init
self._init_model()
File "/home/dai/code/semantic_segmentation/9/openseg.pytorch-master/segmentor/tester.py", line 72, in _init_model
self.seg_net = self.model_manager.semantic_segmentor()
File "/home/dai/code/semantic_segmentation/9/openseg.pytorch-master/lib/models/model_manager.py", line 81, in semantic_segmentor
model = SEG_MODEL_DICTmodel_name
File "/home/dai/code/semantic_segmentation/9/openseg.pytorch-master/lib/models/nets/hrnet.py", line 105, in init
self.backbone = BackboneSelector(configer).get_backbone()
File "/home/dai/code/semantic_segmentation/9/openseg.pytorch-master/lib/models/backbones/backbone_selector.py", line 34, in get_backbone
model = HRNetBackbone(self.configer)(**params)
File "/home/dai/code/semantic_segmentation/9/openseg.pytorch-master/lib/models/backbones/hrnet/hrnet_backbone.py", line 598, in call
bn_momentum=0.1)
File "/home/dai/code/semantic_segmentation/9/openseg.pytorch-master/lib/models/backbones/hrnet/hrnet_backbone.py", line 307, in init
self.bn1 = ModuleHelper.BatchNorm2d(bn_type=bn_type)(64, momentum=bn_momentum)
TypeError: 'NoneType' object is not callable
Could you please tell me what is wrong? thank you.
Hello. I'm trying to reproduce your CityScapes results for our BMVC paper.
after I followed the data directory format in the config.profile file and running bash ./scripts/cityscapes/hrnet/run_h_48_d_4_ocr.sh val 1
I get this error:
ERROR: Found no prediction for ground truth /home/arash/openseg.pytorch/dataset/cityscapes/val/label/munster_000027_000019_gtFine_labelIds.png
could you explain how did you prepare the data?
Thanks
Paper explains how you combine ResNet with OCR. The output stride of (dilated) ResNet is 8 and you use the last two stages as inputs for OCR.
However, HRNetV2 has outputs at 4 different scales (output stride = [4, 8, 16, 32]). Can you explain how you combine them?
In addition, section 3.2 in paper states that the output size of stage 3&4 are H × W. Is this 1/8 of original input image size since the output stride is 8?
Hello, would you like share your source code?
when release the OCR module.
Dear Author
Hello. Thank you for sharing the code about it.
I can get some insight into your code to solve my problem.
I have some questions about your code.
First, in the ocrnet.py, you apply the feature network after that use the OCR block after that you use the F.interpolat 2 times. However, I am wondering why you have two returns about the first interpolation result and the second interpolation result. Usual segmentation network uses the last segmentation as the final segmentation result.
Second, I want to use your segfix algorithm about my problem. However, I can not find the independent algorithm about it. I also can not find the paper. The readme only mentions that it is similar about the PointRend scheme. I am wondering is it CNN approach or use the extra others?
Thank you.
Thanks for your job!
elif torch_ver == '1.2':
from inplace_abn import InPlaceABNSync
return InPlaceABNSync(num_features, **kwargs)
I cant't find the file of inplace_abn for torch_ver==1.2
and i want to know if i use this file if i need to install first
Pytorch0.4 is such an old version and very inconvenient to be used in a new machine. Is there any plan to transplant segfix to pytorch1.x ?
Hi,
Would you consider releasing the resnet50 pretrained model?
Thank you for your excellent algorithm.
Could you please provide the script that transfer the original coco-stuff dataset to the format for training?(train/image,train/label,val/image,val/label)Because I just found the scripts for other dataset(eg.cityscapes/LIP)
Hi, have you release any checkpoints about OCNet
Under the Cityscapes Semantic Segmentation section in model zoo following is written:
To apply SegFix, you should first down the offset files offset_instance.zip to $DATA_ROOT/cityscapes, and then extract the archive.
where offset_instance.zip is linked to offset_semantic.zip.
I was wondering whether you have released the instance offset and the link is wrong or it's just a typo?
In the case of a typo, can you provide the link for instance offsets?
my dataset image size is 256*256,and i dont know how to modifiy the json file
{
"dataset": "BDCI",
"method": "fcn_segmentor",
"data": {
"image_tool": "cv2",
"input_mode": "BGR",
"num_classes": 7,
"label_list": [0, 1, 2, 3, 4, 5, 6, 255],
"data_dir": "~/DataSet/BDCI",
"workers": 8
},
"train": {
"batch_size": 16,
"data_transformer": {
"size_mode": "fix_size",
"input_size": [256, 256],
"align_method": "only_pad",
"pad_mode": "random"
}
},
"val": {
"batch_size": 4,
"mode": "ss_test",
"data_transformer": {
"size_mode": "fix_size",
"input_size": [256, 256],
"align_method": "only_pad"
}
},
"test": {
"batch_size": 4,
"mode": "ss_test",
"out_dir": "~/DataSet/BDCI/seg_result/BDCI",
"data_transformer": {
"size_mode": "fix_size",
"input_size": [256, 256],
"align_method": "only_pad"
}
},
"train_trans": {
"trans_seq": ["random_resize", "random_crop", "random_hflip", "random_brightness"],
"random_brightness": {
"ratio": 1.0,
"shift_value": 10
},
"random_hflip": {
"ratio": 0.5,
"swap_pair": []
},
"random_resize": {
"ratio": 1.0,
"method": "random",
"scale_range": [0.5, 2.0],
"aspect_range": [0.9, 1.1]
},
"random_crop":{
"ratio": 1.0,
"crop_size": [256, 256],
"method": "random",
"allow_outside_center": false
}
},
"val_trans": {
"trans_seq": []
},
"normalize": {
"div_value": 255.0,
"mean_value": [0.485, 0.456, 0.406],
"mean": [0.485, 0.456, 0.406],
"std": [0.229, 0.224, 0.225]
},
"checkpoints": {
"checkpoints_name": "fs_baseocnet_BDCI_seg",
"checkpoints_dir": "./checkpoints/BDCI",
"save_iters": 500
},
"network":{
"backbone": "deepbase_resnet101_dilated8",
"multi_grid": [1, 1, 1],
"model_name": "base_ocnet",
"bn_type": "inplace_abn",
"stride": 8,
"factors": [[8, 8]],
"loss_weights": {
"corr_loss": 0.01,
"aux_loss": 0.4,
"seg_loss": 1.0
}
},
"logging": {
"logfile_level": "info",
"stdout_level": "info",
"log_file": "./log/BDCI/fs_baseocnet_BDCI_seg.log",
"log_format": "%(asctime)s %(levelname)-7s %(message)s",
"rewrite": true
},
"lr": {
"base_lr": 0.01,
"metric": "iters",
"lr_policy": "lambda_poly",
"step": {
"gamma": 0.5,
"step_size": 100
}
},
"solver": {
"display_iter": 10,
"test_interval": 1000,
"max_iters": 40000
},
"optim": {
"optim_method": "sgd",
"adam": {
"betas": [0.9, 0.999],
"eps": 1e-08,
"weight_decay": 0.0001
},
"sgd": {
"weight_decay": 0.0005,
"momentum": 0.9,
"nesterov": false
}
},
"loss": {
"loss_type": "fs_auxce_loss",
"params": {
"ce_weight": [0.8373, 0.9180, 0.8660, 1.0345, 1.0166, 0.9969, 0.9754,
1.0489, 0.8786, 1.0023, 0.9539, 0.9843, 1.1116, 0.9037,
1.0865, 1.0955, 1.0865, 1.1529, 1.0507],
"ce_reduction": "elementwise_mean",
"ce_ignore_index": -1,
"ohem_minkeep": 100000,
"ohem_thresh": 0.9
}
}
}
here is my json file, and when i try to train my dataset, there is such sizemisbatch error...like:
and so on,
environment should be satisfied:
this is my val error:
and the config.profile:
this is my log file screenshot:
run "bash scripts/cityscapes/hrnet/run_h_48_d_4_ocr.sh val 1". will get error as follows:
in segmentor/tester.py#L243(https://github.com/openseg-group/openseg.pytorch/blob/master/segmentor/tester.py#L243)
"Default process group has not been initialized"
AssertionError: Default process group is not initialized
Coarse Label Map,Offset Map,Refined Label Map,Distance Map, Direction Map and the last one,How to draw them。Which drawing software is used, which is a program, what is the name of the software, and can the program be open source?I want to apply Figure 2 and Figure 3 to my own grayscale map. If it can be open sourced, will it be possible in the near future?Thanks you very much.
您好,抱歉我的英语太渣了,想了解一下这3张图是如何制作的。哪些图用了画图软件,是什么软件,哪些用了程序,程序可以开源吗。我想把图2和图3应用到自己的灰度图上,如果可以开源,近期可以吗?谢谢各位大佬,万分感谢。
thank you
Hi,your work are so awesome!Congratulation!
I want to progress my prediction ,would you give a simple tutorial to use Segmix directly?
Thanks for your help!
您好,我想把Segmix应用到我的模型,请问能否出个关于Segmix代码的简单使用教程吗?
万分感谢!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.