fudan-zvg / setr Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
License: MIT License
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
License: MIT License
Thank you for your great work. Auxiliary segmentation loss helps the model training and may lead to significant performance improvement. Can you provide some related results w/o auxiliary loss?
Hello,thanks for your code.
How much GPU memory is needed for training SETR ?
I have 2 P40 GPU but I cann't start training cus OOM.
Looking forward to your reply.
Could you please release the ResNet101 pretrained weights with ImageNet21K or tell us where can we find it?
Thank you!
Since the patches come from a 2D images, the position information consists of two directions, in other words, x-axis and y-axis indexes. This is different from the case in 1-D sequence. How do you implement the position embedding? Can you share the details since the code is not released?
hi authors:
where to download the pretrained model? I'd like to test your model but i cannot download the pretrained model, for instance, SETR_PUP_768x768_40k_cityscapes_bs_8/iter_40000.pth.
Hi, a "t" is missing in Dockerfile. It should be "mmsegmentation", or git clone won't work.
The demo in your repo is written for pspnet, so could you share the demo for SETR?
Hello, It's my first time use MMseg, can you help me to use my own datasets to train?
Hi,
Thanks for sharing the code. I want to use your model on my own dataset, how can I do that? As I see you provide a separate python file for each dataset.
hello, Thank you for your contribution, When my dataset is Cityscapes, I run tools / train.py. It has the following error:
Traceback (most recent call last):
File "/home/caiweixin/Downloads/SETR-main/tools/train.py", line 161, in
main()
File "/home/caiweixin/Downloads/SETR-main/tools/train.py", line 150, in main
train_segmentor(
File "/home/caiweixin/Downloads/SETR-main/mmseg/apis/train.py", line 105, in train_segmentor
runner.run(data_loaders, cfg.workflow, cfg.total_iters)
File "/home/caiweixin/anaconda3/envs/SETR-main/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 130, in run
iter_runner(iter_loaders[i], **kwargs)
File "/home/caiweixin/anaconda3/envs/SETR-main/lib/python3.9/site-packages/mmcv/runner/iter_based_runner.py", line 60, in train
outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
File "/home/caiweixin/Downloads/SETR-main/mmseg/models/segmentors/base.py", line 152, in train_step
losses = self(**data_batch)
File "/home/caiweixin/anaconda3/envs/SETR-main/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/caiweixin/anaconda3/envs/SETR-main/lib/python3.9/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/home/caiweixin/Downloads/SETR-main/mmseg/models/segmentors/base.py", line 122, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/caiweixin/Downloads/SETR-main/mmseg/models/segmentors/encoder_decoder.py", line 153, in forward_train
x = self.extract_feat(img)
File "/home/caiweixin/Downloads/SETR-main/mmseg/models/segmentors/encoder_decoder.py", line 79, in extract_feat
x = self.backbone(img)
File "/home/caiweixin/anaconda3/envs/SETR-main/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/caiweixin/Downloads/SETR-main/mmseg/models/backbones/vit.py", line 393, in forward
B = x.shape[0]
AttributeError: 'DataContainer' object has no attribute 'shape'
I don't know how to solve it. I hope you can give me some suggestions. Thank you very much.
how to select gpu when training with multiple gpus, thanks a lot
Hi, authors,
I got the following error after executing command: python tools/train.py configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py
2021-04-08 08:03:22,265 - mmseg - INFO - Loaded 2975 images
2021-04-08 08:03:24,275 - mmseg - INFO - Loaded 500 images
2021-04-08 08:03:24,276 - mmseg - INFO - Start running, host: root@milton-LabPC, work_dir: /media/root/mdata/data/code13/SETR/work_dirs/SETR_PUP_768x768_40k_cityscapes_bs_8
2021-04-08 08:03:24,276 - mmseg - INFO - workflow: [('train', 1)], max: 40000 iters
Traceback (most recent call last):
File "tools/train.py", line 161, in <module>
main()
File "tools/train.py", line 150, in main
train_segmentor(
File "/media/root/mdata/data/code13/SETR/mmseg/apis/train.py", line 106, in train_segmentor
runner.run(data_loaders, cfg.workflow, cfg.total_iters)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/mmcv/runner/iter_based_runner.py", line 130, in run
iter_runner(iter_loaders[i], **kwargs)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/mmcv/runner/iter_based_runner.py", line 60, in train
outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/media/root/mdata/data/code13/SETR/mmseg/models/segmentors/base.py", line 152, in train_step
losses = self(**data_batch)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/media/root/mdata/data/code13/SETR/mmseg/models/segmentors/base.py", line 122, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/media/root/mdata/data/code13/SETR/mmseg/models/segmentors/encoder_decoder.py", line 157, in forward_train
loss_decode = self._decode_head_forward_train(x, img_metas,
File "/media/root/mdata/data/code13/SETR/mmseg/models/segmentors/encoder_decoder.py", line 100, in _decode_head_forward_train
loss_decode = self.decode_head.forward_train(x, img_metas,
File "/media/root/mdata/data/code13/SETR/mmseg/models/decode_heads/decode_head.py", line 185, in forward_train
seg_logits = self.forward(inputs)
File "/media/root/mdata/data/code13/SETR/mmseg/models/decode_heads/vit_up_head.py", line 93, in forward
x = self.syncbn_fc_0(x)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 519, in forward
world_size = torch.distributed.get_world_size(process_group)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 625, in get_world_size
return _get_group_size(group)
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 220, in _get_group_size
_check_default_pg()
File "/root/anaconda3/envs/pytorch1.7.0/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 210, in _check_default_pg
assert _default_pg is not None, \
AssertionError: Default process group is not initialized
(pytorch1.7.0) root@milton-LabPC:/data/code13/SETR
As I use a single GPU device to perform the training, it seems the error is related to distributed training. Any hints to solve this issue?
THX!
from another post, this code needs a lot of GPU resources, which is really difficult for me to retrain your results. Could you please release your best checkpoints?
Hi, do you have a google drive link for the models with T-Base referenced in the paper (such as SETR-Naive-Base) as well as the corresponding configuration files?
Alternatively, what configuration can I use to train the model if it is not readily available? I tried changing the depth in SETR/configs/base/models/setr_naive_pup.py to 12, but that errors out with "RuntimeError: shape '[2, 1025, 3, 12, 85]' is invalid for input of size 6297600" when using the ADE20K configuration file (https://github.com/fudan-zvg/SETR/blob/main/configs/SETR/SETR_Naive_512x512_160k_ade20k_bs_16.py) for training. Changing the embedding dimension in this file from 1024 results in a lot of shape mismatches with the pretrained imagenet21k model as well. The default training with the T-large depth and embedding dimension work for me with the same file.
Thanks for your help.
Hi, thanks for providing the code, When I use SETR_MLA_480x480_80k_pascal_context_bs_8.py and SETR_MLA_pascal_context_b8_80k.pth, I met this error:"AttributeError: module 'torch.distributed' has no attribute 'group'". How can I solve this on windows?
Below is my environment:
pytorch 1.6.0
mmcv 1.2.6
mmcv-full 1.1.5
Speed test on those models?
"paramwise_cfg=dict(custom_keys={'head': dict(lr_mult=10.)}"
Hi, thank you for open-source your code firstly. I have a question about the configuration of the optimizer.
I found there is "decode_head" in your model, not "head" used in 'custom_keys'. Will 'lr_mult=10' takes effect while we training the model?
Thanks~
I met this error, how to deal with it..., thanks a lot.
RuntimeError: CUDA out of memory. Tried to allocate 38.00 MiB (GPU 0; 24.00 GiB total capacity; 18.99 GiB already allocated; 18.86 MiB free; 19.31 GiB reserved in total by PyTorch)
I think the batchsize is too big, or it's the GPU setting in the code. But I didn't find set batchsize.
Can this code only run under multiple GPUs?
Hi thanks for share the code, i got some problem plz help me.
i have only one GPU
my mmcv and pytorch version is the same as the readme.md.
(base) root@Pub:/mnt/c/Users/Pub/SETR# ./tools/dist_test.sh configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py work_dirs/SETR_PUP_768x768_40k_cityscapes_bs_8/iter_40000.pth 1 --eval mIoU
Traceback (most recent call last):
File "./tools/test.py", line 144, in
main()
File "./tools/test.py", line 100, in main
init_dist(args.launcher, **cfg.dist_params)
File "/root/miniconda3/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 20, in init_dist
_init_dist_pytorch(backend, **kwargs)
File "/root/miniconda3/lib/python3.7/site-packages/mmcv/runner/dist_utils.py", line 33, in _init_dist_pytorch
torch.cuda.set_device(rank % num_gpus)
ZeroDivisionError: integer division or modulo by zero
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/root/miniconda3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/miniconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in
main()
File "/root/miniconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/root/miniconda3/bin/python', '-u', './tools/test.py', '--local_rank=0', 'configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py', 'work_dirs/SETR_PUP_768x768_40k_cityscapes_bs_8/iter_40000.pth', '--launcher', 'pytorch', '--eval', 'mIoU']' returned non-zero exit status 1.
After reading your paper, I have a confusion that how do you handle the multi-patch (256) inputs in the encoder? It seems that in the encoder, the network fuses the 256 patches and learns one feature map (with size: (H/16, W/16, D)) of the whole original image (instead of the patch-wise image), and then decode this feature map to generate the segmentatoin map. Wonder how to process and fue the 256 patches in the encoder?
There is no defination of 'dataset' in class Resize.
According to the results of "mmsegmentation" and "HRNet-segmentation", evaluating with 60 classes (including the background) will lead to a huge decrease in mIoU. Can you provide the evalution results of 59 classes?
Thank you for your sharing.
because i only have one 1080 GPU, and it cannot trained your model(CUDA error: an illegal memory access was encountered), when i want to download your model, it shows 404 website, could you please share the trained model to my email: [email protected]
thank you very much.
同学你好,谢谢你的分享
但是由于我只有一块GPU因此无法训练你的模型,请问你可以把你训练好的模型分享到我的邮箱吗,谢谢。
@lzrobots The paper seems promising, but some question about the efficiency are unanswered:
Given the encoder of Vit-Large-Patches16 and input size of 3x480x480, the output feature maps of any layer Z should be 1024x30x30 (reshaped). How to map these 1024 features to an RGB space for visualization? Are the feature maps directly upsampled to the original image size during the visualization process?
I didn't find any related codes in this repo.
Hi, authors,
I got this error: KeyError: "EncoderDecoder: 'VisionTransformer is not in the backbone registry'"
when I run the command: python tools/train.py configs/SETR/SETR_PUP_768x768_40k_cityscapes_bs_8.py
. The error in details is listed here: https://gist.github.com/amiltonwong/476e04e81e33b3cf8c4cb3f28ee01ddd
Any hints to solve this issue?
THX!
When are you going to release the code?
Hi, congrats to acceptance of CVPR 2021.
I am the member of OpenMMLab and our vision is to provide abundant models in our codebase, where researchers could make fair and effective comparison in computer vision field easily, which could in turn make more citations of those original excellent works because of already built baselines.
As the first transformer model on semantic segmentation, SETR has gotten too much attention in related field.
That's why my colleagues re-implement SETR in MMSeg last several month. Please check our link: https://github.com/open-mmlab/mmsegmentation/tree/master/configs/setr.
So could you please add our link in your original github repositories? We hope more people could use this famous model.
Lookig forward to your reply! Wish you make more great works in the future.
Best,
torch.hub has no attribute 'get_dir'
when i open the pretained model Page,there is no pretrained models
Excuse me
I'm Trainning with multiple GPUs,for example:./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments] and I have 2 GPUs try to use
Traceback (most recent call last):
File "/home/anaconda3/envs/py37_torch1.6/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/anaconda3/envs/py37_torch1.6/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/anaconda3/envs/py37_torch1.6/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in
main()
File "/home/anaconda3/envs/py37_torch1.6/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/anaconda3/envs/py37_torch1.6/bin/python', '-u', './tools/train.py', '--local_rank=1', 'configs/SETR/SETR_Naive_768x768_40k_cityscapes_bs_8.py', '--launcher', 'pytorch', '--load-from=./pth/jx_vit_large_p16_384-b3be5167.pth']' returned non-zero exit status 1.
Thanks for your answer!
Hello, according to the document, I installed and tried the training model normally, but it reported an error during the training process: Is it incompatible with MMseg? Thanks for giving anwser.
Here are some details about the error I met:
2021-07-13 11:57:07,774 - mmseg - INFO - Loaded 2975 images
2021-07-13 11:57:12,063 - mmseg - INFO - Loaded 500 images
2021-07-13 11:57:12,064 - mmseg - INFO - Start running, host: well@admin01, work_dir: /home/well/SETR/work_dirs/SETR_Naive_768x768_40k_cityscapes_bs_8
2021-07-13 11:57:12,064 - mmseg - INFO - workflow: [('train', 1)], max: 40000 iters
...
FileNotFoundError: [Errno 2] No such file or directory: '/home/well/SETR/data/cityscapes/gtFine/train/cologne/cologne_000019_000019_gtFine_labelTrainIds.png'
I make sure the file path is correct. But it can not start the training process.
Thank you for your great work.The size of my picture is (256,832),how should I deal with it?Please tell me more details.thanks.
How do you do about 2D interpolation of the pre-trained position embeddings?Thanks!!
Hello:
when test pascal_context dataset, output:
AssertionError: Input image size (389480) doesn't match model (480480).
@lzrobots
@VictorLlu
@sixiaozheng
Hi, thank you for your sharing.
however, when i run "./tools/dist_test.sh configs/SETR/SETR_PUP_512x512_160k_ade20k_bs_16_MS.py", i got the error: CUDA out of memory. I have 2 NVIDIA Tesla P100 about 16GB per GPU.
Could you please tell me what is wrong.
Thank you.
Hi @sixiaozheng,
Could you please release the R101 pretrained-model with ImageNet-21K.
Sent from PPHub
Hi! Thanks for opensourcing the code. I wonder what are the results of SETR using DeiT on ADE.
I cannt achieve the most miou proposed in original paper while i didnt change the hyperparameter except tuning the bs from 8 to 4. How to achieve the best miou? Is that related to batchsize?
Hi, thank you for your awesome work.
I trained SETR-MLA model on my own dataset, and there are many checkpoints files of different iterations. But how can I know which is the best one to test data?
Thanks in advance.
Can you help me solve this problem? I use the dataset in VOC format.
Traceback (most recent call last):
File "tools/train.py", line 163, in
main()
File "tools/train.py", line 159, in main
meta=meta)
File "/home/ubuntu/disk1/user/SETR-main/mmseg/apis/train.py", line 91, in train_segmentor
val_dataset = build_dataset(cfg.data.val, dict(test_mode=True))
File "/home/ubuntu/disk1/user/SETR-main/mmseg/datasets/builder.py", line 73, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)
File "/home/ubuntu/anaconda3/envs/setr/lib/python3.7/site-packages/mmcv/utils/registry.py", line 171, in build_from_cfg
return obj_cls(**args)
File "/home/ubuntu/disk1/user/SETR-main/mmseg/datasets/pascal_context.py", line 53, in init
**kwargs)
File "/home/ubuntu/disk1/user/SETR-main/mmseg/datasets/custom.py", line 86, in init
self.pipeline = Compose(pipeline)
File "/home/ubuntu/disk1/user/SETR-main/mmseg/datasets/pipelines/compose.py", line 22, in init
transform = build_from_cfg(transform, PIPELINES)
File "/home/ubuntu/anaconda3/envs/setr/lib/python3.7/site-packages/mmcv/utils/registry.py", line 171, in build_from_cfg
return obj_cls(**args)
File "/home/ubuntu/disk1/user/SETR-main/mmseg/datasets/pipelines/test_time_aug.py", line 59, in init
self.transforms = Compose(transforms)
File "/home/ubuntu/disk1/user/SETR-main/mmseg/datasets/pipelines/compose.py", line 22, in init
transform = build_from_cfg(transform, PIPELINES)
File "/home/ubuntu/anaconda3/envs/setr/lib/python3.7/site-packages/mmcv/utils/registry.py", line 171, in build_from_cfg
return obj_cls(**args)
TypeError: init() got an unexpected keyword argument 'dataset'
pip install mmcv
python setup.py develop
and run train.py
ImportError: DLL load failed: 找不到指定的模块。
can you give me some advices?
thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.