Code Monkey home page Code Monkey logo

segformer's Introduction

NVIDIA Source Code License Python 3.8

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Figure 1: Performance of SegFormer-B0 to SegFormer-B5.

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers.
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo.
NeurIPS 2021.

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for SegFormer.

SegFormer is a simple, efficient and powerful semantic segmentation method, as shown in Figure 1.

We use MMSegmentation v0.13.0 as the codebase.

🔥🔥 SegFormer is on MMSegmentation. 🔥🔥

Installation

For install and data preparation, please refer to the guidelines in MMSegmentation v0.13.0.

Other requirements: pip install timm==0.3.2

An example (works for me): CUDA 10.1 and pytorch 1.7.1

pip install torchvision==0.8.2
pip install timm==0.3.2
pip install mmcv-full==1.2.7
pip install opencv-python==4.5.1.48
cd SegFormer && pip install -e . --user

Evaluation

Download trained weights. ( google drive | onedrive )

Example: evaluate SegFormer-B1 on ADE20K:

# Single-gpu testing
python tools/test.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file

# Multi-gpu testing
./tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM>

# Multi-gpu, multi-scale testing
tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM> --aug-test

Training

Download weights ( google drive | onedrive ) pretrained on ImageNet-1K, and put them in a folder pretrained/.

Example: train SegFormer-B1 on ADE20K:

# Single-gpu training
python tools/train.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py 

# Multi-gpu training
./tools/dist_train.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py <GPU_NUM>

Visualize

Here is a demo script to test a single image. More details refer to MMSegmentation's Doc.

python demo/image_demo.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${DEVICE_NAME}] [--palette-thr ${PALETTE}]

Example: visualize SegFormer-B1 on CityScapes:

python demo/image_demo.py demo/demo.png local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py \
/path/to/checkpoint_file --device cuda:0 --palette cityscapes

License

Please check the LICENSE file. SegFormer may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing.

Citation

@inproceedings{xie2021segformer,
  title={SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers},
  author={Xie, Enze and Wang, Wenhai and Yu, Zhiding and Anandkumar, Anima and Alvarez, Jose M and Luo, Ping},
  booktitle={Neural Information Processing Systems (NeurIPS)},
  year={2021}
}

segformer's People

Contributors

chrisding avatar cyrilzakka avatar ddonatien avatar xieenze avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

segformer's Issues

无法通过预训练模型完成训练

预训练模型路径:
./SegFormer/checkpoints/pretrained/mit_b1.pth
执行脚本:
cd SegFormer

python ./tools/train.py ./local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py
报错信息:
Traceback (most recent call last):
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 51, in build_from_cfg
return obj_cls(**args)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\models\segmentors\encoder_decoder.py", line 30, in init
self.backbone = builder.build_backbone(backbone)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\models\builder.py", line 17, in build_backbone
return BACKBONES.build(cfg)
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 210, in build
return self.build_func(*args, **kwargs, registry=self)
File "c:\xxxx\github\mmcv\mmcv\cnn\builder.py", line 26, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 44, in build_from_cfg
f'{obj_type} is not in the {registry.name} registry')
KeyError: 'mit_b1 is not in the models registry'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./tools/train.py", line 166, in
main()
File "./tools/train.py", line 135, in main
test_cfg=cfg.get('test_cfg'))
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\models\builder.py", line 46, in build_segmentor
cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 210, in build
return self.build_func(*args, **kwargs, registry=self)
File "c:\xxxx\github\mmcv\mmcv\cnn\builder.py", line 26, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: "EncoderDecoder: 'mit_b1 is not in the models registry'"

请问大神如何解决该问题?
mmcv-full, mmsegmentation以及您的SegFormer是否需要通过特定的代码版本一一对应才行?

Pretraining segformer on ImageNet-22K

The Swin transformer release a large model pretrained on ImageNet-22K for semantic segmentation and achieved a good result. I wonder if you are interested in improving segformer in a similar way? Thanks!

Question on Mapillary pretrain when evaluating on cityscapes(val) dataset

I met a problem when training on Mapillary and evaluating on cityscapes. The class "wall" miou=0.0.
Could you please provide the training log of Mapillary pretrain and eval?(prefer Model B2) Thanks a lot!

+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 96.93 | 98.09 |
|    sidewalk   | 76.57 | 90.46 |
|    building   | 89.02 | 95.67 |
|      wall     |  0.0  |  0.0  |
|     fence     | 35.52 | 59.93 |
|      pole     | 52.85 | 63.03 |
| traffic light | 59.63 | 71.81 |
|  traffic sign | 68.11 | 77.09 |
|   vegetation  | 89.89 | 96.67 |
|    terrain    |  26.0 |  26.5 |
|      sky      | 90.97 | 93.58 |
|     person    | 72.78 | 87.27 |
|     rider     | 33.21 |  41.0 |
|      car      | 91.25 | 97.25 |
|     truck     |  61.8 | 64.37 |
|      bus      | 66.93 | 71.56 |
|     train     | 62.85 | 65.31 |
|   motorcycle  | 47.68 | 65.62 |
|    bicycle    | 67.57 | 74.03 |
+---------------+-------+-------+
2021-06-21 16:06:43,150 - mmseg - INFO - Summary:
2021-06-21 16:06:43,150 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 93.66 | 62.61 | 70.49 |
+-------+-------+-------+

Why do not directly use MLP to predict the mask on the concatenated features?

Thanks for your work! I have a question. In your decoder, you first use MLP layers to multi-level features to unify the channel dimension and resize them to the same feature size and concatenate them, then you apply an MLP layer again to reduce the channel from 4C to C, and another MLP to predict the segmentation mask. My question is, why you do not directly use the MLP layer on the concatenated features? Have you tested this two-MLPs decoder performance?

segformer on edge devices

Hi there,
Have you considered:

  1. inferencing segformer B0 on an edge device, such as a Raspberry Pi
  2. Pruning of the B0 model, to reduce the model flops and size

Do these functions be used in training?

Hi, I'm not familiar with mmsegmentation training pipeline. I want to know that do functions: reset_drop_path, freeze_patch_emb and no_weight_decay in mix_transformer.py be used when training? Thanks for the nice project.

KeyError: 'AlignedResize is not in the pipeline registry'

Hi,

I hava a similar error to #2. I've just forked the repo to add a print statement, so fix #1 is included. When running python tools/test.py, I'm getting the following:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/test_time_aug.py", line 59, in __init__
    self.transforms = Compose(transforms)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/compose.py", line 22, in __init__
    transform = build_from_cfg(transform, PIPELINES)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 44, in build_from_cfg
    f'{obj_type} is not in the {registry.name} registry')
KeyError: 'AlignedResize is not in the pipeline registry'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/ade.py", line 91, in __init__
    **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/custom.py", line 88, in __init__
    self.pipeline = Compose(pipeline)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/compose.py", line 22, in __init__
    transform = build_from_cfg(transform, PIPELINES)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: "MultiScaleFlipAug: 'AlignedResize is not in the pipeline registry'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tools/test.py", line 170, in <module>
    main()
  File "tools/test.py", line 122, in main
    dataset = build_dataset(cfg.data.test)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/builder.py", line 73, in build_dataset
    dataset = build_from_cfg(cfg, DATASETS, default_args)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: 'ADE20KDataset: "MultiScaleFlipAug: \'AlignedResize is not in the pipeline registry\'"'

I've made a Google Colab to reproduce: https://colab.research.google.com/drive/1-t_lj5K2ZEFemxn88DSfcy9W7RTvklsz?usp=sharing

Inference speed of the model

Hello
How are you?
Thanks for contributing to this project.
Which device did u test your models on?

image

You did NOT explain the device specification in the paper.

About the efficient attention module

Hi,

I would like to ask a question about the efficient attention module, please:
I see that you use a reduction ratio R to descrease the spatial size of input sequences, normally this operation will produce a output sequence of spatial size N/R. But according to your Table.6 it doesn't, the output spatial size is still N. I would like to ask where do you upsample your sequence spatial size from N/R back to N in the attention module after the reduced QKV multiplication?

Thank you!

Error: AlignedResize is not in the pipeline registry

Currently trying to run the evaluation script:
python tools/test.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file

results in the following error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/content/mmsegmentation/mmseg/datasets/pipelines/test_time_aug.py", line 59, in __init__
    self.transforms = Compose(transforms)
  File "/content/mmsegmentation/mmseg/datasets/pipelines/compose.py", line 22, in __init__
    transform = build_from_cfg(transform, PIPELINES)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 44, in build_from_cfg
    f'{obj_type} is not in the {registry.name} registry')
KeyError: 'AlignedResize is not in the pipeline registry'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/content/mmsegmentation/mmseg/datasets/ade.py", line 91, in __init__
    **kwargs)
  File "/content/mmsegmentation/mmseg/datasets/custom.py", line 88, in __init__
    self.pipeline = Compose(pipeline)
  File "/content/mmsegmentation/mmseg/datasets/pipelines/compose.py", line 22, in __init__
    transform = build_from_cfg(transform, PIPELINES)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: "MultiScaleFlipAug: 'AlignedResize is not in the pipeline registry'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tools/test.py", line 166, in <module>
    main()
  File "tools/test.py", line 122, in main
    dataset = build_dataset(cfg.data.test)
  File "/content/mmsegmentation/mmseg/datasets/builder.py", line 73, in build_dataset
    dataset = build_from_cfg(cfg, DATASETS, default_args)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: 'ADE20KDataset: "MultiScaleFlipAug: \'AlignedResize is not in the pipeline registry\'"'

as AlignedResize seems to be missing an import in SegFormer/mmseg/datasets/pipelines/__init__.py. This pull request adds the import accordingly.

Is it possible to use SegFormer for Saliency Object Detection Task?

Hi @xieenze , thanks for your great work. I am wondering that whether you have tested ViT based methods for SOD tasks which normally have higher standard for output mask quality. For example, if I need to segment a human portrait, the hair(very thin object) should also be precisely segment out. When I check the demo for SegFormer, I notice that the mask quality especially on the edge is not that good. Do you think this is caused by the labelled data or the network resolution itself?

Question about normalization (Mean/Std) value different from swin pretrained backbone

Thanks for your great work on making transformer model working so well on semantic segmentation. I have a question regarding the normalization value of mean and std (I have also observed this in maskFormer too, so that feel really confused).

For training swin transformer, the original swin transformer import std and mean from timm with the following value:
Screen Shot 2021-09-20 at 1 40 15 PM

In your work, mean and std have been set to the following value:
Screen Shot 2021-09-20 at 1 51 08 PM

It would be really appreciated if you could give any information on this! Thanks!

Performance on ImageNet?

I see that every segmenation config need a pretrained model on Imagenet, so can you provide the performance of B0 - B5 on ImageNet?

Porting SegFormer to HuggingFace Transformers

Hi guys,

First of all thanks for this impressive (and simple) model!

I'd like to port this model to HuggingFace Transformers, which, as you might know, is a library that includes a lot of Transformer-based models (mostly NLP models like BERT and RoBERTa, but recently I've added the Vision Transformer (ViT), DeiT and DETR to the library, so I think SegFormer definitely deserves its place there too!).

The API I had in mind could look something like this (very similar to ViT):

from transformers import SegFormerFeatureExtractor, SegFormerForImageSegmentation
from PIL import Image
import requests

feature_extractor = SegFormerFeatureExtractor.from_pretrained("nvidia/segformer-b0-fine-tuned-ade-512-512")
model = SegFormerForImageSegmentation("nvidia/segformer-b0-fine-tuned-ade-512-512")

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits # shape (batch_size, num_labels, height/4, width/4)

The main advantage would be that people could train the SegFormer model within a Colab notebook with ease just using a native PyTorch training loop or with frameworks like PyTorch Lighting, HuggingFace Accelerate, etc., and also perform inference very easily as shown above. No scripts required!

The feature extractor should not be a fully-fledged preprocessor, it would probably just need to resize + normalize images, such that they can be fed to the model. I guess resizing to 512x512 is a good default option. I would perhaps include a post_process method, that can be used to convert the logits of the model to an actual image of the semantic segmentation.

All model checkpoints can be hosted for free on the hub, under the NVIDIA namespace (which currently includes models like Megatron-GPT-2).

Are you interested in helping me finishing up this model? My main questions would be:

  • what are the most basic image + mask transformations that would work in order to perform inference + fine-tune on a custom dataset? What should the values of the image size be for each of the checkpoints? It seems that for the 512x512 ADE model, the shortest side is 512?
  • I guess that if the feature extractor resizes (rescales) images to 512x512, the corresponding masks also need to be resized. But as the model predicts masks at resolution 128x128, does the feature extractor need to resize them to this resolution?
  • how is the loss defined, is this just the CrossEntropyLoss between the predicted mask and the ground truth mask?

Training details

Hi, I'm trying to reproduce SegFormer on PASCAL VOC dataset. When using the codes of this repo, I could get ~77% mIoU (without multi-scale test). However, I only got ~75% mIoU with my reproduced code. Here are my training details.

I have reproduced the training and validation data pipeline, including random scaling , random horizontal flipping , and random cropping, etc. For the model, I used the code of this repo and the pre-trained weights. I also used an AdamW optimizer with a warmup scheduler. The other optimizer settings are set as the same with this repo.

Therefore, I'm wondering if there are any extra training details in SegFormer or mmseg itself. I'll very appreciate for your reply.

MiT-B1 Decoder Parameter Count Question

Hello!

In the paper, the decoder parameter count of the B1 model was reported to be 0.6 million parameters in Table 1.

I may be misunderstanding, but from what I can see the decoder parameter count is a function of the feature map channels. Looking at the code and Table 6., the B1 model has the exact same feature channel size as the B2-B5 models, all of which have a decoder size of 3.3million parameters.

Am I missing something as to why the parameter count of the B1 model is 0.6 million parameters and not 3.3million?

How to reduce the dataset?

In your paper, the max iteration is 160000, and the ADE20K datasets contains 20010 pictures. I choose the the part of the pictures of ADE20k and train by your model, the training time is same as the whole datasets. Can you kindly tell me how to reduce the training time using part of the dataset?

train error mit_b1.pth is not a checkpoint file

It's a great honor for me to study your reserch, when i download the pretrained model into pretrained directory . It shows as follows, hope you can give me some advice. Thanks for your time and kindness.
image

Default process group is not initialized

I'm trying to run the training code as follows:
python tools/train.py local_configs/segformer/B0/segformer.b0.512x512.ade.160k.py

I've changed the dataset to a custom dataset (the one given in the OpenMMLab Tutorial)

Getting the below error. The tutorial notebook works fine in the original OpenMMLab repo. Do you have any insight into why this might be happening and what I have to change to make it run?

Traceback (most recent call last):
  File "tools/train.py", line 181, in <module>
    main()
  File "tools/train.py", line 177, in main
    meta=meta)
  File "/content/mmsegmentation/mmseg/apis/train.py", line 115, in train_segmentor
    runner.run(data_loaders, cfg.workflow)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/iter_based_runner.py", line 131, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/iter_based_runner.py", line 60, in train
    outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/content/mmsegmentation/mmseg/models/segmentors/base.py", line 152, in train_step
    losses = self(**data_batch)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
    return old_func(*args, **kwargs)
  File "/content/mmsegmentation/mmseg/models/segmentors/base.py", line 122, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/content/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 158, in forward_train
    gt_semantic_seg)
  File "/content/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 102, in _decode_head_forward_train
    self.train_cfg)
  File "/content/mmsegmentation/mmseg/models/decode_heads/decode_head.py", line 188, in forward_train
    seg_logits = self.forward(inputs)
  File "/content/mmsegmentation/mmseg/models/decode_heads/segformer_head.py", line 82, in forward
    _c = self.linear_fuse(torch.cat([_c4, _c3, _c2, _c1], dim=1))
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/cnn/bricks/conv_module.py", line 195, in forward
    x = self.norm(x)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/batchnorm.py", line 519, in forward
    world_size = torch.distributed.get_world_size(process_group)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 638, in get_world_size
    return _get_group_size(group)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 220, in _get_group_size
    _check_default_pg()
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 211, in _check_default_pg
    "Default process group is not initialized"
AssertionError: Default process group is not initialized

Evaluation报错无法执行成功

执行脚本:
python ./tools/test.py ./local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py ./checkpoints/trained/segformer.b1.512x512.ade.160k.pth
报错信息:
Traceback (most recent call last):
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 51, in build_from_cfg
return obj_cls(**args)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\datasets\pipelines\test_time_aug.py", line 59, in init
self.transforms = Compose(transforms)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\datasets\pipelines\compose.py", line 22, in init
transform = build_from_cfg(transform, PIPELINES)
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 44, in build_from_cfg
f'{obj_type} is not in the {registry.name} registry')
KeyError: 'AlignedResize is not in the pipeline registry'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 51, in build_from_cfg
return obj_cls(**args)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\datasets\ade.py", line 84, in init
**kwargs)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\datasets\custom.py", line 88, in init
self.pipeline = Compose(pipeline)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\datasets\pipelines\compose.py", line 22, in init
transform = build_from_cfg(transform, PIPELINES)
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: "MultiScaleFlipAug: 'AlignedResize is not in the pipeline registry'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./tools/test.py", line 166, in
main()
File "./tools/test.py", line 122, in main
dataset = build_dataset(cfg.data.test)
File "C:\xxxx\github\torch_env\lib\site-packages\mmseg\datasets\builder.py", line 73, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)
File "c:\xxxx\github\mmcv\mmcv\utils\registry.py", line 54, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: 'ADE20KDataset: "MultiScaleFlipAug: 'AlignedResize is not in the pipeline registry'"'

How to pre-train on Imagenet?

Hello,

First, thank you for your excellent work and code!

May I know how does segformer pre-train on Imagenet?

Maybe I missed some details in the paper but I didn't find the process of generating the classification token or pooling feature maps as other works do when training for the classification task.

Thank you!

Issue in the beginning

Hi, guys. I ran into an issue at the very beginning.
You can see it below.
Any advice?

Traceback (most recent call last):
File "tools/test.py", line 10, in
from mmseg.apis import multi_gpu_test, single_gpu_test
File "D:\Conda\envs\torch\lib\site-packages\mmseg\apis_init_.py", line 1, in
from .inference import inference_segmentor, init_segmentor, show_result_pyplot
File "D:\Conda\envs\torch\lib\site-packages\mmseg\apis\inference.py", line 8, in
from mmseg.models import build_segmentor
File "D:\Conda\envs\torch\lib\site-packages\mmseg\models_init_.py", line 1, in
from .backbones import * # noqa: F401,F403
File "D:\Conda\envs\torch\lib\site-packages\mmseg\models\backbones_init_.py", line 2, in
from .fast_scnn import FastSCNN
File "D:\Conda\envs\torch\lib\site-packages\mmseg\models\backbones\fast_scnn.py", line 6, in
from mmseg.models.decode_heads.psp_head import PPM
File "D:\Conda\envs\torch\lib\site-packages\mmseg\models\decode_heads_init_.py", line 4, in
from .cc_head import CCHead
File "D:\Conda\envs\torch\lib\site-packages\mmseg\models\decode_heads\cc_head.py", line 7, in
from mmcv.ops import CrissCrossAttention
File "D:\Conda\envs\torch\lib\site-packages\mmcv\ops_init_.py", line 1, in
from .bbox import bbox_overlaps
File "D:\Conda\envs\torch\lib\site-packages\mmcv\ops\bbox.py", line 3, in
ext_module = ext_loader.load_ext('ext', ['bbox_overlaps'])
File "D:\Conda\envs\torch\lib\site-packages\mmcv\utils\ext_loader.py", line 12, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "D:\Conda\envs\torch\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed

Share Google Colab notebook

Hello, I am wondering if you, or any others here have this implemented in a google colab notebook that you can share. I am very new to this and would appreciate any assistance to set this up in google colab so that i can fine-tune on a small custom dataset (for school purposes). Thank you very much

How to change checkpoint saving frequency

Hi, first of all, thank you for your research and code.

I see that during training, the model is saved every 4000 iterations. Where can I change this spec, such that my model is saved every, lets say, 1000 iterations?

Thank you

Fp16OptimizerHook breaks training

First off, thanks for your awesome work.

I have no trouble getting SegFormer to train normally but if I configure with Fp16OptimizerHook then a short way into training (10-20k iterations) suddenly no classes are predicted:

optimizer_config=dict(type="Fp16OptimizerHook"),
fp16=dict(),

Is this expected? Has anyone got Fp16OptimizerHook to work with this repo?

About the pretrained model for B5

Hi, thanks for the great work! May I know if the pre-trained weights for B5 are trained with Mapillary Vistas or only with ImageNet-1K?

training error: unrecognized arguments

When i use your script command 'python tools/train.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py
', it shows unrecognized arguments as follows. Can you tell me how to solve it ?
image

license

I was looking to port this to a website but saw the license restriction so decided not to
"3.3 Use Limitation. The Work and any derivative works thereof only may be used or intended for use
non-commercially. Notwithstanding the foregoing, NVIDIA and its affiliates may use the Work and any derivative
works commercially. As used herein, “non-commercially” means for research or evaluation purposes only."
but I see that the authors are ok with it being ported to other websites #20
so does what does evaluation purposes only cover under the license?

Simple SegFormer network class

Hello
How are you?
Thanks for contributing to this project.
It is difficult for us to use this project because it contains many other scripts.
Did u check https://github.com/lucidrains/segformer-pytorch which is a third-party implementation for SegFormer?
This project contains ONLY a simple segformer network class so it is easy to use.
But the number of params of MiT-B0 network by this implementation is 7M.
I know that the number of params of MiT-B0 is 3.6M in the paper.
Could u check https://github.com/lucidrains/segformer-pytorch shortly?
If it is difficult, could u make the SegFormer network class like the above implementation?
Thanks

The memory usage is high?

I trained with batch size=2, image size is (1216, 1216), mit_b1. It uses almost 22GB GPU memory, is this normal?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.