Code Monkey home page Code Monkey logo

afformer's Introduction

Head-Free Lightweight Semantic Segmentation with Linear Transformer

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for AFFormer.๐Ÿ”ฅ๐Ÿ”ฅ

Figure 1: Performance of AFFormer.

AFFormer is a head-free, lightweight and powerful semantic segmentation method, as shown in Figure 1.

We use MMSegmentation v0.21.1 as the codebase.

Installation

For install and data preparation, please refer to the guidelines in MMSegmentation v0.21.1.

An example (works for me): CUDA 11.3 and pytorch 1.10.1

pip install mmcv-full==1.5.0
pip install torchvision
pip install timm
pip install opencv-python
pip install einops

Evaluation

Download weights ( google drive | alidrive )

Example: evaluate AFFormer-base on ADE20K :

# Single-gpu testing
bash tools/dist_test.sh ./configs/AFFormer/AFFormer_base_ade20k.py /path/to/checkpoint_file.pth 1 --eval mIoU

# Multi-gpu testing
bash tools/dist_test.sh ./configs/AFFormer/AFFormer_base_ade20k.py /path/to/checkpoint_file.pth <GPU_NUM> --eval mIoU

# Multi-gpu, multi-scale testing
bash tools/dist_test.sh ./configs/AFFormer/AFFormer_base_ade20k.py /path/to/checkpoint_file.pth <GPU_NUM> --eval mIoU --aug-test

Training

Download weights ( google drive | alidrive ) pretrained on ImageNet-1K (refer to deit), and put them in a folder pretrained/.

Example: train AFFormer-base on ADE20K:

# Single-gpu training
bash tools/dist_train.sh ./configs/AFFormer/AFFormer_base_ade20k.py

# Multi-gpu training
bash tools/dist_train.sh ./configs/AFFormer/AFFormer_base_ade20k.py <GPU_NUM>

Visualize

Here is a demo script to test a single image. More details refer to MMSegmentation's Doc.

python demo/image_demo.py ${IMAGE_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--device ${DEVICE_NAME}] [--palette-thr ${PALETTE}]

Example: visualize SegFormer-B1 on CityScapes:

python demo/image_demo.py demo/demo.png local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py \
/path/to/checkpoint_file --device cuda:0 --palette cityscapes

License

The code is released under the MIT license.

Copyright

Copyright (C) 2010-2021 Alibaba Group Holding Limited.

Citation

If you find this work helpful to your research, please consider citing the paper:

@inproceedings{dong2023afformer,
  title={AFFormer: Head-Free Lightweight Semantic Segmentation with Linear Transformer},
  author={Bo, Dong and Pichao, Wang and Fan Wang},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  pages={},
  year={2023}
}

afformer's People

Contributors

dongbo811 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.