Code Monkey home page Code Monkey logo

vmambair's Introduction

VmambaIR: Visual State Space Model for Image Restoration

Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, and Wenming Yang, "VmambaIR: Visual State Space Model for Image Restoration", arXiv, 2024

[arXiv] [supplementary material] [visual results] [pretrained models]

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ News

  • 2024-03-18: This repo is released.

Abstract: Image restoration is a critical task in low-level computer vision, aiming to restore high-quality images from degraded inputs. Various models, such as convolutional neural networks (CNNs), generative adversarial networks (GANs), transformers, and diffusion models (DMs), have been employed to address this problem with significant impact. However, CNNs have limitations in capturing long-range dependencies. DMs require large prior models and computationally intensive denoising steps. Transformers have powerful modeling capabilities but face challenges due to quadratic complexity with input image size. To address these challenges, we propose VmambaIR, which introduces State Space Models (SSMs) with linear complexity into comprehensive image restoration tasks. We utilize a Unet architecture to stack our proposed Omni Selective Scan (OSS) blocks, consisting of an OSS module and an Efficient Feed-Forward Network (EFFN). Our proposed omni selective scan mechanism overcomes the unidirectional modeling limitation of SSMs by efficiently modeling image information flows in all six directions. Furthermore, we conducted a comprehensive evaluation of our VmambaIR across multiple image restoration tasks, including image deraining, single image super-resolution, and real-world image super-resolution. Extensive experimental results demonstrate that our proposed VmambaIR achieves state-of-the-art (SOTA) performance with much fewer computational resources and parameters. Our research highlights the potential of state space models as promising alternatives to the transformer and CNN architectures in serving as foundational frameworks for next-generation low-level visual tasks.


Single Imgae Super-Resolution

Real-World Imgae Super-Resolution

Image Deraining


โš’๏ธ TODO

  • Release code and pretrained models

๐Ÿ”— Contents

  1. Datasets
  2. Models
  3. Training
  4. Testing
  5. Results
  6. Citation
  7. Acknowledgements

๐Ÿ”Ž Results

We achieved state-of-the-art performance on multiple image restoration tasks. Detailed results can be found in the paper.

Evaluation on Single Image Super-Resolution (click to expand)
  • quantitative comparisons in Table 1 of the main paper

  • visual comparison in Figure 5 of the main paper

Evaluation on Real-World Image Super-Resolution (click to expand)
  • quantitative comparisons in Table 2 of the main paper

  • visual comparison in Figure 6 of the main paper

Evaluation on Image Deraining (click to expand)
  • quantitative comparisons in Table 2 of the main paper

  • visual comparison in Figure 6 of the main paper

๐Ÿ“Ž Citation

If you find the code helpful in your resarch or work, please cite the following paper(s).

@article{chen2023image,
  title={VmambaIR: Visual State Space Model for Image Restoration},
  author={Shi, Yuan and Xia, Bin and Jin, Xiaoyu and Wang, Xing and Zhao, Tianyu and Xia, Xin and Xiao, Xuefeng and Yang, Wenming},
  journal={arXiv preprint arXiv:2403.11423},
  year={2024}
}

๐Ÿ’ก Acknowledgements

This code is built on BasicSR, Vmamba.

vmambair's People

Contributors

zj-binxia avatar alphacatplus avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.