Code Monkey home page Code Monkey logo

gsdd's Introduction

πŸ“– GSDD: Generative Space Dataset Distillation for Image Super-Resolution (AAAI2024) πŸŽ‰

πŸ’» This is the official implementation of GSDD accepted by AAAI-2024.πŸ˜„ Thanks for watching! ❀️

πŸ“„ Paper πŸ“œ Supp πŸ“ˆ Poster πŸŽ₯ Video


πŸ”­ Overview


🧰 Dependencies

Windows 10 or Ubuntu 18.04
NVIDIA RTX 3090Ti (24G)
CUDA version  V11.1
torch         1.8.1+cu111
torchvision   0.9.1+cu111

πŸ› οΈ Environment

# git clone repository

git clone https://github.com/eric930711/GSDD.git
cd GSDD

# create conda environment and activate

conda create -n GSDD python=3.9
conda activate GSDD

# install requirement
 
pip install -r requirements.txt

πŸ”§ Usage

Pretrained models Google Drive   Distilled samples Google Drive

Training & Inference

COMING SOON...

πŸ‘€ Quick View

Abstract (click me)

Single image super-resolution (SISR), especially in the real world, usually builds a large amount of LR-HR image pairs to learn representations that contain rich textural and structural information. However, relying on massive data for model training not only reduces training efficiency, but also causes heavy data storage burdens. In this paper, we attempt a pioneering study on dataset distillation (DD) for SISR problems to explore how data could be slimmed and compressed for the task. Unlike previous coreset selection methods which select a few typical examples directly from the original data, we remove the limitation that the selected data cannot be further edited, and propose to synthesize and optimize samples to preserve more task-useful representations. Concretely, by utilizing pre-trained GANs as a suitable approximation of realistic data distribution, we propose GSDD, which distills data in a latent generative space based on GAN-inversion techniques. By optimizing them to match with the practical data distribution in an informative feature space, the distilled data could then be synthesized. Experimental results demonstrate that when trained with our distilled data, GSDD can achieve comparable performance to the state-of-the-art (SOTA) SISR algorithms, while a nearly Γ—8 increase in training efficiency and a saving of almost 93.2% data storage space can be realized. Further experiments on challenging real-world data also demonstrate the promising generalization ability of GSDD.

Method

GSDD

Datasets

Our experimental datasets include OST and DPED. The distilled OST image data (i.e., ICP=50, 100, 200, and 500) will be published soon.

Quantitative Comparison
Qualitative Presentation
Distilled Samples Showcase
Influence of Varying ICP Numbers
Ablation Study
Conclusion

This paper focuses on exploring the possibility of applying DD to benefit low-level CV tasks (especially for SISR). Overall, we propose a GSDD framework that optimizes and synthesizes training samples by the GAN-inversion manipulation in a latent generative space. Our optimization process is based on a distribution- matching data condensation strategy, so that the SISR network can have comparable performance to models trained on the original dataset. We further improve the approach by proposing a distillation loss with the regularization term R. Finally, we demonstrate its effectiveness via extensive experiments. We achieve a competitive performance compared to SOTA SISR solutions under the premise of a nearly Γ—8 increase in training efficiency and a saving of almost 93.2% data storage space. We wish that the study opens a new perspective in SR research and expect to see it facilitates more practical applications in the future.


πŸ“– Citation

Please cite us if our work is useful for your research. Thanks a lot! ❀️



⭐ License

This project is released under the Apache 2.0 license.


🀝 Acknowledgement

This work received strong support from SSL and Tom Cai. We would like to thank them!

gsdd's People

Contributors

eric930711 avatar

Watchers

 avatar

gsdd's Issues

pre-trained feature extractor

Great work!I have some confusion in the pre-trained feature extractor (ResNet18 in paper),

  1. what dataset is used to pre-train the feature extractor?your task dataset or ImageNet.

  2. and why in "Distribution Matching Optimization" part said the feature extractor is pre-trained, but in "Training Data Distillation" part said the feature extractor is a randomly sampled embedding space?

Hope for your answer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.