Code Monkey home page Code Monkey logo

dropseg's Introduction

DropSeg

The official fine-tuning implementation of our VOS approach (DropSeg) for the CVPR 2023 paper DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks.

☀️ Highlights

* Thanks for the great STCN library, which helps us to quickly implement the DropMAE VOS fine-tuning. The repository mainly follows the STCN repository.

* The proposed DropSeg uses pairs of frames for offline VOS training, and achieves SOTA results on existing VOS benchmarks w/ one-shot evaluation.

Install the environment

The Anaconda is used to create the Python environment, which mainly follows the installation in DropMAE and partially in STCN. The detailed installation packages can be found in environment.yaml.

Training

Data preparation

We follow the same data preparation steps used in STCN. Download both DAVIS and YouTube-19 datasets:

├── DAVIS
│   ├── 2016
│   │   ├── Annotations
│   │   └── ...
│   └── 2017
│       ├── test-dev
│       │   ├── Annotations
│       │   └── ...
│       └── trainval
│           ├── Annotations
│           └── ...
├── YouTube
│   ├── all_frames
│   │   └── valid_all_frames
│   ├── train
│   ├── train_480p
│   └── valid

Pre-trained model download

Download pre-trained DropMAE models in DropMAE (e.g., K700-800E).

Training command

python -m torch.distributed.launch --master_port 9842 --nproc_per_node=8 train_dropseg.py --pretrained_net_path pretrained_model_path --id retrain_s03 --stage 3

--pretrained_net_path indicates your downloaded pre-trained model path.

Inference command

Download the DropSeg model here, and run the evaluation w/ the following commands. All evaluations are done in the 480p resolution.

Python submit_eval_davis17.py --davis_path path_to_davis17_dataset
Python submit_eval_davis16.py --davis_path path_to_davis16_dataset

After running the above evaluation, you could get the qualitative results saved in the root project directory. You could use the offline evaluation toolikit (https://github.com/davisvideochallenge/davis2017-evaluation) to get the validation performance on DAVIS-16/17. For test-dev on DAVIS-17, using the online evaluation server instead.

Acknowledgments

  • Thanks for the STCN library for convenient implementation.

Citation

If our work is useful for your research, please consider cite:

@inproceedings{dropmae2023,
  title={DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks},
  author={Qiangqiang Wu and Tianyu Yang and Ziquan Liu and Baoyuan Wu and Ying Shan and Antoni B. Chan},
  booktitle={CVPR},
  year={2023}
}

dropseg's People

Contributors

jimmy-dq avatar

Stargazers

Alexey Nekrasov avatar yahooo avatar  avatar  avatar

Watchers

 avatar

dropseg's Issues

Problems when inference

Hi, I want to use the inference code to generate davis results. But i got a problem:
AssertionError: Input image height (480) doesn't match model (224), which is shown in the self.patch_embed of models_vit.
Can you help me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.