Code Monkey home page Code Monkey logo

droptrack's Introduction

DropTrack

The official fine-tuning implementation of DropTrack for the CVPR 2023 paper DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks.

☀️ Highlights

* Thanks for the great OSTrack library, which helps us to quickly implement the DropMAE VOT fine-tuning. The repository mainly follows the OSTrack repository.

* The OSTrack w/ our DropMAE pre-trained models can achieve state-of-the-art performance on existing popular tracking benchmarks.

Tracker GOT-10K (AO) LaSOT (AUC) LaSOT (AUC) TrackingNet (AUC) TNL2K(AUC)
DropTrack-K700-ViTBase 75.9 71.8 52.7 84.1 56.9

🌟 Training Speed

Our DropTrack has the same training procedure and nearly the same model parameters (i.e., except for using two frame identity embeddings) w/ OSTrack, so the training speed is consistent w/ OSTrack. We use 4 A100 GPUs w/ a total batch size of 128, which costs about ~6 hours (100 Epochs) for training on GOT-10k.

Install the environment

Option1: The Anaconda is used to create the Python environment, which mainly follows the installation in OSTrack. The specific installation packages are listed in requirements.txt for consideration, which can be installed in the following way:

conda create -n droptrack python=3.8
conda activate droptrack
pip install -r requirements.txt

Set project paths

Run the following command to set paths for this project

python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir ./output

After running this command, you can also modify paths by editing these two files

lib/train/admin/local.py  # paths about training
lib/test/evaluation/local.py  # paths about testing

Data Preparation

Put the tracking datasets in ./data. It should look like:

${PROJECT_ROOT}
 -- data
     -- lasot
         |-- airplane
         |-- basketball
         |-- bear
         ...
     -- got10k
         |-- test
         |-- train
         |-- val
     -- coco
         |-- annotations
         |-- images
     -- trackingnet
         |-- TRAIN_0
         |-- TRAIN_1
         ...
         |-- TRAIN_11
         |-- TEST

Training

  • Download pre-trained DropMAE models in DropMAE and put it under $PROJECT_ROOT$/pretrained_models.
  • Modify the PRETRAIN_FILE tag in vitb_384_mae_ce_32x4_ep300.yaml or vitb_384_mae_ce_32x4_got10k_ep100.yaml to the name of your downloaded DropMAE pre-trained models.
  • Training Command on GOT-10K:
cd path_to_your_project
python tracking/train.py --script ostrack --config vitb_384_mae_ce_32x4_got10k_ep100 --save_dir sabe_path --mode multiple --nproc_per_node 4 --use_lmdb 0 --use_wandb 0
  • Training Command on the other tracking datasets:
cd path_to_your_project
python tracking/train.py --script ostrack --config vitb_384_mae_ce_32x4_ep300 --save_dir save_path --mode multiple --nproc_per_node 4 --use_lmdb 0 --use_wandb 0

Training logs

The training log of DropTrack-Got10k-100E is available here.

Evaluation

Download the tracking model weights

K400-1600E-GOT10k K700-800E-GOT10k K700-800E-AllData
Tracking Models download download download

Change the corresponding values of lib/test/evaluation/local.py to the actual benchmark saving paths. Note that the save_dir tag should be set to the downloaded tracking model path and you can also modify the tracking model name in lib/test/parameter/ostrack.py.

Some testing examples:

  • LaSOT or other off-line evaluated benchmarks (modify --dataset correspondingly)
python tracking/test.py ostrack vitb_384_mae_ce_32x4_ep300 --dataset lasot --threads 16 --num_gpus 4
python tracking/analysis_results.py # need to modify tracker configs and names
  • GOT10K-test
python tracking/test.py ostrack vitb_384_mae_ce_32x4_got10k_ep100 --dataset got10k_test --threads 16 --num_gpus 4
python lib/test/utils/transform_got10k.py --tracker_name ostrack --cfg_name vitb_384_mae_ce_32x4_got10k_ep100

Acknowledgments

  • Thanks for the OSTrack library for convenient implementation.

Citation

If our work is useful for your research, please consider cite:

@inproceedings{dropmae2023,
  title={DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks},
  author={Qiangqiang Wu and Tianyu Yang and Ziquan Liu and Baoyuan Wu and Ying Shan and Antoni B. Chan},
  booktitle={CVPR},
  year={2023}
}

droptrack's People

Contributors

jimmy-dq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

droptrack's Issues

论文中的公式(4)

您好
问题1:论文中的公式4为什么要用一个时间匹配概率和标准化的空间重要性相乘,这两个变量之间有什么联系吗?
问题2:然后论文中提到 “当Wi,j较大时,第i个查询令牌具有较好的帧间匹配性“ 帧间的匹配性是由ftem函数直接影响的,为什么乘上标准化的空间重要性之后也是 Wi,j越大,帧间爱你的相关性越好?
期待您的回答,谢谢!

Inference Speed

Why is DropTrack so much slower than OStrack?

When using OSTrack-256 for testing, it can reach 110+FPS on GOT-10K, but DropTrack-256 can only reach 55+FPS, which is even slower than OSTrack-384's 75+FPS. I am using a RTX 3080 (10G).

ONNX convertion

Thanks for your great work, I have been working with this topic for a long time and I am really impressed with your work. I want to ask you if you are planning to convert your model to onnx ?

install issue

Thank for your great work, but when i follow your work to install reqirement.txt, can't install the package well. Did you install package with this command "pip install -r requirement.txt" ? Hope your reply~

about log

thanks for your pretrained-models, could you give me your training logs about Droptrack

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.