researchmm / stark Goto Github PK

View Code? Open in Web Editor NEW

617.0 617.0 140.0 7.15 MB

[ICCV'21] Learning Spatio-Temporal Transformer for Visual Tracking

License: MIT License

Shell 0.86% Python 99.14%

transformer

stark's People

Contributors

Stargazers

Watchers

Forkers

scott-mao xialuxi richard-lx 627081385 wellxiong florinshum shyamalschandra ingeniousfrog ammaddd brookefzy trendingtechnology xieyunjiao benjamesbabala 13331112522 jianqiuer qwopqwop200 ste7an-cs ricklentz yanwanquan studian mszlean swan2015 tzhhhh123 adamzdw omri-l scutpaul batumgl olya-varentsova veredeps masterbin-iiau cv-ip wonlee2019 yk135915 jingweizhang12 yonghoonkwon honglinchu nuhpiskin upapprentice lyslxl kuzhang myheathcliff poet-libai mjt1312 qingamng jb2020-super xenialll wp8733684 liuguicen peter-linzi namnaku87 allezsyh bellgakk gubei528 maiabboud prilwoo xjturjc ykaganov uzair110 zvs08 silicon2006 sky-fanny sc-stallone trojerz xiaofa-jpg wenruicai lpxtt pugangqiang andong-star fishhe tanghaotommy yyk-nb chenxin-dlut baozhiqiang1978 linyark jovialio sauravgarg540 fireae pixeli99 dark-current sina-asgari mango1218 cv-tracking maklachur alaneason lilin19890401 lseventeen mbsdtf wolfworld6 dengjian-cn isbecky27 chenjian7578 kuaijl sbnair hmbb123 wdlyyds realmessage sylee-skhu nikp00 noobgrow blankblanki

stark's Issues

the HZ of the GOT-10K evaluation result

Hi, thanks for your work！ Could you pleae provide the Hz value of GOT-10K when evaluating the Stark?

Are all the models here working on single object tracking not MOT?

Hello,
Thank you for your contribution ^^,
I need to know if all the pretrained models here are working on single object tracking,
So I wounder if there's any model working as MOT

关于paper中的4.4Comparison with Other Frameworks一些问题

作者您好，我想问下Template images serve as the queries这里的网络结构是指：在一个标准的transformer结构中，template和search同时输入encoder，然后template也输入到decoder，就好像下图所示吗？

Some confusion about the speed.

I want to know that do you test the speed on a Tesla V100 GPU?

how to analysis the model on GOT10k-val dataset?

Thanks for your work! I trained the model and want to evaluate it on the GOT10k-Val dataset to see its performance, but only see 'LaSOT', 'otb', 'nfs', 'uav', 'tc128ce' datasets, so how to evaluate on the GOT10k-Val? By the way, what's the difference between analysis_results and analysis_results_ITP files?

在got10k数据集上进行测试出现:out of memory

Hello when I use
python tracking/test.py stark_st baseline_got10k_only --dataset got10k_test --threads 32
python lib/test/utils/transform_got10k.py --tracker_name stark_st --cfg_name baseline_got10k_only
Tested on the got10k data set, the result is less than 20 Sequences, and the oom error occurs.
My hardware platform is 3090

关于swin_base_patch4_window12_384_22k.pth

首先感谢您优秀的工作
请问在哪里能下载swin_base_patch4_window12_384_22k.pth呢？我没有能找到任何链接。

Alpha-Refine Problems

您好！您上传的包有问题，无法解压！
https://drive.google.com/file/d/1qOQRfaRMbQ2nmgX1NFjoQHfXOAn609QM/view

Effect of template choice on transformer

Thanks for sharing! I have some questions around the choice of template. From the paper you cropped 2^2 times the ground truth bounding box, rather than just the actual target bounding box resized to square image. My questions are:

Is the purpose here to include more surrounding information? If so what would be the optimal template size here? Also a factor of 2 would not always include the whole tracking object if aspect ratio is high.
By not specifying the bounding box exactly I assume the transformer has to learn some segmentation capability? For instance right now I noticed that if you change the template crop size (output size stay the same) a little bit during the inference time, the model would give very poor performance. So it seems that some information sensitive to absolute positions are learned in this setting. Would passing the exact coordinates into the transformer help in any way?

What is the effect of replacing transformer with swin transformer?

Hi,

Thanks for your good job！What is the effect of replacing transformer with swin transformer?

Looking forward to your reply~

tutorials for lmdb

Hi, can you provide a tutorial for using lmdb format？

How long it cost for training stark baseline using a 1080Ti?

recurring the works on recorded video

Hi,

Thanks for your great job！I'm recurring the works on recorded video. But there is no progress. Can you give a tutorial how to recurring your job in a video. Thanks a lot.

cuda10.2 and 3060 do not match

run: python tracking/video_demo.py stark_s baseline test_video/demo.mp4

cuda10.2:

NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the NVIDIA GeForce RTX 3060 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

cuda11.0:

WARNING: You are using tensorboardX instead sis you have a too old pytorch version.
Traceback (most recent call last):
  File "tracking/../lib/train/admin/tensorboard.py", line 4, in <module>
    from torch.utils.tensorboard import SummaryWriter
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/lib/../../../../libcublas.so.11: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tracking/video_demo.py", line 9, in <module>
    from lib.test.evaluation import Tracker
  File "tracking/../lib/test/evaluation/__init__.py", line 1, in <module>
    from .data import Sequence
  File "tracking/../lib/test/evaluation/data.py", line 3, in <module>
    from lib.train.data.image_loader import imread_indexed
  File "tracking/../lib/train/__init__.py", line 1, in <module>
    from .admin.multigpu import MultiGPU
  File "tracking/../lib/train/admin/__init__.py", line 3, in <module>
    from .tensorboard import TensorboardWriter
  File "tracking/../lib/train/admin/tensorboard.py", line 7, in <module>
    from tensorboardX import SummaryWriter
ModuleNotFoundError: No module named 'tensorboardX'

but when i installed tensorboardX:

WARNING: You are using tensorboardX instead sis you have a too old pytorch version.
Traceback (most recent call last):
  File "tracking/../lib/train/admin/tensorboard.py", line 4, in <module>
    from torch.utils.tensorboard import SummaryWriter
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/lib/../../../../libcublas.so.11: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tracking/video_demo.py", line 9, in <module>
    from lib.test.evaluation import Tracker
  File "tracking/../lib/test/evaluation/__init__.py", line 1, in <module>
    from .data import Sequence
  File "tracking/../lib/test/evaluation/data.py", line 3, in <module>
    from lib.train.data.image_loader import imread_indexed
  File "tracking/../lib/train/__init__.py", line 1, in <module>
    from .admin.multigpu import MultiGPU
  File "tracking/../lib/train/admin/__init__.py", line 3, in <module>
    from .tensorboard import TensorboardWriter
  File "tracking/../lib/train/admin/tensorboard.py", line 7, in <module>
    from tensorboardX import SummaryWriter
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/tensorboardX/__init__.py", line 5, in <module>
    from .torchvis import TorchVis
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/tensorboardX/torchvis.py", line 11, in <module>
    from .writer import SummaryWriter
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/tensorboardX/writer.py", line 34, in <module>
    import torch
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/home/richard/miniconda3/envs/torch1.7/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/richard/miniconda3/envs/torch1.7/lib/python3.6/site-packages/torch/lib/../../../../libcublas.so.11: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference

nfs数据集验证问题

作者您好，最近我在用模型验证nfs数据集，nfs数据集使用时的格式是怎样的呢，需要自己编写脚本将原本的zip解压吗，因为在输入命令执行之后提示路径错误，没有anno文件夹。

tensorrt

hello
i have follow your steps to convert baseline stark_st2 to onnx successfully,
now i want to convert the onnx to tensorrt model,can you give me some advice?
if possible, please list your relevant environment (tensorrt/cuda/etc .)
thank you sincerely!

Run video_demo.py

Thanks for your excellent work, could you please show me how to run video_demo.py with pretrained model, thanks a lot!

关于预训练模型下载

您好，关于您训练好的模型，可以用邮箱分享一份给我吗？我在您github上放的下载链接上下载不了。我的邮箱是：[email protected]
需要的模型有：stark_S50/baseline_got10k_only/STARKS_ep0500.pth.tar，stark_ST50/baseline_got10k_only/STARKST_ep0050.pth.tar

FLOPS calculation

Hi,
I notice you are calculating MACS in your profile_model.py. Did you report the MACS value as the GFLOPs?
The reported GFLOPs in your paper does not match my calculation.

Dataloader will randomly crashed

Hi.

I found that the training process will randomly crashed with RuntimeError: DataLoader worker (pid(s) 36469) exited unexpectedly, is that normal?

I use the following training command.

python tracking/train.py --script stark_s --config baseline_got10k_only --save_dir . --mode multiple --nproc_per_node 8

thanks!

BatchNorm problem

I tried to replace the backbone or bottleneck with other convolutional layers with BN and ReLU. The training result is good but the validation and testing results are very low.
I have already set the network.eval() while testing.
Will BN and ReLU affect the results of model training and testing?
Should I change BN to LN, even after convolutional layers?
Or it is possible to use only the convolutional layer without BN and RELU?

the meaning of "lmdb" in "self.lasot_lmdb_dir"

Hi! Could you please tell what is the meaning of "lmdb" in class EnvironmentSettings "self.lasot_lmdb_dir"? I guess it is the dir of val dataset of lasot?

how many cores/threads do you use every task/PID/ nproc?

Hi，I use 8 32g tesla V100 and want to train your model. but the fps is lower than yours. Do you know how many cores/threads do you use every task/PID/ nproc? Thanks!

A problem about loading checkpoint

When I train ‘st’ model，I found the 'net_type' is ''STARKS'',but the checkpoint_dict['net_type'] is ''LittleBoy_clean_corner'',
so
assert net_type == checkpoint_dict['net_type'], 'Network is not of correct type.'It's always wrong.

How can I solve this problem?Thanks！

RawResults

nice work!could you provide your rawresults about STARK testing on these datasets?UAV,got,lasot,otb

the specific role of the decoder in transformer structure

Hi!
You said that "In the encoder-decoder attention module, the target query can attend to all positions on the template and the search region features, thus learning robust representations for the final bounding box prediction." in your paper. How to understand that? It's really abstract for me.
Thanks for your reply!

thansk for your amazing work, how to test on my video in your model?

why not set sequential input of the data

Hi, thanks for your work. I find from your codes that "shuffle = True" when setting dataloaders. So if the input data is not sequential, how to update template every 200 frames? thanks!

train on 4 11GB RTX 2080Ti gpus

Hi! Is it possible to train Stark on 4 11GB RTX 2080Ti gpus?

ModuleNotFoundError: No module named 'lib'

hi,

I run several times about vot (python version), but still got the problem:
from lib.test.vot20.stark_vot20 import run_vot_exp
ModuleNotFoundError: No module named 'lib'

It seems not finding the stark project path, though I export it as:
export PYTHONPATH=/home/xxxx/projects/transformer/Stark-main:$PYTHONPATH.

Expected to solve it by inspiring from any of your answers.

Thanks!

about lmdb

hi, I Haven't used lmdb to read datasets before, could you tell me how to generate lmdb files? thanks

train error ValueError: Caught ValueError in DataLoader worker process 0

非常感谢您公布成果，对我的学习提供了很大的帮助，很感激您的工作，我目前遇到了一些问题，我在执行
python tracking/train.py --script stark_s --config baseline --save_dir . --mode multiple --nproc_per_node 4 时，出现了错误，显示ValueError : Caught ValueError in DataLoader worker process 0
由于设备有限，训练集，我没有加上trackingnet，是否是我没有修改一些配置。望您有空帮我解答，再次非常感谢您。

Thanks for your good job！ I have tried to recurring your final results, but I failed。
when i use
python tracking/train.py --script stark_s --config baseline --save_dir . --mode multiple --nproc_per_node 4
There was an error
ValueError : Caught ValueError in DataLoader worker process 0
Due to limited equipment, I did not add trackingnet in training set, whether I did not modify some configuration.I hope you have time to help me out. Thank you very much again.

the raw reasults of other models or paper

Hi! I have one question. When getting the result plots like "Comparisons on LaSOT test set" with other papers' models, we need raw reaults of their. So how to get it?

No matching checkpoint file found

你好，我训练时，一直提醒No matching checkpoint file found，请问有没有解决办法。

need list.txt when debugging training with a single GPU

Hi! When I debug tranining with a single GPU, there is an error that no such file or directory: 'got10k/list.txt'. So what is this？

torchvision version detection

Hello, I noticed some problems with the implementation of the code that detects the version of torchvision in utils.misc.py

import torchvision
if float(torchvision.__version__[:3]) < 0.7:
    from torchvision.ops import _new_empty_tensor
    from torchvision.ops.misc import _output_size

This will have a fatal issue for torchvision>=0.10

python -c "import torchvision; print(torchvision.__version__[:3])"
# output equals 0.1

how long did the training take?

Hi! How long did the traning take using 8 16gb V100? I use 4 11gb 2080ti to train the model for 500 epoch and it will take 20 days.

hi,what is your hardware for trainning

About GOT-10k test set results

Hi, Thanks for your wonderful work. I notice that Transformer Tracking use the model trained with all datasets(LaSOT, GOT10K, COCO, TrackingNet) to get the evaluation result on GOT-10k test set, and the result is much better than the model trained with GOT10K only.

However, when I use the STARK-S50 pre-trained model(trained on all datasets) in your model zoo to evaluate the GOT-10k test set, I find that the AO is 0.688, which only gains small improvement compared with 0.672

I am confused with this phenomenon. Have you ever tried to evaluate the model trained with all datasets on GOT-10k test set? Or can you kindly explain the reason why there is just little performance gain to use the model trained on all datasets?

权重加载问题

您好，下载的权重都是提示no module named lib.train.admin.local

problems with Docker run

I have pulled the docker image from docker hub when I run a docker container the container starts and exit after one second.
please include some documentation about the docker running process if I should pass arguments when running the container.
In docker ps-a the container appears as an old container.
if you could also include full documentation about dockerized STARK training, and testing. this will be great.

Implementation of Ao evaluation index of got10k val dataset

Hello!About the implementation of Ao evaluation index of got10k val dataset, which code file is it implemented in?

Raw results

Hi, thanks for sharing the work. I saw that in the model zoo readme file you are mentioning raw results but I couldn't find a link referring to them. Could you give me a hand? Thanks in advance.

Training process not utilizing a dynamically updated template

It seems that STARK doesn't mention anything about a dynamically updated template (DUT for short) during training procedure, is it a deliberate design or am I missing something?

I reckon that the DUT is actually something like a short-term memory, and it should not be treated equally as a normal template from the first frame by the transformer, so the DUT should be explicitly included in training. However, this is not how STARK has been implemented.

So I'm curious what's the intuition or reasoning behind STARK's current training protocol of dismissing the DUT?

Very poor performance on vot2019

I am very interested in your work and thank you for your contribution. I tested Stark on vot19, and the configuration file used baseline_R101, but the result was very poor, with an EAO of 0.2. Is this result normal? Have you tested the result of vot2019?

Training results of the first stage

Hi,

Thanks for your good job！ I have tried to recurring your final results, but I failed. Can you send me your checkpoint file of the training results of the first stage (my email address [email protected])? This will help me to find my problems.

Thanks.

Is there a reason to use FrozenBatchNorm2d instead of turning the BN layers in eval mode?

Thanks in advance and nice work.

License

Hi,
Which license this repository uses?

Thanks,
Omri.