masterbin-iiau / unicorn Goto Github PK
View Code? Open in Web Editor NEW[ECCV'22 Oral] Towards Grand Unification of Object Tracking
License: MIT License
[ECCV'22 Oral] Towards Grand Unification of Object Tracking
License: MIT License
Hi, thanks for your awesome work.
According to the code in this repo, you perform 3-stages training: detection, tracking, tracking-plus-mask, right? Could you please provide the scripts (with specific hyper-parameters, e.g., batch size) to reproduce the results shown in your paper?
I found the tools/demo.py support the TensorRT inference, so I try to convert it to TensorTR model, but where is the tools/trt.py mentioned in tools/demo.py?
if args.trt: assert not args.fuse, "TensorRT model is not support model fusing!" trt_file = os.path.join(file_name, "model_trt.pth") assert os.path.exists( trt_file ), "TensorRT model is not found!\n Run python3 tools/trt.py first!" model.head.decode_in_inference = False decoder = model.head.decode_outputs logger.info("Using TensorRT to inference") else: trt_file = None decoder = None
Hi Authors,
Thanks and congratulations on the wonderful work. I wonder if you can please share the checkpoint for the second group of models unicorn_track_large_mot_challenge_mask. The link is missing from the readme and from Huggingface https://huggingface.co/NimaBoscarino/unicorn_track_large_mot_challenge_mask/tree/main
Regards
Harkirat
如题,我点击您给的连接后一直显示失败,请问您可以发送模型到[email protected]吗?如果不方便的话请问是否可以提供其他地址?万分感谢!
Wonderful work! As stated in the paper, the model is trained under the supervision of corresponding loss and detection loss using the data from SOT and MOT.
For the part of classification loss in detection, I want to know how to handle the instances from SOT datasets whose classes are outside MOT datasets. Also, do you think the SOT performance of this model is related to whether the tracked object class is in the MOT dataset?
由于某个众所周知的原因。。。
Hello, thanks for your work on Tracking Unification which is really promising! I've seen that you are currently working on a demo script, but could you provide simple example commands for SOT and MOT inference ?
Some explanations of the arguments would also be appreciated, I'm not sure to understand the purpose of the experiment description file.
Thanks for your time,
Would be glad to help, working on the documentation for example.
Hi, in Sec 4.1, the paper said that Unicorn in four tasks uses the same model parameters, does this mean that the Unicorn model is trained in a multi task manner? Meanwhile, I found that in the ablation study in Sec 4.6, the four results of Unification in the Single task part of Table 7 are different from the results in Tables 1, 3, 4, 6, what is the difference between the Unification setting in Table 7 and those in Table 1,3, 4, 6?
Hi, author. I have read your paper, which is a fascinating piece of work. Is there any code of demo for inference video or images?
bdd100k 's some requirements
Scikit-learn requires Python 3.8 or later.
how do you solve it?
is bdd100k used for convert label
if the answer is yes,i can set another env for bdd100
Hello!
I want to compare unicorn with our method on vot2020.
[unicorn] #
label = unicorn
protocol = traxpython
command = import tools.run_vot as run_vot; run_vot.run_vot2020('unicorn_vos', 'unicorn_track_r50_mask') # Set the tracker name and the parameter name
paths = /media/wuhan/disk1/wh_code_backup/Unicorn
env_PATH = /home/wuhan/anaconda3/envs/unicorn/bin/python;${PATH}
And I modified the Unicorn/external/lib/test/tracker/unicorn_vos.py
def initialize(self, image, info: dict):
self.frame_id = 0
# process init_info
self.init_object_ids = info["init_object_ids"]
self.sequence_object_ids = info['sequence_object_ids']
# assert self.init_object_ids == self.sequence_object_ids
# forward the reference frame once
"""resize the original image and transform the coordinates"""
self.H, self.W, _ = image.shape
ref_frame_t, r = self.preprocessor.process(image, self.input_size)
"""forward the network"""
with torch.no_grad():
_, self.out_dict_pre = self.model(imgs=ref_frame_t, mode="backbone") # backbone output (previous frame) (b, 3, H, W)
self.dh, self.dw = self.out_dict_pre["h"] * 2, self.out_dict_pre["w"] * 2 # STRIDE = 8
"""get initial label mask (K, H/8*W/8)"""
self.lbs_pre_dict = {}
self.state_pre_dict = {}
for obj_id in self.init_object_ids:
self.state_pre_dict[obj_id] = info["init_bbox"]
init_box = torch.tensor(info["init_bbox"]).view(-1)
init_box[2:] += init_box[:2] # (x1, y1, x2, y2)
init_box_rsz = init_box * r # coordinates on the resized image
self.lbs_pre_dict[obj_id] = F.interpolate(get_label_map(init_box_rsz, self.input_size[0], self.input_size[1]) \
, scale_factor=1/8, mode="bilinear", align_corners=False)[0].flatten(-2).to(self.device) # (1, H/8*W/8)
"""deal with new-incoming instances"""
self.out_dict_pre_new = [] # a list containing out_dict for new in-coming instances
self.obj_ids_new = []
def track(self, image, info: dict = None, bboxes=None, scores=None, gt_box=None):
self.frame_id += 1
"""resize the original image and transform the coordinates"""
cur_frame_t, r = self.preprocessor.process(image, self.input_size)
with torch.no_grad():
with torch.cuda.amp.autocast(enabled=False):
fpn_outs_cur, out_dict_cur = self.model(imgs=cur_frame_t, mode="backbone") # backbone output (current frame)
# deal with instances from the first frame
final_mask_dict, inst_scores = self.get_mask_results(fpn_outs_cur, out_dict_cur, self.out_dict_pre, r, self.init_object_ids)
# deal with instances from the intermediate frames
for (out_dict_pre, init_object_ids) in zip(self.out_dict_pre_new, self.obj_ids_new):
final_mask_dict_tmp, inst_scores_tmp = self.get_mask_results(fpn_outs_cur, out_dict_cur, out_dict_pre, r, init_object_ids)
final_mask_dict.update(final_mask_dict_tmp)
inst_scores = np.concatenate([inst_scores, inst_scores_tmp])
# deal with instances from the current frame"""
if "init_object_ids" in info.keys():
self.out_dict_pre_new.append(out_dict_cur)
self.obj_ids_new.append(info["init_object_ids"])
inst_scores_tmp = np.ones((len(info["init_object_ids"]),))
inst_scores = np.concatenate([inst_scores, inst_scores_tmp])
for obj_id in info["init_object_ids"]:
self.state_pre_dict[obj_id] = info["init_bbox"]
init_box = torch.tensor(info["init_bbox"]).view(-1)
init_box[2:] += init_box[:2] # (x1, y1, x2, y2)
init_box_rsz = init_box * r # coordinates on the resized image
self.lbs_pre_dict[obj_id] = F.interpolate(get_label_map(init_box_rsz, self.input_size[0], self.input_size[1]) \
, scale_factor=1/8, mode="bilinear", align_corners=False)[0].flatten(-2).to(self.device) # (1, H/8*W/8)
final_mask_dict[obj_id] = info["init_mask"]
# Deal with overlapped masks
cur_obj_ids = copy.deepcopy(self.init_object_ids)
for obj_ids_inter in self.obj_ids_new:
cur_obj_ids += obj_ids_inter
if "init_object_ids" in info.keys():
cur_obj_ids += info["init_object_ids"]
# soft aggregation
cur_obj_ids_int = [int(x) for x in cur_obj_ids]
mask_merge = np.zeros((self.H, self.W, max(cur_obj_ids_int)+1)) # (H, W, N+1)
tmp_list = []
for cur_id in cur_obj_ids:
mask_merge[:, :, int(cur_id)] = final_mask_dict[cur_id]
tmp_list.append(final_mask_dict[cur_id])
back_prob = np.prod(1 - np.stack(tmp_list, axis=-1), axis=-1, keepdims=False)
mask_merge[:, :, 0] = back_prob
mask_merge_final = np.argmax(mask_merge, axis=-1) # (H, W)
for cur_id in cur_obj_ids:
final_mask_dict[cur_id] = (mask_merge_final == int(cur_id))
"""get the final result"""
final_mask = np.zeros((self.H, self.W), dtype=np.uint8)
# for obj_id in cur_obj_ids:
# final_mask[final_mask_dict[obj_id]==1] = int(obj_id)
final_mask = mask_merge_final
return {"segmentation": final_mask}
But the tracking and segmentation results is "0, 0, 0, 0"
Can you help me?
Hello, thanks for this interesting project, wanted to ask how can i apply the tracker to a custom-trained yolox model of my own,
I have the model and i already integrated it with ByteTrack, is there any script of readme that can help me with this?
It is a very nice working. When I try to reproduce this work, I encountered this problem. I'm tired of this. It is very annoying!!! Hope author give me a favor, thanks a lot!!!
Background: Test SOT on LaSOT.
So i was trying to train for tracking, using qdtrack association, and this requires a lot of computational power, which i can get, but first i wanted to test how efficient will the method be,
I have a custom detector that i trained, can i use this detector for detections, and your pretrained model for embeddings and id association, or that would tear up the association accuracy?
Thanks in advance.
Hi there, congrats on the release and on the acceptance to ECCV 2022! I got SOT working on my local machine, but getting the other video-level tasks to work has been a bit difficult, so I wondered if you'd find it useful to have a demo available. To make it easier for people to tinker with your work, would you be interested in adding the models and a web demo to Hugging Face? The Hugging Face hub offers free hosting, and I'd be more than happy to help out if it's something you're interested in.
Hi,
How to split your Unicorn framework as Detection, Apperance Model(embeding), association model?
So we can easy take replace of different module.
I am facing when running I faced ModuleNotFoundError: No module named 'MultiScaleDeformableAttention' error so I tried unicorn/models/ops/python setup.py build
and python setup.py install
but I am getting the error below
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda)) RuntimeError: The detected CUDA version (10.1) mismatches the version that was used to compile PyTorch (11.7). Please make sure to use the same CUDA versions.
I see that Deformable-DETR is only available until 10.1 or something below, how to fix it?
Under the assumption that I can try out Unicorn with some pre-trained models, I tried to install it on a Mac and a PC, both w/o nVidia GPU. In step 3 when executing bash make.sh, setup.py is called which is testing for torch.cuda.is_available() and is raising an exception if the function returns false. So running in CPU-mode is not foreseen?
Hi,
While following along the inference.py
code to see how the model is working, I noticed something in the preprocessing code.
PreprocessorX
at external/lib/test/tracker/unicorn_sot.py:111
turns RGB format back to BGR format and the normalization process is missing. The self.normalize
is not referenced at all as well. So the input seems to be raw image in BGR format with values between range [0, 255].
I wasn't able to find any code that performs normalization in the forward
functions of the inner models as well.
Is it just that the model was trained on raw pixel values or am I missing anything?
i run to demo but have some error
box_corner = prediction.new(prediction.shape)
AttributeError: 'tuple' object has no attribute 'new'
what is this error?
Hello @MasterBin-IIAU, when i use exp file unicorn_track_large.py, along with the weights unicorn_det_convnext_large_800x1280, model loads well, as long as num of classes is 8 or 80, but when i try to accommodate this to my dataset (num_classes = 11), to retrain on my custom database, the model raises the error Invalid num_classes,
Is this the expected behaviour? i want to train a detector that can be used in qdtrack association (track_omni.py script), trained on my dataset (11 classes), is that possible with the current scripts?
Thanks in advance.
非常感谢您对跟踪领域的重大贡献,由于数据集太大,请问具体需要多大呢?4T能否装下呢?
What needs to be changed if I’d like to train the network with only a single gpu? Is this possible?
thank you! And great work :)
Do you have any plans to create a Dockerfile for all the environments?
Thanks
Hi,
This is a very awesome work. I want to integrate it into mmtracking framework.
Do you agree it? I want to refactor code and Pull requests to mmtracking git .
This may be addressed somewhere in the code, but I have not found it yet. Any help would be amazing!
Thank you.
ConvNext似乎在下游任务上表现不是非常优,您能做到非常优秀的结果,是否有什么好的经验呢?
二是您有没有尝试过RepLKNet呢?这里面的选择有什么考量吗?如果做过的话,也希望如果方便的话,最好也能提供一下实验结果。
最后关注到您是使用16a100完成的,请问如果显存只有11g的卡8能否容得下呢?
再次感谢您大统一的优秀工作!冒昧再次打扰,感谢!
ConvNext does not seem to be doing very well on downstream tasks. Do you have any good experience in achieving very good results?
Second, have you tried RepLKNet? Is there any consideration in the selection? If so, I also hope to provide the experimental results if it is convenient.
Finally, I noticed that you used 16* A100 to complete the task. Can the card *8 with only 11G memory hold the task?
Thank you again for your great work! Thank you for bothering me again!
does the models work on edge devices and what specs/qualifications
Thanks for your job,as mentioned in title,did you do Ablations for the learnable broadcast
为什么不管怎么调bs,单机多卡训练的时候显存占用都不变。
在MOT/MOTS推理阶段是否存在类似于feature propagation这样的操作?论文framework图中的reference targets是如何体现在推理阶段的?
Hello @MasterBin-IIAU, thank you for your work and publishing it.
I am currently trying to setup an environment to benchmark Unicorn vs. another algorithm from someone in my company during a project to proof my expertise in an internal AI degree. So bear with me when I am not 100% sure about wording and what I am doing :-).
In a first step, I intend to run a MOT only test on MOT Challenge 17 data, as in the beginning BDD data was not completely loaded. I installed the python environment, although I use python 3.8 and CUDA 11.6 as well as PyTorch 1.12 (just to let you know)
Then I was able to run
python launch_uni-py --name unicorn_track_tiny_mot_only.py --nproc_per_node=2 --batch 16 --mode multiple
but as training would take 10 days, I want to use your provided model zoo. Therefore I created a directory called Unicorn_outputs/unicorn_track_tiny_mot_only and placed the pre-trained latest_ckpt.pth from model zoo in it. I also changed mot_test_name to motchallenge in exp/unicorn_track.py but there is anyhow no difference when I don't change it (after I now loaded all BDD data as well).
When I call
python tools/track.py -f expos/default/unicorn_track_tiny_mot_only.py -c Unicorn_outputs/unicorn_track_tiny_mot_only/latest_ckpt.pth -b 1 -d 1
it throws an list index out of range error in the function convert_to_coco_format where it retrieves the label, as the data loader.dataset.class_ids is only of dimension 1, which means it only knows one I think this is expected, as MOT 17 only knows one class. But the actual output it is working on contains label-numbers 0,1,2,3,6, and 7. Variable cls is a tensor starting with 10 '0' values that work, but certainly the '7' in the next position is throwing the error.
My assumption is that something is still wrong with the number of classes but don't know how to proceed. I did some debugging but currently I don't find the solution.
Thanks for an answer, Carsten.
新地址依然找不到mot17的预训练模型,里面没有latest_ckpt.pth文件
Hi, awesome work!
I want to know which GPU you use for the inference speed in paper?
Or what is the minimum gpu device requirements to run the model?
my version are:
torch 1.10.0
torchaudio 0.10.0
torchvision 0.11.0
but this error:
Traceback (most recent call last):
File "tools/test_omni.py", line 11, in
from mmdet.datasets import build_dataset
File "/home/yj/.local/lib/python3.7/site-packages/mmdet/init.py", line 18, in
f'MMCV=={mmcv.version} is used but incompatible. '
AssertionError: MMCV==1.4.6 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.1.0.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.