Code Monkey home page Code Monkey logo

unicorn's People

Contributors

ifighting avatar masterbin-iiau avatar nimaboscarino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unicorn's Issues

scripts to reproduce the results in paper

Hi, thanks for your awesome work.

According to the code in this repo, you perform 3-stages training: detection, tracking, tracking-plus-mask, right? Could you please provide the scripts (with specific hyper-parameters, e.g., batch size) to reproduce the results shown in your paper?

where is the tools/trt.py?

I found the tools/demo.py support the TensorRT inference, so I try to convert it to TensorTR model, but where is the tools/trt.py mentioned in tools/demo.py?
if args.trt: assert not args.fuse, "TensorRT model is not support model fusing!" trt_file = os.path.join(file_name, "model_trt.pth") assert os.path.exists( trt_file ), "TensorRT model is not found!\n Run python3 tools/trt.py first!" model.head.decode_in_inference = False decoder = model.head.decode_outputs logger.info("Using TensorRT to inference") else: trt_file = None decoder = None

The number of classes in the detection loss.

Wonderful work! As stated in the paper, the model is trained under the supervision of corresponding loss and detection loss using the data from SOT and MOT.
For the part of classification loss in detection, I want to know how to handle the instances from SOT datasets whose classes are outside MOT datasets. Also, do you think the SOT performance of this model is related to whether the tracked object class is in the MOT dataset?

Demo command for SOT and MOT

Hello, thanks for your work on Tracking Unification which is really promising! I've seen that you are currently working on a demo script, but could you provide simple example commands for SOT and MOT inference ?

Some explanations of the arguments would also be appreciated, I'm not sure to understand the purpose of the experiment description file.

Thanks for your time,
Would be glad to help, working on the documentation for example.

What is the experimental setting in the paper?

Hi, in Sec 4.1, the paper said that Unicorn in four tasks uses the same model parameters, does this mean that the Unicorn model is trained in a multi task manner? Meanwhile, I found that in the ablation study in Sec 4.6, the four results of Unification in the Single task part of Table 7 are different from the results in Tables 1, 3, 4, 6, what is the difference between the Unification setting in Table 7 and those in Table 1,3, 4, 6?

How to evaluate on vot2020?

Hello!
I want to compare unicorn with our method on vot2020.
[unicorn] #
label = unicorn
protocol = traxpython
command = import tools.run_vot as run_vot; run_vot.run_vot2020('unicorn_vos', 'unicorn_track_r50_mask') # Set the tracker name and the parameter name

Specify a path to trax python wrapper if it is not visible (separate by ; if using multiple paths)

paths = /media/wuhan/disk1/wh_code_backup/Unicorn

Additional environment paths

env_PATH = /home/wuhan/anaconda3/envs/unicorn/bin/python;${PATH}

And I modified the Unicorn/external/lib/test/tracker/unicorn_vos.py

def initialize(self, image, info: dict):
    self.frame_id = 0
    # process init_info
    self.init_object_ids = info["init_object_ids"]
    self.sequence_object_ids = info['sequence_object_ids']
    # assert self.init_object_ids == self.sequence_object_ids
    # forward the reference frame once
    """resize the original image and transform the coordinates"""
    self.H, self.W, _ = image.shape
    ref_frame_t, r = self.preprocessor.process(image, self.input_size)
    """forward the network"""
    with torch.no_grad():
        _, self.out_dict_pre = self.model(imgs=ref_frame_t, mode="backbone")  # backbone output (previous frame) (b, 3, H, W)
    self.dh, self.dw = self.out_dict_pre["h"] * 2, self.out_dict_pre["w"] * 2  # STRIDE = 8
    """get initial label mask (K, H/8*W/8)"""
    self.lbs_pre_dict = {}
    self.state_pre_dict = {}
    for obj_id in self.init_object_ids:
        self.state_pre_dict[obj_id] = info["init_bbox"]
        init_box = torch.tensor(info["init_bbox"]).view(-1)
        init_box[2:] += init_box[:2] # (x1, y1, x2, y2)
        init_box_rsz = init_box * r # coordinates on the resized image
        self.lbs_pre_dict[obj_id] = F.interpolate(get_label_map(init_box_rsz, self.input_size[0], self.input_size[1]) \
            , scale_factor=1/8, mode="bilinear", align_corners=False)[0].flatten(-2).to(self.device) # (1, H/8*W/8)
    """deal with new-incoming instances"""
    self.out_dict_pre_new = [] # a list containing out_dict for new in-coming instances
    self.obj_ids_new = []

def track(self, image, info: dict = None, bboxes=None, scores=None, gt_box=None):
    self.frame_id += 1
    """resize the original image and transform the coordinates"""
    cur_frame_t, r = self.preprocessor.process(image, self.input_size)
    with torch.no_grad():
        with torch.cuda.amp.autocast(enabled=False):
            fpn_outs_cur, out_dict_cur = self.model(imgs=cur_frame_t, mode="backbone")  # backbone output (current frame)
    # deal with instances from the first frame
    final_mask_dict, inst_scores = self.get_mask_results(fpn_outs_cur, out_dict_cur, self.out_dict_pre, r, self.init_object_ids)
    # deal with instances from the intermediate frames
    for (out_dict_pre, init_object_ids) in zip(self.out_dict_pre_new, self.obj_ids_new):
        final_mask_dict_tmp, inst_scores_tmp = self.get_mask_results(fpn_outs_cur, out_dict_cur, out_dict_pre, r, init_object_ids)
        final_mask_dict.update(final_mask_dict_tmp)
        inst_scores = np.concatenate([inst_scores, inst_scores_tmp])
    # deal with instances from the current frame"""
    if "init_object_ids" in info.keys():
        self.out_dict_pre_new.append(out_dict_cur)
        self.obj_ids_new.append(info["init_object_ids"])
        inst_scores_tmp = np.ones((len(info["init_object_ids"]),))
        inst_scores = np.concatenate([inst_scores, inst_scores_tmp])
        for obj_id in info["init_object_ids"]:
            self.state_pre_dict[obj_id] = info["init_bbox"]
            init_box = torch.tensor(info["init_bbox"]).view(-1)
            init_box[2:] += init_box[:2] # (x1, y1, x2, y2)
            init_box_rsz = init_box * r # coordinates on the resized image
            self.lbs_pre_dict[obj_id] = F.interpolate(get_label_map(init_box_rsz, self.input_size[0], self.input_size[1]) \
                , scale_factor=1/8, mode="bilinear", align_corners=False)[0].flatten(-2).to(self.device) # (1, H/8*W/8)
            final_mask_dict[obj_id] = info["init_mask"]
    # Deal with overlapped masks
    cur_obj_ids = copy.deepcopy(self.init_object_ids)
    for obj_ids_inter in self.obj_ids_new:
        cur_obj_ids += obj_ids_inter
    if "init_object_ids" in info.keys():
        cur_obj_ids += info["init_object_ids"]
    # soft aggregation
    cur_obj_ids_int = [int(x) for x in cur_obj_ids]
    mask_merge = np.zeros((self.H, self.W, max(cur_obj_ids_int)+1)) # (H, W, N+1)
    tmp_list = []
    for cur_id in cur_obj_ids:
        mask_merge[:, :, int(cur_id)] = final_mask_dict[cur_id]
        tmp_list.append(final_mask_dict[cur_id])
    back_prob = np.prod(1 - np.stack(tmp_list, axis=-1), axis=-1, keepdims=False)
    mask_merge[:, :, 0] = back_prob
    mask_merge_final = np.argmax(mask_merge, axis=-1) # (H, W)
    for cur_id in cur_obj_ids:
        final_mask_dict[cur_id] = (mask_merge_final == int(cur_id))
    """get the final result"""
    final_mask = np.zeros((self.H, self.W), dtype=np.uint8)
    # for obj_id in cur_obj_ids:
    #     final_mask[final_mask_dict[obj_id]==1] = int(obj_id)
    final_mask = mask_merge_final
    return {"segmentation": final_mask}

But the tracking and segmentation results is "0, 0, 0, 0"

Can you help me?

Custom dataset

Hello, thanks for this interesting project, wanted to ask how can i apply the tracker to a custom-trained yolox model of my own,
I have the model and i already integrated it with ByteTrack, is there any script of readme that can help me with this?

Using Pretrained-embeddings along with custom trained detections

So i was trying to train for tracking, using qdtrack association, and this requires a lot of computational power, which i can get, but first i wanted to test how efficient will the method be,

I have a custom detector that i trained, can i use this detector for detections, and your pretrained model for embeddings and id association, or that would tear up the association accuracy?

Thanks in advance.

Web demo and models on Hugging Face

Hi there, congrats on the release and on the acceptance to ECCV 2022! I got SOT working on my local machine, but getting the other video-level tasks to work has been a bit difficult, so I wondered if you'd find it useful to have a demo available. To make it easier for people to tinker with your work, would you be interested in adding the models and a web demo to Hugging Face? The Hugging Face hub offers free hosting, and I'd be more than happy to help out if it's something you're interested in.

Installation error

q1
I am going through the installation step, can't understand the following error. Can anyone help me to resolve this.

Cuda mismatch error when building module for ModuleNotFoundError: No module named 'MultiScaleDeformableAttention'

I am facing when running I faced ModuleNotFoundError: No module named 'MultiScaleDeformableAttention' error so I tried unicorn/models/ops/python setup.py build and python setup.py install but I am getting the error below

raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda)) RuntimeError: The detected CUDA version (10.1) mismatches the version that was used to compile PyTorch (11.7). Please make sure to use the same CUDA versions.

I see that Deformable-DETR is only available until 10.1 or something below, how to fix it?

CPU-mode not supported?

Under the assumption that I can try out Unicorn with some pre-trained models, I tried to install it on a Mac and a PC, both w/o nVidia GPU. In step 3 when executing bash make.sh, setup.py is called which is testing for torch.cuda.is_available() and is raising an exception if the function returns false. So running in CPU-mode is not foreseen?

Question regarding input tensor preprocessing

Hi,

While following along the inference.py code to see how the model is working, I noticed something in the preprocessing code.

PreprocessorX at external/lib/test/tracker/unicorn_sot.py:111 turns RGB format back to BGR format and the normalization process is missing. The self.normalize is not referenced at all as well. So the input seems to be raw image in BGR format with values between range [0, 255].

I wasn't able to find any code that performs normalization in the forward functions of the inner models as well.
Is it just that the model was trained on raw pixel values or am I missing anything?

ValueError: Invalid num_classes

Hello @MasterBin-IIAU, when i use exp file unicorn_track_large.py, along with the weights unicorn_det_convnext_large_800x1280, model loads well, as long as num of classes is 8 or 80, but when i try to accommodate this to my dataset (num_classes = 11), to retrain on my custom database, the model raises the error Invalid num_classes,
Is this the expected behaviour? i want to train a detector that can be used in qdtrack association (track_omni.py script), trained on my dataset (11 classes), is that possible with the current scripts?
Thanks in advance.

Single GPU?

What needs to be changed if I’d like to train the network with only a single gpu? Is this possible?

thank you! And great work :)

Dockerfile anywhere?

Do you have any plans to create a Dockerfile for all the environments?

Thanks

mmtracking framework support ?

Hi,

This is a very awesome work. I want to integrate it into mmtracking framework.
Do you agree it? I want to refactor code and Pull requests to mmtracking git .

Using custom Yolo model and video inference

  • Is there a way to upload a custom trained model then use it to track and perform inference in a custom video?
  • Does Unicorn only accept YoloX, or would it be able to accept a generic (ex Yolo5/6/7) trained model?

This may be addressed somewhere in the code, but I have not found it yet. Any help would be amazing!

SOT参数问题

图片
大佬好啊,我是刚入门的小白,想问下test.py的tracker_name和tracker_para可以填啥嘞,想跑那个SOT的测试,但是发现这个和pytracking不太一样,想知道在那个default里填啥才可以跑SOT的模型

为什么没有尝试一下使用RepLKNet作为backbone呢?

ConvNext似乎在下游任务上表现不是非常优,您能做到非常优秀的结果,是否有什么好的经验呢?
二是您有没有尝试过RepLKNet呢?这里面的选择有什么考量吗?如果做过的话,也希望如果方便的话,最好也能提供一下实验结果。
最后关注到您是使用16a100完成的,请问如果显存只有11g的卡8能否容得下呢?

再次感谢您大统一的优秀工作!冒昧再次打扰,感谢!

ConvNext does not seem to be doing very well on downstream tasks. Do you have any good experience in achieving very good results?
Second, have you tried RepLKNet? Is there any consideration in the selection? If so, I also hope to provide the experimental results if it is convenient.
Finally, I noticed that you used 16* A100 to complete the task. Can the card *8 with only 11G memory hold the task?

Thank you again for your great work! Thank you for bothering me again!

on edge devices

does the models work on edge devices and what specs/qualifications

MOT/MOTS推理问题

在MOT/MOTS推理阶段是否存在类似于feature propagation这样的操作?论文framework图中的reference targets是如何体现在推理阶段的?

List index out of range in convert_to_coco_format

Hello @MasterBin-IIAU, thank you for your work and publishing it.

I am currently trying to setup an environment to benchmark Unicorn vs. another algorithm from someone in my company during a project to proof my expertise in an internal AI degree. So bear with me when I am not 100% sure about wording and what I am doing :-).

In a first step, I intend to run a MOT only test on MOT Challenge 17 data, as in the beginning BDD data was not completely loaded. I installed the python environment, although I use python 3.8 and CUDA 11.6 as well as PyTorch 1.12 (just to let you know)

Then I was able to run

python launch_uni-py --name unicorn_track_tiny_mot_only.py --nproc_per_node=2 --batch 16 --mode multiple

but as training would take 10 days, I want to use your provided model zoo. Therefore I created a directory called Unicorn_outputs/unicorn_track_tiny_mot_only and placed the pre-trained latest_ckpt.pth from model zoo in it. I also changed mot_test_name to motchallenge in exp/unicorn_track.py but there is anyhow no difference when I don't change it (after I now loaded all BDD data as well).

When I call

python tools/track.py -f expos/default/unicorn_track_tiny_mot_only.py -c Unicorn_outputs/unicorn_track_tiny_mot_only/latest_ckpt.pth -b 1 -d 1

it throws an list index out of range error in the function convert_to_coco_format where it retrieves the label, as the data loader.dataset.class_ids is only of dimension 1, which means it only knows one I think this is expected, as MOT 17 only knows one class. But the actual output it is working on contains label-numbers 0,1,2,3,6, and 7. Variable cls is a tensor starting with 10 '0' values that work, but certainly the '7' in the next position is throwing the error.

My assumption is that something is still wrong with the number of classes but don't know how to proceed. I did some debugging but currently I don't find the solution.

Thanks for an answer, Carsten.

Which device for test?

Hi, awesome work!
I want to know which GPU you use for the inference speed in paper?
Or what is the minimum gpu device requirements to run the model?

install failed

my version are:
torch 1.10.0
torchaudio 0.10.0
torchvision 0.11.0

but this error:

Traceback (most recent call last):
File "tools/test_omni.py", line 11, in
from mmdet.datasets import build_dataset
File "/home/yj/.local/lib/python3.7/site-packages/mmdet/init.py", line 18, in
f'MMCV=={mmcv.version} is used but incompatible. '
AssertionError: MMCV==1.4.6 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.1.0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.