aleksandrkim61 / eagermot Goto Github PK

View Code? Open in Web Editor NEW

390.0 390.0 80.0 28.14 MB

Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

License: MIT License

TeX 0.13% Python 99.87%

3dmot icra mot paper sensor-fusion

eagermot's People

Contributors

Stargazers

Watchers

Forkers

bygreencn jlqzzz collector-m satoshirobatofujimoto shike1239 mfkiwl benjamesbabala doublepoints chisyliu tmactmac1992 zyg11 xiaojake cv-ip liuwq0809 trendingtechnology will-yh magictz jdschuma 24werewolf sandropapais chaosles ruanhailiang royzon bailiping kentaroyoshikawa fungungun kinggreat24 lzypoiuy marcelomata openaiseven hlchen1043 gdb123 janino163 prabhakaran-gokul xibinyue himelys sabadijou jie311 mdane-x dsx0511 yuhuangying innat ahmedstohy icra-2021 pragyandahal evanmey haopo2005 jrkjamie neu-yuezhang yanwu-ge lixiaoyu2000 elijahahianyo ventura65 luhao2021 mengxingshifen1218 hcheng1005 farhanaristo yangkang779 guangjinzheng oceanechy ridz-rs eliaswilliamgit gaojinghan1 gehadza jiangxianghan sp4rq-dev akshaynarla perception-repo odcosgrove avi9700 wistrong tonyx19 bethwelbruce zh-lei g3parking kovalp mryu001 raghum6

eagermot's Issues

Query on result for nuscenes data with centerpoint and mmdetection_cascade_x101 detections

Dear Kim,
Thanks for sharing your code. I am trying to run tracking on v1.0-mini version of the nuScenes dataset downloaded from here, with the detections you provided in the drive centerpoint and mmdetection. Gave SPLIT as mini_val and modified the val/test in the detections folder to min_val/min_test and ran the code to see the results.
I was getting below result which I couldn't understand. Please help me understand the tracking results and if the dataset and detections are properly provided.

Console output after I ran run_tracking.py:

Thanks,
Gayathri.

visualize code - utils_viz issue&render_option.json

Thank you for solving the missing function, now it can output some visualization result. However, after the first frame, some warning come out:

~/EagerMOT/utils/utils_viz.py:132: RuntimeWarning: invalid value encountered in true_divide
axis_ = axis_ / np.linalg.norm(axis_)
[Open3D WARNING] [ViewControl] ConvertFromPinholeCameraParameters() failed because window height and width do not match.
[Open3D WARNING] Read JSON failed: unable to open file: render_option.json

And when I look into the saved output figures, it seems all dark, no contents.

Thanks and looking forward to the fix!

About the normalized cosine distance in paper

Hi, I not find the normalized cosine distance proposed in the paper in the source code. And not sure how it's calculated, and why it ranges from [1,2], not [1,3] ?

KITTI MOTS dataset

Hi,

I would like to test this code with the KITTI MOTS dataset but it is not very clear to me how to do it, is it possible? In the article I read that you have tested with KITTI MOTS but then in the configuration it seems that it has been done with KITTI.

Thank you very much

no detections for

hello, thank you for your work!
I have tried to train EagerMOT with center point and mmdetection, but I got this error. Could you give some help?

Parsing NuScenes v1.0-mini ...
======
Loading NuScenes tables for version v1.0-mini...
23 category,
8 attribute,
4 visibility,
911 instance,
12 sensor,
120 calibrated_sensor,
31206 ego_pose,
8 log,
10 scene,
404 sample,
31206 sample_data,
18538 sample_annotation,
4 map,
Done loading in 0.360 seconds.
======
Reverse indexing ...
Done reverse indexing in 0.1 seconds.
======
Done parsing
Starting sequence: scene-0061
Processing frame ca9a282c9e77460f8360f564131a8af5
Parsing /home/zxl/wyd_EagerMOT/EagerMOT-open_main/centerpoint/mini_train/detections1.json
Traceback (most recent call last):
  File "run_tracking.py", line 176, in <module>
    run_on_nuscenes()
  File "run_tracking.py", line 151, in run_on_nuscenes
    mot_dataset, NUSCENES_BEST_PARAMS, target_sequences, sequences_to_exclude)
  File "run_tracking.py", line 131, in perform_tracking_with_params
    sequences_to_exclude=sequences_to_exclude)
  File "run_tracking.py", line 40, in perform_tracking_full
    run_info = sequence.perform_tracking_for_eval(params)
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/dataset_classes/mot_sequence.py", line 73, in perform_tracking_for_eval
    predicted_instances = frame.perform_tracking(params, run_info)
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/dataset_classes/mot_frame.py", line 166, in perform_tracking
    self.fuse_instances_and_save(params, run_info, load=False, save=False)
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/dataset_classes/mot_frame.py", line 66, in fuse_instances_and_save
    self.bboxes_3d, self.dets_2d_multicam, params["fusion_iou_threshold"])
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/dataset_classes/mot_frame.py", line 247, in dets_2d_multicam
    self.load_segmentations_2d_if_needed()
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/dataset_classes/mot_frame.py", line 224, in load_segmentations_2d_if_needed
    dets_2d_multicam = self.sequence.get_segmentations_for_frame(self.name)
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/dataset_classes/mot_sequence.py", line 93, in get_segmentations_for_frame
    self.dets_2d_multicam_per_frame = self.load_detections_2d()
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/dataset_classes/nuscenes/sequence.py", line 82, in load_detections_2d
    frames_cam_tokens_detections = loading.load_detections_2d_nuscenes(self.seg_source, self.token)
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/inputs/loading.py", line 100, in load_detections_2d_nuscenes
    return detections_2d.load_detections_2d_mmdetection_nuscenes(seq_name)
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/inputs/detections_2d.py", line 101, in load_detections_2d_mmdetection_nuscenes
    all_dets = utils.load_json_for_sequence(utils.DETECTIONS_MMDETECTION_CASCADE_NUIMAGES_NUSCENES, seq_name)
  File "/home/zxl/wyd_EagerMOT/EagerMOT-open_main/inputs/utils.py", line 151, in load_json_for_sequence
    raise NotADirectoryError(f"No detections for {target_seq_name}")
NotADirectoryError: No detections for cc8c0bf57f984915a77078b10eb33198

Does this require synchronous sensor inputs if used in real time?

I want to try using this repo but my lidar and cameras do not publish data at the same frequency. Is this capable of running in only Lidar or only camera data to update the current latest estimates?

Visualization of the results

Hello!

In the README you say that you will create another repository to visualize the results. Was it created or not yet?

Thank you!

Results in a video format

How to visualize the results in a video format?
Thanks in advance

Tracking on nuscenes data with centerpoint 3d detections on nuscenes and mmdetection cascade on nuimages

Hello @aleksandrkim61 ,
Thanks for the sharing your code.
I am trying to run tracking on nuscenes data with the exisiting detections provided in model zoo, but I am stuck at understanding how nuimges 2d detections and nuscenes 3d detections are correlated as the data in nuscenes is not same as nuimages.
mmdetection is run on nuimages and centerpoint is run on nuscenes and in the code we provide the path to the nuscenes and only correlated I could see is the class mapping from nuimages to nuscens. It would be really great if you can help me understand this.
Thanks,
Gayathri.

some clarifications regarding kitti dataset

Hi,
Thanks for open sourcing the work,I am interested in running some experiments using EagerMot on kitti dataset to get an understanding of how multi object tracking works.
Since kitti 2d tracking has multiple files,I just wanted to confirm that these are the necesary files which need to be downloaded
1.left color images
2.camera callibration matrices of track set
3.L-SVM reference detections
4.regionlet reference detections.

I am assuming that files 3 and 4 are not required,since the detection results are already provided right?
Apart from these files,is there anything else which i need to look into before evaluating the tracker?

Can you provide 2D TrackR-CNN detections (for testing) of KITTI MOT?There are no these datas on their Home linked pages.

Hi,
Thanks for sharing your code.I am trying to improve this work on KITTI MOT using detections of 3D PointGNN + 2D TrackR-CNN.
I saw that you use 2D detections from RRC for cars and TrackR-CNN for
pedestrians on the 2D MOT KITTI benchmark in the paper.But I find there are only TrackR-CNN detections (train+val) for KITTI MOTS on that linked pages and no detections of testing.

I have no GPU to train the model to obtain detections from original KITTI MOT testing datas. Can you provide the KITTI MOT‘s 2D detections (both cars and pedestrians) of testing data from TrackR-CNN？

Thanks a lot！

a problem about visualize.py

How to use the visualize.py ? After I run run_trucking.py, there is only txt and no json file

how to test the AMOTA MOTA

thanks for your work,
do you have any test code to get the AMOTA?
hope your reply

'class' in MOTSFusion detection/segmentation

HI,
How do you get the KEY 'class' in MOTSFusion detection/segmentation.txt/.json? I can not run the code due to program error: File ".../EagerMOT/inputs/detections_2d.py", line 77, in parse_motsfusion_seg
return (int(seg_json['class']), float(seg_json['score']),
KeyError: 'class'

about using AB3DMOT

Hi Thanks for your great works !

May I ask how could I change the code if using 'AB3DMOT' ?

Thanks

How to reproduce ego-motion files for KITTI?

Hello,
I tried to reproduce ego-motion npy files for the KITTI tracking sequences. I used both kitti_to_rosbag and KITTI360Scripts to convert oxts seq files to poses. I also transformed them into the rectified reference camera frame. However, as you can see in reproduced trajectory, xyz view and rpy view images the reproduced trajectories are slightly different than yours. Can you please explain how did you generate the ego motions?

Visualization code

Hello, thank you for your work, I want to know how to visualize the results, where is the code?

'MOTFrameNuScenes' object has no attribute 'bbox_3d_from_nu'

Thank you for updating the visualization script!
When I use it to visualiztaion the result in nuscenes, comes out this error:

Traceback (most recent call last):
File "visualize.py", line 398, in
target_sequences=target_sequences, target_frame=20, result_json=json_to_parse, radius=0.06, save_img=True)
File "visualize.py", line 114, in visualize_3d
geometries_to_add = vis_mot_predictions_submitted(frame, params, all_colors, tracking_results, radius, world)
File "visualize.py", line 229, in vis_mot_predictions_submitted
bbox_internal = frame.bbox_3d_from_nu(bbox, bbox_label, world=True)
AttributeError: 'MOTFrameNuScenes' object has no attribute 'bbox_3d_from_nu'

Is there some solution for that?

Any plan for releasing the code

Hi!
Thanks for your wonderful work!

I'm wondering if there is a plan for releasing the code of EagerMOT?

Thanks!

NuScenes Evaluation

Hello and thanks for the release of this code!

I would like to evaluate the tracking results using the NuScenes evaluation. I noticed that this code currently only evaluates one sequence at a time, and produces a single JSON file for each sequence. Would you recommend I just evaluate all sequences and combine them in a single JSON file for evaluation?

Question about the Track creation and confirmation

Hi, Thanks for sharing your code!

I have a question about the creation of unmatched detections. In the code, you seem to use a condition if instance.bbox3d is not None to create the new tracks for unmatched 3d detections only. How about the unmatched 2d detections? After reading your paper, I think EagerMot can identify the distant objects that only have 2d box in image.

EagerMOT/tracking/tracking_manager.py

Lines 203 to 210 in 3733e60

    
           # create and initialise new tracks for all unmatched detections 
        
           for cam, indices in unmatched_det_indices_final.items(): 
        
               for instance_i in indices: 
        
                   instance = leftover_det_instance_multicam[cam][instance_i] 
        
                   if instance.bbox3d is not None: 
        
                       self.trackers.append(Track(instance, self.is_angular)) 
        
           self.trackers.extend([Track(instance, self.is_angular) 
        
                                 for instance in leftover_det_instances_no_2d if instance.bbox3d is not None])

The second question is about the track confirmation. In your paper, you confirm the track if it was associated with an instance in the current frame and has been updated with 2d box in the last Age_2d frames. But your code logic seems to inconsistent with it. Please correct me if I'm wrong.

EagerMOT/tracking/tracking_manager.py

Lines 241 to 246 in 3733e60

    
           max_age_2d_for_class = self.max_age_2d[track.class_id - 1] 
        
           if track.time_since_2d_update < max_age_2d_for_class: 
        
               instance.bbox3d.confidence = track.confidence 
        
           else: 
        
               frames_since_allowed_no_2d_update = track.time_since_2d_update + 1 - max_age_2d_for_class 
        
               instance.bbox3d.confidence = track.confidence / \

The third question is about the concept about the track confirmation. In your code, you seem to only use hits to confirm the track. In your paper, you said that a track is considered confirmed if it was associated with an instance in the current frame and has been updated with 2D information in the last Age2d frames. So I'm a little confused about the concept.

EagerMOT/tracking/tracking_manager.py

Lines 261 to 266 in 3733e60

    
           def is_confirmed_track(self, class_id, hits, age_total): 
        
               required_hits = self.min_hits[class_id - 1] 
        
               if self.frame_count < required_hits: 
        
                   return hits >= age_total 
        
               else: 
        
                   return hits >= required_hits

Looking forward to your reply!

Required detections and segmentations

Hi, I'm trying to run the code as a state of the art reference for my master thesis work, and I was trying to run it with 3D PointGNN + (2D MOTSFusion+RRC) using KITTI and no other detections/segmentations so I thought I should comment the lines referring to other possibilities in the inputs/utils.py file, but when I run the code it stops due to not having TRACKRCNN data. How should I handle this? do I need those files aswell?

ego_motion

Hi, thanks for your work. Could you explain a little more about ego_motion? Also, how to prepare it with my own custom dataset?

IDs is higher

Hello, it's very nice work!
I have a question. Why the IDs seem to be higher than other methods at the same level？ Can you give an explanation ?

About testing

Thank you for your awesome work. I am trying to test your code in my computer but i am failed.My questions are as follows. I would appreciate for your reply.
1、I modified SPLIT = 'testing' in local_variables.py to test the code. Is it right?
2、I want to use PointGnn and TrackRCNN. How should i modify the code?

Which file to download and Format of Trackrcnn result

Hi! I am reproducing EagerMOT on KITTI with pointgnn as 3d detection and tracking_best(motsfusion + trackrcnn) as 2d detection for my bachelor degree thesis. But I am quite confusing about the format of the detection results of trackrcnn. In website of MOTS, under downloads there are 3 subtitles. I don't know which dataset should I use and their format. Could you please help me understand their format? Or can you give the source of format they use?

I download the files from Detection for Tracking Oly Challenge, and unzip MOTS20_detections.zip. There is a folder named KITTI_MOTS in it. And I use this as input, but I can not understand the input format. The txt format mentioned in https://www.vision.rwth-aachen.de/page/mots consisted of 6 parts for each line. But in each line in the txt I downloaded there are more than 10 items as follows:

0 723.8499755859375 173.62062072753906 185.27008056640625 134.31040954589844 0.9960479140281677 1 375 1242 YSZ8k0h:8J4K5L3K5M2M3M300N2O1N2O1N2O0O2N2O1N2N2O1O1N2O2M2O101M3N2N1O5L3M2L9I3L3N3M0O2O2N0O2O1O00001O0O100000000000O100O001N200M3O0M4O1O1O100000000000000000000000000000000O1000O100000000000000000000000000000000000000000000000O10001O0000000000000000001N10001O0000000000001O00001O0O2O001O00001O1O00100O1O1O10001OO01011M10010OO200N002O0O1O3M2N1O2N1O2N5K1O2N7I2N6I3N1N4K4L5KWRj3

I tried to compare the format with your code in input/detections_ed.py parse_trackrcnn_seg() function as follows.

def parse_trackrcnn_seg(seg_values):
    """ Returns class, score, mask, bbox and reid parsed from input """
    mask = {'size': [int(seg_values[7]), int(seg_values[8])],
            'counts': seg_values[9].strip().encode(encoding='UTF-8')}
    box = (int(float(seg_values[1])), int(float(seg_values[2])),
           int(float(seg_values[3])), int(float(seg_values[4])))
    return (int(seg_values[6]), float(seg_values[5]), mask, box, [float(e) for e in seg_values[10:]])

I guess some idems in file mean classes, scores, masks, boxes and reids. In MOTS website, the annotation mentioned run-length encoding but in this function, the rle(10th item, very long string) is assigned to mask['counts']. I don't understand what does this variable mean. Website say rle is related to cocotool. But I didn't find anything related to this mask['counts'] value and cocotool in the repo.

The txt downloaded from under MOTSChallenge seems like

0 1109.5554 179.36575 1197.3547 314.45007 0.9999083 2 375 1242 Ygf<5b;0000O2O1N_;0`D5L2N1N2N2cFFZ7=eHFT7?kHDP7?oHCn6`0PIAn6c0nH_On6g0nH[On6l0lHWOP7o0kHUOR7n0kHTOS7P1iHROR7V1hHmNo6b1eHdN[7l20O10000N2M3H8K5N2O1N2N2M3N2O1O1001O001O0015K2O2N1N1O1O001O4eNUGAU9MWG1Z:N0O0O101O1N3N000001N2O1N2N2O4LO1O2N2N2N1O001O2N2N2N2N000000kU`0 -0.3751167 0.48021537 0.032476578 -0.28660417 -0.70799315 -0.52072155 0.08180365 -0.013868877 0.036078844 -0.23432338 0.10030153 0.2857126 -0.53020716 0.12753601 0.40149367 0.7348276 0.043223802 -0.13538602 -0.14182042 -0.6249713 0.30748948 0.26873767 0.025597623 0.31074926 0.32362318 0.08508656 0.3480975 0.020496124 -0.1315603 -0.060836367 -0.39438733 -0.60612524 -0.15734667 0.08845482 0.075994976 0.21069686 0.06765656 -0.3943655 -0.050879166 0.26497495 -0.56978315 -0.5910222 0.0981341 -0.5647276 0.5951754 -0.10315818 -0.23011783 -0.8937163 0.36296442 0.23472416 0.2052533 0.17285214 -0.08307746 -0.26530197 -0.43209535 -0.13557851 0.25855196 -0.4168136 -0.2923897 0.2938376 0.7098037 0.39629406 -0.033923443 -0.17291501 -0.38073516 -0.07897187 0.37062654 -0.12985493 0.1492367 -0.45166814 -0.64741623 -0.5740453 -0.23283233 -0.14643145 -0.27898163 0.014514893 -0.1434794 -0.6008462 -0.09011394 -0.41281822 0.2717996 -0.96931094 -0.24767381 0.14481777 -0.23039247 -0.46699083 -0.07223604 0.04203764 0.26910537 0.24745579 -0.57074845 -0.078286625 -0.53346604 -0.29033712 -0.09410042 -0.27020353 -0.22399586 0.561881 -0.6308956 -0.006530372 -0.13324912 -0.33152327 -0.31110197 -0.2549216 -0.2163514 -0.34898254 0.21159562 0.29987532 -0.40363675 0.24261205 -0.33671173 0.81703144 0.46958938 -0.69749266 0.1615237 -0.50936264 -0.16553718 -0.1437751 -0.03610575 -0.030241877 0.27487156 0.75182754 -0.17875957 -0.520232 -0.029418062 0.15701526 0.051346615 -0.11979125

what is the number after rle( run-length encoding) mean? Do they mean reid?

I am new to object tracking field so the question above is probably basic. I will appreciate it if you give some kind help.

About running the testing split

Hi, I've tried running the code on the KITTI testing split but we I do so the program returns the error:
No such file or directory: 'data_kitti/testing\label_02\0001.txt'
I changed the split variable in the local_variable.py file and also run the adapt_kitti_motsfusion_input.py script as I'm using PointGNN and MOTSFusion for detections.
Also, is it possible to run the algorithm using 3D detections only without specifying the variable seg_source in the run_on_kitti function?
Thanks in advance

Question about AMOTP in ablation

Hi @aleksandrkim61, regarding the results in the paper, I noticed that the ablation study in Table IV using nuScenes dataset reports an AMOTP where higher is better. However, for the nuScenes benchmark and in their provided code, AMOTP is defined using center distance where lower is better. I'm wondering how the AMOTP in Table IV was computed. Was it using 3D IoU?
Thanks in advance!

Question about the result

Dear Kim:
Thanks for your brilliant work, I ran the EagerMOT on Kitti dataset, got the result shown above , is that right? Why there is no matched instances?
Best wishes!

Folder structure for nuscences data

Hi, @aleksandrkim61, could anybody please elaborate on the folder structure for nuscenes data? From the full dataset which as sweeps, samples, maps, v1.0-trainval how to structure it to train, val, test.
thanks!

Tentative Date for Release

Any tentative dates for the release of the code?

The concept seems like a natural successor to previous methods and I am very 'Eager' to try it :D

Does the code require any training?

I am just catching up on this work. I don't see a training script set up anywhere. Can this code be modified to run real time instead of a dataset without any training beforehand?

Detections for nuScenes mini dataset

Thanks for sharing your code!
I'm wondering if the CenterPoint and MMDetection detections are provided for the v1.0-mini version of the nuScenes dataset. As mentioned in the README, I changed SPLIT to mini-train / mini-val, but the CenterPoint and MMDetection folders from the google drive link contain detections for the (full) validation and test sets only. Am I missing something here? Thanks in advance!

can you provide detections data in google drive since it's hard to find in github repo?

can you provide following detections data?

EagerMot result format

Hello, I try to visualize my EagerMOT resuts. I used prepared pointgnn as 3D inputs and RRC segmentations as 2D input. I obtain following result for an object tracked in a sequence:

0 1 Pedestrian 0 0 -1.000000 -1 -1 -1 -1 1.666817 0.743471 0.980450 6.358918 1.613495 8.489952 1.077119 4.558822

What is the format of this output exactly? I used KITTI dataset so I assume it in kitti tracking format but the last value, which should be precision score in this case, is not compatible with precision score format ( which should be between 0-1). Can you give a brief explanation about this output format? For example the 3D object dimensions and locations (from 11. to 16. value) are the 3d bounding box info in image or lidar space?
Thanks!

	# create and initialise new tracks for all unmatched detections
	for cam, indices in unmatched_det_indices_final.items():
	for instance_i in indices:
	instance = leftover_det_instance_multicam[cam][instance_i]
	if instance.bbox3d is not None:
	self.trackers.append(Track(instance, self.is_angular))
	self.trackers.extend([Track(instance, self.is_angular)
	for instance in leftover_det_instances_no_2d if instance.bbox3d is not None])

	max_age_2d_for_class = self.max_age_2d[track.class_id - 1]
	if track.time_since_2d_update < max_age_2d_for_class:
	instance.bbox3d.confidence = track.confidence
	else:
	frames_since_allowed_no_2d_update = track.time_since_2d_update + 1 - max_age_2d_for_class
	instance.bbox3d.confidence = track.confidence / \

	def is_confirmed_track(self, class_id, hits, age_total):
	required_hits = self.min_hits[class_id - 1]
	if self.frame_count < required_hits:
	return hits >= age_total
	else:
	return hits >= required_hits

aleksandrkim61 / eagermot Goto Github PK

eagermot's People

Contributors

Stargazers

Watchers

Forkers

eagermot's Issues

Recommend Projects

Recommend Topics

Recommend Org