stanfordvl / jrmot_ros Goto Github PK

Source code for JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset

License: MIT License

CMake 0.98% Python 99.02%

jrmot_ros's Introduction

JRMOT ROS package

The repository contains the code for the work "JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset".

Note that due to the global pandemic, this repository is still a work in progress. Updates will be made as soon as possible.

Introduction

JRMOT is a 3D multi object tracking system that:

Is real-time
Is online
Fuses 2D and 3D information
Achieves State of the Art performance on KITTI

We also release JRDB:

A dataset with over 2 million annotated boxes and 3500 time consistent trajectories in 2D and 3D
Captured in social, human-centric settings
Captured by our social mobile-manipulator JackRabbot
Contains 360 degree cylindrical images, stereo camera images, 3D pointclouds and more sensing modalties

All information, including download links for JRDB can be found here.

JRMOT

Our system is built on top of state of the art 2D and 3D detectors (mask-RCNN and F-PointNet respectively). These detections are associated with predicted track locations at every time step.
Association is done via a novel feature fusion, as well as a cost selection procedure, followed by Kalman state gating and JPDA.
Given the JPDA output, we use both 2D and 3D detections in a novel multi-modal Kalman filter to update the track locations.

Using the code

There are 3 nodes forming parts of the ROS package:

3d_detector.py: Runs F-PointNet, which performs 3D detection and 3D feature extraction
template.py: Runs Aligned-Re-ID, which performs 2D feature extraction
tracker_3d_node.py: Performs tracking while taking both 2D detections + features and 3D detections + features as input

The launch file in the folder "launch" launches all 3 nodes.

Dependencies

The following are dependencies of the code:

2D detector: The 2D detector is not included in this package. To interface with your own 2D detector, please modify the file template.py to subscribe to the correct topic, and also to handle the conversion from ROS message to numpy array.
Spencer People Tracking messages: The final tracker output is in a Spencer People Tracking message. Please install this package and include these message types.
Various python packages: These can be found in requirements.txt.. Please install all dependencies prior to running the code (including CUDA and cuDNN. Additionally, this code requires a solver called Gurobi. Instructions to install gurobipy can be found here.
Weight files: The trained weights, (trained on JRDB) for FPointNet and Aligne-ReID can be found here.

Citation

If you find this work useful, please cite:

@INPROCEEDINGS{shenoi2020jrmot,
  author={A. {Shenoi} and M. {Patel} and J. {Gwak} and P. {Goebel} and A. {Sadeghian} and H. {Rezatofighi} and R. {Mart\'in-Mart\'in} and S. {Savarese}},
  booktitle={2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, 
  title={JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset}, 
  year={2020},
  volume={},
  number={},
  pages={10335-10342},
  doi={10.1109/IROS45743.2020.9341635}}

If you utilise our dataset, please also cite:

@article{martin2019jrdb,
  title={JRDB: A dataset and benchmark of egocentric visual perception for navigation in human environments},
  author={Mart{\'i}n-Mart{\'i}n, Roberto and Patel, Mihir and Rezatofighi, Hamid and Shenoi, Abhijeet and Gwak, JunYoung and Frankel, Eric and Sadeghian, Amir and Savarese, Silvio},
  journal={arXiv preprint arXiv:1910.11792},
  year={2019}
}

jrmot_ros's People

Contributors

Stargazers

Watchers

jrmot_ros's Issues

Where can I download your full weights

The weight file provided in Google Web Drive is incomplete. Where can I download the full weight file？@StanfordVL

labels in Range image

Hi all, In order to get the range-image, I define the following transformation for each point p of point cloud with coordinates (x, y, z)

where α and β are the zenith and azimuth angles respectively, ∆α and ∆β fixed-size steps generate by the resolution grid of the range-image; and x̄ and ȳ are indexes which define 2D pixel coordinates of the 2D spherical image.
here also is my code:

angular = (0.5236, 2 * np.pi)     # approximation of 30º and 360º
shape = (16, 384)     #image shape
height, width = shape
y_angular, x_angular = angular
x_delta, y_delta = x_angular / width, y_angular / height

x, y, z = np.array(self.pcd.points).T
r = np.sqrt(x**2 + y**2 + z**2)

azimuth_angle = np.arctan2(-y, x) % (2 * np.pi)
elevation_angle = np.arcsin(z / r)

x_img = np.floor(azimuth_angle / self.x_delta).astype(int)
y_img = np.floor(elevation_angle / self.y_delta).astype(int)
y_img -= y_img.min()

on the other hand I have labels as cx, cy, cz, l, w, h in label_list getting from label 3D for each point cloud. (cx, cy, cz)
coordinates of the center of the cuboid, (l, w, h) are the length, width and height, as mentioned here. considering this 3D labels I get all points which is located in a given cuboid and assign them as 1, the rest is assigned 0. the way that I do is :

for cx, cy, cz, l, w, h in label_list:            
      self.pcd_labels[np.all([cx - l/2 <= x, 
                              cy - w/2 <= y,
                              cz - h/2 <= z,
                              x <= cx + l/2,
                              y <= cy + w/2,
                              z <= cz + h/2
                             ], axis=0)] = 1

the result is as below, as you see there are some false in labels.

what can be the problem?

Thanks in advance for you supports :)

Where can I download your dataset?

https://jrdb.stanford.edu/

shows me information below:

Service Unavailable
The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

License

Hi, What License are you planning to associate with the codebase?

Run on KITTI dataset

Hi, thanks for your great work. I wonder how did you run your code on KITTI tracking dataset?

Is there any pretrained model on KITTI tracking dataset?

Bugs with 3d_detector.py

Hi! Thank you very much for sharing the code for your paper!

I deployed your code in my own environment and also wrote the program that reads the JRDB dataset and released it via ROS. When I run 3d_detector.py, I can get into the get_3d_feature() callback function.

However, my program would crash and abort immediately after running. When I use breakpoints for debugging, the place where I find the program aborts is here:

# jpda_rospack/src/featurepointnet_model.py
try:
    batch_centers, \
    batch_heading_scores, batch_heading_residuals, \
    batch_size_scores, batch_size_residuals, batch_features = \
    self.sess.run([self.ops['center'],
                   ep['heading_scores'], ep['heading_residuals'],
                   ep['size_scores'], ep['size_residuals'], self.ops['depth_feature']],
                   feed_dict=feed_dict)
except Exception as e:
     print(e)

and the error message is as follows:

[INFO] [1600131005.769569]: 3D detector ready.
2020-09-15 08:50:08.476800: E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 6021 (compatibility version 6000).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2020-09-15 08:50:08.477254: W ./tensorflow/stream_executor/stream.h:1939] attempting to perform DNN operation using StreamExecutor without DNN support
cuDNN launch failure : input shape ([4,1024,1,64])
	 [[Node: conv1/bn/cond/FusedBatchNorm_1 = FusedBatchNorm[T=DT_FLOAT, data_format="NHWC", epsilon=0.001, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv1/bn/cond/FusedBatchNorm_1/Switch, conv1/bn/cond/FusedBatchNorm_1/Switch_1, conv1/bn/cond/FusedBatchNorm_1/Switch_2, conv1/bn/cond_1/AssignMovingAvg/sub/Switch, conv1/bn/cond_1/AssignMovingAvg_1/sub/Switch)]]
Caused by op 'conv1/bn/cond/FusedBatchNorm_1', defined at:
  File "/home/zlin/software/pycharm-2020-2/plugins/python/helpers/pydev/pydevd.py", line 2141, in <module>
    main()
  File "/home/zlin/software/pycharm-2020-2/plugins/python/helpers/pydev/pydevd.py", line 2132, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/zlin/software/pycharm-2020-2/plugins/python/helpers/pydev/pydevd.py", line 1441, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "/home/zlin/software/pycharm-2020-2/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/zlin/software/pycharm-2020-2/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/3d_detector.py", line 203, in <module>
    main(sys.argv)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/3d_detector.py", line 197, in main
    Detector_3d()
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/3d_detector.py", line 51, in __init__
    self.depth_model = create_depth_model('FPointNet', fpointnet_config)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/featurepointnet_model.py", line 339, in create_depth_model
    return FPointNet(config_path)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/featurepointnet_model.py", line 28, in __init__
    end_points, depth_feature = self.get_model(pointclouds_pl, one_hot_vec_pl, is_training_pl)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/featurepointnet_model.py", line 271, in get_model
    is_training, bn_decay, end_points)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/featurepointnet_model.py", line 154, in get_instance_seg_v1_net
    scope='conv1', bn_decay=bn_decay)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/featurepointnet_tf_util.py", line 181, in conv2d
    data_format=data_format)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/featurepointnet_tf_util.py", line 577, in batch_norm_for_conv2d
    return batch_norm_template(inputs, is_training, scope, [0,1,2], bn_decay, data_format)
  File "/home/zlin/PycharmProjects/MOT/catkin_ws_jrmot/src/jpda_rospack/src/featurepointnet_tf_util.py", line 531, in batch_norm_template
    data_format=data_format)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
    return func(*args, **current_args)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 592, in batch_norm
    scope=scope)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 401, in _fused_batch_norm
    _fused_batch_norm_inference)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/utils.py", line 217, in smart_cond
    return control_flow_ops.cond(pred, fn1, fn2, name)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 316, in new_func
    return func(*args, **kwargs)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1864, in cond
    orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1725, in BuildCondBranch
    original_result = fn()
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 398, in _fused_batch_norm_inference
    data_format=data_format)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/nn_impl.py", line 831, in fused_batch_norm
    name=name)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 2034, in _fused_batch_norm
    is_training=is_training, name=name)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/home/zlin/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access
InternalError (see above for traceback): cuDNN launch failure : input shape ([4,1024,1,64])
	 [[Node: conv1/bn/cond/FusedBatchNorm_1 = FusedBatchNorm[T=DT_FLOAT, data_format="NHWC", epsilon=0.001, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv1/bn/cond/FusedBatchNorm_1/Switch, conv1/bn/cond/FusedBatchNorm_1/Switch_1, conv1/bn/cond/FusedBatchNorm_1/Switch_2, conv1/bn/cond_1/AssignMovingAvg/sub/Switch, conv1/bn/cond_1/AssignMovingAvg_1/sub/Switch)]]

Training settings

Hi Abhijeet!
Congratulations to your work and thank you for sharing your code to the whole community. But I want to ask how I can train your model as I didn't see the related instruction in README you released.

Best regards!

Velodyne points to camera image Result

hi, thank you for this repository,
I'm really new in this track and have some questions, I want to project JRDB point clouds to RGB image and have some problems. considering stitched image and its point cloud I want to have something like this, how can I do such projection. I tried project_ref_to_image_torch, project_velo_to_ref, move_lidar_to_camera_frame and did :

def print_projection_plt(points, image):
    """ project converted velodyne points into camera image """
    
    hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

    for i in range(points.shape[1]):
        cv2.circle(hsv_image, (np.int64(points[0][i]),np.int64(points[1][i])),5, (0,0,255),-1)

    return cv2.cvtColor(hsv_image, cv2.COLOR_HSV2RGB)


pcd = o3d.io.read_point_cloud(pcd_path)
pointcloud = np.asarray(pcd.points)
pointcloud = torch.tensor(pointcloud)

pnt = OmniCalibration(calib_folder='calib').project_ref_to_image_torch(pointcloud)
pnt = np.asarray(pnt).T[:, 1::5] # one point in 5 points

image = cv2.imread(img_path)


image = print_projection_plt(points=pnt, image=image)

but results are not correct.
thank you

Transformation lidar to rgb

Thank you for your great work and this impressive dataset. I was having a closer look at this repo in order to understand how you are using the new JRDB for your purpose of tracking. I want to use it in the context of 3D people detection and I am currently wondering which transformations to apply.

I am training my detection network in the Velodyne coordinate system (i.e. x: forward, y:left, z:up) but something still seems odd, when visualizing the merged point clouds (upper and lower lidar) and the 3D bounding box labels. I have first tried transforming both point clouds to the base coordinate system, as well as the center coordinates (transformations taken from here). Alternatively, I tried to translate only the point cloud from the lower velodyne coordinate system into the upper velodyne coordinate system, since the bounding boxes have been annotated with reference to the upper velodyne coordinate system. Then I undo the centering of the center annotations on the RGB camera by translating them back up in the upper velodyne coordinate system. In my understanding both approaches should lead to the same result? But also in the visualization the point cloud appears to be a bit spherical.

Do you have any recommendation on which transformations to apply and to which coordinate system to transform all the point clouds and annotations? I also upload my code for the transformations here:
jrdb_transform.py.zip

Moreover, I am not sure, if there might be a bug in the function _ move_lidar_to_camera_frame_ in the calibration source code (see.

JRMOT_ROS/src/calibration.py

Line 435 in 2b72612

pointcloud[:,:3] - torch.Tensor(self.global_config_dict['calibrated']

). You subtract the translation vector (loaded from the calibration.yaml) from the points but this reverses the effect of the translation vectors as defined in the calibration.yaml (lidar_upper_to_rgb: translation: [0, 0, -0.33529]). I think either in the calibration.yaml or in the function the minus should be inverted, since the camera sits in between the upper and lower lidar sensor as far as I understood?

Thanks in advance for your advice and help! I would highly appreciate your input.

About JPDA

Hello， I read your code， but I cannot find where is the JPDAF moduel, would you please tell me which file is the implementation of the JPDAF?

System setup

Hello again,

I'm trying to setup the computer(NVIDIA Xavier) to use your package with ROS

From the code I guess package depends on the following ML frameworks:

Pytorch - for Re-ID of Image Features
Tensorflow - for 3D detection

I'm a bit confused as to how to setup the computer as ROS depends on Python-2 and the current git code-base for Pytorch is supported only for python3 onwards and only a few older versions of TF can be setup with python2. It would be nice if you could shed some light on system setup so that I can test the package on my computer as well.

Also, if you could tell which version of Tensorflow and Pytorch does this package depend on in your readme, that would be great.

Cheers to the great work !!

Gurobi installation for ARM

Hello,
Thanks a lot for the great work.

I'm trying to implement JRMOT in a robot that we are building. We make extensive use of Nvidia Xavier computers which are based on ARM architectures.

Unfortunately one of the dependencies of this package being Gurobi solver is not available for ARM based architecture. Is there a round about this solver?

Also I'm unable to find in which of the source code file Gurobi is being used. If I could have a look at it, I might try to replace it available solvers for ARM architecture.

Thanks a lot!

C++ implementation

Reduction in framerate

The 3d dector.py significantly reduces the framerate while running in realtime .. Is the class single threaded?

UnboundLocalError: local variable 'cov2' referenced before assignment

File "/home/baolong/Baolong/LXX/tianjun/JRMOT_ws/src/jpda_rospack/src/double_measurement_kf.py", line 180, in gating_distance
return EKF.squared_mahalanobis_distance(mean, cov2, measurements)
UnboundLocalError: local variable 'cov2' referenced before assignment