tusen-ai / lidar_rcnn Goto Github PK
View Code? Open in Web Editor NEWLiDAR R-CNN: An Efficient and Universal 3D Object Detector
LiDAR R-CNN: An Efficient and Universal 3D Object Detector
Hi, thank you for the great work. I have a bit confusion of the embedding channels in PointNet. In the paper it is said that "We shrink the number of embedding channels to [64, 64, 512] in PointNet to achieve fast inference speed while maintaining accuracy."
However, in the code, it seems to be [64, 128, 512]. Could you kindly clarify under which setting are the experiment results in the paper conducted? And how large is the effect of this channel difference (64->128) to the results. Thank you so much!
Hello, I am very interested in your paper, and I am reproducing it. Could you please provide the pretrained model of Table 5. It's about 3D AP results on WOD with three classes trained in one model. My email is [email protected]. Thank you very much.
Hi guys,
It's really a nice work!
According to the paper and the code,training the pointnet is using the propsals generated by pre-trained detectors(such as pointpillar).
So the training of the base detectors and the pointnet is separated.
I wonder whether you have tried to train the base detectors and the pointnet from scrath. That is, instead of using pre-trained base detector to generate proposals,training the detector and the pointnet jointly from scrath.
If we train the network in this way,the results would be better,worse,or almost the same?
Hello, really impressive work! Currently, I've been working on using your model to refine CenterPoint-pp results with another dataset(not public ones). I notice that when performing transformation from point cloud points to proposals coordinate system, you use pose information like:
However, the dataset I use has no precise pose information, but I think the proposals and the pc points of my model are already in the same coordinate system, thus I wonder in this case whether I could still use LiDAR R-CNN or I should resort to other second stage models. Could you provide me some advice? Thanks.
Hi, I'm more interested in multi-frame training, but I don't know how to combine them
Hi~Sorry to bother you again!
Is that right?
The prediction frames of all frames are extracted at one time and then disrupted globally, which means that when lidar RCNN trains a batch, it contains different boxes of different frames. When the batchsize is 256, the extreme case may contain up to 256 frames, and each frame takes a box.
Below is my idea!
If I train two frames at a time, extract proposals through the frozen one-stage network, and then use lidarcnn for end-to-end training, is it ok?Do u have an idea about how to design the ROI sampler ratio?
Hi,
Tutorial Link referred in the link https://github.com/tusen-ai/LiDAR_RCNN/blob/master/tools/data_processer/README.md is inactive
Snapshot:
Could you please give a document here for generating the dataset for testing?
def from_prediction_to_label_format(centers, sizes, headings,
pred_bbox):
l, w, h = (np.exp(sizes) * pred_bbox[:, [3, 4, 5]]).T
ry = headings.reshape(-1)
tx, ty, tz = (centers * pred_bbox[:, [3, 4, 5]]).T
return l, w, h, tx, ty, tz, ry
Hi, when I preparing training data following README.md I get some error code and can't generate kitti_results_train.bin data. The error code is as follows:
Message waymo.open_dataset.Objects exceeds maximum protobuf size of 2GB: 2375722334
I try to debug this error and find it is caused by the combine operate. The numbers of *.bin files is too big in training data and exceedss 2GB after combined all of them. So I want to know how to deal with this problem, I have tried to split training data in two .bin file but can't concat them in right format.
Thanks
This model file checkpoints/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth
in the docs cannot be found in mmdet3d official repo (they only have the interval-5 pretrained models). Are the proposals extracted with interval-1
models: 3d-car and 3d-3class? If I want to reproduce your results, do I need to first train with these two configs? Thanks.
111
I do greatly appreciate to your work, but I am comfused with the threshold of proposal during training, for example
IOU_THRESHOLD: [1, 0.7, 0.5, 1, 0.5]
in https://github.com/tusen-ai/LiDAR_RCNN/blob/master/config/lidar_rcnn_all_cls.yaml,
will not this filter proposals which are exactly ground truth but iou low than 0.5?
In the paper, "we enlarge its width and length to contain more contextual points around it". I wonder how much do you enlarge the box?
Hi, thank you for your great work. When I tried to generate proposal data of Waymo dataset using the command here for the training split. I modified the code as instructed in the doc, but there was lots of prompts like this:
79193 not found.
80014 not found.
79194 not found.
80015 not found.
79195 not found.
80016 not found.
79196 not found.
80017 not found.
Is this normal? Or this might due to some code modification of upstream mmdet3d in latest versions. Could you provide the commit hash of the mmdet3d code base you are using? Thanks!
Hello, sorry I come back with another question......
Recently, I've been working on using LiDAR R-CNN to refine the results of the CenterPoint-PP model with my own dataset. During data processing for my own dataset, I notice that the results of my CenterPoint-PP model has more bboxes detected than the ground truth ones (false detection case). When performing get_matching_by_iou function in LiDAR R-CNN, the obtained matching_gt_bbox has the same number of bboxes as the model predictions instead of the groundtruth data. I'm a bit confused about this process. Now that we are trying to do refinement, shouldn't we remove the falsely detected bboxes in the results and keep to the groundtruth? If so, why the matching bboxes is according to the predictions instead of groundtruth?
Maybe I have some misunderstandings here, it would be a great helper if you could give me some hints. Thanks in advance.
Hi there, I noticed that for each 3D proposal its width and length are enlarged to contain more contextual points around it, is it controlled by ''expand_proposal_meter " parameter?
If so, I'm wondering why it's set to 3 meter (so large)? Wouldn't it include nearby points on another car if cars are parked together and confuse the network? And did you guys do abalation studies on this parameter?
Thanks a lot!
Hi~
A great work,i‘d like to ask
For getting points in proposals, why only consider bev?
or am I wrong?
or can i consider points in 3d box?Will this improve?
MatrixXb extract_points(const py::EigenDRef<Eigen::MatrixXf> pc,
const py::EigenDRef<Eigen::VectorXf> bbox,
float expand, bool canonic) {
int pc_num = pc.rows();
float yaw = bbox(4);
float cos_yaw = std::cos(yaw);
float sin_yaw = std::sin(yaw);
MatrixXb valid_mask(pc_num, 1);
for (int i=0; i<pc_num;i++){
float r_x = (pc(i, 0) - bbox(0)) * cos_yaw + (pc(i, 1) - bbox(1)) * sin_yaw;
float r_y = (pc(i, 0) - bbox(0)) * (-sin_yaw) + (pc(i, 1) - bbox(1)) * cos_yaw;
//&& (pc(i, 2) < 2.0f)
if ((std::abs(r_x) < bbox(2)/2 + expand/2) && (std::abs(r_y) < bbox(3)/2 + expand/2)) {
valid_mask(i, 0) = true;
}
else{
valid_mask(i, 0) = false;
}
}
return valid_mask;
}
Hello, I am very interested in your paper, and I am now reproducing it. Could you please provide your pretrained model with car, my email is [email protected], thank you very much
What is pc_url_ri2, is it diffirent from pc_url?
I see pcd = np.vstack([pcd, pcd_ri2]) in load_data func
def get_proposal_dict(data, pc_path):
outputs_dict = {}
for o in data.objects:
output = [
o.object.box.center_x, o.object.box.center_y, o.object.box.center_z,
o.object.box.length, o.object.box.width, o.object.box.height,
o.object.box.heading, o.score, o.object.type
]
key = "{}/{}".format(o.context_name, o.frame_timestamp_micros)
if key not in outputs_dict:
outputs_dict[key] = defaultdict(list)
outputs_dict[key]['pred_lst'].append(output)
outputs_dict[key]['pc_url'] = '{}/segment-{}_with_camera_labels/{}_1.npz'.format(
pc_path, o.context_name, o.frame_timestamp_micros)
outputs_dict[key]['pc_url_ri2'] = '{}/segment-{}_with_camera_labels/{}_2.npz'.format(
pc_path, o.context_name, o.frame_timestamp_micros)
return outputs_dict
def load_data(it):
pcd, pcd_ri2, proposal, gt_box, gt_cls = pkl.loads(it['data'])
pcd = pcd.astype(np.single)[:, [0, 1, 2]]
pcd_ri2 = pcd_ri2.astype(np.single)[:, [0, 1, 2]]
pcd = np.vstack([pcd, pcd_ri2])
proposal = proposal.astype(np.single)[:7]
gt_box = gt_box.astype(np.single)[:7]
return pcd, proposal, gt_box, gt_cls
I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.
Here are the OpenMMLab 2.0 repos branches:
OpenMMLab 1.0 branch | OpenMMLab 2.0 branch | |
---|---|---|
MMEngine | 0.x | |
MMCV | 1.x | 2.x |
MMDetection | 0.x 、1.x、2.x | 3.x |
MMAction2 | 0.x | 1.x |
MMClassification | 0.x | 1.x |
MMSegmentation | 0.x | 1.x |
MMDetection3D | 0.x | 1.x |
MMEditing | 0.x | 1.x |
MMPose | 0.x | 1.x |
MMDeploy | 0.x | 1.x |
MMTracking | 0.x | 1.x |
MMOCR | 0.x | 1.x |
MMRazor | 0.x | 1.x |
MMSelfSup | 0.x | 1.x |
MMRotate | 1.x | 1.x |
MMYOLO | 0.x |
Attention: please create a new virtual environment for OpenMMLab 2.0.
Happy Dragon Boat Festival!
I see that the pointpillars in the mm3d you use are a two-stage model, so if I want to add your module to my original one-stage pointpillars, should it be divided into three steps?
By the way, it would be better if you have the relevant code implementation
hi~ Zhichao Li /Feng Wang/ Naiyan Wang~
我对你们的研究工作非常感兴趣,我在复现你们的工作,但是我没有hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth的模型去提取proposal;您可以提供我该预训练模型以及LiDAR R-CNN的模型吗?
(I am very interested in your research work, I am reproducing your work, but I do not have the model of hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth to extract the proposal; you can provide me with the pre-trained model and LiDAR R-CNN model?)
hi~ Zhichao Li /Feng Wang/ Naiyan Wang~
I am very interested in your work LIDAR RCNN, but when I use the LIDAR RCNN pretrained model you gave me checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb, I found that the weights of the pretrained model are 9 dimensions, but your input data is 12 dimensions.
Can you provide me a pretrained model whose dimensions are correctly matched.
I found that in one of your commits, the dimension was increased from 9 to 12 dimensions, but the latest pre-trained model is still 9 dimensions
thanks for your work!
I apply your work on KITTI dataset. However, I found that “aos” metric in KITTI datasets performs bad.
In the paper, when the predicted orientation is flipped with 180°, the orientation target is set to be the minimum residual calculated by original orientation and flipped orientation. It doesn't seem to fix the 180° flip orientation error.
In this function
First, thanks a lot for sending me checkpoints previously.
Following the instructions of your work, I would like to double check the following points related to the path of config files.
If the pc_path and gt_path in config/mmdet3d_pp_train.yaml and config/mmdet3d_pp_val.yaml share the same path of the decode points(pc) and ground truth(gt) generated in the first step.
The data_path in mmdet3d_pp_train.yaml is kitti_results_train.bin, but that of mmdet3d_pp_val.yaml(kitti_results.bin) was not generated in pp generating.
Thank you so much.
Hello, I wonder what is the 'num_lidar_points_in_box'? Do the waymo datasets has this key?
Hi,
I am able to do all setup as per instructions given in README
In the evaluation step,
python -m torch.distributed.launch --nproc_per_node=4 tools/test.py --cfg config/lidar_rcnn.yaml --checkpoint outputs/lidar_rcnn/checkpoint_lidar_rcnn_59.pth.tar
python tools/create_results.py --cfg config/lidar_rcnn.yaml
I am facing the following questions while running the evaluation.
Hello, just like the title saying, I wonder what are the specific processes for WOD, which means if I want to use LIDAR R-CNN on my own dataset, I have to do it differently. I already change the data_processor and everything I can think of in the loader and creat_results that are respect to waymo dataset, then I use the refined results to perform evaluation on my own dataset. However, I got NAN on rotation error, and the MAP is pretty low.
Therefore, I'm confused about some subtle processes that are performed just for waymo not for other datasets. For example, compute heading residual is necessary for using LiDAR R-CNN? Did you guys use rotation in some sublte ways? (In my dataset, the rotation is according to y axis, while in your code, it's x axis, but the way of computing rotZ is the same, I already changed it.)
This bug has been driving me crazy, that's why my issue description above is a bit messy, forgive me please. I would be grateful if you could provide me some hints. Thank you a lot. Save this almost desperate kid, please.🥺
hi, guys, this is a good work. I'm curious about the hyperparameters in this work, which hyperparameter is sensitive to dataset? when I apply this methods to my own data, the results is not very good, should I adjust some hyperparameter to get a better performance ?
not find the pretrained model to extract proposal and checkpoints to test LiDarRcnn
Thank you for your great work!
I am quite interested in your work.I want to know how to run the code in KITTI dataset,it seems that the code you released doesn't support KITTI.Looking forward to your reply.
Hi, I am very interested in your paper, and I am reproducing it. Could you please provide the pretrained model of pointpillar in mmdet3d? Thank you very much!
hi~ Zhichao Li /Feng Wang/ Naiyan Wang~
我对你们的工作LIDAR RCNN非常感兴趣,但是我在使用您给我的LIDAR RCNN预训练模型checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb 时,发现预训练模型的权重是9维,但是你们的输入数据是12维12维
您可以提供给我维度可以正确匹配的预训练模型吗
when will the code release?
Hi developers of LiDAR R-CNN,
Congrats on the acceptance of the paper!
LiDAR R-CNN achieves new state-of-the-art results through simple yet effective improvement, which is very insightful to the community. We also found that the baseline is based on the implementations in MMDetection3D.
Therefore, I am coming to ask, as we believe LiDAR R-CNN might have a great impact on the community, would you like to also contribute an implementation of LiDAR R-CNN to MMDetection3D?
If so, maybe we could have a more detailed discussion about that? MMDetection3D welcomes any kind of contribution. Please feel free to ask if there is anything from the MMDet3D team that could help.
On behalf of the MMDet3D Development Team
BR,
Wenwei
Thanks for your awesome works. When I use Lidar-RCNN on my own dataset, the refine score is useless, Most objects are classified as backgrounds. In addition, the average refined center error is only reduced by 1 cm. I don't know Is this normal?
Thank you for your great work! I have a question about loss function. In this to_pcdet fuction, it seems that the dataset format in bbox is [xyz_lidar, l, w, h, rz]. This format looks the same as openpcdet. Why does it need to convert to pcdet format again?
When I transfered it to the CenterpointNet and nuscenes datasets, Then evaluated on nuscense, it didn’t seem to work. I don’t know what went wrong, Looking forward to your suggestions and comments.
Nice work! Could you please tell how fast your model run on WAYMO-data and which device?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.