Code Monkey home page Code Monkey logo

lidar_rcnn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lidar_rcnn's Issues

Clarification of the PointNet embedding channels

Hi, thank you for the great work. I have a bit confusion of the embedding channels in PointNet. In the paper it is said that "We shrink the number of embedding channels to [64, 64, 512] in PointNet to achieve fast inference speed while maintaining accuracy."

However, in the code, it seems to be [64, 128, 512]. Could you kindly clarify under which setting are the experiment results in the paper conducted? And how large is the effect of this channel difference (64->128) to the results. Thank you so much!

The pretrained model

Hello, I am very interested in your paper, and I am reproducing it. Could you please provide the pretrained model of Table 5. It's about 3D AP results on WOD with three classes trained in one model. My email is [email protected]. Thank you very much.
1687186876100

Training from scratch

Hi guys,
It's really a nice work!
According to the paper and the code,training the pointnet is using the propsals generated by pre-trained detectors(such as pointpillar).
So the training of the base detectors and the pointnet is separated.
I wonder whether you have tried to train the base detectors and the pointnet from scrath. That is, instead of using pre-trained base detector to generate proposals,training the detector and the pointnet jointly from scrath.
If we train the network in this way,the results would be better,worse,or almost the same?

Is ego pose information essential for using LiDAR R-CNN?

Hello, really impressive work! Currently, I've been working on using your model to refine CenterPoint-pp results with another dataset(not public ones). I notice that when performing transformation from point cloud points to proposals coordinate system, you use pose information like:
image
However, the dataset I use has no precise pose information, but I think the proposals and the pc points of my model are already in the same coordinate system, thus I wonder in this case whether I could still use LiDAR R-CNN or I should resort to other second stage models. Could you provide me some advice? Thanks.

Question about virtual points

Thanks for your nice work!
The paper shows the different pooling method. To add virtual points in the proposal box is well for model result. But I can not find the about code in your project.
image
Can you tell me where the code about 'virtual points', waiting for your replay.
Thanks.

About train one iter data

Hi~Sorry to bother you again!
Is that right?
The prediction frames of all frames are extracted at one time and then disrupted globally, which means that when lidar RCNN trains a batch, it contains different boxes of different frames. When the batchsize is 256, the extreme case may contain up to 256 frames, and each frame takes a box.
Below is my idea!
If I train two frames at a time, extract proposals through the frozen one-stage network, and then use lidarcnn for end-to-end training, is it ok?Do u have an idea about how to design the ROI sampler ratio?

Exceeds maximum protobuf size when preparing training data

Hi, when I preparing training data following README.md I get some error code and can't generate kitti_results_train.bin data. The error code is as follows:
Message waymo.open_dataset.Objects exceeds maximum protobuf size of 2GB: 2375722334

I try to debug this error and find it is caused by the combine operate. The numbers of *.bin files is too big in training data and exceedss 2GB after combined all of them. So I want to know how to deal with this problem, I have tried to split training data in two .bin file but can't concat them in right format.

Thanks

How is the PP model trained

This model file checkpoints/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth in the docs cannot be found in mmdet3d official repo (they only have the interval-5 pretrained models). Are the proposals extracted with interval-1 models: 3d-car and 3d-3class? If I want to reproduce your results, do I need to first train with these two configs? Thanks.

Version of mmdet3d used in the project

Hi, thank you for your great work. When I tried to generate proposal data of Waymo dataset using the command here for the training split. I modified the code as instructed in the doc, but there was lots of prompts like this:

79193 not found.
80014 not found.
79194 not found.
80015 not found.
79195 not found.
80016 not found.
79196 not found.
80017 not found.

Is this normal? Or this might due to some code modification of upstream mmdet3d in latest versions. Could you provide the commit hash of the mmdet3d code base you are using? Thanks!

The num of boxes of matching_gt_bbox is more than that of valid_gt?

Hello, sorry I come back with another question......
Recently, I've been working on using LiDAR R-CNN to refine the results of the CenterPoint-PP model with my own dataset. During data processing for my own dataset, I notice that the results of my CenterPoint-PP model has more bboxes detected than the ground truth ones (false detection case). When performing get_matching_by_iou function in LiDAR R-CNN, the obtained matching_gt_bbox has the same number of bboxes as the model predictions instead of the groundtruth data. I'm a bit confused about this process. Now that we are trying to do refinement, shouldn't we remove the falsely detected bboxes in the results and keep to the groundtruth? If so, why the matching bboxes is according to the predictions instead of groundtruth?

issue

Maybe I have some misunderstandings here, it would be a great helper if you could give me some hints. Thanks in advance.

expand_proposal_meter

Hi there, I noticed that for each 3D proposal its width and length are enlarged to contain more contextual points around it, is it controlled by ''expand_proposal_meter " parameter?

If so, I'm wondering why it's set to 3 meter (so large)? Wouldn't it include nearby points on another car if cars are parked together and confuse the network? And did you guys do abalation studies on this parameter?

Thanks a lot!

About extract points

Hi~
A great work,i‘d like to ask
For getting points in proposals, why only consider bev?
or am I wrong?
or can i consider points in 3d box?Will this improve?

MatrixXb extract_points(const py::EigenDRef<Eigen::MatrixXf> pc,
                        const py::EigenDRef<Eigen::VectorXf> bbox,
                        float expand, bool canonic) {
  int pc_num = pc.rows();
  float yaw = bbox(4);
  float cos_yaw = std::cos(yaw);
  float sin_yaw = std::sin(yaw);

  MatrixXb valid_mask(pc_num, 1);
  for (int i=0; i<pc_num;i++){
    float r_x = (pc(i, 0) - bbox(0)) * cos_yaw + (pc(i, 1) - bbox(1)) * sin_yaw;
    float r_y = (pc(i, 0) - bbox(0)) * (-sin_yaw) + (pc(i, 1) - bbox(1)) * cos_yaw;
    //&& (pc(i, 2) < 2.0f)
    if ((std::abs(r_x) < bbox(2)/2 + expand/2) && (std::abs(r_y) < bbox(3)/2 + expand/2)) {
      valid_mask(i, 0) = true;
    }
    else{
      valid_mask(i, 0) = false;
    }
  }
  return valid_mask;
}

The pretrained model

Hi, I am very interested in your paper, and I am reproducing it. The pretrained model of pointpillar provided in mmdetection3d does not reach the performance shown in the Table 2 below, so could you please provide the pretrained model of pointpillar in Table 2? Thank you very much!

lidar_rcnn_per

Link for tf records

Hi,
I am trying to generate data as mentioned in the data processor README. Can you please share the link to get tf records here? Whether it is referring to perception dataset v1.2 or tf records shared via mail?
image

pc_url_ri2

What is pc_url_ri2, is it diffirent from pc_url?
I see pcd = np.vstack([pcd, pcd_ri2]) in load_data func

def get_proposal_dict(data, pc_path):
outputs_dict = {}
for o in data.objects:
output = [
o.object.box.center_x, o.object.box.center_y, o.object.box.center_z,
o.object.box.length, o.object.box.width, o.object.box.height,
o.object.box.heading, o.score, o.object.type
]
key = "{}/{}".format(o.context_name, o.frame_timestamp_micros)
if key not in outputs_dict:
outputs_dict[key] = defaultdict(list)
outputs_dict[key]['pred_lst'].append(output)
outputs_dict[key]['pc_url'] = '{}/segment-{}_with_camera_labels/{}_1.npz'.format(
pc_path, o.context_name, o.frame_timestamp_micros)
outputs_dict[key]['pc_url_ri2'] = '{}/segment-{}_with_camera_labels/{}_2.npz'.format(
pc_path, o.context_name, o.frame_timestamp_micros)
return outputs_dict

def load_data(it):
pcd, pcd_ri2, proposal, gt_box, gt_cls = pkl.loads(it['data'])
pcd = pcd.astype(np.single)[:, [0, 1, 2]]
pcd_ri2 = pcd_ri2.astype(np.single)[:, [0, 1, 2]]
pcd = np.vstack([pcd, pcd_ri2])
proposal = proposal.astype(np.single)[:7]
gt_box = gt_box.astype(np.single)[:7]
return pcd, proposal, gt_box, gt_cls

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 1.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

How to add boundary offset to the original one-stage pointpillars?

Happy Dragon Boat Festival!
I see that the pointpillars in the mm3d you use are a two-stage model, so if I want to add your module to my original one-stage pointpillars, should it be divided into three steps?

  1. First, consider the top n (for example, 500) with the highest predicted probability in the box predicted by pointpillars as proposals.
  2. Second, use the boundary offset you proposed to augment the feature dimensions of the points in the proposal.
  3. Finally, refine on the predicted box (or the original anchor?)

By the way, it would be better if you have the relevant code implementation

The pretrained model

hi~ Zhichao Li /Feng Wang/ Naiyan Wang~

我对你们的研究工作非常感兴趣,我在复现你们的工作,但是我没有hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth的模型去提取proposal;您可以提供我该预训练模型以及LiDAR R-CNN的模型吗?

(I am very interested in your research work, I am reproducing your work, but I do not have the model of hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth to extract the proposal; you can provide me with the pre-trained model and LiDAR R-CNN model?)

checkpoint shape error

hi~ Zhichao Li /Feng Wang/ Naiyan Wang~

I am very interested in your work LIDAR RCNN, but when I use the LIDAR RCNN pretrained model you gave me checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb, I found that the weights of the pretrained model are 9 dimensions, but your input data is 12 dimensions.

Can you provide me a pretrained model whose dimensions are correctly matched.

image

image

I found that in one of your commits, the dimension was increased from 9 to 12 dimensions, but the latest pre-trained model is still 9 dimensions

about aos metric

thanks for your work!
I apply your work on KITTI dataset. However, I found that “aos” metric in KITTI datasets performs bad.
In the paper, when the predicted orientation is flipped with 180°, the orientation target is set to be the minimum residual calculated by original orientation and flipped orientation. It doesn't seem to fix the 180° flip orientation error.
In this function

path of data_processer config files

First, thanks a lot for sending me checkpoints previously.

Following the instructions of your work, I would like to double check the following points related to the path of config files.

  1. If the pc_path and gt_path in config/mmdet3d_pp_train.yaml and config/mmdet3d_pp_val.yaml share the same path of the decode points(pc) and ground truth(gt) generated in the first step.

  2. The data_path in mmdet3d_pp_train.yaml is kitti_results_train.bin, but that of mmdet3d_pp_val.yaml(kitti_results.bin) was not generated in pp generating.

Thank you so much.

Run inference on single GPU

Hi,
I am able to do all setup as per instructions given in README
In the evaluation step,

python -m torch.distributed.launch --nproc_per_node=4 tools/test.py --cfg config/lidar_rcnn.yaml --checkpoint outputs/lidar_rcnn/checkpoint_lidar_rcnn_59.pth.tar
python tools/create_results.py --cfg config/lidar_rcnn.yaml

I am facing the following questions while running the evaluation.

  1. How to change the command to run a single GPU, nproc_per_node needs to be 1.
  2. What should be MODEL.Frame number for checkpoint_lidar_rcnn_59.pth.tar?
    Since I am trying to understand the evaluation, kindly help me on this to fix.

What processes in LIDAR-RCNN are specific for waymo dataset?

Hello, just like the title saying, I wonder what are the specific processes for WOD, which means if I want to use LIDAR R-CNN on my own dataset, I have to do it differently. I already change the data_processor and everything I can think of in the loader and creat_results that are respect to waymo dataset, then I use the refined results to perform evaluation on my own dataset. However, I got NAN on rotation error, and the MAP is pretty low.
issue2

Therefore, I'm confused about some subtle processes that are performed just for waymo not for other datasets. For example, compute heading residual is necessary for using LiDAR R-CNN? Did you guys use rotation in some sublte ways? (In my dataset, the rotation is according to y axis, while in your code, it's x axis, but the way of computing rotZ is the same, I already changed it.)
image

This bug has been driving me crazy, that's why my issue description above is a bit messy, forgive me please. I would be grateful if you could provide me some hints. Thank you a lot. Save this almost desperate kid, please.🥺

About hyperparameter

hi, guys, this is a good work. I'm curious about the hyperparameters in this work, which hyperparameter is sensitive to dataset? when I apply this methods to my own data, the results is not very good, should I adjust some hyperparameter to get a better performance ?

How to test in KITTI dataset?

Thank you for your great work!
I am quite interested in your work.I want to know how to run the code in KITTI dataset,it seems that the code you released doesn't support KITTI.Looking forward to your reply.

pretrain model

Hi, I am very interested in your paper, and I am reproducing it. Could you please provide the pretrained model of pointpillar in mmdet3d? Thank you very much!

checkpoint shape error

hi~ Zhichao Li /Feng Wang/ Naiyan Wang~
我对你们的工作LIDAR RCNN非常感兴趣,但是我在使用您给我的LIDAR RCNN预训练模型checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb 时,发现预训练模型的权重是9维,但是你们的输入数据是12维12维
您可以提供给我维度可以正确匹配的预训练模型吗

Collaboration with MMDetection3D

Hi developers of LiDAR R-CNN,

Congrats on the acceptance of the paper!

LiDAR R-CNN achieves new state-of-the-art results through simple yet effective improvement, which is very insightful to the community. We also found that the baseline is based on the implementations in MMDetection3D.

Therefore, I am coming to ask, as we believe LiDAR R-CNN might have a great impact on the community, would you like to also contribute an implementation of LiDAR R-CNN to MMDetection3D?
If so, maybe we could have a more detailed discussion about that? MMDetection3D welcomes any kind of contribution. Please feel free to ask if there is anything from the MMDet3D team that could help.

On behalf of the MMDet3D Development Team

BR,

Wenwei

The cls scores are useless on my own dataset

Thanks for your awesome works. When I use Lidar-RCNN on my own dataset, the refine score is useless, Most objects are classified as backgrounds. In addition, the average refined center error is only reduced by 1 cm. I don't know Is this normal?

to_pcdet function in loss function

Thank you for your great work! I have a question about loss function. In this to_pcdet fuction, it seems that the dataset format in bbox is [xyz_lidar, l, w, h, rz]. This format looks the same as openpcdet. Why does it need to convert to pcdet format again?

Inference time

Nice work! Could you please tell how fast your model run on WAYMO-data and which device?
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.