sshaoshuai / pointrcnn Goto Github PK

View Code? Open in Web Editor NEW

1.7K 1.7K 425.0 452 KB

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

License: MIT License

Shell 0.10% Python 88.39% C++ 4.26% Cuda 7.25%

pointrcnn's People

Contributors

Stargazers

Watchers

Forkers

amitabhama tsbiosky wdxpython zhaobaozi chiukin collector-m nnu-gisa zhearing kelvinson zhby99 huangmhao ml-lab houseleo cvrosefun codybai johanhyrefeldt fangyuan123 poodarchu llingvo lwjtyzh barongeng techger leo-xxx wyjforwjy chisyliu jacklongking kentang-mit jingbanz gaoshf div99 fourierwang me-meda ashwathaithal kunsongshi thegreatgalaxy adamzdw liuqi8827 jlqzzz jryongithub sjylegend weipengchan chaomath lkskstlr wangjingbo1219 johndpope chwlsunny zpdesu deepaktalwardt dongwoohhh arunkumarramanan roberto-hg lianjingxiang nuoloveheng manojbhat09 peiyi-li ruihong000 jovialio tthhee berlingberling bruinxiong staceycy areslp galfaroth lijh199511 hdjsjyl xialu20 leelinjun jessony bestsonny chengyaoli lh0616 kaancolak jiachens fengkai11 treexun wangjuenew fxy2012 hooic jcmayoral dashidhy wumingbai laszlo1234 ekele-nnorom eralien zzq1016 ceciliabi jt827859032 guangshengshi goodlucktian loganrebecca yeongjae chris0919 glc12125 pkurainbow yangdaiyu123 leixu84 jyakaranda rockylu1995 aoyuqc lsptb

pointrcnn's Issues

About the results

Hi Shaoshuai,

Thanks for your great paper and implementation ~ I have tried your provided checkpoint and it reproduced exactly the same results as what is reported in README.md.

However, when I went through the training process provided in README.md according to your instructions, I got the final results as follows:

bbox AP:97.4753, 89.2702, 88.8074
bev  AP:89.1184, 87.2523, 86.3265
3d   AP:86.1782, 77.1745, 76.6546
aos  AP:97.46, 89.10, 88.54

I got slightly better results (in terms of 3D AP and BEV) if I used RCNN trained at epoch 23 instead of 30 and RPN trained at epoch 200:

bbox AP:97.3265, 89.3636, 88.8258
bev  AP:89.6408, 87.2576, 85.9677
3d   AP:87.3695, 77.6297, 76.8561
aos  AP:97.31, 89.22, 88.59

Both results cannot match up with the pretrained model.

I noticed that the major difference between the model in the provided training pipeline and the official testing pipeline is that the official testing pipeline uses RPN.LOC_XZ_FINE=False. Should I do so if I want to train my model from scratch? Besides, I noticed in the paper that the RCNN stage is trained for 50 epochs. However, in the provided instructions, the epoch number is 30.

Thanks a lot for your reply~

Best,
Ken

fatal error: cuda.h: No such file or directory

Thank you for your reply,I have one question,when I run 'sh build_and_install.sh' to build and install the pointnet2_lib, iou3d, roipool3d libraries ,I always receive this" fatal error: cuda.h: No such file or directory",but I do not know why.

These are the reply by the computer:

building 'iou3d_cuda' extension
gcc -pthread -B /home/user/anaconda/envs/qq/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrrototypes -fPIC -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lib/include -I/home/user/anaconda/envs/qq/lib/n3.7/site-packages/torch/lib/include/torch/csrc/api/include -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lilude/TH -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lib/include/THC -I/usr/local/cuda:/usr/local/cuda-8.0/de -I/home/user/anaconda/envs/qq/include/python3.7m -c src/iou3d.cpp -o build/temp.linux-x86_64-3.7/src/iou3d.o -g -DTORCH_APIUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=iou3d_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/iou3d.cpp:4:18: fatal error: cuda.h: No such file or directory
compilation terminated.
error: command 'gcc' failed with exit status 1
running install
running bdist_egg
running egg_info
writing roipool3d.egg-info/PKG-INFO
writing dependency_links to roipool3d.egg-info/dependency_links.txt
writing top-level names to roipool3d.egg-info/top_level.txt
reading manifest file 'roipool3d.egg-info/SOURCES.txt'
writing manifest file 'roipool3d.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'roipool3d_cuda' extension
gcc -pthread -B /home/user/anaconda/envs/qq/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrrototypes -fPIC -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lib/include -I/home/user/anaconda/envs/qq/lib/n3.7/site-packages/torch/lib/include/torch/csrc/api/include -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lilude/TH -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lib/include/THC -I/usr/local/cuda:/usr/local/cuda-8.0/de -I/home/user/anaconda/envs/qq/include/python3.7m -c src/roipool3d.cpp -o build/temp.linux-x86_64-3.7/src/roipool3d.o -g -DTAPI_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=roipool3d_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda:/usr/local/cuda-8.0/bin/nvcc -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lib/include -I/hoer/anaconda/envs/qq/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -I/home/user/anaconda/envs/qq/lib/pyt7/site-packages/torch/lib/include/TH -I/home/user/anaconda/envs/qq/lib/python3.7/site-packages/torch/lib/include/THC -I/usr/louda:/usr/local/cuda-8.0/include -I/home/user/anaconda/envs/qq/include/python3.7m -c src/roipool3d_kernel.cu -o build/temp.linu_64-3.7/src/roipool3d_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --com-options '-fPIC' -O2 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=roipool3d_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++
unable to execute '/usr/local/cuda:/usr/local/cuda-8.0/bin/nvcc': No such file or directory
error: command '/usr/local/cuda:/usr/local/cuda-8.0/bin/nvcc' failed with exit status 1
Could you help me?Thank you very much.

RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:441

solution for this issue please provide

(pt) divya@divya-pc:~/virtual/pt/PointRCNN/tools$ python3 eval_rcnn.py --cfg_file cfgs/default.yaml --ckpt PointRCNN.pth --batch_size 1 --eval_mode rcnn --set RPN.LOC_XZ_FINE False
/home/divya/virtual/pt/PointRCNN/tools/../lib/config.py:187: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
yaml_cfg = edict(yaml.load(f))
2019-05-28 15:28:03,034 INFO Start logging
2019-05-28 15:28:03,034 INFO cfg_file cfgs/default.yaml
2019-05-28 15:28:03,034 INFO eval_mode rcnn
2019-05-28 15:28:03,034 INFO eval_all False
2019-05-28 15:28:03,035 INFO test False
2019-05-28 15:28:03,035 INFO ckpt PointRCNN.pth
2019-05-28 15:28:03,035 INFO rpn_ckpt None
2019-05-28 15:28:03,035 INFO rcnn_ckpt None
2019-05-28 15:28:03,035 INFO batch_size 1
2019-05-28 15:28:03,035 INFO workers 4
2019-05-28 15:28:03,035 INFO extra_tag default
2019-05-28 15:28:03,035 INFO output_dir None
2019-05-28 15:28:03,035 INFO ckpt_dir None
2019-05-28 15:28:03,035 INFO save_result False
2019-05-28 15:28:03,035 INFO save_rpn_feature False
2019-05-28 15:28:03,035 INFO random_select True
2019-05-28 15:28:03,035 INFO start_epoch 0
2019-05-28 15:28:03,035 INFO rcnn_eval_roi_dir None
2019-05-28 15:28:03,035 INFO rcnn_eval_feature_dir None
2019-05-28 15:28:03,035 INFO set_cfgs ['RPN.LOC_XZ_FINE', 'False']
2019-05-28 15:28:03,035 INFO cfg.TAG: default
2019-05-28 15:28:03,035 INFO cfg.CLASSES: Car
2019-05-28 15:28:03,035 INFO cfg.INCLUDE_SIMILAR_TYPE: True
2019-05-28 15:28:03,035 INFO cfg.AUG_DATA: True
2019-05-28 15:28:03,035 INFO cfg.AUG_METHOD_LIST: ['rotation', 'scaling', 'flip']
2019-05-28 15:28:03,035 INFO cfg.AUG_METHOD_PROB: [1.0, 1.0, 0.5]
2019-05-28 15:28:03,035 INFO cfg.AUG_ROT_RANGE: 18
2019-05-28 15:28:03,035 INFO cfg.GT_AUG_ENABLED: True
2019-05-28 15:28:03,035 INFO cfg.GT_EXTRA_NUM: 15
2019-05-28 15:28:03,036 INFO cfg.GT_AUG_RAND_NUM: True
2019-05-28 15:28:03,036 INFO cfg.GT_AUG_APPLY_PROB: 1.0
2019-05-28 15:28:03,036 INFO cfg.GT_AUG_HARD_RATIO: 0.6
2019-05-28 15:28:03,036 INFO cfg.PC_REDUCE_BY_RANGE: True
2019-05-28 15:28:03,036 INFO cfg.PC_AREA_SCOPE: [[-40. 40. ]
[ -1. 3. ]
[ 0. 70.4]]
2019-05-28 15:28:03,036 INFO cfg.CLS_MEAN_SIZE: [[1.5256319 1.6285675 3.8831165]]
2019-05-28 15:28:03,036 INFO
cfg.RPN = edict()
2019-05-28 15:28:03,036 INFO cfg.RPN.ENABLED: True
2019-05-28 15:28:03,036 INFO cfg.RPN.FIXED: True
2019-05-28 15:28:03,036 INFO cfg.RPN.USE_INTENSITY: False
2019-05-28 15:28:03,036 INFO cfg.RPN.LOC_XZ_FINE: False
2019-05-28 15:28:03,036 INFO cfg.RPN.LOC_SCOPE: 3.0
2019-05-28 15:28:03,036 INFO cfg.RPN.LOC_BIN_SIZE: 0.5
2019-05-28 15:28:03,036 INFO cfg.RPN.NUM_HEAD_BIN: 12
2019-05-28 15:28:03,036 INFO cfg.RPN.BACKBONE: pointnet2_msg
2019-05-28 15:28:03,036 INFO cfg.RPN.USE_BN: True
2019-05-28 15:28:03,036 INFO cfg.RPN.NUM_POINTS: 16384
2019-05-28 15:28:03,036 INFO
cfg.RPN.SA_CONFIG = edict()
2019-05-28 15:28:03,036 INFO cfg.RPN.SA_CONFIG.NPOINTS: [4096, 1024, 256, 64]
2019-05-28 15:28:03,037 INFO cfg.RPN.SA_CONFIG.RADIUS: [[0.1, 0.5], [0.5, 1.0], [1.0, 2.0], [2.0, 4.0]]
2019-05-28 15:28:03,037 INFO cfg.RPN.SA_CONFIG.NSAMPLE: [[16, 32], [16, 32], [16, 32], [16, 32]]
2019-05-28 15:28:03,037 INFO cfg.RPN.SA_CONFIG.MLPS: [[[16, 16, 32], [32, 32, 64]], [[64, 64, 128], [64, 96, 128]], [[128, 196, 256], [128, 196, 256]], [[256, 256, 512], [256, 384, 512]]]
2019-05-28 15:28:03,037 INFO cfg.RPN.FP_MLPS: [[128, 128], [256, 256], [512, 512], [512, 512]]
2019-05-28 15:28:03,037 INFO cfg.RPN.CLS_FC: [128]
2019-05-28 15:28:03,037 INFO cfg.RPN.REG_FC: [128]
2019-05-28 15:28:03,037 INFO cfg.RPN.DP_RATIO: 0.5
2019-05-28 15:28:03,037 INFO cfg.RPN.LOSS_CLS: SigmoidFocalLoss
2019-05-28 15:28:03,037 INFO cfg.RPN.FG_WEIGHT: 15
2019-05-28 15:28:03,037 INFO cfg.RPN.FOCAL_ALPHA: [0.25, 0.75]
2019-05-28 15:28:03,037 INFO cfg.RPN.FOCAL_GAMMA: 2.0
2019-05-28 15:28:03,037 INFO cfg.RPN.REG_LOSS_WEIGHT: [1.0, 1.0, 1.0, 1.0]
2019-05-28 15:28:03,037 INFO cfg.RPN.LOSS_WEIGHT: [1.0, 1.0]
2019-05-28 15:28:03,037 INFO cfg.RPN.NMS_TYPE: normal
2019-05-28 15:28:03,037 INFO cfg.RPN.SCORE_THRESH: 0.3
2019-05-28 15:28:03,037 INFO
cfg.RCNN = edict()
2019-05-28 15:28:03,037 INFO cfg.RCNN.ENABLED: True
2019-05-28 15:28:03,037 INFO cfg.RCNN.USE_RPN_FEATURES: True
2019-05-28 15:28:03,037 INFO cfg.RCNN.USE_MASK: True
2019-05-28 15:28:03,037 INFO cfg.RCNN.MASK_TYPE: seg
2019-05-28 15:28:03,037 INFO cfg.RCNN.USE_INTENSITY: False
2019-05-28 15:28:03,037 INFO cfg.RCNN.USE_DEPTH: True
2019-05-28 15:28:03,037 INFO cfg.RCNN.USE_SEG_SCORE: False
2019-05-28 15:28:03,037 INFO cfg.RCNN.ROI_SAMPLE_JIT: True
2019-05-28 15:28:03,037 INFO cfg.RCNN.ROI_FG_AUG_TIMES: 10
2019-05-28 15:28:03,037 INFO cfg.RCNN.REG_AUG_METHOD: multiple
2019-05-28 15:28:03,037 INFO cfg.RCNN.POOL_EXTRA_WIDTH: 1.0
2019-05-28 15:28:03,037 INFO cfg.RCNN.LOC_SCOPE: 1.5
2019-05-28 15:28:03,037 INFO cfg.RCNN.LOC_BIN_SIZE: 0.5
2019-05-28 15:28:03,038 INFO cfg.RCNN.NUM_HEAD_BIN: 9
2019-05-28 15:28:03,038 INFO cfg.RCNN.LOC_Y_BY_BIN: False
2019-05-28 15:28:03,038 INFO cfg.RCNN.LOC_Y_SCOPE: 0.5
2019-05-28 15:28:03,038 INFO cfg.RCNN.LOC_Y_BIN_SIZE: 0.25
2019-05-28 15:28:03,038 INFO cfg.RCNN.SIZE_RES_ON_ROI: False
2019-05-28 15:28:03,038 INFO cfg.RCNN.USE_BN: False
2019-05-28 15:28:03,038 INFO cfg.RCNN.DP_RATIO: 0.0
2019-05-28 15:28:03,038 INFO cfg.RCNN.BACKBONE: pointnet
2019-05-28 15:28:03,038 INFO cfg.RCNN.XYZ_UP_LAYER: [128, 128]
2019-05-28 15:28:03,038 INFO cfg.RCNN.NUM_POINTS: 512
2019-05-28 15:28:03,038 INFO
cfg.RCNN.SA_CONFIG = edict()
2019-05-28 15:28:03,038 INFO cfg.RCNN.SA_CONFIG.NPOINTS: [128, 32, -1]
2019-05-28 15:28:03,038 INFO cfg.RCNN.SA_CONFIG.RADIUS: [0.2, 0.4, 100]
2019-05-28 15:28:03,038 INFO cfg.RCNN.SA_CONFIG.NSAMPLE: [64, 64, 64]
2019-05-28 15:28:03,038 INFO cfg.RCNN.SA_CONFIG.MLPS: [[128, 128, 128], [128, 128, 256], [256, 256, 512]]
2019-05-28 15:28:03,038 INFO cfg.RCNN.CLS_FC: [256, 256]
2019-05-28 15:28:03,038 INFO cfg.RCNN.REG_FC: [256, 256]
2019-05-28 15:28:03,038 INFO cfg.RCNN.LOSS_CLS: BinaryCrossEntropy
2019-05-28 15:28:03,038 INFO cfg.RCNN.FOCAL_ALPHA: [0.25, 0.75]
2019-05-28 15:28:03,038 INFO cfg.RCNN.FOCAL_GAMMA: 2.0
2019-05-28 15:28:03,038 INFO cfg.RCNN.CLS_WEIGHT: [1. 1. 1.]
2019-05-28 15:28:03,038 INFO cfg.RCNN.CLS_FG_THRESH: 0.6
2019-05-28 15:28:03,038 INFO cfg.RCNN.CLS_BG_THRESH: 0.45
2019-05-28 15:28:03,038 INFO cfg.RCNN.CLS_BG_THRESH_LO: 0.05
2019-05-28 15:28:03,039 INFO cfg.RCNN.REG_FG_THRESH: 0.55
2019-05-28 15:28:03,039 INFO cfg.RCNN.FG_RATIO: 0.5
2019-05-28 15:28:03,039 INFO cfg.RCNN.ROI_PER_IMAGE: 64
2019-05-28 15:28:03,039 INFO cfg.RCNN.HARD_BG_RATIO: 0.8
2019-05-28 15:28:03,039 INFO cfg.RCNN.SCORE_THRESH: 0.3
2019-05-28 15:28:03,039 INFO cfg.RCNN.NMS_THRESH: 0.1
2019-05-28 15:28:03,039 INFO
cfg.TRAIN = edict()
2019-05-28 15:28:03,039 INFO cfg.TRAIN.SPLIT: train
2019-05-28 15:28:03,039 INFO cfg.TRAIN.VAL_SPLIT: smallval
2019-05-28 15:28:03,039 INFO cfg.TRAIN.LR: 0.002
2019-05-28 15:28:03,039 INFO cfg.TRAIN.LR_CLIP: 1e-05
2019-05-28 15:28:03,039 INFO cfg.TRAIN.LR_DECAY: 0.5
2019-05-28 15:28:03,039 INFO cfg.TRAIN.DECAY_STEP_LIST: [100, 150, 180, 200]
2019-05-28 15:28:03,039 INFO cfg.TRAIN.LR_WARMUP: True
2019-05-28 15:28:03,039 INFO cfg.TRAIN.WARMUP_MIN: 0.0002
2019-05-28 15:28:03,039 INFO cfg.TRAIN.WARMUP_EPOCH: 1
2019-05-28 15:28:03,039 INFO cfg.TRAIN.BN_MOMENTUM: 0.1
2019-05-28 15:28:03,039 INFO cfg.TRAIN.BN_DECAY: 0.5
2019-05-28 15:28:03,039 INFO cfg.TRAIN.BNM_CLIP: 0.01
2019-05-28 15:28:03,039 INFO cfg.TRAIN.BN_DECAY_STEP_LIST: [1000]
2019-05-28 15:28:03,039 INFO cfg.TRAIN.OPTIMIZER: adam_onecycle
2019-05-28 15:28:03,039 INFO cfg.TRAIN.WEIGHT_DECAY: 0.001
2019-05-28 15:28:03,039 INFO cfg.TRAIN.MOMENTUM: 0.9
2019-05-28 15:28:03,039 INFO cfg.TRAIN.MOMS: [0.95, 0.85]
2019-05-28 15:28:03,039 INFO cfg.TRAIN.DIV_FACTOR: 10.0
2019-05-28 15:28:03,039 INFO cfg.TRAIN.PCT_START: 0.4
2019-05-28 15:28:03,039 INFO cfg.TRAIN.GRAD_NORM_CLIP: 1.0
2019-05-28 15:28:03,039 INFO cfg.TRAIN.RPN_PRE_NMS_TOP_N: 9000
2019-05-28 15:28:03,039 INFO cfg.TRAIN.RPN_POST_NMS_TOP_N: 512
2019-05-28 15:28:03,040 INFO cfg.TRAIN.RPN_NMS_THRESH: 0.85
2019-05-28 15:28:03,040 INFO cfg.TRAIN.RPN_DISTANCE_BASED_PROPOSE: True
2019-05-28 15:28:03,040 INFO
cfg.TEST = edict()
2019-05-28 15:28:03,040 INFO cfg.TEST.SPLIT: val
2019-05-28 15:28:03,040 INFO cfg.TEST.RPN_PRE_NMS_TOP_N: 9000
2019-05-28 15:28:03,040 INFO cfg.TEST.RPN_POST_NMS_TOP_N: 100
2019-05-28 15:28:03,040 INFO cfg.TEST.RPN_NMS_THRESH: 0.8
2019-05-28 15:28:03,040 INFO cfg.TEST.RPN_DISTANCE_BASED_PROPOSE: True
2019-05-28 15:28:03,041 INFO Load testing samples from ../data/KITTI/object/training
2019-05-28 15:28:03,041 INFO Done: total test samples 3769
2019-05-28 15:28:05,774 INFO ==> Loading from checkpoint 'PointRCNN.pth'
2019-05-28 15:28:05,802 INFO ==> Done
2019-05-28 15:28:05,803 INFO ---- EPOCH no_number JOINT EVALUATION ----
2019-05-28 15:28:05,803 INFO ==> Output file: ../output/rcnn/default/eval/epoch_no_number/val
eval: 0%|▏ | 11/3769 [00:12<1:05:31, 1.05s/it, mode=EVAL, recall=20/25]Traceback (most recent call last):
File "eval_rcnn.py", line 902, in
eval_single_ckpt(root_result_dir)
File "eval_rcnn.py", line 765, in eval_single_ckpt
eval_one_epoch(model, test_loader, epoch_id, root_result_dir, logger)
File "eval_rcnn.py", line 692, in eval_one_epoch
ret_dict = eval_one_epoch_joint(model, dataloader, epoch_id, result_dir, logger)
File "eval_rcnn.py", line 486, in eval_one_epoch_joint
for data in dataloader:
File "/home/divya/virtual/pt/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/divya/virtual/pt/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AssertionError: Traceback (most recent call last):
File "/home/divya/virtual/pt/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/divya/virtual/pt/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/divya/virtual/pt/PointRCNN/tools/../lib/datasets/kitti_rcnn_dataset.py", line 234, in getitem
return self.get_rpn_sample(index)
File "/home/divya/virtual/pt/PointRCNN/tools/../lib/datasets/kitti_rcnn_dataset.py", line 251, in get_rpn_sample
img_shape = self.get_image_shape(sample_id)
File "/home/divya/virtual/pt/PointRCNN/tools/../lib/datasets/kitti_rcnn_dataset.py", line 130, in get_image_shape
return super().get_image_shape(idx % 10000)
File "/home/divya/virtual/pt/PointRCNN/tools/../lib/datasets/kitti_dataset.py", line 35, in get_image_shape
assert os.path.exists(img_file)
AssertionError

How to show the 3D bounding boxes as shown in the image?

The image you gave with network and 3d bboxes are nice, but there isn't corresbounding code to generate the result image with 3D bboxes in 2D image or 2D bboxes in birdview image. Could you please provide it?

OSError: CUDA_HOME environment variable is not set.

When I run the build_and_install.sh， it shows：

OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

I know the error means that I should go to the ~/.bashrc to export my CUDA location. However, I install the cuda and cudnn by using anaconda, when I install the pytorch anaconda auto-install the cuda and cudnn as the dependency of pytorch in the virtual environment created by anaconda. I other words, there exists no cuda in /usr/local.

I try to find the cuda root in the directory of anaconda, but I failed.

It it very kind of you to help me fix this problem.

Dear ゴテンクス , we are looking forward your code, please

FileNotFoundError:/tools/train_utils/train_utils.py

Hi, Shaoshuai,
I got a problem when I tried to run the command in quick demo part to evaluate the pretrained model:

Traceback (most recent call last):
File "eval_rcnn.py", line 902, in
eval_single_ckpt(root_result_dir)
File "eval_rcnn.py", line 762, in eval_single_ckpt
load_ckpt_based_on_args(model, logger)
File "eval_rcnn.py", line 719, in load_ckpt_based_on_args
train_utils.load_checkpoint(model, filename=args.ckpt, logger=logger)
File "/home/b70379409/PointRCNN/tools/../tools/train_utils/train_utils.py", line 90, in load_checkpoint
raise FileNotFoundError
FileNotFoundError

I ran the code in the folder tools.
Can you give me any clues? Thank you very much.

CUDNN_STATUS_NOT_INITIALIZED with mgpu

Hi,

I really like PointRCNN and wanted to try some things with it out.

When I try to train the RPN with multiple GPUs just the way you describe in the README I get following error.

For some more information:
Cuda Version: 10
Cudnn Version: 10
GPUs: 2x 980 TI (non SLI)
Pop! OS 19.04

I tried running PointRCNN on both GPUs separately and they worked, so the GPUs themself should work.

It also works if I use a batch-size of just 1. Probably because it then only uses 1 GPU. I also tried various other batch sizes.

This is the error I get when running with --mgpu

Traceback (most recent call last):                                                                                                                                                                                                            
  File "train_rcnn.py", line 250, in <module>
    lr_scheduler_each_iter=(cfg.TRAIN.OPTIMIZER == 'adam_onecycle')
  File "/-redacted-/PointRCNN/tools/../tools/train_utils/train_utils.py", line 199, in train
    loss, tb_dict, disp_dict = self._train_it(batch)
  File "/-redacted-/PointRCNN/tools/../tools/train_utils/train_utils.py", line 132, in _train_it
    loss, tb_dict, disp_dict = self.model_fn(self.model, batch)
  File "/-redacted-/PointRCNN/tools/../lib/net/train_functions.py", line 35, in model_fn
    ret_dict = model(input_data)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
    raise output
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
    output = module(*input, **kwargs)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/PointRCNN/tools/../lib/net/point_rcnn.py", line 33, in forward
    rpn_output = self.rpn(input_data)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/PointRCNN/tools/../lib/net/rpn.py", line 74, in forward
    backbone_xyz, backbone_features = self.backbone_net(pts_input)  # (B, N, 3), (B, C, N)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/PointRCNN/tools/../lib/net/pointnet2_msg.py", line 61, in forward
    li_xyz, li_features = self.SA_modules[i](l_xyz[i], l_features[i])
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/PointRCNN/tools/../pointnet2_lib/pointnet2/pointnet2_modules.py", line 40, in forward
    new_features = self.mlps[i](new_features)  # (B, mlp[-1], npoint, nsample)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/-redacted-/.local/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 320, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

Dear pretty boy， we are looking forward your code, please

Performance of model for all categories

Firstly, I'd like to thank you for sharing your great job.
Turn to the topic. I found the code train different models on different categories currently. I wonder what's the performance when training a model for all object categories(car, pedestrian, cyclist).

Why is the angular difference between proposal and ground truth within range of [-pi/4, pi] in orientation refinement

I think this paper is excellent. But I can not understand why the angular difference between proposal and ground truth is within the range [-pi/4, pi] in orientation refinement. I do not know how to infer the range from the assumption that the 3D IoU si at least 0.55.

iou for foreground segmentation woth pointnet++

Hello Shaoshuai @sshaoshuai ,

Thanks for sharing this excellent work.
May I know the iou value you can achieve for the foreground segmentation with pointnet++, which is one of the key task in the entire PointRCNN framework (for car detection with focal loss).

Thanks in advance.

How to select rpn checkpoints

Hi, this is a great project!
I tried your code and get a 3d AP of 87.51% with the offline training pipeline. This is worse than reported in the paper. For RCNN training, I just use the 200th checkpoint of RPN to generate proposals as shown in the README file. However, this may be not the best choice. I wonder how you select the best RPN checkpoint file.

Can't open build_and_install.sh

Thank you for your answer, I am following the steps on github, however，when I do this:Build and install the pointnet2_lib, iou3d, roipool3d libraries by executing the following command:
sh build_and_install.sh
The result is: Can't open build_and_install.sh
I try some methods,but it is not useful.
Could you offer me some advices?

Dear handsome guy， we are looking forward your code，please

Problem in RCNN.ROI_SAMPLE_JIT=False

Thanks a lot for your fabulous work !
Your code works well when ROI_SAMPLE_JIT=True.
But a problem happens in /lib/net/rcnn_net.py when ROI_SAMPLE_JIT = False, which may relate to tensor size :

  xyz_input = pts_input[..., 0:self.rcnn_input_channel].transpose(1, 2).unsqueeze(dim=3)
  xyz_feature = self.xyz_up_layer(xyz_input)
  rpn_feature = pts_input[..., self.rcnn_input_channel:].transpose(1, 2).unsqueeze(dim=3)
  merged_feature = torch.cat((xyz_feature, rpn_feature), dim=1)
  merged_feature = self.merge_down_layer(merged_feature)
  l_xyz, l_features = [xyz], [merged_feature.squeeze(dim=3)]

In my case(default as you recommend), the processed xyz_input is a [4, 512, 64, 1, 5] tensor, which can't be processed by the sharedMLP(128,5,1,1 conv indeed):
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [128, 5, 1, 1], but got 5-dimensional input of size [4, 512, 64, 1, 5] instead

Segmentation fault when using roipool3d_cuda

Hi, thank you for sharing. I am trying to run generate_gt_database.py. But I got Segmentation fault when roipool3d_cuda.pts_in_boxes3d_cpu is called. Is it related to the CUDA version? May I ask which version of CUDA you are using?

FIleNotFoundError: /tools/train_utils/train_utils.py

python eval_rcnn.py --cfg_file cfgs/default.yaml --ckpt PointRCNN.pth --batch_size 1 --eval_mode rcnn --set RPN.LOC_XZ_FINE False

Speed benchmark

Does there any speed reference about this method?

Dear sir, we are looking forward to your code. Thanks

tensorflow

What is the version of tensorflow do you use?

undefined symbol when try to run eval_rcnn.py

Hi,
I installed all the required packages inside a conda env and run build_and_install.sh successfully without any error. However, when I try to run the eval_rcnn.py script, it has the following error:

$ python tools/eval_rcnn.py --cfg_file cfgs/default.yaml --ckpt PointRCNN.pth --batch_size 1 --eval_mode rcnn --set RPN.LOC_XZ_FINE False
Traceback (most recent call last):
  File "tools/eval_rcnn.py", line 7, in <module>
    from lib.net.point_rcnn import PointRCNN
  File "/home/xxx/projects/pointrcnn_pytorch/tools/../lib/net/point_rcnn.py", line 3, in <module>
    from lib.net.rpn import RPN
  File "/home/xxx/projects/pointrcnn_pytorch/tools/../lib/net/rpn.py", line 4, in <module>
    from lib.rpn.proposal_layer import ProposalLayer
  File "/home/xxx/projects/pointrcnn_pytorch/tools/../lib/rpn/proposal_layer.py", line 6, in <module>
    import lib.utils.iou3d.iou3d_utils as iou3d_utils
  File "/home/xxx/projects/pointrcnn_pytorch/tools/../lib/utils/iou3d/iou3d_utils.py", line 2, in <module>
    import iou3d_cuda
ImportError: /home/xxx/anaconda2/envs/pointrcnn_pytorch/lib/python3.6/site-packages/iou3d-0.0.0-py3.6-linux-x86_64.egg/iou3d_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at19UndefinedTensorImpl10_singletonE

Do you have any idea how to deal with it?

How to convert original pcap files to binary files

Hi, shaoshuai,
Thank you for your impressive work. Now I am trying to process my own lidar data using the model provided by you. The original data format is pcap, while the lidar file provided in KITTI is bin. How to convert pcap to bin ? Or is there any other way to apply my personal data to your open source framework?

run eval_rcnn.py have a problem

hello
my computer evn is cuda9.0 and GPU2080, do you know why this problem is caused?
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1549628766161/work/aten/src/THC/THCBlas.cu:441

Recall of RPN stage

Do you have code to do validation during the RPN training? Such as computing recall for proposals or IOU/accuracy for point-wise FG/BG classification every 5 epochs or so? How do you select the checkpoint of RPN training? Do you always take epoch 200?

Thanks

Number of input channels doesn't match net input

First of all thanks for sharing the code.

I try to run the described "quick demo" but unfortunately I run into an error.

"RuntimeError: Given groups=1, weight of size 16 3 1 1, expected input[1, 4, 4096, 16] to have 3 channels, but got 4 channels instead".

It seems as in "pointnet2_modules.py" in the " _PointnetSAModuleBase" class (in line 40), the "self.mlps[i]" expects a three channel tensor (that is also what the self.mlps is showing as the first conv2d uses 3 channels, but the generated new_features is of dimension [1,4,4096,16]. Especially I have recognized, that the "xyz" tensor, which is passed as a parameter to the forward function is (1, 126456, 3) as it should be due to the method description. As I haven't made a change to your code I am surprised if I had made a mistake with the dataset ?

Best regards
Bjoern

Problem in computing the observation angle "alpha" and rotation angle "ry" ???

Thanks! solved!

Question about 'decode_bbox_target'

In file bbox_trainsfrom.py, the input of decode_bbox_target are:

def decode_bbox_target(roi_box3d, pred_reg, loc_scope, loc_bin_size, num_head_bin, anchor_size, get_xz_fine=True, get_y_by_bin=False, loc_y_scope=0.5, loc_y_bin_size=0.25, get_ry_fine=False):
"""
:param roi_box3d: (N, 7)
:param pred_reg: (N, C)
:param loc_scope:
:param loc_bin_size:
:param num_head_bin:
:param anchor_size:
:param get_xz_fine:
:param get_y_by_bin:
:param loc_y_scope:
:param loc_y_bin_size:
:param get_ry_fine:
:return:
"""

however, in ProposalLayer, the layer use decode_bbox_target to get the proposals:

proposals = decode_bbox_target(xyz.view(-1, 3), rpn_reg.view(-1, rpn_reg.shape[-1]),
                                       anchor_size=self.MEAN_SIZE,
                                       loc_scope=cfg.RPN.LOC_SCOPE, 
                                       loc_bin_size=cfg.RPN.LOC_BIN_SIZE, 
                                       num_head_bin=cfg.RPN.NUM_HEAD_BIN, 
                                       get_xz_fine=cfg.RPN.LOC_XZ_FINE, 
                                       get_y_by_bin=False,
                                       get_ry_fine=False)

my question is, why the shape of roi_box3d input in definition is (N, 7), but actual shape of input is (B*N, 3) ?

Question in decode bbox target

Hi, thanks for your nice codebase. I'm wondering why it has to rotate the point cloud and double the ry prediciton here: https://github.com/sshaoshuai/PointRCNN/blob/master/lib/utils/bbox_transform.py#L115. Could you please briefly explain it? Thanks!

How to generate plane files?

Hi, Thanks for sharing!

I encountered the following problem when running the first stage training without downloading your plane files.
FileNotFoundError: [Errno 2] No such file or directory: ../data/KITTI/object/training/planes/005957.txt'

Can I know how do you get these plane files? btw, how can we turn off the augmentation option so that we can run the code without downloading the plane files?

Thanks!

Visualization

Hi, shaoshuai,
I am wondering how to show the detection results with 3D bbxes and show the detected vehicles in a distinguish color. Did you use any library? Thanks.

CUDA_ERROR_OUT_OF_MEMORY

When i run the following command to evaluate the pretrained model python eval_rcnn.py --cfg_file cfgs/default.yaml --ckpt PointRCNN.pth --batch_size 1 --eval_mode rcnn --set RPN.LOC_XZ_FINE False .
The Eroror: Call to cuDevicePrimaryCtxRetain results in CUDA_ERROR_OUT_OF_MEMORY
Can you help me?

How much GPU memory will be cost?

I read your paper and notice that you use 4 set-abstraction layers with multi-scale grouping (with sizes 4096, 1024, 256, 64) and subsample 16,384 points from each scene as the inputs. I think this is very consuming and want to know how much GPU memory will be used. Thanks!

RCNN is skipped when getitem?

In your default.yaml, both RPN.ENABLED and RCNN.ENABLED are TRUE.
Also in train_rcnn.py in line 160:

elif args.train_mode == 'rcnn':

    cfg.RCNN.ENABLED = True
    cfg.RPN.ENABLED = cfg.RPN.FIXED = True
    root_result_dir = os.path.join('../', 'output', 'rcnn', cfg.TAG)

But in kitti_rcnn_dataset.py line 231 in getitem, RPN.ENABLED and RCNN.ENABLED are orgnized in an if-elif logic, where the RCNN part is skipped. Is that a bug?

def __getitem__(self, index):

    if cfg.RPN.ENABLED:
        return self.get_rpn_sample(index)
    elif cfg.RCNN.ENABLED:
        if self.mode == 'TRAIN':
            if cfg.RCNN.ROI_SAMPLE_JIT:
                return self.get_rcnn_sample_jit(index)
            else:
                return self.get_rcnn_training_sample_batch(index)
        else:
            return self.get_proposal_from_file(index)
    else:
        raise NotImplementedError

gcc cuda version

May I post a question?
I configured cuda9.0 cudnn7.2, the corresponding is gcc5.4, but I searched on the Internet and found that gcu6 corresponds to cuda9.0, I want to know what version of cuda cudnn and gcc you use.
Thanks

Training log polluted

My training log is polluted with the following:

2019-04-19 15:56:56,610  DEBUG  STREAM b'IDAT' 41 8192
2019-04-19 15:56:56,647  DEBUG  STREAM b'IHDR' 16 13

This results in a log file of about 70 MB.
Does somebody else have this problem?

corner cases

Hello @sshaoshuai ,

Thanks for releasing this work.
Utilize pointnet++ as a backbone (or as an opinion of mine, it is actually a foreground extractor) can actually be replaced by any other point-based methods, e.g., pointsift, pointcnn, kpconv, rscnn,etc.
I am quite interested in the performance of pointnet++ as it basically identify the performance of the second stage (in my opinion, if it cannot extract as many as possible foreground objects, namely, high recall, the second stage cannot get good result obviously.

I implement pointnet++ based on your code (basically remove all kitti related code and apply to my database and data augmentation), and get some interesting corner cases as illustrated pics followed with ROS,

I doubled the fps points ([8192, 2048, 512, 128]) in each SA layer as my data covers a range of 360 degree view and [-60, 60] meters. Foreground thresh = 0.3 as originally set by you.
Red points are predicted points and green points are the missed labels.

Some wrongly classified walls
Some missed large trucks

Any suggestions for these cases?
Thanks in advance.

Running trained PointRCNN on dataset

Hi, I've been running through some tests on this code and it seems to be running quite well so far, but I was having some difficulty running a trained model on the dataset and printing/saving the results rather than evaluating against them. Does the existing code have this functionality built in? Or would I have to write my own script for this? Also if this is the case, would it be possible for you to outline what I'd have to do to get this working? Thank you!

How to visualize the output result in point cloud and image?

I find your work is really interesting. I am wondering how to visualize the output result.

Pretrained model performance is not consistent with claim

After downloading pretrained model, I ran follow command
python eval_rcnn.py --cfg_file cfgs/default.yaml --ckpt PointRCNN.pth --batch_size 1 --eval_mode rcnn --set RPN.LOC_XZ_FINE False
but result is not good as claim

Did I miss something?

Output Format

Hi sshoashuai,
Thaanks you for releasing your code and the pretrained model. I'm a bit confuse when trying to plot the 3D predicted bounding boxes. I'm guessing that the output format (the first line of the prediction for 000001.txt) encodes the bounding box as follows:

Car -1 -1 -1.7723 753.6551 163.8756 814.0421 204.0525 1.5462 (height) 1.6426 (width) 3.9845 (length) 7.0502 (x) 1.2010 (y) 29.7814 (z) -1.5398 (theta) 1.6346

with y representing the elevation and the XZ plane representing the ground plane.

Am i right ?
Regards,

modify the codes to train all categories

Firstly, thanks for your great job.

I think it's not reasonable for a category to correspond to a model. If I want to modify the code to train all the categories at the same time. Do you have any suggestions? Which part of the code is needed to be modified

path of your pretrained model

Hello author, what is the path of your pretrained model?

Segment error: with 'python generate_gt_database.py --class_name 'Car' --split train'

It is observed that when running generate_gt_database.py in directory ../PointRCNN/tools, segment error occurs. After printing statement test, I found that error is raised in line 72,

boxes_pts_mask_list = roipool3d_utils.pts_in_boxes3d_cpu(torch.from_numpy(pts_rect), torch.from_numpy(gt_boxes3d))

I have no idea why python didn't solve the memory problem. Very weird.