wxinlong / asis Goto Github PK
View Code? Open in Web Editor NEWAssociatively Segmenting Instances and Semantics in Point Clouds, CVPR 2019
License: MIT License
Associatively Segmenting Instances and Semantics in Point Clouds, CVPR 2019
License: MIT License
There is a bug in the data collection script (collect_indoor3d_data.py
) for windows systems, resulting in wrongly placed, npy files.
Line 18 does not split the path into all its components as the separator on windows is \\
.
The result of this bug is that the processed npy files end up in the Stanford3dDataset
folder, instead of the stanford_indoor3d_ins.sem
folder.
>>> elements = anno_path.split('/')
elements = ['C:\\ASIS\\data\\Stanford3dDataset_v1.2_Aligned_Version\\Area_2', 'conferenceRoom_1', 'Annotations']
minimal fix tested on windows:
>>> elements = os.path.normpath(anno_path).split(os.path.sep)
elements = ['C:', 'ASIS', 'data', 'Stanford3dDataset_v1.2_Aligned_Version', 'Area_2', 'conferenceRoom_1', 'Annotations']
Hi !
In test.py
, how to match your pretrained model with the arg:
--model_path
log1/epoch_99.ckpt
I mean the pretrained model belongs to which log ?
Thanks for help !
Hello, thank you very much for providing the learning code, it can run smoothly.However, when I used CloudCompare to visualize the results, the results are quite different from those mentioned in your article.I was wondering what visualization tools you used to view the results?thank you
Hi Xinlong,
I want to run "eval_iou_accuracy.py", but it will be killed. I find that the list "pts_in_pred" has shape of (number_of_instances, number_of_points). It can be super large. I think it is why my program will be killed. Have you ever meet this problem? How can I solve it? Thank you so much.
Binbin
This is the first time I've run the open source code,I can't see the order to run different functions in different folders from README. Is there a running sequence?
我使用了您的训练模型,并且下载了完整且没有错误的S3DIS数据用于测试。
但是在BlockMerging这里出现了 : index 129 is out of bounds for axis 0 with size 100这样的问题。
想请教您这个问题到底出在哪里了呢?
Hello People, I'm wondering how the authors of the paper tested their network on the ShapeNet dataset? Any ideas?
When i train the ASIS network with my data, i find that the instance network doesnot get trained at all in the test results. The inslabel predicted by instance network is same in all the points and the results of seg network is terrible too. I don't know the reason why the network can't be trained, and it worked well when i run the code with your data or even with my another group of data. do you know what could be the reason for the instance network could not be trained.
Thank you @WXinlong for your excellent work.
I noticed that you used try...except... in function BlockMerging() when assigning value to the ndarray overlapgroupcounts. I also found this function in SGPN repo. When I test Area 5 Hallway 1 of S3DIS with SGPN, there is a error in this line "overlapgroupcounts[grouplabel[i],volume[xx,yy,zz]] += 1" that reports the index exceeds array index range(300).
Did you meet this error either and add a try...except... syntax here?
@WXinlong Thanks for sharing the code i have few queries
Q1. The below image shown i which dataset
Q2. Have you trained your source code on semanttic Kitti dataset ? is so can you share the pre-trained model , if not can you let us knw what type of conversion is required to convert the data for training
Q3. what is performance obtained on the dataset you have tested
Thanks in advance
Hello, @WXinlong . In the experiments of ShapeNet, I want to perform the DBSCAN clustering algorithm to generate instance labels. But the parameters of DBSCAN are hard to chooose.
Could you please provide the values of eps and min_samples of DBSCAN?
And I would be grateful if you could provide the instance labels of ShapeNet.
Thank you!
Thank you for your great code! I can get good results on S3DIS. But I can not reproduce the results on ShapeNet datasets. If it's possible, could you please give me some code on shapeNet dataset? thanks a lot. my email is [email protected]
Thanks for your great code. Could you please provide the value of δv and δd for outdoor dataset? (https://github.com/WXinlong/ASIS/blob/master/misc/vkitti_asis.png) Thanks a lot.
I would like to download the prepared HDF5 data for training but the link doesn't work. Could you fix it?
Do you have a link to download raw S3DIS data?
Thank you
In python 3 it is not allowed to use the +
operator on a range object and a list.
This causes an error in indoor3d_util.py
(sample_data(), line 129) when running the gen_h5.py
script.
Minimal fix for python 3 compatibility:
>>> return np.concatenate([data, dup_data], 0), range(N)+list(sample)
TypeError: unsupported operand type(s) for +: 'range' and 'list'
>>> return np.concatenate([data, dup_data], 0), list(range(N))+list(sample)
this works
Thanks for sharing your work. I want to know how to visualize the results. Hopeful to your reply.
Hello, when I run the train.sh, I meet the error :
TypeError: Failed to convert object of type <type 'list'> to Tensor.
Any advice will be appreciated deeply.
Here is the traceback:
Traceback (most recent call last):
File "train.py", line 256, in
train()
File "train.py", line 132, in train
loss, sem_loss, disc_loss, l_var, l_dist, l_reg = get_loss(pred_ins, labels_pl, pred_sem_label, pred_sem, sem_labels_pl)
File "/home/juzhi/temp/ASIS/models/ASIS/model.py", line 89, in get_loss
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 150, in discriminative_loss
0])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2770, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2599, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2549, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 121, in body
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 26, in discriminative_loss_single
reshaped_pred = tf.reshape(prediction, [-1, feature_dim])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2451, in reshape
name=name)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 493, in apply_op
raise err
TypeError: Failed to convert object of type <type 'list'> to Tensor. Contents: [-1, Dimension(5)]. Consider casting elements to a supported type.
[56409. 74039. 9895. 10639. 5923. 6255. 7588. 7744. 3566. 7615.
6183. 5024. 2581.]
cp: cannot stat 'inference_merge.py': No such file or directory
Traceback (most recent call last):
File "test.py", line 260, in
test()
File "test.py", line 83, in test
loss, sem_loss, disc_loss, l_var, l_dist, l_reg = get_loss(pred_ins, labels_pl, pred_sem_label, pred_sem, sem_labels_pl)
File "/home/juzhi/temp/ASIS/models/ASIS/model.py", line 89, in get_loss
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 150, in discriminative_loss
0])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2770, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2599, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2549, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 121, in body
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 26, in discriminative_loss_single
reshaped_pred = tf.reshape(prediction, [-1, feature_dim])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2451, in reshape
name=name)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 493, in apply_op
raise err
TypeError: Failed to convert object of type <type 'list'> to Tensor. Contents: [-1, Dimension(5)]. Consider casting elements to a supported type.
Thank you for your great code!
I used the "conda install --channel https://conda.anaconda.org/anaconda tensorflow-gpu=1.3.0" command to install tf1.3, but there is no " external/nsync/public" behind "anaconda3/envs/py2.7/lib/python2.7/site-packages/tensorflow/include/external/" in the directory.
Thus I can't compile successfully tf_op.
Any advise will be appreciated.
I was wondering if you tried to test the ASIS on Outdoor dataset like KITTI or Semantic3D.
Hi, may I ask which drawing tool do you use for clear illustrated figures like Figure 3? I am looking for drawing some graph like yours but do not have any clue..
when I try to run the training code, there are some errors:
Current batch/total batch num: 0/697
2019-10-05 23:19:12.279100: E tensorflow/stream_executor/cuda/cuda_blas.cc:636] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "train.py", line 256, in
train()
File "train.py", line 211, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 243, in train_one_epoch
feed_dict=feed_dict)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=786432, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_911)]]
Caused by op u'layer1/conv0/Conv2D', defined at:
File "train.py", line 256, in
train()
File "train.py", line 128, in train
pred_sem, pred_ins = get_model(pointclouds_pl, is_training_pl, NUM_CLASSES, bn_decay=bn_decay)
File "/home/ASIS/models/ASIS/model.py", line 29, in get_model
l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=1024, radius=0.1, nsample=32, mlp=[32,32,64], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
File "/home/ASIS/utils/pointnet_util.py", line 187, in pointnet_sa_module
data_format=data_format)
File "/home/ASIS/utils/tf_util.py", line 165, in conv2d
data_format=data_format)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 639, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InternalError (see above for traceback): Blas SGEMM launch failed : m=786432, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_911)]]
I don't know how to solve it, I really hope that you can help me! waiting for your reply,thanks!
Thank you for your great work! but I have some problem when trying to reproduce the results on Area5. Has anyone reproduced the results on Instance Segmentation task?
Can you provide some advice and guidance?
//=======================================
Area5 Results report in the paper:
Table 1: Instance segmentation results on S3DIS dataset.
Backbone Method mCov mWCov mPrec mRec
PN++ ASIS 44.6 47.8 55.3 42.4
Table 2: Semantic segmentation results on S3DIS dataset
Backbone Method mAcc mIoU oAcc
PN++ ASIS 60.9 53.4 86.9
//=======================================
Evaluate the trained model you provided on Area5 :
"(optional) Trained model can be downloaded from here."
I can reproduce the Semantic segmentation results with the trained model you provided,
but can not reproduce the Instance segmentation results. (Just testing with the trained model you provided, without training by myself)
Because of the evaluate method in test.py sampled 4096 points in each block to test on, each time the evaluation results are different. The mMUCov, mMWCov, mACC, mIoU oAcc will be a little different which is acceptable, but the difference is a bit big for mPrecision and mRecall.
I evaluate the provided model on Area5 using "test.py" and "eval_iou_accuracy.py" for three times, without changing any code. Results are as follows:
//--------------------------------------
test on Area5 for the 1st time:
Instance segmentation:
mMUCov: 0.42489331999047575 < 44.6
mMWCov: 0.4535493664127393 < 47.8
mPrecision: 0.5021224775661305 < 55.3
mRecall: 0.4004512705562707 < 42.4
Semantic segmentation:
mAcc: 0.6092345653223031 = 60.9
mIoU: 0.5331283341313834 = 53.4
oAcc: 0.8693586484215375 = 86.9
//--------------------------------------
test on Area5 for the 2nd time:
Instance segmentation:
mMUCov: 0.4291705590335591 < 44.6
mMWCov: 0.45548170503790186 < 47.8
mPrecision: 0.5474015599454721 = 55.3
mRecall: 0.4274631185866652 = 42.4
Semantic segmentation:
mAcc: 0.6084982131786074 = 60.9
mIoU: 0.5328714125575762 = 53.4
oAcc: 0.8693728293860369 = 86.9
//--------------------------------------
test on Area5 for the 3rd time:
Instance segmentation:
mMUCov: 0.4270858894049287 < 44.6
mMWCov: 0.45475902917080196 < 47.8
mPrecision: 0.5194688592978431 < 55.3
mRecall: 0.40944191619875714 < 42.4
Semantic segmentation:
mAcc: 0.6079589620032484 = 60.9
mIoU: 0.5315288320965799 = 53.4
oAcc: 0.8692114229308049 = 86.9
//--------------------------------------
//=======================================
So I trained the model by myself, and test it on Area5,
using your training and tesing code, with random initialization.
//--------------------------------------
test on Area5 for the 1st time:
mMUCov: 0.4192711216731248 < 44.6
mMWCov: 0.4444165547538725 < 47.8
mPrecision: 0.5096157303879506 < 55.3
mRecall: 0.40007411245273317 < 42.4
mAcc: 0.6049548823385776 = 60.9
mIoU: 0.528582705433821 = 53.4
oAcc: 0.8679205275945892 = 86.9
//--------------------------------------
test on Area5 for the 2nd time:
mMUCov: 0.4174190863147767 < 44.6
mMWCov: 0.44349561917787444 < 47.8
mPrecision: 0.5229537176620869 < 55.3
mRecall: 0.40793882667136827 < 42.4
mAcc: 0.6044562112313121 = 60.9
mIoU: 0.5287068091885877 = 53.4
oAcc: 0.8676705435570818 = 86.9
//=======================================
I also trained the model by initializing the network using the provided model weights.
//--------------------------------------
test on Area5 for the 1st time:
mMUCov: 0.4184383436403125 < 44.6
mMWCov: 0.4453902032528212 < 47.8
mPrecision: 0.5073059155435999 < 55.3
mRecall: 0.3895353315843023 < 42.4
mAcc: 0.6104495511204024 = 60.9
mIoU: 0.5355729541022203 = 53.4
oAcc: 0.868981285117484 = 86.9
Thank you again for your code.
please, could you give me the link vkitti2. I need that to do somethings. Thank you.
您好,我在运行test.py时遇到了#Unknow file type!#这个问题。
为什么您在数据预处理时生成了.h5格式数据,但是最后测试的时候只能打开txt格式和npy格式的文件?
迫切盼望得到您的回复,再次感谢您的贡献!
Thank you very much for your great work ,but I have some problem when trying to evaluate the results on S3DIS with test.py you provided.
According to the original paper of PointNet(http://openaccess.thecvf.com/content_cvpr_2017/papers/Qi_PointNet_Deep_Learning_CVPR_2017_paper.pdf),when testing on the S3DIS dataset, all points in each block are tested.But in your test.py,I find you sampled 4096 points in each block to test on.Dose it mean your evaluate protocol is different from that in PointNet?
Looking forward to your reply.
Thank you for your great work! Could you provide me a link to the ShapeNet dataset with instance segmentation annotations. If you can help me, It Will Be Really Nice Of You!
hi Xinlong,
thanks for your work. when I use 6-fold CV with S3DIS dataset, there is an irregular performance. could you please provide the performance of each fold ? thanks.
Hi, thank you very much for your contribution to open source.I'd like to ask you a question. I collected my own point cloud files with lidar and intend to make H5 data set for testing. But the production process failed. Analyze the reason. Can I not collect the component data set of point cloud like the original data set? As shown in the file above.I'm looking forward to your reply.
Thank you for your outstanding work.
But I can't find estimate_mean_ins_size.py and mean_ins_size.txt.
when I run test.py, it raises some error, I don't know what the exactly problem it is, can you please help me?
Model restored.
0 / 68 ...
Loading train file /home/ASIS/data/stanford_indoor3d_ins.sem/Area_5_conferenceRoom_1.npy
Processsing: Shape [0] Block[0]
2019-09-14 18:39:28.811094: E tensorflow/stream_executor/cuda/cuda_blas.cc:636] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "test.py", line 262, in
test()
File "test.py", line 183, in test
feed_dict=feed_dict)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=32768, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_293)]]
[[Node: ins_fa_layer3/Squeeze/_585 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2283_ins_fa_layer3/Squeeze", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Caused by op u'layer1/conv0/Conv2D', defined at:
File "test.py", line 262, in
test()
File "test.py", line 79, in test
pred_sem, pred_ins = get_model(pointclouds_pl, is_training_pl, NUM_CLASSES)
File "/home/ASIS/models/ASIS/model.py", line 29, in get_model
l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=1024, radius=0.1, nsample=32, mlp=[32,32,64], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
File "/home/ASIS/utils/pointnet_util.py", line 187, in pointnet_sa_module
data_format=data_format)
File "/home/ASIS/utils/tf_util.py", line 165, in conv2d
data_format=data_format)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 639, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InternalError (see above for traceback): Blas SGEMM launch failed : m=32768, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_293)]]
[[Node: ins_fa_layer3/Squeeze/_585 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2283_ins_fa_layer3/Squeeze", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Annother question: I can run train.py and estimate_mean_ins_size.py successfully, but I was wondering,that why train.py just takes about several minutes to train.
waiting for your kindly response! thanks very much
Hi,
Thanks for your impressive work, but I just wonder why you set cluster number to 5 for instance segmentation, and what does this value mean? Thanks a lot.
net_ins = tf_util.conv1d(net_ins, 5, 1, padding='VALID', activation_fn=None, scope='ins_fc4')
Best Regards
Frank
Hello, I'm quite grateful for you to offer those codes. However, maybe a code (train.py) using just one GPU cannot fulfill the potential of multiple GPUs. And I went wrong when trying to modify it into a multiple-GPU version. Would you like to introduce an official one? Thanks very much!
Thank you for your open source code! But I can not reproduce the 85.0 mIoU on ShapeNet datasets. I would be very appreciated if you could give me some code on shapeNet dataset? thank you very much. my email is [email protected]
I need authorization to download S3DIS from Google cloud. Is there any other way to download S3DIS?
Hi, Could you tell me what predicted instance label '-1' means?
Hi !
I found there are many data version, could you tell me which one was used in your work ?
I use the last one, and preprossed as README
:
python collect_indoor3d_data.py
python gen_h5.py
cd data && python generate_input_list.py`
but seems not work, because when I run sh +x train.sh 5
, it shows :
Fail to load modelfile: None
**** EPOCH 000 ****
----
Current batch/total batch num: 0/824
2019-04-29 03:33:14.057889: E tensorflow/core/common_runtime/executor.cc:660] Executor failed to create kernel. Not found: No registered '_CopyFromGpuToHost' OpKernel for CPU devices compatible with node swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)
. Registered: device='GPU'
[[Node: swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)]]
Traceback (most recent call last):
File "train.py", line 262, in <module>
train()
File "train.py", line 217, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 249, in train_one_epoch
feed_dict=feed_dict)
File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: No registered '_CopyFromGpuToHost' OpKernel for CPU devices compatible with node swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)
. Registered: device='GPU'
[[Node: swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)]]
Traceback (most recent call last):
File "estimate_mean_ins_size.py", line 49, in <module>
estimate(FLAGS.test_area)
File "estimate_mean_ins_size.py", line 30, in estimate
cur_data, cur_group, _, cur_sem = provider.loadDataFile_with_groupseglabel_stanfordindoor(h5_filename)
File "/data/lirong/ASIS/models/ASIS/provider.py", line 213, in loadDataFile_with_groupseglabel_stanfordindoor
seg = f['seglabel'][:].astype(np.int32)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/h5py/_hl/dataset.py", line 573, in __getitem__
self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)
KeyboardInterrupt
I don't know if this is because the dataset. I use the virtual environment same as PointNet++ and is useful.
Thanks for your help !
I'd like to quote your paper. How is the ShapeNet dataset set made? Could you post it?
I used "python train.py --gpu 2 --batch_size 24 --max_epoch 100 --log_dir log5 --learning_rate 0.001 --decay_step 300000 --restore_model None --input_list /home/ASIS/data/train_hdf5_file_list_woArea5.txt" to train, but it reported
2019-09-30 22:36:43.451109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:02:00.0
totalMemory: 10.73GiB freeMemory: 64.56MiB
2019-09-30 22:36:43.557029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 1 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:03:00.0
totalMemory: 10.73GiB freeMemory: 54.56MiB
2019-09-30 22:36:43.696038: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 2 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:82:00.0
totalMemory: 10.73GiB freeMemory: 62.56MiB
2019-09-30 22:36:43.817509: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 3 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:83:00.0
totalMemory: 10.73GiB freeMemory: 64.56MiB
2019-09-30 22:36:43.817819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
2019-09-30 22:36:43.818075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1 2 3
2019-09-30 22:36:43.818086: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0: Y N N N
2019-09-30 22:36:43.818093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1: N Y N N
2019-09-30 22:36:43.818098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 2: N N Y N
2019-09-30 22:36:43.818104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 3: N N N Y
2019-09-30 22:36:43.818116: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:02:00.0, compute capability: 7.5)
2019-09-30 22:36:43.818124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: GeForce RTX 2080 Ti, pci bus id: 0000:03:00.0, compute capability: 7.5)
2019-09-30 22:36:43.818131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:82:00.0, compute capability: 7.5)
2019-09-30 22:36:43.818138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: GeForce RTX 2080 Ti, pci bus id: 0000:83:00.0, compute capability: 7.5)
2019-09-30 22:36:44.557000: E tensorflow/core/common_runtime/direct_session.cc:168] Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory
Traceback (most recent call last):
File "train.py", line 256, in
train()
File "train.py", line 165, in train
sess = tf.Session(config=config)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1509, in init
super(Session, self).init(target, graph, config=config)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 628, in init
self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
I don't know why I used gpu 2 to train the code, but it tells me gpu 0 is out of memory. can you please tell the solution
I noticed that the ASIS repositories have showed Qualitative result of vKITTI test set. Since the dataset I found(https://github.com/VisualComputingInstitute/vkitti3D-dataset) contains only the semantic label and the origional vKITTI dataset contains only RGB(also with depth ground truth) images.
Is the ASIS itself trained on dataset somewhere from original vKITTI or trained on S3DIS and tested on vKITTI?
@WXinlong Thanks for sharing your work, and i have some question for the function of BlockMerging. When i run the test code with my data. I encountered a IndexError that the actual index which was generated through the code "x,y,z = (pts[:,0]/gap).astype(np.int32)" was out of bounds for axis 1 with size of volume. I didnot change any code except training and testing with my data. I am curious to know the the function of BlockMerging and the meaning of the following code:
gap = 5e-3
volume_num = int(1. / gap)+1+200
volume = -1* np.ones([volume_num,volume_num,volume_num]).astype(np.int32)
volume_seg = -1* np.ones([volume_num,volume_num,volume_num]).astype(np.int32)
Looking for you reply, Thank You!
Thank you for your great work! I run the collect_indoor3d_data.py of ASIS-master ,but it didn't generate the Area_5_hallway_6.npy file.
Could you please tell me how to fix manually the Area_5/hallway_6 data?
Thanks for your help.
hi,thanks for your excellent work.there is a problem that I meet when I use 6-fold CV,but I get a irregular performance,I think there is some bug in my code,could you please give me your code with 6-fold CV ? if it's possible,could you please give me the code with shapeNet dataset ? thanks a lot.
Hi,
I retrained your model without any change, and test it on Area5, but the performance result is quite low compared with yours, I just wonder if there are some information I don't know.
My environment is tensorflow 1.6 and python 3.5 and my result is as below:
Instance Segmentation mMUCov: 0.3599536949433257
Instance Segmentation mMWCov: 0.38854278223099514
Instance Segmentation mPrecision: 0.40252245593357444
Instance Segmentation mRecall: 0.31376601611130606
Semantic Segmentation oAcc: 0.8630127024386128
Semantic Segmentation mAcc: 0.5824638078371082
Semantic Segmentation mIoU: 0.5037184939579153
Thanks a lot for your help.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.