Code Monkey home page Code Monkey logo

asis's People

Contributors

wxinlong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

asis's Issues

bug in collect_indoor3d_data.py

There is a bug in the data collection script (collect_indoor3d_data.py) for windows systems, resulting in wrongly placed, npy files.

Line 18 does not split the path into all its components as the separator on windows is \\.
The result of this bug is that the processed npy files end up in the Stanford3dDataset folder, instead of the stanford_indoor3d_ins.sem folder.

>>> elements = anno_path.split('/')
elements = ['C:\\ASIS\\data\\Stanford3dDataset_v1.2_Aligned_Version\\Area_2', 'conferenceRoom_1', 'Annotations']

minimal fix tested on windows:

>>> elements = os.path.normpath(anno_path).split(os.path.sep)
elements = ['C:', 'ASIS', 'data', 'Stanford3dDataset_v1.2_Aligned_Version', 'Area_2', 'conferenceRoom_1', 'Annotations']

pretrained model belongs to which log

Hi !
In test.py, how to match your pretrained model with the arg:

--model_path
log1/epoch_99.ckpt

I mean the pretrained model belongs to which log ?

Thanks for help !

Problems with visualization of results

Hello, thank you very much for providing the learning code, it can run smoothly.However, when I used CloudCompare to visualize the results, the results are quite different from those mentioned in your article.I was wondering what visualization tools you used to view the results?thank you

Run eval_iou_accuracy.py killed

Hi Xinlong,

I want to run "eval_iou_accuracy.py", but it will be killed. I find that the list "pts_in_pred" has shape of (number_of_instances, number_of_points). It can be super large. I think it is why my program will be killed. Have you ever meet this problem? How can I solve it? Thank you so much.

Binbin

help!

This is the first time I've run the open source code,I can't see the order to run different functions in different folders from README. Is there a running sequence?

BlockMerging出错!!!

我使用了您的训练模型,并且下载了完整且没有错误的S3DIS数据用于测试。
但是在BlockMerging这里出现了 : index 129 is out of bounds for axis 0 with size 100这样的问题。
想请教您这个问题到底出在哪里了呢?

ASIS on ShapeNet

Hello People, I'm wondering how the authors of the paper tested their network on the ShapeNet dataset? Any ideas?

Some question about the instance network

When i train the ASIS network with my data, i find that the instance network doesnot get trained at all in the test results. The inslabel predicted by instance network is same in all the points and the results of seg network is terrible too. I don't know the reason why the network can't be trained, and it worked well when i run the code with your data or even with my another group of data. do you know what could be the reason for the instance network could not be trained.

A issue on block merging

Thank you @WXinlong for your excellent work.

I noticed that you used try...except... in function BlockMerging() when assigning value to the ndarray overlapgroupcounts. I also found this function in SGPN repo. When I test Area 5 Hallway 1 of S3DIS with SGPN, there is a error in this line "overlapgroupcounts[grouplabel[i],volume[xx,yy,zz]] += 1" that reports the index exceeds array index range(300).
Did you meet this error either and add a try...except... syntax here?

Inference and Training

@WXinlong Thanks for sharing the code i have few queries

Q1. The below image shown i which dataset
image
Q2. Have you trained your source code on semanttic Kitti dataset ? is so can you share the pre-trained model , if not can you let us knw what type of conversion is required to convert the data for training
Q3. what is performance obtained on the dataset you have tested

Thanks in advance

The parameters of DBSCAN

Hello, @WXinlong . In the experiments of ShapeNet, I want to perform the DBSCAN clustering algorithm to generate instance labels. But the parameters of DBSCAN are hard to chooose.
Could you please provide the values of eps and min_samples of DBSCAN?
And I would be grateful if you could provide the instance labels of ShapeNet.
Thank you!

Code on ShapeNet

Thank you for your great code! I can get good results on S3DIS. But I can not reproduce the results on ShapeNet datasets. If it's possible, could you please give me some code on shapeNet dataset? thanks a lot. my email is [email protected]

python 3 fix for indoor3d_util.py

In python 3 it is not allowed to use the + operator on a range object and a list.
This causes an error in indoor3d_util.py (sample_data(), line 129) when running the gen_h5.py script.

Minimal fix for python 3 compatibility:

>>> return np.concatenate([data, dup_data], 0), range(N)+list(sample)
TypeError: unsupported operand type(s) for +: 'range' and 'list'
>>> return np.concatenate([data, dup_data], 0), list(range(N))+list(sample)
this works

TypeError: Failed to convert object of type <type 'list'> to Tensor.

Hello, when I run the train.sh, I meet the error :
TypeError: Failed to convert object of type <type 'list'> to Tensor.
Any advice will be appreciated deeply.

Here is the traceback:
Traceback (most recent call last):
File "train.py", line 256, in
train()
File "train.py", line 132, in train
loss, sem_loss, disc_loss, l_var, l_dist, l_reg = get_loss(pred_ins, labels_pl, pred_sem_label, pred_sem, sem_labels_pl)
File "/home/juzhi/temp/ASIS/models/ASIS/model.py", line 89, in get_loss
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 150, in discriminative_loss
0])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2770, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2599, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2549, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 121, in body
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 26, in discriminative_loss_single
reshaped_pred = tf.reshape(prediction, [-1, feature_dim])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2451, in reshape
name=name)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 493, in apply_op
raise err
TypeError: Failed to convert object of type <type 'list'> to Tensor. Contents: [-1, Dimension(5)]. Consider casting elements to a supported type.
[56409. 74039. 9895. 10639. 5923. 6255. 7588. 7744. 3566. 7615.
6183. 5024. 2581.]
cp: cannot stat 'inference_merge.py': No such file or directory
Traceback (most recent call last):
File "test.py", line 260, in
test()
File "test.py", line 83, in test
loss, sem_loss, disc_loss, l_var, l_dist, l_reg = get_loss(pred_ins, labels_pl, pred_sem_label, pred_sem, sem_labels_pl)
File "/home/juzhi/temp/ASIS/models/ASIS/model.py", line 89, in get_loss
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 150, in discriminative_loss
0])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2770, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2599, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2549, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 121, in body
delta_v, delta_d, param_var, param_dist, param_reg)
File "/home/juzhi/temp/ASIS/utils/loss.py", line 26, in discriminative_loss_single
reshaped_pred = tf.reshape(prediction, [-1, feature_dim])
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2451, in reshape
name=name)
File "/home/juzhi/anaconda3/envs/python27/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 493, in apply_op
raise err
TypeError: Failed to convert object of type <type 'list'> to Tensor. Contents: [-1, Dimension(5)]. Consider casting elements to a supported type.

tf_op compile failed on tf1.3

Thank you for your great code!
I used the "conda install --channel https://conda.anaconda.org/anaconda tensorflow-gpu=1.3.0" command to install tf1.3, but there is no " external/nsync/public" behind "anaconda3/envs/py2.7/lib/python2.7/site-packages/tensorflow/include/external/" in the directory.
Thus I can't compile successfully tf_op.
Any advise will be appreciated.

About drawing the figure

Hi, may I ask which drawing tool do you use for clear illustrated figures like Figure 3? I am looking for drawing some graph like yours but do not have any clue..

running problem

when I try to run the training code, there are some errors:
Current batch/total batch num: 0/697
2019-10-05 23:19:12.279100: E tensorflow/stream_executor/cuda/cuda_blas.cc:636] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "train.py", line 256, in
train()
File "train.py", line 211, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 243, in train_one_epoch
feed_dict=feed_dict)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=786432, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_911)]]

Caused by op u'layer1/conv0/Conv2D', defined at:
File "train.py", line 256, in
train()
File "train.py", line 128, in train
pred_sem, pred_ins = get_model(pointclouds_pl, is_training_pl, NUM_CLASSES, bn_decay=bn_decay)
File "/home/ASIS/models/ASIS/model.py", line 29, in get_model
l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=1024, radius=0.1, nsample=32, mlp=[32,32,64], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
File "/home/ASIS/utils/pointnet_util.py", line 187, in pointnet_sa_module
data_format=data_format)
File "/home/ASIS/utils/tf_util.py", line 165, in conv2d
data_format=data_format)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 639, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InternalError (see above for traceback): Blas SGEMM launch failed : m=786432, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_911)]]

I don't know how to solve it, I really hope that you can help me! waiting for your reply,thanks!

Instance Segmentation Results on Area5

Thank you for your great work! but I have some problem when trying to reproduce the results on Area5. Has anyone reproduced the results on Instance Segmentation task?
Can you provide some advice and guidance?

//=======================================
Area5 Results report in the paper:

Table 1: Instance segmentation results on S3DIS dataset.
Backbone Method mCov mWCov mPrec mRec
PN++ ASIS 44.6 47.8 55.3 42.4

Table 2: Semantic segmentation results on S3DIS dataset
Backbone Method mAcc mIoU oAcc
PN++ ASIS 60.9 53.4 86.9

//=======================================
Evaluate the trained model you provided on Area5 :
"(optional) Trained model can be downloaded from here."

I can reproduce the Semantic segmentation results with the trained model you provided,
but can not reproduce the Instance segmentation results. (Just testing with the trained model you provided, without training by myself)

Because of the evaluate method in test.py sampled 4096 points in each block to test on, each time the evaluation results are different. The mMUCov, mMWCov, mACC, mIoU oAcc will be a little different which is acceptable, but the difference is a bit big for mPrecision and mRecall.

I evaluate the provided model on Area5 using "test.py" and "eval_iou_accuracy.py" for three times, without changing any code. Results are as follows:
//--------------------------------------
test on Area5 for the 1st time:

Instance segmentation:
mMUCov: 0.42489331999047575 < 44.6
mMWCov: 0.4535493664127393 < 47.8
mPrecision: 0.5021224775661305 < 55.3
mRecall: 0.4004512705562707 < 42.4

Semantic segmentation:
mAcc: 0.6092345653223031 = 60.9
mIoU: 0.5331283341313834 = 53.4
oAcc: 0.8693586484215375 = 86.9
//--------------------------------------
test on Area5 for the 2nd time:

Instance segmentation:
mMUCov: 0.4291705590335591 < 44.6
mMWCov: 0.45548170503790186 < 47.8
mPrecision: 0.5474015599454721 = 55.3
mRecall: 0.4274631185866652 = 42.4

Semantic segmentation:
mAcc: 0.6084982131786074 = 60.9
mIoU: 0.5328714125575762 = 53.4
oAcc: 0.8693728293860369 = 86.9
//--------------------------------------
test on Area5 for the 3rd time:

Instance segmentation:
mMUCov: 0.4270858894049287 < 44.6
mMWCov: 0.45475902917080196 < 47.8
mPrecision: 0.5194688592978431 < 55.3
mRecall: 0.40944191619875714 < 42.4

Semantic segmentation:
mAcc: 0.6079589620032484 = 60.9
mIoU: 0.5315288320965799 = 53.4
oAcc: 0.8692114229308049 = 86.9
//--------------------------------------

//=======================================
So I trained the model by myself, and test it on Area5,
using your training and tesing code, with random initialization.

//--------------------------------------
test on Area5 for the 1st time:

mMUCov: 0.4192711216731248 < 44.6
mMWCov: 0.4444165547538725 < 47.8
mPrecision: 0.5096157303879506 < 55.3
mRecall: 0.40007411245273317 < 42.4

mAcc: 0.6049548823385776 = 60.9
mIoU: 0.528582705433821 = 53.4
oAcc: 0.8679205275945892 = 86.9

//--------------------------------------
test on Area5 for the 2nd time:

mMUCov: 0.4174190863147767 < 44.6
mMWCov: 0.44349561917787444 < 47.8
mPrecision: 0.5229537176620869 < 55.3
mRecall: 0.40793882667136827 < 42.4

mAcc: 0.6044562112313121 = 60.9
mIoU: 0.5287068091885877 = 53.4
oAcc: 0.8676705435570818 = 86.9

//=======================================
I also trained the model by initializing the network using the provided model weights.

//--------------------------------------
test on Area5 for the 1st time:

mMUCov: 0.4184383436403125 < 44.6
mMWCov: 0.4453902032528212 < 47.8
mPrecision: 0.5073059155435999 < 55.3
mRecall: 0.3895353315843023 < 42.4

mAcc: 0.6104495511204024 = 60.9
mIoU: 0.5355729541022203 = 53.4
oAcc: 0.868981285117484 = 86.9

Thank you again for your code.

test.py###Unknow file type!

您好,我在运行test.py时遇到了#Unknow file type!#这个问题。
为什么您在数据预处理时生成了.h5格式数据,但是最后测试的时候只能打开txt格式和npy格式的文件?
迫切盼望得到您的回复,再次感谢您的贡献!

error in tfop compling

截图
I get the error "not find -ltensorflow_framework
"when i try to complie on may os

if i try use ur tf_sampling_so.so,I get the error "tf_sampling_so.so: undefined symbol: _ZN10tensorflow15OpKernelContext10CtxFailureEPKciRKNS_6StatusE"

And my env is tf1.4 py27 cuda9.0.

Evaluate method in test.py

Thank you very much for your great work ,but I have some problem when trying to evaluate the results on S3DIS with test.py you provided.
According to the original paper of PointNet(http://openaccess.thecvf.com/content_cvpr_2017/papers/Qi_PointNet_Deep_Learning_CVPR_2017_paper.pdf),when testing on the S3DIS dataset, all points in each block are tested.But in your test.py,I find you sampled 4096 points in each block to test on.Dose it mean your evaluate protocol is different from that in PointNet?
Looking forward to your reply.

irregular performance with 6-fold CV

hi Xinlong,
thanks for your work. when I use 6-fold CV with S3DIS dataset, there is an irregular performance. could you please provide the performance of each fold ? thanks.

Questions about making h5 datasets with my own point cloud files

1581174874(1)
1581174879(1)

Hi, thank you very much for your contribution to open source.I'd like to ask you a question. I collected my own point cloud files with lidar and intend to make H5 data set for testing. But the production process failed. Analyze the reason. Can I not collect the component data set of point cloud like the original data set? As shown in the file above.I'm looking forward to your reply.

about test.py

when I run test.py, it raises some error, I don't know what the exactly problem it is, can you please help me?
Model restored.
0 / 68 ...
Loading train file /home/ASIS/data/stanford_indoor3d_ins.sem/Area_5_conferenceRoom_1.npy
Processsing: Shape [0] Block[0]
2019-09-14 18:39:28.811094: E tensorflow/stream_executor/cuda/cuda_blas.cc:636] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "test.py", line 262, in
test()
File "test.py", line 183, in test
feed_dict=feed_dict)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=32768, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_293)]]
[[Node: ins_fa_layer3/Squeeze/_585 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2283_ins_fa_layer3/Squeeze", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op u'layer1/conv0/Conv2D', defined at:
File "test.py", line 262, in
test()
File "test.py", line 79, in test
pred_sem, pred_ins = get_model(pointclouds_pl, is_training_pl, NUM_CLASSES)
File "/home/ASIS/models/ASIS/model.py", line 29, in get_model
l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=1024, radius=0.1, nsample=32, mlp=[32,32,64], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1')
File "/home/ASIS/utils/pointnet_util.py", line 187, in pointnet_sa_module
data_format=data_format)
File "/home/ASIS/utils/tf_util.py", line 165, in conv2d
data_format=data_format)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 639, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InternalError (see above for traceback): Blas SGEMM launch failed : m=32768, n=32, k=9
[[Node: layer1/conv0/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](layer1/concat, layer1/conv0/weights/read/_293)]]
[[Node: ins_fa_layer3/Squeeze/_585 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2283_ins_fa_layer3/Squeeze", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Annother question: I can run train.py and estimate_mean_ins_size.py successfully, but I was wondering,that why train.py just takes about several minutes to train.
waiting for your kindly response! thanks very much

cluster

Hi,
Thanks for your impressive work, but I just wonder why you set cluster number to 5 for instance segmentation, and what does this value mean? Thanks a lot.

net_ins = tf_util.conv1d(net_ins, 5, 1, padding='VALID', activation_fn=None, scope='ins_fc4')

Best Regards
Frank

A training code that can be deployed on multiple GPUs

Hello, I'm quite grateful for you to offer those codes. However, maybe a code (train.py) using just one GPU cannot fulfill the potential of multiple GPUs. And I went wrong when trying to modify it into a multiple-GPU version. Would you like to introduce an official one? Thanks very much!

code on ShapeNet datasets

Thank you for your open source code! But I can not reproduce the 85.0 mIoU on ShapeNet datasets. I would be very appreciated if you could give me some code on shapeNet dataset? thank you very much. my email is [email protected]

About dataset

Hi !
I found there are many data version, could you tell me which one was used in your work ?
image

I use the last one, and preprossed as README:

python collect_indoor3d_data.py
python gen_h5.py         
cd data && python generate_input_list.py`

but seems not work, because when I run sh +x train.sh 5, it shows :

Fail to load modelfile: None
**** EPOCH 000 ****
----
Current batch/total batch num: 0/824
2019-04-29 03:33:14.057889: E tensorflow/core/common_runtime/executor.cc:660] Executor failed to create kernel. Not found: No registered '_CopyFromGpuToHost' OpKernel for CPU devices compatible with node swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)
        .  Registered:  device='GPU'
         [[Node: swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)]]
Traceback (most recent call last):
  File "train.py", line 262, in <module>
    train()
  File "train.py", line 217, in train
    train_one_epoch(sess, ops, train_writer)
  File "train.py", line 249, in train_one_epoch
    feed_dict=feed_dict)
  File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: No registered '_CopyFromGpuToHost' OpKernel for CPU devices compatible with node swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)
        .  Registered:  device='GPU'
         [[Node: swap_out_gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/sem_fa_layer4/ThreeInterpolate_grad/ThreeInterpolateGrad_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](sem_fa_layer3/Squeeze/_301)]]
Traceback (most recent call last):
  File "estimate_mean_ins_size.py", line 49, in <module>
    estimate(FLAGS.test_area)
  File "estimate_mean_ins_size.py", line 30, in estimate
    cur_data, cur_group, _, cur_sem = provider.loadDataFile_with_groupseglabel_stanfordindoor(h5_filename)
  File "/data/lirong/ASIS/models/ASIS/provider.py", line 213, in loadDataFile_with_groupseglabel_stanfordindoor
    seg = f['seglabel'][:].astype(np.int32)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/data/lirong/py2/venv_python2.7/local/lib/python2.7/site-packages/h5py/_hl/dataset.py", line 573, in __getitem__
    self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)
KeyboardInterrupt

I don't know if this is because the dataset. I use the virtual environment same as PointNet++ and is useful.

Thanks for your help !

training problem

I used "python train.py --gpu 2 --batch_size 24 --max_epoch 100 --log_dir log5 --learning_rate 0.001 --decay_step 300000 --restore_model None --input_list /home/ASIS/data/train_hdf5_file_list_woArea5.txt" to train, but it reported
2019-09-30 22:36:43.451109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:02:00.0
totalMemory: 10.73GiB freeMemory: 64.56MiB
2019-09-30 22:36:43.557029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 1 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:03:00.0
totalMemory: 10.73GiB freeMemory: 54.56MiB
2019-09-30 22:36:43.696038: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 2 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:82:00.0
totalMemory: 10.73GiB freeMemory: 62.56MiB
2019-09-30 22:36:43.817509: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 3 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:83:00.0
totalMemory: 10.73GiB freeMemory: 64.56MiB
2019-09-30 22:36:43.817819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
2019-09-30 22:36:43.818075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1 2 3
2019-09-30 22:36:43.818086: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0: Y N N N
2019-09-30 22:36:43.818093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1: N Y N N
2019-09-30 22:36:43.818098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 2: N N Y N
2019-09-30 22:36:43.818104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 3: N N N Y
2019-09-30 22:36:43.818116: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:02:00.0, compute capability: 7.5)
2019-09-30 22:36:43.818124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: GeForce RTX 2080 Ti, pci bus id: 0000:03:00.0, compute capability: 7.5)
2019-09-30 22:36:43.818131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:82:00.0, compute capability: 7.5)
2019-09-30 22:36:43.818138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: GeForce RTX 2080 Ti, pci bus id: 0000:83:00.0, compute capability: 7.5)
2019-09-30 22:36:44.557000: E tensorflow/core/common_runtime/direct_session.cc:168] Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory
Traceback (most recent call last):
File "train.py", line 256, in
train()
File "train.py", line 165, in train
sess = tf.Session(config=config)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1509, in init
super(Session, self).init(target, graph, config=config)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 628, in init
self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/home/anaconda3/envs/asis/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

I don't know why I used gpu 2 to train the code, but it tells me gpu 0 is out of memory. can you please tell the solution

training/testing on vkitti

I noticed that the ASIS repositories have showed Qualitative result of vKITTI test set. Since the dataset I found(https://github.com/VisualComputingInstitute/vkitti3D-dataset) contains only the semantic label and the origional vKITTI dataset contains only RGB(also with depth ground truth) images.
Is the ASIS itself trained on dataset somewhere from original vKITTI or trained on S3DIS and tested on vKITTI?

Some question about BlockMerging.

@WXinlong Thanks for sharing your work, and i have some question for the function of BlockMerging. When i run the test code with my data. I encountered a IndexError that the actual index which was generated through the code "x,y,z = (pts[:,0]/gap).astype(np.int32)" was out of bounds for axis 1 with size of volume. I didnot change any code except training and testing with my data. I am curious to know the the function of BlockMerging and the meaning of the following code:
gap = 5e-3
volume_num = int(1. / gap)+1+200
volume = -1* np.ones([volume_num,volume_num,volume_num]).astype(np.int32)
volume_seg = -1* np.ones([volume_num,volume_num,volume_num]).astype(np.int32)

Looking for you reply, Thank You!

Helping me,please.

Thank you for your great work! I run the collect_indoor3d_data.py of ASIS-master ,but it didn't generate the Area_5_hallway_6.npy file.
Could you please tell me how to fix manually the Area_5/hallway_6 data?
Thanks for your help.

6-fold CV

hi,thanks for your excellent work.there is a problem that I meet when I use 6-fold CV,but I get a irregular performance,I think there is some bug in my code,could you please give me your code with 6-fold CV ? if it's possible,could you please give me the code with shapeNet dataset ? thanks a lot.

reproduce results on Area5 for instance segmentation

Hi,
I retrained your model without any change, and test it on Area5, but the performance result is quite low compared with yours, I just wonder if there are some information I don't know.

My environment is tensorflow 1.6 and python 3.5 and my result is as below:
Instance Segmentation mMUCov: 0.3599536949433257
Instance Segmentation mMWCov: 0.38854278223099514
Instance Segmentation mPrecision: 0.40252245593357444
Instance Segmentation mRecall: 0.31376601611130606

Semantic Segmentation oAcc: 0.8630127024386128
Semantic Segmentation mAcc: 0.5824638078371082
Semantic Segmentation mIoU: 0.5037184939579153

Thanks a lot for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.