hancyran / repsurf Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2022 Oral] Official implementation for "Surface Representation for Point Clouds"
License: Apache License 2.0
[CVPR 2022 Oral] Official implementation for "Surface Representation for Point Clouds"
License: Apache License 2.0
thx for your great work!
when I run the code of scripts/s3dis/train_repsurf_umb.sh,
The program stops running suddenly
these are message:
[2022-11-18 14:58:19,302 INFO train.py line 326 16493] Epoch: [1/100][250/765] Batch 0.760 (1.679) Remain 35:34:19 Loss 0.9464 Accuracy 74.02 [2022-11-18 15:05:20,747 INFO train.py line 326 16493] Epoch: [1/100][500/765] Batch 0.590 (1.683) Remain 35:31:19 Loss 0.6425 Accuracy 81.22 [2022-11-18 15:11:53,662 INFO train.py line 326 16493] Epoch: [1/100][750/765] Batch 0.483 (1.646) Remain 34:37:36 Loss 0.4475 Accuracy 82.48 [2022-11-18 15:12:03,557 INFO train.py line 345 16493] Train result at epoch [1/100]: mIoU / mAcc / OA 51.48 / 71.95 / 75.43 [2022-11-18 15:19:06,043 INFO train.py line 326 16493] Epoch: [2/100][250/765] Batch 0.869 (1.690) Remain 35:26:02 Loss 0.4624 Accuracy 85.42 [2022-11-18 15:25:38,102 INFO train.py line 326 16493] Epoch: [2/100][500/765] Batch 0.814 (1.629) Remain 34:02:41 Loss 0.4129 Accuracy 83.37 [2022-11-18 15:32:28,147 INFO train.py line 326 16493] Epoch: [2/100][750/765] Batch 0.511 (1.633) Remain 34:00:33 Loss 0.2694 Accuracy 88.11 [2022-11-18 15:32:36,285 INFO train.py line 345 16493] Train result at epoch [2/100]: mIoU / mAcc / OA 69.06 / 87.74 / 85.77 [2022-11-18 15:39:29,681 INFO train.py line 326 16493] Epoch: [3/100][250/765] Batch 1.076 (1.654) Remain 34:19:14 Loss 0.2946 Accuracy 87.22 [2022-11-18 15:46:07,134 INFO train.py line 326 16493] Epoch: [3/100][500/765] Batch 0.494 (1.622) Remain 33:32:46 Loss 0.1570 Accuracy 93.33 [2022-11-18 15:52:44,975 INFO train.py line 326 16493] Epoch: [3/100][750/765] Batch 0.558 (1.612) Remain 33:13:30 Loss 0.2258 Accuracy 90.11 [2022-11-18 15:53:01,495 INFO train.py line 345 16493] Train result at epoch [3/100]: mIoU / mAcc / OA 76.53 / 91.84 / 89.52 [2022-11-18 16:00:07,540 INFO train.py line 326 16493] Epoch: [4/100][250/765] Batch 0.739 (1.704) Remain 35:00:31 Loss 0.1830 Accuracy 92.54 [2022-11-18 16:06:41,970 INFO train.py line 326 16493] Epoch: [4/100][500/765] Batch 0.582 (1.641) Remain 33:35:45 Loss 0.2703 Accuracy 90.42 [2022-11-18 16:13:25,494 INFO train.py line 326 16493] Epoch: [4/100][750/765] Batch 0.394 (1.632) Remain 33:17:57 Loss 0.1216 Accuracy 94.29 [2022-11-18 16:13:33,778 INFO train.py line 345 16493] Train result at epoch [4/100]: mIoU / mAcc / OA 82.32 / 94.37 / 92.11 [2022-11-18 16:20:25,855 INFO train.py line 326 16493] Epoch: [5/100][250/765] Batch 0.859 (1.648) Remain 33:30:38 Loss 0.2021 Accuracy 92.22 Traceback (most recent call last): File "tool/train.py", line 494, in <module> mp.spawn(main_worker, nprocs=args.ngpus_per_node, args=(args.ngpus_per_node, args)) File "/home/user1/miniconda3/envs/repsurf-seg/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/user1/miniconda3/envs/repsurf-seg/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes while not context.join(): File "/home/user1/miniconda3/envs/repsurf-seg/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 136, in join signal_name=name torch.multiprocessing.spawn.ProcessExitedException: process 1 terminated with signal SIGKILL
could you help me?
thank U very much
When I just use UmbrellaSurfaceConstructor in front of other networks, the training of the whole network becomes very slow, especially the loss stage, do you know the reason?
The data size is the same as Repsurf, but the speed is much slower.
Remove UmbrellaSurfaceConstructor, the speed will recover quickly.
Thank you for the great code! I have implemented your code and I am using my own dataset on it. The validation part takes significantly Longer than the training and fails to return without any error (process gets terminated). For simplicity I am using 1 GPU, with multprocessing_distributed off and batch_number = 1. Any ideas?
Thanks
Dear author,
I'm trying to use my own dataset now. When I used the precalculated class weight of S3DIS, the results of training are normal, but the results of testing are very abnormal.The mIOU is only 6%. So what are the precalculated class weight and how to calculate that?
Could you help me?
Thank you very much.
def get_class_weights(dataset_name):
# pre-calculate the class weight
if dataset_name == 'S3DIS_A1':
num_per_class = [0.27362621, 0.3134626, 0.18798782, 1.38965602, 1.44210271, 0.86639497, 1.07227331,
1., 1.05912352, 1.92726327, 0.52329938, 2.04783419, 0.5104427]
elif dataset_name == 'S3DIS_A2':
num_per_class = [0.29036634, 0.34709631, 0.19514767, 1.20129272, 1.39663689, 0.87889087, 1.11586938,
1., 1.54599972, 1.87057415, 0.56458097, 1.87316536, 0.51576885]
elif dataset_name == 'S3DIS_A3':
num_per_class = [0.27578885, 0.32039725, 0.19055443, 1.14914046, 1.46885687, 0.85450877, 1.05414776,
1., 1.09680025, 2.09280004, 0.59355243, 1.95746691, 0.50429199]
elif dataset_name == 'S3DIS_A4':
num_per_class = [0.27667177, 0.32612854, 0.19886974, 1.18282174, 1.52145143, 0.8793782, 1.14202999,
1., 1.0857859, 1.89738584, 0.5964717, 1.95820557, 0.52113351]
elif dataset_name == 'S3DIS_A5':
num_per_class = [0.28459923, 0.32990557, 0.1999722, 1.20798185, 1.33784535, 1., 0.93323316, 1.0753585,
1.00199521, 1.53657772, 0.7987055, 1.82384844, 0.48565471]
elif dataset_name == 'S3DIS_A6':
num_per_class = [0.29442441, 0.37941846, 0.21360804, 0.9812721, 1.40968965, 0.88577139, 1.,
1.09387107, 1.53238009, 1.61365643, 1.15693894, 1.57821041, 0.47342451]
elif dataset_name == 'ScanNet_train':
num_per_class = [0.32051547, 0.1980627, 0.2621471, 0.74563083, 0.52141879, 0.65918949, 0.73560561, 1.03624985,
1.00063147, 0.90604468, 0.43435155, 3.91494446, 1.94558718, 1., 0.54871637, 2.13587716,
1.13931665, 2.06423695, 5.59103054, 1.08557339, 1.35027497]
elif dataset_name == 'ScanNet_trainval':
num_per_class = [0.32051547, 0.1980627, 0.2621471, 0.74563083, 0.52141879, 0.65918949, 0.73560561, 1.03624985,
1.00063147, 0.90604468, 0.43435155, 3.91494446, 1.94558718, 1., 0.54871637, 2.13587716,
1.13931665, 2.06423695, 5.59103054, 1.08557339, 1.35027497]
else:
raise Exception('No Prepared Class Weights of Dataset')
return torch.FloatTensor(num_per_class)
Hi @hancyran,
Thanks for open-sourcing your excellent work!
I notice that you reported the results of mIoU, mAcc, OA of 6-fold S3DIS in table 2, could you please share with me the iou on each category, e.g., ceiling, floor, ...? just like what you did on table 6-9 in the supplementary material.
Much appreciate it!
Hi, thank you for your great work. I have read your paper and I am curious about the backbone of the segmentation task. Do you use the same Decoder of PointNet++ or have any modifications on it?
hello, I try to train the model on two gpus, but an error occurred. The error occurred when load the S3DIS dataset. The error shows that "ValueError: batch_size should be a positive integer value, but got batch_size=0", how can I solve this problem? Thanks for your help.
sorry, I just saw the s3dis area-5 result. how can i train this code for S3DIS 6-fold?
Another question: I see this code to feed too much point clouds into the network. If the input point cloud data will be reduced, speeding up training.
Will this cause a drop in accuracy?
你好,我在复现论文在scannet上结果时有较大出入,想要一份您在Scannet数据集上的dataloader文件来判断一下是显存不同的问题,还是数据预处理的问题。
hello, thank you for your work, it's a good work. When will you publish the codes about object detection? Thank you!
Dear author
Why get the centroids by concatenating edges with edges.roll(-1, 2) ?
# Algorithm 2 Pytorch-Style Pseudocode of Umbrella RepSurf
pairs = concat([edges, edges.roll(-1, 2)], dim=-2) # [B,N,K,2,3]
centroids = mean(pairs, dim=3) # [B,N,K,3]
Why not get centroids by mean(edges, dim-2) directly? # edges: [B, N, K, 3]
I'm confused.
您好,想问问单卡3090的config超参数您试过没没啊,由于我没得4张显卡,所以特向您请教
When will you publish codes about scannetv2?
Hello,I tried to reproduce the experimental results in your paper,My results are similar to those in your paper in Area5, but I want to try the experimental results in 6-fold.
if you can provide the code,I would be grateful!
thx
Hello everyone !
I managed to train repsurf on a different set of dataset ( My own dataset class ), i tested it, everything worked just fine, i want to know if i can run it on a machine without GPU, can I get rid of cuda extensions ( rewrite the pointops class for cpu and retrain) ? or simply convert it to onxx runtime ?
thanks !
Hi,
thank you for the great work and code!
I want to deploy a RepSurf model and to do that I need the model in ONNX format. It seems that the conversion to ONNX fails due to custom pointops kernels. Have you ever encountered this and perhaps have any advice?
Thanks,
Jelena
Hi,
we are having some problems reproducing the results of semantic segmentation on S3DIS from the article.
For the repsurf model (all hyperparameters set as instructed in article), we got mIoU of 70% while your article reports mIoU of 74%. We would like to inspect the training logs so could you possibly provide training logs of repsurf for 6-fold validation?
Another problem is related to the PointNet++ model. You reported mIoU of 58% in your paper. Did you measure it with the PointNet++ implementation from this repo or some other implementation? We ran the 6-fold validation of the PointNet++ implementation from this repo and got mIoU of 69%. Do you know what can be the possible cause of this?
Dear @hancyran, thank you for publishing this work.
I am curious how to use this network on my own data. For example, say I have a PCD for a room, I do not have the ground truth for this. I just want to run inference script to give me semantic segmented PCD.
How would be the best way to progress using your network? Any help on this would be very much appreciated.
Thank you for the contribution !!
Thanks for your excellent works !
I had a bit of a problem when reproducing the project. Could you explain the loss function (SmoothClsLoss) used in classification task? I didn't find an introduction to it in the paper.
File "/home/RepSurf/classification/modules/pointnet2_utils.py", line 110, in query_knn_point
return knnquery(k, xyz, new_xyz)
File "/home/RepSurf/classification/modules/pointops/functions/pointops.py", line 315, in forward
pointops_cuda.knnquery_cuda(b, n, m, nsample, xyz, new_xyz, idx, dist2)
TypeError: knnquery_cuda(): incompatible function arguments. The following argument types are supported:
1. (arg0: int, arg1: int, arg2: at::Tensor, arg3: at::Tensor, arg4: at::Tensor, arg5: at::Tensor, arg6: at::Tensor, arg7: at::Tensor) -> None
Hello,
I'm trying to reproduce the reported results on ModelNet40.
I have written a data loader for ModelNet40 and I'm training it with all the implementation details in the paper (Appendix G).
The maximum accuracy I'm getting is 92.8 which is way less than the reported numbers.
Can you upload the code for ModelNet40?
RepSurf/modules/repsurface_utils.py
Line 129 in ef873f8
Thanks to the author for the excellent work, but I wonder why group_centriod = torch.zeros_like (sorted_group_xyz), why is the value of group_centriod all 0?
When I want to configure the environment, I find I can't find the folder ‘lib/pointops’, where can I download the relevant file and configure it?
☆⌒(*^-゜)v THX!!
您的工作非常出色,请问什么时候可以更新一下有关室内数据集s3dis和scannetv2的代码呢?
for the segmentation modules, i didn't manage to install pointops module on windows, any help ? on ubunut it worked just fine.
did anyone managed to install this on windows ? and make it work ?
Thanks a lot.
Could you please offer the segmentation code for ScanNet?
What is the meaning of the variable offset [line306] in the segmentation model?
And if I only have data in the form of [B, 3, N] can I even get the offset variable?
Also, if I can't get the offset easily, how can I extract the point cloud features in the data of [B, 3, N]?
Really appreciate it if you can reply!
Dear authors,
Thank you so much for your awesome work.
When I follow your source code to compile Pointops in Classification and then compile the other Pointops in Segmentation, the Pointops imported will come from Cls.'s.
This problem was found by
File './RepSurf/segmentation/modules/pointops/functions/pointops.py', line 126, in forward
pointops_cuda.knnquery_cuda(m, nsample, xyz, new_xyz, offset, new_offset, idx, dist2)
TypeError: knnquery_cuda(): incompatible function arguments. The following argument types are supported:
1. (arg0: int, arg1: int, arg2: int, arg3: int, arg4: at::Tensor, arg5: at::Tensor, arg6: at::Tensor, arg7: at::Tensor) -> None
Since knnquery_cuda() in Seg.'s Pointops support (arg0: int, arg1: int, arg2: at::Tensor, arg3: at::Tensor, arg4: at::Tensor, arg5: at::Tensor, arg6: at::Tensor, arg7: at::Tensor)
.
Then I found setup( name='pointops', ext_modules=[ CUDAExtension('pointops_cuda',
in Cls's Pointops and
setup( name='pointops_cuda', ext_modules=[ CUDAExtension('pointops_cuda',
in Seg's Pointops. They have different python site-packages names but the same CUDAExtension name. This may cause upon problems during import pointops_cuda
.
Because I am not familiar with C++, would you help me solve this problem?
(The solution for now on is to delete the other site-packages while using one)
Thanks again for your great work.
Hi @hancyran
Thanks for your good work. I have one question. Have you tested RepSurf on other semantic segmentation backbones or SOTA backbones?
Xiaobing Han
My training settings are consistent with the code you provided, batch_ Choose size 8, but the floats I measured differ significantly from the data in your article. Can you provide your testing method?
this is my method :
flops, params = profile(model.module, ([coord, feat, offset],))
print('flops: %.2fG' % (flops / 1e9), 'params: %.2fM' % (params / 1e6))
result :
flops: 96.04G params: 0.98M
Module : repsurf.repsurf_umb_ssg
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.