Code Monkey home page Code Monkey logo

spatiotemporalsegmentation's Introduction

Spatio-Temporal Segmentation

This repository contains the accompanying code for 4D-SpatioTemporal ConvNets: Minkowski Convolutional Neural Networks, CVPR'19.

Change Log

  • 2020-05-19 The latest Minkowski Engine since the commit be5c3, does not require explicit cache clear and can use the memory more efficiently.
  • 2020-05-04: As pointed out by Thomas Chaton on Issue#30, I also found out that the training script contains bugs that models cannot reach the target performance described in the Model Zoo with the latest MinkowskiEngine. I am in the process of debugging the bugs, but I am having some difficulty finding the bugs. So, I created another git repo SpatioTemporalSegmentation-ScanNet from my other private repo that reaches the target performance. Please refer to the SpatioTemporalSegmentation-ScanNet for the ScanNet training. I'll update this repo once I find the bugs and merge SpatioTemporalSegmentation-ScanNet with this repo. Sorry for the trouble.

Requirements

  • Ubuntu 14.04 or higher
  • CUDA 10.1 or higher
  • pytorch 1.3 or higher
  • python 3.6 or higher
  • GCC 6 or higher

Installation

You need to install pytorch and Minkowski Engine either with pip or with anaconda.

Pip

The MinkowskiEngine is distributed via PyPI MinkowskiEngine which can be installed simply with pip. First, install pytorch following the instruction. Next, install openblas.

sudo apt install libopenblas-dev

pip install torch torchvision

pip install -U git+https://github.com/StanfordVL/MinkowskiEngine

Next, clone the repository and install the rest of the requirements

git clone https://github.com/chrischoy/SpatioTemporalSegmentation/

cd SpatioTemporalSegmentation

pip install -r requirements.txt

Troubleshooting

Please visit the MinkowskiEngine issue pages if you have difficulties installing Minkowski Engine.

ScanNet Training

  1. Download the ScanNet dataset from the official website. You need to sign the terms of use.

  2. Next, preprocess all scannet raw point cloud with the following command after you set the path correctly.

python -m lib.datasets.preprocessing.scannet
  1. Train the network with
export BATCH_SIZE=N;
./scripts/train_scannet.sh 0 \
	-default \
	"--scannet_path /path/to/preprocessed/scannet"

Modify the BATCH_SIZE accordingly.

The first argument is the GPU id and the second argument is the path postfix and the last argument is the miscellaneous arguments.

mIoU vs. Overall Accuracy

The official evaluation metric for ScanNet is mIoU. OA, Overal Accuracy is not the official metric since it is not discriminative. This is the convention from the 2D semantic segmentation as the pixelwise overall accuracy does not capture the fidelity of the semantic segmentation. On 3D ScanNet semantic segmentation, OA: 89.087 -> mIOU 71.496 mAP 76.127 mAcc 79.660 on the ScanNet validation set v2.

Then why is the overall accuracy least discriminative metric? This is due to the fact that most of the scenes consist of large structures such as walls, floors, or background and scores on these will dominate the statistics if you use Overall Accuracy.

Synthia 4D Experiment

  1. Download the dataset from download

  2. Extract

cd /path/to/extract/synthia4d
wget http://cvgl.stanford.edu/data2/Synthia4D.tar
tar -xf Synthia4D.tar
tar -xvjf *.tar.bz2
  1. Training
export BATCH_SIZE=N; \
./scripts/train_synthia4d.sh 0 \
	"-default" \
	"--synthia_path /path/to/extract/synthia4d"

The above script trains a network. You have to change the arguments accordingly. The first argument to the script is the GPU id. Second argument is the log directory postfix; change to mark your experimental setup. The final argument is a series of the miscellaneous aruments. You have to specify the synthia directory here. Also, you have to wrap all arguments with " ".

Stanford 3D Dataset

  1. Download the stanford 3d dataset from the website

  2. Preprocess

Modify the input and output directory accordingly in

lib/datasets/preprocessing/stanford.py

And run

python -m lib.datasets.preprocessing.stanford
  1. Train
./scripts/train_stanford.sh 0 \
	"-default" \
	"--stanford3d_path /PATH/TO/PREPROCESSED/STANFORD"

Model Zoo

Model Dataset Voxel Size Conv1 Kernel Size Performance Link
Mink16UNet34C ScanNet train + val 2cm 3 Test set 73.6% mIoU, no sliding window download
Mink16UNet34C ScanNet train 2cm 5 Val 72.219% mIoU, no rotation average, no sliding window per class performance download
Mink16UNet18 Stanford Area5 train 5cm 5 Area 5 test 65.828% mIoU, no rotation average, no sliding window per class performance download
Mink16UNet34 Stanford Area5 train 5cm 5 Area 5 test 66.348% mIoU, no rotation average, no sliding window per class performance download
3D Mink16UNet14A Synthia CVPR19 train 15cm 3 CVPR19 test 81.903% mIoU, no rotation average, no sliding window per class performance download
3D Mink16UNet18 Synthia CVPR19 train 15cm 3 CVPR19 test 82.762% mIoU, no rotation average, no sliding window per class performance download

Note that sliding window style evaluation (cropping and stitching results) used in many related works effectively works as an ensemble (rotation averaging) which boosts the performance.

Demo

The demo code will download weights and an example scene first and then visualize prediction results.

Dataset Scannet Stanford
Command python -m demo.scannet python -m demo.stanford
Result

Citing this work

If you use the Minkowski Engine, please cite:

@inproceedings{choy20194d,
  title={4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks},
  author={Choy, Christopher and Gwak, JunYoung and Savarese, Silvio},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={3075--3084},
  year={2019}
}

Related projects

spatiotemporalsegmentation's People

Contributors

chrischoy avatar faultaddr avatar fengziyue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spatiotemporalsegmentation's Issues

Could you please provide processed S3DIS file structure

Sorry to bother you .S3DIS is a new dataset to me .I want to feed it into processing.stanford.py, but i wonder if i need to change the original file structure i downloaded from the Internet. If necessary ,could you please provide processed file structure,preferably the script. thank you !

Stanford data preprocessing

Hello, I highly appreciate you for sharing the code.

It seems that in SpatioTemporalSegmentation/lib/datasets/preprocessing/stanford.py, the below code returns an error.

inds, collabels = ME.utils.sparse_quantize(
coords,
feats,
labels,
return_index=True,
ignore_label=255,
quantization_size=0.01 # 1cm
)
image

My MinkowskiEngine version is 0.5.0 so I changed the code like this :

coords, feats, collabels = ME.utils.sparse_quantize(
coords,
feats,
labels,
return_index=False,
ignore_label=255,
quantization_size=0.01 # 1cm
)
pointcloud = np.concatenate((coords, feats, collabels), axis=1)

It works well in preprocessing, but when I run the " ./scripts/train_stanford.sh" , it also returns errors like this
image

Please let me know if I have some mistakes. Thank you.

Invalid label type

Hi @chrischoy,

I was training on a subset of scannet & I got the following error;

Traceback (most recent call last):
  File "main.py", line 154, in <module>
    main()
  File "main.py", line 147, in main
    train(model, train_data_loader, val_data_loader, config)
  File "/home/x23/workspace_pcs/SpatioTemporalSegmentation/lib/train.py", line 78, in train
    coords, input, target = data_iter.next()
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
    raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/x23/workspace_pcs/SpatioTemporalSegmentation/lib/dataset.py", line 266, in __getitem__
    coords, feats, labels, center=center)
  File "/home/x23/workspace_pcs/SpatioTemporalSegmentation/lib/voxelizer.py", line 130, in voxelize
    coords_aug, feats, labels = ME.utils.sparse_quantize(coords_aug, feats, labels=labels, ignore_label=self.ignore_label)
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/MinkowskiEngine-0.3.3-py3.7-linux-x86_64.egg/MinkowskiEngine/utils/quantization.py", line 194, in sparse_quantize
    ignore_label)
  File "/home/x23/miniconda3/envs/mink/lib/python3.7/site-packages/MinkowskiEngine-0.3.3-py3.7-linux-x86_64.egg/MinkowskiEngine/utils/quantization.py", line 82, in quantize_label
    assert labels.dtype == np.int32, f"Invalid label type {labels.dtype} != np.int32"
AssertionError: Invalid label type float32 != np.int32

Out of memory, even only batch size = 4

Hi Chrischoy,
Thank you for your great work. However, when I train on S3DIS with the batch size is only 4, it has out of memory problem. My GPU: titan v100(12G)
outofmemory

You can see the figure. When the training is at epoch 15, it seems the leak of memory. Can you fix that error?
Thank you!

Inconsistency when batch size varies

hi Chris,

I'm performing a classification task using MinkowskiEngine. I train the neural network with batch_size=8.

I set the batch_size to be 1 at test phase, and the result is quite bad, there's a big gap between validation set and test set. I increase the batch_size and the result is better.

Can you help explain this? I know the batch normalization layer might be the reason, but I've never encounter this phenomenon with other frameworks.

command for running test stage

Hi,

May I know about what linux command you input for testing on ScanNet, as I did not find test scripts in the scripts folder?
Thanks.

Any pretrained models for train set only?

Do you have any models only trained on the training set (without validation set)?
I cannot reproduce your result due to the lack of hardware, but I want to test something on the validation dataset.
It would be really appreciated if you release your code with only trained on trainset.

Thanks,

Training time

Hi,

Thanks for sharing the training code. Does the code work for multiple GPUs? If not, how long it takes to get the reported SOTA performance model?

Thanks very much

Error while Loading the ScanNet Dataset?

Following the README.md
I downloaded the _vh_clean_2.ply file using the Scannet.org provided scripts
then use python -m lib.datasets.preprocessing.scannet to generate processed .ply files in the "train/test" folder under the processed_dir. I came across missing the scannetv2_train.txt file, so I download them from https://github.com/ScanNet/ScanNet/tree/master/Tasks/Benchmark. However, when I checked, the filename in it is like scene0191_00. However, the files in the processed folder are like scene0191_00.ply, thus resulting in the Error when the data loader loads data. Could you be kind enough to inform me how to properly prepare the ScanNet dataset for this codebase? Thx a lot!

Multi GPU training issue

I encountered this issue after modified the S3DIS training code to multi GPU:

File "/home/zfeng/.conda/envs/mink/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/zfeng/.conda/envs/mink/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 107, in forward
exponential_average_factor, self.eps)
File "/home/zfeng/.conda/envs/mink/lib/python3.7/site-packages/torch/nn/functional.py", line 1670, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

Could you please tell me what's wrong with it?

MinkowskiNet42

hi Christopher,

Thank you for your great work and also open source the code!

In your paper you use MinkowskiNet42 to indicate the network architecture you use for Scannet semantic segmentation task. Is this the same as Res16UNet34C๏ผŸ I'm trying to reproduce the result but seems there's still a long way from 0.73. Can you tell me your batch size?

a problem with scannet about training

thanks for your share. when I tried to training in ScanNet, I got a problem like this
"File "/home/zou/anaconda3/envs/sparse/lib/python3.8/site-packages/MinkowskiEngine-0.4.3-py3.8-linux-x86_64.egg/MinkowskiEngine/Common.py", line 272, in convert_region_type
region_offset = torch.IntTensor(region_offset)
TypeError: only integer tensors of a single element can be converted to an index".

alternate dataset!!!

@chrischoy @panyunyi97 @fengziyue thanks for open sourcing the wonderfull work , i had few queries
Q1 have you trained the architecture on the available other dataset like semanttic Kitti and 3D dataset
Q2 If not trained can we follow the same training pipeline , if trained can you please share the pre-trained model
Q3 can we use the currently pre-trained model to test on custom dataset which less number of point cloud density

Thanks in advance

General Training Questions (train / val curve, batch size, iter size)

Hi Chris,

With the same optimizer, learning rate schedule, I got training curves like this. Compared to your implementation, I simply remove the data augmentation part, is the chromatic data augmentation that important in this case? Btw, I only have one 1080Ti, so I set the maximum number of points to be 0.6M then the batch size of each iteration varies. Have you compared with SparseConv in terms of training time on this semantic segmentation task? I played with that once and I remember it would be much faster.

image

Identify new classes

Hey @chrischoy,

Can you suggest a way using which I can identify new classes? I am looking for something of transfer learnings sorts where I can use any of pre-trained models and build on top of it since I can only afford less number of these new classes.

Any help would be much appreciated.

module 'lib' has no attribute 'datasets'

Traceback (most recent call last):
  File "/content/SpatioTemporalSegmentation/main.py", line 26, in <module>
    from lib.datasets import load_dataset
  File "/content/SpatioTemporalSegmentation/lib/datasets/__init__.py", line 1, in <module>
    import lib.datasets.synthia as synthia
AttributeError: module 'lib' has no attribute 'datasets'

Exception has occurred: TypeError only integer tensors of a single element can be converted to an index

Hi @chrischoy ,

Thanks for the great work! When I am trying training semantic segmentation on ScanNet, I encounter the following exception:

Traceback (most recent call last): File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/chronbird/SpatioTemporalSegmentation/main.py", line 156, in <module> main() File "/home/chronbird/SpatioTemporalSegmentation/main.py", line 149, in main train(model, train_data_loader, val_data_loader, config) File "/home/chronbird/SpatioTemporalSegmentation/lib/train.py", line 91, in train soutput = model(*inputs) File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/chronbird/SpatioTemporalSegmentation/models/res16unet.py", line 204, in forward out_b1p2 = self.block1(out) File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/chronbird/SpatioTemporalSegmentation/models/modules/resnet_block.py", line 42, in forward out = self.conv1(x) File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiConvolution.py", line 263, in forward self.kernel_generator.get_kernel(input.tensor_stride, self.is_transpose) File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/MinkowskiEngine/Common.py", line 347, in get_kernel self.cache[tuple(tensor_stride)] = convert_region_type( File "/home/chronbird/anaconda3/envs/pointcloud/lib/python3.8/site-packages/MinkowskiEngine/Common.py", line 272, in convert_region_type region_offset = torch.IntTensor(region_offset) TypeError: only integer tensors of a single element can be converted to an index

Then I check about the input of self.block1, it is
SparseTensor( Coords=tensor([[ 1, 74, 8, 150], [ 1, 70, 0, 192], [ 8, -190, 350, 82], ..., [ 1, 240, 6, 192], [ 5, 380, -82, 136], [ 6, 70, -68, 110]], dtype=torch.int32) Feats=tensor([[2.4673, 1.2257, 2.8942, ..., 0.0000, 2.3988, 0.0000], [0.5576, 0.0000, 0.0000, ..., 0.9546, 0.0000, 0.1357], [0.2138, 0.1608, 0.0000, ..., 0.0000, 0.3920, 0.0000], ..., [0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.4807], [0.0000, 0.0366, 0.0000, ..., 0.5666, 0.0000, 1.0899], [0.0000, 1.6132, 0.0000, ..., 0.1976, 0.0000, 1.4563]], device='cuda:0', grad_fn=<ReluBackward1>) coords_key=< CoordsKey, key: 16908437251554604741, tensor_stride: [2, 2, 2, ๏ฟฝ๏ฟฝ] in dimension: 3 > tensor_stride=[2, 2, 2] coords_man=< CoordsManager Number of Coordinate Maps: 2 Coordinate Map Key: 16908437251554604741, Size: 323293 Coordinate Map Key: 15034981587763204738, Size: 821840 Number of Kernel Maps: 2 Kernel In-Out Map Key: 1762388836632698508, Size: 821840 Kernel In-Out Map Key: 13453830109827794797, Size: 7072854 > spatial dimension=3)

and self.block1
Sequential( (0): BasicBlock( (conv1): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1]) (norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1]) (norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): MinkowskiReLU() ) (1): BasicBlock( (conv1): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1]) (norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1]) (norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): MinkowskiReLU() ) )

I am using Pytorch 1.5.1 and the latest MinkowskiEngine built from source. Could you please help me check what might possibly cause the error? Thanks in advance.

ValueError: not enough values to unpack (expected 4, got 2)

@chrischoy
Got the following error while trying to train on scannet dataset with your latest commit,

Traceback (most recent call last):
  File "main.py", line 156, in <module>
    main()
  File "main.py", line 149, in main
    train(model, train_data_loader, val_data_loader, config)
  File "/home/ubuntu/workspace_pcs/code/SpatioTemporalSegmentation/lib/train.py", line 78, in train
    coords, input, target = data_iter.next()
  File "/home/ubuntu/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/home/ubuntu/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/home/ubuntu/miniconda3/envs/mink/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/ubuntu/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ubuntu/miniconda3/envs/mink/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ubuntu/workspace_pcs/code/SpatioTemporalSegmentation/lib/dataset.py", line 265, in __getitem__
    coords, feats, labels, center = self.load_ply(index)
ValueError: not enough values to unpack (expected 4, got 2)

Potential error in training loop

In line train.py:100, The "loss.backward()" inside the sub_iter loop. However in line train.py:68 the "optimizer.zero_grad()" is applied outside the loop.

I am not sure if this is intended as to my knowledge the zero grad should be done between each backwards operation? This means the zero_grad should either be inside the sub_iter loop or the loss.backwards() should be replaced with a batch_loss.backwards() outside of the loop.

TS-CRF

I trained ResUNetIN14 with Adam(lr=0.1,weight_decay=1e-4,) on S3DIS fold1 split
but it only achieve 30%mIOU......
Then I test the trained model and found points on the same object have multiple labels
I think maybe the reason is no CRF?

lib import issues

Hi,

I have one problem with the lib import issues:

Traceback (most recent call last):
File "", line 1, in
File "/private/home/zaiweizhang/ssl_scaling/third-party/SpatioTemporalSegmentation/lib/datasets/init.py", line 1, in
import lib.datasets.synthia as synthia
ModuleNotFoundError: No module named 'lib'

It's keep giving me this issue, and using python -m main doesn't work.

average_precision for binary classification

Hello,

The function test.py/average_precision uses label_binarize from sklearn.preprocessing, and if you read the documentation you can see that it explicitly says :
Shape will be [n_samples, 1] for binary problems.

So you get this error:

image

Because since the output of this function will be used to calculate the average precision, the target and predictions must have the same shape ( Nx2 ).

Here is a "quick" solution that I don't like a lot (basically creating a dummy class), so if anyone could give a better one that would be good:

def average_precision(prob_np, target_np):
num_class = prob_np.shape[1]
if num_class == 2:
num_class += 1
label = label_binarize(target_np, classes=list(range(num_class)))
with np.errstate(divide='ignore', invalid='ignore'):
return average_precision_score(label, prob_np, None)

And also, if you think that I am mistaken, please let me know, feedback is always welcomed.

Model Zoo conv1_kernel_size mismatch

Hello,

I noticed that you have put a pertained Mink16UNet34C model.
However, I don't think this is a correct pretrained model.
If you load the model of the model zoo with model.load_state_dict(torch.load(file_path))['state_dict'] you can see that conv0p1s1 has kernel size of 3, while your implementation of MinkUNet34 has kernel size of 5 in the Minkowski Engine

If you look at the indoor.py and get the pretrained model of via link that you've uploaded, I think you can get the correct pretrained model.

Please let me know if I'm mistaken.
Thanks,

Issues with the Stanford demo

Hello,
I tried to run the Stanford demo, and I got multiple issues:
1- When you run the python -m demo.stanford command for the first time (you don't have the .ply file yet) it should run the download function since the file_name doesn't match with any local file.
image
But then, you get this error:
image
Because the defined file_name in the start of the demo file doesn't match the name of the file on the server-side if you follow the static part of the link (conferenceRoom_1.ply vs 1.ply).
image

image

2- When you change the name on the code to 1.ply, it downloads the file, but from what I understood, it is not the good file.
Because then I got this error:
image

Which is caused by this line:
image

And when you see the structure of the data here:
image

It really has no "label" property, and I think the file is from ScanNet because when you visualize it you get this:
image

Sorry if this is not the proper way to write an issue, it is my first one, and I tried to be the most explicit I could.

[MinkowskiEngine 0.5.4][ME.RegionType.HYBRID vs ME.RegionType.CUSTOM] ME.RegionType.CUSTOM giving AttributeError: 'NoneType' object has no attribute 'numel'

Hi @chrischoy
I am trying to run the code in presumably written with the 0.4.3 MinkowskiEngine but my version in 0.5.4

ConvType.SPATIAL_HYPERCUBE_TEMPORAL_HYPERCROSS: ME.RegionType.HYBRID

Which uses ME.RegionType.HYBRID.
I replaced HYPERCUBE and HYPERCROSS to HYPER_CUBE and HYPER_CROSS respectively. Then I replaced ME.RegionType.HYBRID with ME.RegionType.CUSTOM in the above mentioned line.

I am getting the following error

Traceback (most recent call last):
  File "/home/anaconda3/envs/py3-mink/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/anaconda3/envs/py3-mink/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/SandeepMenon/SpatioTemporalSegmentation/main.py", line 156, in <module>
    main()
  File "/home/SandeepMenon/SpatioTemporalSegmentation/main.py", line 117, in main
    model = NetClass(num_in_channel, num_labels, config)
  File "/home/SandeepMenon/SpatioTemporalSegmentation/models/res16unet.py", line 339, in __init__
    super(STRes16UNetBase, self).__init__(in_channels, out_channels, config, D, **kwargs)
  File "/home/SandeepMenon/SpatioTemporalSegmentation/models/res16unet.py", line 23, in __init__
    super(Res16UNetBase, self).__init__(in_channels, out_channels, config, D)
  File "/home/SandeepMenon/SpatioTemporalSegmentation/models/resnet.py", line 25, in __init__
    self.network_initialization(in_channels, out_channels, config, D)
  File "/home/SandeepMenon/SpatioTemporalSegmentation/models/res16unet.py", line 58, in network_initialization
    self.block1 = self._make_layer(
  File "/home/SandeepMenon/SpatioTemporalSegmentation/models/resnet.py", line 106, in _make_layer
    block(
  File "/home/SandeepMenon/SpatioTemporalSegmentation/models/modules/resnet_block.py", line 23, in __init__
    self.conv1 = conv(
  File "/home/SandeepMenon/SpatioTemporalSegmentation/models/modules/common.py", line 128, in conv
    kernel_generator = ME.KernelGenerator(
  File "/home/anaconda3/envs/py3-mink/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiKernelGenerator.py", line 303, in __init__
    self.kernel_volume = get_kernel_volume(
  File "/home/anaconda3/envs/py3-mink/lib/python3.8/site-packages/MinkowskiEngine/MinkowskiKernelGenerator.py", line 91, in get_kernel_volume
    region_offset.numel() > 0
AttributeError: 'NoneType' object has no attribute 'numel'


While trying to create an ME.KernelGenerator
using the following params

ME.KernelGenerator(
      3, 1, 1, region_type=ME.RegionType.CUSTOM, axis_types=[<RegionType.HYPER_CUBE: 0>, <RegionType.HYPER_CUBE: 0>, <RegionType.HYPER_CUBE: 0>, <RegionType.HYPER_CROSS: 1>], dimension=4)

It is because I am not providing region_offsets. Can you help me understand offsets to give in this case?
How to use the ME.RegionType.CUSTOM type in replacement to this?
Else, how to install MinkowskiEngine 0.4.3. I dont see an archive.

Thank you

Possible bugs in train.py

Thank you for a very nice project.
When scanning through lib/train.py, I found two possible bugs:

  1. L#78 coords[:, :3] += (torch.rand(3) * 100).type_as(coords) should be coords[:, 1:] += (torch.rand(3) * 100).type_as(coords) since first coords is the batch index
  2. L#83 input[:, 1:] = input[:, 1:] / 255. - 0.5 should be input[:, :3] = input[:, :3] / 255. - 0.5

[bug]Can not save predictions for testset

Hello,

I noticed that "--save_prediction" is failed ,because it can not import OnlineVoxelizationDatasetBase from lib.dataset. And there are not any codes of OnlineVoxelizationDatasetBase in dataset.py

May you check it?

Thanks

TypeError: 'pybind11_type' object is not iterable

The following line is giving a syntax error

int_to_region_type = {m.value: m for m in ME.RegionType}

File "/home/SandeepMenon/SpatioTemporalSegmentation/main.py", line 28, in
from models import load_model, load_wrapper
File "/home/SandeepMenon/SpatioTemporalSegmentation/models/init.py", line 1, in
import models.resunet as resunet
File "/home/SandeepMenon/SpatioTemporalSegmentation/models/resunet.py", line 1, in
from models.resnet import ResNetBase, get_norm
File "/home/SandeepMenon/SpatioTemporalSegmentation/models/resnet.py", line 6, in
from models.modules.common import ConvType, NormType, get_norm, conv, sum_pool
File "/home/SandeepMenon/SpatioTemporalSegmentation/models/modules/common.py", line 61, in
int_to_region_type = {m.value: m for m in ME.RegionType}
TypeError: 'pybind11_type' object is not iterable

Environment details
Minkowski Engine : 0.5.4
PyTorch : 1.7.1

4D semantic segmentation for very large point cloud sequences

Hi @chrischoy

Really enjoyed going through the Synthia Temporal semantic segmentation pipeline.
I had a few doubts as to the data processing part for this.
I see for one particular sequence, all the point clouds are loaded(stacked) and passed to the model at one go. Yes there is a temporal voxelization, but what if you have really long sequences of dense point clouds.
For example, I am dealing with point cloud sequences with a total of approx. 170M points spread over 300 frames. Can this pipeline and model handle that kind of data?

My gpu is 24 GB Tesla m40

Thank you

How do I actually test the pretrained model?

Hi Chris,

This might be a relatively trivial question but for some reason I am not able to test any custom .ply file (as well as any file from scannet).

if I run,

python indoor.py --weights ./Mink16UNet34C_ScanNet.pth --conv1_kernel_size 3

I get the following error;

Traceback (most recent call last):
  File "indoor.py", line 106, in <module>
    'scene0635_00.ply', voxel_size=voxel_size)
  File "indoor.py", line 86, in generate_input_sparse_tensor
    coordinates, features = ME.utils.sparse_collate(coordinates_, featrues_)
  File "/home/x23/miniconda3/envs/sts2/lib/python3.6/site-packages/MinkowskiEngine-0.3.2-py3.6-linux-x86_64.egg/MinkowskiEngine/utils/collation.py", line 137, in sparse_collate
    bcoords[s:s + cn, :D] = coord
RuntimeError: expand(torch.IntTensor{[130652, 3, 3]}, size=[130652, 3]): the number of sizes provided (2) must be greater or equal to the number of dimensions in the tensor (3)

To counter above issue, if I add return_index=True in ME.utils.sparse_quantize, i get this,

Traceback (most recent call last):
  File "indoor.py", line 109, in <module>
    soutput = model(sinput)
  File "/home/x23/miniconda3/envs/sts2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/x23/workspace_pcs/SpatioTemporalSegmentation/models/res16unet.py", line 197, in forward
    out = self.conv0p1s1(x)
  File "/home/x23/miniconda3/envs/sts2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/x23/miniconda3/envs/sts2/lib/python3.6/site-packages/MinkowskiEngine-0.3.2-py3.6-linux-x86_64.egg/MinkowskiEngine/MinkowskiConvolution.py", line 272, in forward
    out_coords_key, input.coords_man)
  File "/home/x23/miniconda3/envs/sts2/lib/python3.6/site-packages/MinkowskiEngine-0.3.2-py3.6-linux-x86_64.egg/MinkowskiEngine/MinkowskiConvolution.py", line 65, in forward
    f"Type mismatch input: {input_features.type()} != kernel: {kernel.type()}"
AssertionError: Type mismatch input: torch.cuda.DoubleTensor != kernel: torch.cuda.FloatTensor

What could be the possible cause?

number of labels decreased after applying

Hi Chris,

I would like to know what is happening when applying quantize_label function located at utils/quantization.py, because I find that the number of labels decreased after applying this function. quantize_label called a function MEB.quantize_label_np but it is binary that cannot be traced.

Thanks in advance.

Point-level predictions from voxels

Hello Chris,

In the paper you are mentioning that to get point-level predictions, you are propagating them from the voxels.

image

And from what I understood, you need to use the argument --evaluate_original_pointcloud, but as you can see:
The "pointcloud" is not implemented.

image

So I thought maybe I should use the other flag, --test_original_pointcloud, which uses the save_predictions function.

image

But the function tries to import OnlineVoxelizationDatasetBase , that doesn't exist anywhere in the code ( except where it is being called )

image

image

My question is : Am I doing things the right way ? or is there something that I am missing ?

Thank you

about ConvType

anyone could tell me difference of these ConvType?
i.e. ME.RegionType.HYPERCUBE,...,

if i want to build a Spatio Module, which ConvType should i use?

i dont know why these are two kind of ConvType in MinkUNet32?

Res16UNet34(
  (conv0p1s1): MinkowskiConvolution(in=6, out=32, region_type=RegionType.HYPERCUBE, kernel_size=[5, 5, 5], stride=[1, 1, 1], dilation=[1, 1, 1])
  (bn0): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (conv1p1s2): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bn1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block1): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (conv2p2s2): MinkowskiConvolution(in=32, out=32, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bn2): MinkowskiBatchNorm(32, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block2): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=32, out=64, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=64, out=64, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
      (downsample): Sequential(
        (0): MinkowskiConvolution(in=32, out=64, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
        (1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=64, out=64, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=64, out=64, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (2): BasicBlock(
      (conv1): MinkowskiConvolution(in=64, out=64, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=64, out=64, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (conv3p4s2): MinkowskiConvolution(in=64, out=64, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bn3): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block3): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=64, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
      (downsample): Sequential(
        (0): MinkowskiConvolution(in=64, out=128, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
        (1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (2): BasicBlock(
      (conv1): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (3): BasicBlock(
      (conv1): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (conv4p8s2): MinkowskiConvolution(in=128, out=128, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bn4): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block4): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=128, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
      (downsample): Sequential(
        (0): MinkowskiConvolution(in=128, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
        (1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (2): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (3): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (4): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
    (5): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (convtr4p16s2): MinkowskiConvolutionTranspose(in=256, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bntr4): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block5): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=384, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
      (downsample): Sequential(
        (0): MinkowskiConvolution(in=384, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
        (1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (convtr5p8s2): MinkowskiConvolutionTranspose(in=256, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bntr5): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block6): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=320, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
      (downsample): Sequential(
        (0): MinkowskiConvolution(in=320, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
        (1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (convtr6p4s2): MinkowskiConvolutionTranspose(in=256, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bntr6): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block7): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=288, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
      (downsample): Sequential(
        (0): MinkowskiConvolution(in=288, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
        (1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (convtr7p2s2): MinkowskiConvolutionTranspose(in=256, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
  (bntr7): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
  (block8): Sequential(
    (0): BasicBlock(
      (conv1): MinkowskiConvolution(in=288, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
      (downsample): Sequential(
        (0): MinkowskiConvolution(in=288, out=256, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
        (1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.02, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): MinkowskiConvolution(in=256, out=256, region_type=RegionType.HYBRID, kernel_volume=27, stride=[1, 1, 1], dilation=[1, 1, 1])
      (norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): MinkowskiReLU()
    )
  )
  (final): MinkowskiConvolution(in=256, out=13, region_type=RegionType.HYPERCUBE, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
  (relu): MinkowskiReLU()
)

CUDA out of memory in training ScanNet

Trying to train ScanNet scene segmentation, but run into CUDA out of memory error. My environment:

  • Ubuntu 18.04
  • Python 3.7
  • PyTorch 1.4
  • CUDA toolkit 10.1
  • Tesla K80 with 12GB ram each GPU (also tried on GeForce RTX 2070 with 8GB ram)

The training was started by:
export BATCH_SIZE=8; ./scripts/train_scannet.sh 2 -default "--scannet_path ./data/scannet/train"
(I've tried BATCH_SIZE=32, and BATCH_SIZE=16, both failed)

Here is the error dump:

...

microway-gpu-ubuntu 03/11 17:50:03 ===> Start training
/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:224: UserWarning: To get the last learning rate computed by the scheduler, please use 'get_last_lr()'.
  warnings.warn("To get the last learning rate computed by the scheduler, "
microway-gpu-ubuntu 03/11 17:50:32 ===> Epoch[1](1/151): Loss 3.1041    LR: 1.000e-01   Score 4.961     Data time: 5.2196, Total iter time: 29.8803
Traceback (most recent call last):
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/yang/projects/SpatioTemporalSegmentation/main.py", line 156, in <module>
    main()
  File "/home/yang/projects/SpatioTemporalSegmentation/main.py", line 149, in main
    train(model, train_data_loader, val_data_loader, config)
  File "/home/yang/projects/SpatioTemporalSegmentation/lib/train.py", line 91, in train
    soutput = model(*inputs)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yang/projects/SpatioTemporalSegmentation/models/res16unet.py", line 252, in forward
    out = self.block8(out)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yang/projects/SpatioTemporalSegmentation/models/modules/resnet_block.py", line 47, in forward
    out = self.norm2(out)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/MinkowskiEngine/MinkowskiNormalization.py", line 58, in forward
    output = self.bn(input.F)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 107, in forward
    exponential_average_factor, self.eps)
  File "/home/yang/.conda/envs/st-segmentation/lib/python3.7/site-packages/torch/nn/functional.py", line 1670, in batch_norm
    training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 364.00 MiB (GPU 0; 11.17 GiB total capacity; 9.93 GiB already allocated; 247.81 MiB free; 10.26 GiB reserved in total by PyTorch)

Any help is appreciated.

-- Yang

Scannet update?

Hi @chrischoy ! I was wondering if you managed to find the bug for Scannet training? I am also trying to figure out what's wrong but so far no luck...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.