Code Monkey home page Code Monkey logo

Comments (5)

thangvubk avatar thangvubk commented on May 24, 2024

It seems the dataloader cannot load the data. Could you please check whether your data is in correct path. The train, val and test should be in SoftGroup/dataset/scannetv2/

from softgroup.

Yustarzzz avatar Yustarzzz commented on May 24, 2024

Hi thangvubk !
Thanks for your help. However, they are in the correct path. . .

from softgroup.

thangvubk avatar thangvubk commented on May 24, 2024

The problem is your train data has only 2 scans. The default batch size is 4 and drop_last=True, it will ignore your data. See below.

self.train_data_loader = DataLoader(train_set, batch_size=self.batch_size, collate_fn=self.trainMerge, num_workers=self.train_workers,
shuffle=True, sampler=None, drop_last=True, pin_memory=True)

The solution is (1) set drop_last to False, or (2) reduce batch_size to 2.

from softgroup.

Yustarzzz avatar Yustarzzz commented on May 24, 2024

Thanks thangvubk!
I think maybe it works. I uploaded more 2 scenes for train dataset, so there is 4 scenes.

However, there is an other error occur like below ...
Do you know about this?


/content/drive/MyDrive/AIA/softgroup/SoftGroup/util/config.py:22: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
config = yaml.load(f)
[2022-03-22 12:37:56,288 INFO log.py line 39 7052] ************************ Start Logging ************************
[2022-03-22 12:37:58,464 INFO train.py line 22 7052] Namespace(TEST_NMS_THRESH=0.3, TEST_NPOINT_THRESH=100, TEST_SCORE_THRESH=-1, batch_size=4, bg_thresh=0.0, block_reps=2, block_residual=True, class_numpoint_mean=[-1.0, -1.0, 3917.0, 12056.0, 2303.0, 8331.0, 3948.0, 3166.0, 5629.0, 11719.0, 1003.0, 3317.0, 4912.0, 10221.0, 3889.0, 4136.0, 2120.0, 945.0, 3967.0, 2589.0], classes=18, cluster_shift_meanActive=300, config='config/softgroup_default_scannet.yaml', data_root='dataset', dataset='scannetv2', dataset_dir='data/scannetv2_inst.py', dist=False, epochs=500, eval=True, exp_path='exp/scannetv2/softgroup/softgroup_default_scannet', fg_thresh=1.0, filename_suffix='_inst_nostuff.pth', fix_module=['input_conv', 'unet', 'output_layer', 'semantic_linear', 'offset_linear'], full_scale=[128, 512], ignore_label=-100, input_channel=3, iou_thr=0.5, local_rank=0, loss_weight=[1.0, 1.0, 1.0, 1.0, 1.0], lr=0.001, manual_seed=123, max_npoint=250000, max_proposal_num=200, mode=4, model_dir='model/softgroup/softgroup.py', model_name='softgroup', momentum=0.9, multiplier=0.5, optim='Adam', point_aggr_radius=0.04, prepare_epochs=-1, pretrain=None, pretrain_module=['input_conv', 'unet', 'output_layer', 'semantic_linear', 'offset_linear', 'intra_ins_unet', 'intra_ins_outputlayer'], pretrain_path='hais_ckpt.pth', save_dir='exp', save_freq=16, save_instance=True, save_pt_offsets=False, save_semantic=False, scale=50, score_fullscale=20, score_mode=4, score_scale=50, score_thr=0.2, semantic_classes=20, semantic_only=False, split='val', step_epoch=200, task='train', test_epoch=500, test_mask_score_thre=-0.5, test_seed=567, test_workers=16, train_workers=4, use_coords=True, using_NMS=False, weight_decay=0.0001, width=32)
[2022-03-22 12:37:58,473 INFO train.py line 153 7052] => creating model ...
Load pretrained input_conv: 1/1
Load pretrained unet: 390/390
Load pretrained output_layer: 5/5
Load pretrained semantic_linear: 9/9
Load pretrained offset_linear: 9/9
Load pretrained intra_ins_unet: 85/85
Load pretrained intra_ins_outputlayer: 5/5
[2022-03-22 12:38:05,721 INFO train.py line 164 7052] cuda available: True
[2022-03-22 12:38:05,788 INFO train.py line 168 7052] #classifier parameters: 30839600
[2022-03-22 12:38:06,724 INFO scannetv2_inst.py line 50 7052] Training samples: 4
[2022-03-22 12:38:06,792 INFO scannetv2_inst.py line 84 7052] Validation samples: 1
Traceback (most recent call last):
File "train.py", line 221, in
train_epoch(dataset.train_data_loader, model, model_fn, optimizer, epoch)
File "train.py", line 61, in train_epoch
loss, , visual_dict, meter_dict = model_fn(batch, model, epoch, semantic_only=cfg.semantic_only)
File "/content/drive/MyDrive/AIA/softgroup/SoftGroup/model/softgroup/softgroup.py", line 527, in model_fn
ret = model(input
, p2v_map, coords_float, coords[:, 0].int(), batch_offsets, epoch, 'train', semantic_only=semantic_only)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/AIA/softgroup/SoftGroup/model/softgroup/softgroup.py", line 316, in forward
output = self.input_conv(input)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/spconv/modules.py", line 123, in forward
input = module(input)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/spconv/conv.py", line 151, in forward
self.stride, self.padding, self.dilation, self.output_padding, self.subm, self.transposed, grid=input.grid)
File "/usr/local/lib/python3.7/dist-packages/spconv/ops.py", line 89, in get_indice_pairs
stride, padding, dilation, out_padding, int(subm), int(transpose))
RuntimeError: /content/drive/MyDrive/AIA/softgroup/SoftGroup/lib/spconv/src/spconv/indice.cu 120
cuda execution failed with error 98

from softgroup.

thangvubk avatar thangvubk commented on May 24, 2024

What is the GPU model are you using. It is related to spconv. I found a related issue here open-mmlab/OpenPCDet#442

from softgroup.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.