/home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/distributed/launch.py:163: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(
The module torch.distributed.launch is deprecated and going to be removed in future.Migrate to torch.distributed.run
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
WARNING:torch.distributed.run:--use_env is deprecated and will be removed in future releases.
Please read local_rank from `os.environ('LOCAL_RANK')` instead.
INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs:
entrypoint : tools/train.py
min_nodes : 1
max_nodes : 1
nproc_per_node : 2
run_id : none
rdzv_backend : static
rdzv_endpoint : 127.0.0.1:29500
rdzv_configs : {'rank': 0, 'timeout': 900}
max_restarts : 3
monitor_interval : 5
log_dir : None
metrics_cfg : {}
INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_poy3g6dh/none_du2bhj2l
INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python3
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
/home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/distributed/elastic/utils/store.py:52: FutureWarning: This is an experimental API and will be changed in future.
warnings.warn(
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=0
master_addr=127.0.0.1
master_port=29500
group_rank=0
group_world_size=1
local_ranks=[0, 1]
role_ranks=[0, 1]
global_ranks=[0, 1]
role_world_sizes=[2, 2]
global_world_sizes=[2, 2]
INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_poy3g6dh/none_du2bhj2l/attempt_0/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_poy3g6dh/none_du2bhj2l/attempt_0/1/error.json
fatal: not a git repository (or any parent up to mount point /opt/data)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2022-12-23 16:16:27,421 - mmdet - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.8.13 (default, Oct 21 2022, 23:50:54) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3: GeForce RTX 3090
CUDA_HOME: /usr/local/cuda-11.2
NVCC: Build cuda_11.2.r11.2/compiler.29618528_0
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0+cu111
PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
- CuDNN 8.0.5
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.10.0+cu111
OpenCV: 4.6.0
MMCV: 1.4.0
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.14.0
MMSegmentation: 0.14.1
MMDetection3D: 0.15.0+
------------------------------------------------------------
fatal: not a git repository (or any parent up to mount point /opt/data)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2022-12-23 16:16:28,108 - mmdet - INFO - Distributed training: True
2022-12-23 16:16:28,735 - mmdet - INFO - Config:
voxel_size = 0.01
model = dict(
type='SingleStageSparse3DDetector',
voxel_size=0.01,
backbone=dict(type='MEResNet3D', in_channels=3, depth=34),
neck_with_head=dict(
type='Fcaf3DNeckWithHead',
in_channels=(64, 128, 256, 512),
out_channels=128,
pts_threshold=100000,
n_classes=18,
n_reg_outs=6,
voxel_size=0.01,
assigner=dict(type='Fcaf3DAssigner', limit=27, topk=18, n_scales=4),
loss_bbox=dict(type='IoU3DLoss', loss_weight=1.0, with_yaw=False)),
train_cfg=dict(),
test_cfg=dict(nms_pre=1000, iou_thr=0.5, score_thr=0.01))
optimizer = dict(type='AdamW', lr=0.001, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=10, norm_type=2))
lr_config = dict(policy='step', warmup=None, step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
custom_hooks = [dict(type='EmptyCacheHook', after_iter=True)]
checkpoint_config = dict(interval=1, max_keep_ckpts=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/fcaf3d_scannet-3d-18class'
load_from = None
resume_from = None
workflow = [('train', 1)]
n_points = 100000
dataset_type = 'ScanNetDataset'
data_root = '../all_data/scannet_v2/'
class_names = ('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin')
train_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='DEPTH',
shift_height=False,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]),
dict(type='LoadAnnotations3D'),
dict(type='GlobalAlignment', rotation_axis=2),
dict(type='IndoorPointSample', num_points=100000),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.087266, 0.087266],
scale_ratio_range=[0.9, 1.1],
translation_std=[0.1, 0.1, 0.1],
shift_height=False),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin')),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='DEPTH',
shift_height=False,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]),
dict(type='GlobalAlignment', rotation_axis=2),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(type='IndoorPointSample', num_points=100000),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table',
'door', 'window', 'bookshelf', 'picture',
'counter', 'desk', 'curtain', 'refrigerator',
'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin'),
with_label=False),
dict(type='Collect3D', keys=['points'])
])
]
data = dict(
samples_per_gpu=8,
workers_per_gpu=4,
train=dict(
type='RepeatDataset',
times=10,
dataset=dict(
type='ScanNetDataset',
data_root='../all_data/scannet_v2/',
ann_file='../all_data/scannet_v2/scannet_infos_train.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='DEPTH',
shift_height=False,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]),
dict(type='LoadAnnotations3D'),
dict(type='GlobalAlignment', rotation_axis=2),
dict(type='IndoorPointSample', num_points=100000),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.087266, 0.087266],
scale_ratio_range=[0.9, 1.1],
translation_std=[0.1, 0.1, 0.1],
shift_height=False),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa', 'table',
'door', 'window', 'bookshelf', 'picture',
'counter', 'desk', 'curtain', 'refrigerator',
'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin')),
dict(
type='Collect3D',
keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
],
filter_empty_gt=True,
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door',
'window', 'bookshelf', 'picture', 'counter', 'desk',
'curtain', 'refrigerator', 'showercurtrain', 'toilet',
'sink', 'bathtub', 'garbagebin'),
box_type_3d='Depth')),
val=dict(
type='ScanNetDataset',
data_root='../all_data/scannet_v2/',
ann_file='../all_data/scannet_v2/scannet_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='DEPTH',
shift_height=False,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]),
dict(type='GlobalAlignment', rotation_axis=2),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(type='IndoorPointSample', num_points=100000),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa',
'table', 'door', 'window', 'bookshelf',
'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain',
'toilet', 'sink', 'bathtub',
'garbagebin'),
with_label=False),
dict(type='Collect3D', keys=['points'])
])
],
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin'),
test_mode=True,
box_type_3d='Depth'),
test=dict(
type='ScanNetDataset',
data_root='../all_data/scannet_v2/',
ann_file='../all_data/scannet_v2/scannet_infos_val.pkl',
pipeline=[
dict(
type='LoadPointsFromFile',
coord_type='DEPTH',
shift_height=False,
load_dim=6,
use_dim=[0, 1, 2, 3, 4, 5]),
dict(type='GlobalAlignment', rotation_axis=2),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1.0, 1.0],
translation_std=[0, 0, 0]),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(type='IndoorPointSample', num_points=100000),
dict(
type='DefaultFormatBundle3D',
class_names=('cabinet', 'bed', 'chair', 'sofa',
'table', 'door', 'window', 'bookshelf',
'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain',
'toilet', 'sink', 'bathtub',
'garbagebin'),
with_label=False),
dict(type='Collect3D', keys=['points'])
])
],
classes=('cabinet', 'bed', 'chair', 'sofa', 'table', 'door', 'window',
'bookshelf', 'picture', 'counter', 'desk', 'curtain',
'refrigerator', 'showercurtrain', 'toilet', 'sink', 'bathtub',
'garbagebin'),
test_mode=True,
box_type_3d='Depth'))
gpu_ids = range(0, 2)
2022-12-23 16:16:28,735 - mmdet - INFO - Set random seed to 0, deterministic: False
2022-12-23 16:16:29,843 - mmdet - INFO - Model:
SingleStageSparse3DDetector(
(backbone): MEResNet3D(
(conv1): Sequential(
(0): MinkowskiConvolution(in=3, out=64, kernel_size=[3, 3, 3], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiInstanceNorm(nchannels=64)
(2): MinkowskiReLU()
(3): MinkowskiMaxPooling(kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
)
(layer1): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[2, 2, 2], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=64, out=64, kernel_size=[1, 1, 1], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(2): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(layer2): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=64, out=128, kernel_size=[3, 3, 3], stride=[2, 2, 2], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=64, out=128, kernel_size=[1, 1, 1], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(2): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(3): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(layer3): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=128, out=256, kernel_size=[3, 3, 3], stride=[2, 2, 2], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=128, out=256, kernel_size=[1, 1, 1], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(2): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(3): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(4): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(5): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
(layer4): Sequential(
(0): BasicBlock(
(conv1): MinkowskiConvolution(in=256, out=512, kernel_size=[3, 3, 3], stride=[2, 2, 2], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=512, out=512, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
(downsample): Sequential(
(0): MinkowskiConvolution(in=256, out=512, kernel_size=[1, 1, 1], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): MinkowskiConvolution(in=512, out=512, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=512, out=512, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
(2): BasicBlock(
(conv1): MinkowskiConvolution(in=512, out=512, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm1): MinkowskiBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): MinkowskiConvolution(in=512, out=512, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(norm2): MinkowskiBatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): MinkowskiReLU()
)
)
)
(neck_with_head): Fcaf3DNeckWithHead(
(loss_centerness): CrossEntropyLoss()
(loss_bbox): IoU3DLoss()
(loss_cls): FocalLoss()
(pruning): MinkowskiPruning()
(out_block_0): Sequential(
(0): MinkowskiConvolution(in=64, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): MinkowskiELU()
)
(up_block_1): Sequential(
(0): MinkowskiGenerativeConvolutionTranspose(in=128, out=64, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): MinkowskiELU()
(3): MinkowskiConvolution(in=64, out=64, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(4): MinkowskiBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): MinkowskiELU()
)
(out_block_1): Sequential(
(0): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): MinkowskiELU()
)
(up_block_2): Sequential(
(0): MinkowskiGenerativeConvolutionTranspose(in=256, out=128, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): MinkowskiELU()
(3): MinkowskiConvolution(in=128, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(4): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): MinkowskiELU()
)
(out_block_2): Sequential(
(0): MinkowskiConvolution(in=256, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): MinkowskiELU()
)
(up_block_3): Sequential(
(0): MinkowskiGenerativeConvolutionTranspose(in=512, out=256, kernel_size=[2, 2, 2], stride=[2, 2, 2], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): MinkowskiELU()
(3): MinkowskiConvolution(in=256, out=256, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(4): MinkowskiBatchNorm(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): MinkowskiELU()
)
(out_block_3): Sequential(
(0): MinkowskiConvolution(in=512, out=128, kernel_size=[3, 3, 3], stride=[1, 1, 1], dilation=[1, 1, 1])
(1): MinkowskiBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): MinkowskiELU()
)
(centerness_conv): MinkowskiConvolution(in=128, out=1, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(reg_conv): MinkowskiConvolution(in=128, out=6, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(cls_conv): MinkowskiConvolution(in=128, out=18, kernel_size=[1, 1, 1], stride=[1, 1, 1], dilation=[1, 1, 1])
(scales): ModuleList(
(0): Scale()
(1): Scale()
(2): Scale()
(3): Scale()
)
)
)
2022-12-23 16:16:42,207 - mmdet - INFO - Start running, host: root@zy20221217, work_dir: /opt/data/private/indoor/work_dirs/fcaf3d_scannet-3d-18class
2022-12-23 16:16:42,208 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) CheckpointHook
(NORMAL ) DistEvalHook
(VERY_LOW ) TextLoggerHook
--------------------
before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) DistSamplerSeedHook
(NORMAL ) DistEvalHook
(NORMAL ) EmptyCacheHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
--------------------
before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) DistEvalHook
(LOW ) IterTimerHook
--------------------
after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) DistEvalHook
(NORMAL ) EmptyCacheHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
--------------------
after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) DistEvalHook
(NORMAL ) EmptyCacheHook
(VERY_LOW ) TextLoggerHook
--------------------
before_val_epoch:
(NORMAL ) DistSamplerSeedHook
(NORMAL ) EmptyCacheHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
--------------------
before_val_iter:
(LOW ) IterTimerHook
--------------------
after_val_iter:
(NORMAL ) EmptyCacheHook
(LOW ) IterTimerHook
--------------------
after_val_epoch:
(NORMAL ) EmptyCacheHook
(VERY_LOW ) TextLoggerHook
--------------------
after_run:
(VERY_LOW ) TextLoggerHook
--------------------
2022-12-23 16:16:42,209 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
2022-12-23 16:16:42,209 - mmdet - INFO - Checkpoints will be saved to /opt/data/private/indoor/work_dirs/fcaf3d_scannet-3d-18class by HardDiskBackend.
Traceback (most recent call last):
File "tools/train.py", line 223, in <module>
main()
File "tools/train.py", line 212, in main
train_model(
File "/opt/data/private/indoor/mmdet3d/apis/train.py", line 27, in train_model
train_detector(
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 51, in train
self.call_hook('after_train_iter')
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
getattr(hook, fn_name)(self)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/hooks/optimizer.py", line 35, in after_train_iter
runner.outputs['loss'].backward()
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/_tensor.py", line 255, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/autograd/__init__.py", line 147, in backward
Variable._execution_engine.run_backward(
RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
Traceback (most recent call last):
File "tools/train.py", line 223, in <module>
main()
File "tools/train.py", line 212, in main
train_model(
File "/opt/data/private/indoor/mmdet3d/apis/train.py", line 27, in train_model
train_detector(
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 51, in train
self.call_hook('after_train_iter')
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
getattr(hook, fn_name)(self)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/mmcv/runner/hooks/optimizer.py", line 35, in after_train_iter
runner.outputs['loss'].backward()
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/_tensor.py", line 255, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/autograd/__init__.py", line 147, in backward
Variable._execution_engine.run_backward(
RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: an illegal memory access was encountered
Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:1055 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f1a89cdba22 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x10aa3 (0x7f1a89f3caa3 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a7 (0x7f1a89f3e147 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7f1a89cc55a4 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #4: <unknown function> + 0xe568e9 (0x7f1a8afb08e9 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0x2f45656 (0x7f1a8d09f656 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #6: <unknown function> + 0x355bcc2 (0x7f1a8d6b5cc2 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #7: torch::autograd::deleteNode(torch::autograd::Node*) + 0x7f (0x7f1a8d6b5d6f in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x3545ad8 (0x7f1a8d69fad8 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #9: c10::TensorImpl::release_resources() + 0x20 (0x7f1a89cc5570 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #10: <unknown function> + 0xa2822a (0x7f1b2eae322a in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #11: <unknown function> + 0xa282c1 (0x7f1b2eae32c1 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #12: /home/miniconda3/envs/indoor/bin/python3() [0x4f0e66]
frame #13: /home/miniconda3/envs/indoor/bin/python3() [0x4c179f]
frame #14: /home/miniconda3/envs/indoor/bin/python3() [0x4c15f3]
frame #15: /home/miniconda3/envs/indoor/bin/python3() [0x4c16da]
frame #16: /home/miniconda3/envs/indoor/bin/python3() [0x4f0d27]
frame #17: /home/miniconda3/envs/indoor/bin/python3() [0x4d07e8]
frame #18: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c38]
frame #19: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c4b]
frame #20: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c4b]
frame #21: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c4b]
frame #22: /home/miniconda3/envs/indoor/bin/python3() [0x4b4c97]
frame #23: PyDict_SetItemString + 0x99 (0x4bc2c9 in /home/miniconda3/envs/indoor/bin/python3)
frame #24: PyImport_Cleanup + 0x93 (0x58c5e3 in /home/miniconda3/envs/indoor/bin/python3)
frame #25: Py_FinalizeEx + 0x71 (0x5881e1 in /home/miniconda3/envs/indoor/bin/python3)
frame #26: Py_RunMain + 0x1b6 (0x57f406 in /home/miniconda3/envs/indoor/bin/python3)
frame #27: Py_BytesMain + 0x39 (0x55cbe9 in /home/miniconda3/envs/indoor/bin/python3)
frame #28: __libc_start_main + 0xe7 (0x7f1b362ccc87 in /lib/x86_64-linux-gnu/libc.so.6)
frame #29: /home/miniconda3/envs/indoor/bin/python3() [0x55ca9e]
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: an illegal memory access was encountered
Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:1055 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f38aa109a22 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x10aa3 (0x7f38aa36aaa3 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a7 (0x7f38aa36c147 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7f38aa0f35a4 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #4: <unknown function> + 0xe568e9 (0x7f38ab3de8e9 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0x2f45656 (0x7f38ad4cd656 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #6: <unknown function> + 0x355bcc2 (0x7f38adae3cc2 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #7: torch::autograd::deleteNode(torch::autograd::Node*) + 0x7f (0x7f38adae3d6f in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x3545ad8 (0x7f38adacdad8 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #9: c10::TensorImpl::release_resources() + 0x20 (0x7f38aa0f3570 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #10: <unknown function> + 0xa2822a (0x7f394ef1122a in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #11: <unknown function> + 0xa282c1 (0x7f394ef112c1 in /home/miniconda3/envs/indoor/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #12: /home/miniconda3/envs/indoor/bin/python3() [0x4f0e66]
frame #13: /home/miniconda3/envs/indoor/bin/python3() [0x4c179f]
frame #14: /home/miniconda3/envs/indoor/bin/python3() [0x4c15f3]
frame #15: /home/miniconda3/envs/indoor/bin/python3() [0x4c16da]
frame #16: /home/miniconda3/envs/indoor/bin/python3() [0x4f0d27]
frame #17: /home/miniconda3/envs/indoor/bin/python3() [0x4d07e8]
frame #18: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c38]
frame #19: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c4b]
frame #20: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c4b]
frame #21: /home/miniconda3/envs/indoor/bin/python3() [0x4e3c4b]
frame #22: /home/miniconda3/envs/indoor/bin/python3() [0x4b4c97]
frame #23: PyDict_SetItemString + 0x99 (0x4bc2c9 in /home/miniconda3/envs/indoor/bin/python3)
frame #24: PyImport_Cleanup + 0x93 (0x58c5e3 in /home/miniconda3/envs/indoor/bin/python3)
frame #25: Py_FinalizeEx + 0x71 (0x5881e1 in /home/miniconda3/envs/indoor/bin/python3)
frame #26: Py_RunMain + 0x1b6 (0x57f406 in /home/miniconda3/envs/indoor/bin/python3)
frame #27: Py_BytesMain + 0x39 (0x55cbe9 in /home/miniconda3/envs/indoor/bin/python3)
frame #28: __libc_start_main + 0xe7 (0x7f39566fac87 in /lib/x86_64-linux-gnu/libc.so.6)
frame #29: /home/miniconda3/envs/indoor/bin/python3() [0x55ca9e]
hello, can you help me slove this problem.