Code Monkey home page Code Monkey logo

lsknet's People

Contributors

zcablii avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

lsknet's Issues

[Feature] COCO-based pre-trained ?

What's the feature?

In the RTMDET paper, it is mentioned that the COCO-based pre-trained model shows better results than imagenat, which I also actually verified. Is there a similar situation on LSK?

Any other context?

No response

Gradient explosion problem on single RTX 2080TI with lr0001 and bs 1

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

sys.platform: linux
Python: 3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0]
CUDA available: True
GPU 0,1: GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.8
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.1
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2023.0-Product Build 20221128 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.1
OpenCV: 4.7.0
MMCV: 1.7.1
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.2
MMRotate: 0.3.4+7755aa5

Reproduces the problem - code sample

i only changed the 'workers_per_gpu' in the model config file "lsk_t_fpn_1x_dota_le90.py"

data = dict(
samples_per_gpu=1,
workers_per_gpu=1, # it is originally 2
train=dict(pipeline=train_pipeline, version=angle_version),
val=dict(version=angle_version),
test=dict(version=angle_version))

Reproduces the problem - command or script

python tools/train.py configs/lsknet/lsk_t_fpn_1x_dota_le90.py

Reproduces the problem - error message

2023-07-27 17:07:28,876 - mmrotate - INFO - Epoch [4][1350/12799] lr: 2.000e-04, eta: 7:29:24, time: 0.236, data_time: 0.003, memory: 7814, loss_rpn_cls: 0.0550, loss_rpn_bbox: 0.1282, loss_cls: 0.2275, acc: 91.9297, loss_bbox: 0.2049, loss: 0.6157, grad_norm: 6.9244
/home/dl-1/anaconda3/envs/mmrotate/lib/python3.8/site-packages/mmcv/runner/hooks/optimizer.py:59: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
return clip_grad.clip_grad_norm_(params, **self.grad_clip)
2023-07-27 17:07:40,816 - mmrotate - INFO - Epoch [4][1400/12799] lr: 2.000e-04, eta: 7:29:12, time: 0.239, data_time: 0.003, memory: 7814, loss_rpn_cls: 0.1545, loss_rpn_bbox: 0.1710, loss_cls: nan, acc: 74.9375, loss_bbox: nan, loss: nan, grad_norm: nan

Additional information

Hi author 'zcablii' ,
I trained the model, 'lsk_t_fpn_1x_dota_le90.py',on DOTA dataset.
i find that the bottom of the model config shows that the learning rate should adapt to the number of GPU.
But i did not change the learning rate.
Finally, it got gradient explosion.
Should i adjust the learning rate?
Thank you~

[Bug]

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

Python: 3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3: GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.0, V11.0.194
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.7.1
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.0
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.8.2
OpenCV: 4.7.0
MMCV: 1.6.0
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.0
MMRotate: 0.3.4+c9b6c66

Reproduces the problem - code sample

python tools/train.py /data/pys/model/Large-Selective-Kernel-Network/configs/lsknet/lsk_t_fpn_1x_dota_le90.py
--work-dir /data/pys/out/lsknet

Reproduces the problem - command or script

python tools/train.py /data/pys/model/Large-Selective-Kernel-Network/configs/lsknet/lsk_t_fpn_1x_dota_le90.py
--work-dir /data/pys/out/lsknet

Reproduces the problem - error message

/data/pys/env/openmmlab/bin/python /data/pys/model/Large-Selective-Kernel-Network/tools/train.py
/data/pys/model/Large-Selective-Kernel-Network/mmrotate/utils/setup_env.py:38: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/data/pys/model/Large-Selective-Kernel-Network/mmrotate/utils/setup_env.py:48: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
2023-04-29 11:27:08,979 - mmrotate - INFO - Environment info:

sys.platform: linux
Python: 3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3: GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.0, V11.0.194
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.7.1
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.0
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.8.2
OpenCV: 4.7.0
MMCV: 1.6.0
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.0
MMRotate: 0.3.4+c9b6c66

2023-04-29 11:27:09,350 - mmrotate - INFO - Distributed training: False
2023-04-29 11:27:09,594 - mmrotate - INFO - Config:
dataset_type = 'DOTADataset'
data_root = '/data/pys/Data/DOTA/dota_1024_ms/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='RResize', img_scale=(1024, 1024)),
dict(
type='RRandomFlip',
flip_ratio=[0.25, 0.25, 0.25],
direction=['horizontal', 'vertical', 'diagonal'],
version='le90'),
dict(
type='PolyRandomRotate',
rotate_ratio=0.5,
angles_range=180,
auto_bound=False,
rect_classes=[9, 11],
version='le90'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type='DOTADataset',
ann_file='/data/pys/Data/DOTA/dota_1024_ms/trainval/annfiles/',
img_prefix='/data/pys/Data/DOTA/dota_1024_ms/trainval/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='RResize', img_scale=(1024, 1024)),
dict(
type='RRandomFlip',
flip_ratio=[0.25, 0.25, 0.25],
direction=['horizontal', 'vertical', 'diagonal'],
version='le90'),
dict(
type='PolyRandomRotate',
rotate_ratio=0.5,
angles_range=180,
auto_bound=False,
rect_classes=[9, 11],
version='le90'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
],
version='le90'),
val=dict(
type='DOTADataset',
ann_file='/data/pys/Data/DOTA/dota_1024_ms/val/annfiles/',
img_prefix='/data/pys/Data/DOTA/dota_1024_ms/val/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
],
version='le90'),
test=dict(
type='DOTADataset',
ann_file='/data/pys/Data/DOTA/dota_1024_ms/test/images/',
img_prefix='/data/pys/Data/DOTA/dota_1024_ms/test/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
],
version='le90'))
evaluation = dict(interval=1, metric='mAP')
optimizer = dict(
type='AdamW', lr=0.0002, betas=(0.9, 0.999), weight_decay=0.05)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.3333333333333333,
step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
opencv_num_threads = 0
mp_start_method = 'fork'
angle_version = 'le90'
gpu_number = 1
model = dict(
type='OrientedRCNN',
backbone=dict(
type='LSKNet',
embed_dims=[32, 64, 160, 256],
drop_rate=0.1,
drop_path_rate=0.1,
depths=[3, 3, 5, 2],
init_cfg=dict(
type='Pretrained',
checkpoint='/data/pys/pretrain/lsk_t_backbone-2ef8a593.pth'),
norm_cfg=dict(type='SyncBN', requires_grad=True)),
neck=dict(
type='FPN',
in_channels=[32, 64, 160, 256],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='OrientedRPNHead',
in_channels=256,
feat_channels=256,
version='le90',
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='MidpointOffsetCoder',
angle_range='le90',
target_means=[0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0, 0.5, 0.5]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='OrientedStandardRoIHead',
bbox_roi_extractor=dict(
type='RotatedSingleRoIExtractor',
roi_layer=dict(
type='RoIAlignRotated',
out_size=7,
sample_num=2,
clockwise=True),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='RotatedShared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=15,
bbox_coder=dict(
type='DeltaXYWHAOBBoxCoder',
angle_range='le90',
norm_factor=None,
edge_swap=True,
proj_xy=True,
target_means=(0.0, 0.0, 0.0, 0.0, 0.0),
target_stds=(0.1, 0.1, 0.2, 0.2, 0.1)),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
gpu_assign_thr=800,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.8),
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=False,
iou_calculator=dict(type='RBboxOverlaps2D'),
gpu_assign_thr=800,
ignore_iof_thr=-1),
sampler=dict(
type='RRandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False)),
test_cfg=dict(
rpn=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.8),
min_bbox_size=0),
rcnn=dict(
nms_pre=2000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(iou_thr=0.1),
max_per_img=2000)))
work_dir = '/data/pys/out/lsknet'
auto_resume = False
gpu_ids = range(0, 1)

2023-04-29 11:27:09,595 - mmrotate - INFO - Set random seed to 39666445, deterministic: False
/data/pys/env/openmmlab/lib/python3.8/site-packages/mmdet/models/dense_heads/anchor_head.py:116: UserWarning: DeprecationWarning: num_anchors is deprecated, for consistency or also use num_base_priors instead
warnings.warn('DeprecationWarning: num_anchors is deprecated, '
init cfg {'type': 'Pretrained', 'checkpoint': '/data/pys/pretrain/lsk_t_backbone-2ef8a593.pth'}
2023-04-29 11:27:10,134 - mmrotate - INFO - initialize LSKNet with init_cfg {'type': 'Pretrained', 'checkpoint': '/data/pys/pretrain/lsk_t_backbone-2ef8a593.pth'}
2023-04-29 11:27:10,135 - mmcv - INFO - load model from: /data/pys/pretrain/lsk_t_backbone-2ef8a593.pth
2023-04-29 11:27:10,136 - mmcv - INFO - load checkpoint from local path: /data/pys/pretrain/lsk_t_backbone-2ef8a593.pth
2023-04-29 11:27:10,219 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: head.weight, head.bias

2023-04-29 11:27:10,265 - mmrotate - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-04-29 11:27:10,331 - mmrotate - INFO - initialize OrientedRPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01}
2023-04-29 11:27:10,345 - mmrotate - INFO - initialize RotatedShared2FCBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}, {'type': 'Xavier', 'layer': 'Linear', 'override': [{'name': 'shared_fcs'}, {'name': 'cls_fcs'}, {'name': 'reg_fcs'}]}]
2023-04-29 11:27:20,360 - mmrotate - INFO - Start running, host: user@ubuntu, work_dir: /data/pys/out/lsknet
2023-04-29 11:27:20,361 - mmrotate - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) CheckpointHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook

before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook
(LOW ) IterTimerHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook

before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook
(LOW ) IterTimerHook
(LOW ) EvalHook

after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(LOW ) IterTimerHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook

after_train_epoch:
(NORMAL ) CheckpointHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook

before_val_epoch:
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_val_iter:
(LOW ) IterTimerHook

after_val_iter:
(LOW ) IterTimerHook

after_val_epoch:
(VERY_LOW ) TextLoggerHook

after_run:
(VERY_LOW ) TextLoggerHook

2023-04-29 11:27:20,361 - mmrotate - INFO - workflow: [('train', 1)], max: 12 epochs
2023-04-29 11:27:20,361 - mmrotate - INFO - Checkpoints will be saved to /data/pys/out/lsknet by HardDiskBackend.
Traceback (most recent call last):
File "/data/pys/model/Large-Selective-Kernel-Network/tools/train.py", line 193, in
main()
File "/data/pys/model/Large-Selective-Kernel-Network/tools/train.py", line 182, in main
train_detector(
File "/data/pys/model/Large-Selective-Kernel-Network/mmrotate/apis/train.py", line 141, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 49, in train
for i, data_batch in enumerate(self.data_loader):
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 352, in iter
return self._get_iterator()
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 294, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 827, in init
self._reset(loader, first_iter=True)
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 857, in _reset
self._try_put_index()
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1091, in _try_put_index
index = self._next_index()
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 427, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 227, in iter
for idx in self.sampler:
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/mmdet/datasets/samplers/group_sampler.py", line 36, in iter
indices = np.concatenate(indices)
File "<array_function internals>", line 200, in concatenate
ValueError: need at least one array to concatenate
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/data/pys/env/openmmlab/lib/python3.8/multiprocessing/popen_fork.py", line 27, in poll
pid, sts = os.waitpid(self.pid, flag)
File "/data/pys/env/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 38068) is killed by signal: Terminated.

Additional information

  1. 采用multiscale train 训练DOTA v1.0数据集出错,出错信息应该是数据集没有正确加载,但之前用相同的方式运行RVSA和oriented RCNN并没有出错,目前没有找到解决的方法。作者能否提供一下,您当前运行环境的各个python库的版本,方便复现

Confusion about DOTA V1.0 dataset

Model/Dataset/Scheduler description

Hi,

I found your paper LSKNet very interesting. I want to finetune your model on DOTA1.0 pre-trained on Image Net.

I observed that the dataset has two different labels for this path
https://drive.google.com/drive/folders/1gmeE3D7R62UAtuIFOB9j2M5cUPTwtsxK
I selected "labelTxt-v1.0"
Now, inside this "labelTxt-v1.0", there are two more folders including "labelTxt.zip" and "Train_Task2_gt.zip"
Both has slightly different annotations.

I want to know which you used for your training.

Regards
Mustansar

Inference DOTA dataset with provided setting does not work. (NameError: name 'true'/'false/null' is not defined, or AttributeError: 'ConfigDict' object has no attribute 'data')

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

sys.platform: linux
Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
CUDA available: True
GPU 0,1: GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.2, V11.2.67
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 2.0.1+cu117
PyTorch compiling details: PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.7
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.5
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.2+cu117
OpenCV: 4.8.0
MMCV: 1.7.1
MMCV Compiler: GCC 7.5
MMCV CUDA Compiler: 11.2
MMRotate: 0.3.4+293d79c

Reproduces the problem - code sample

I am using config and checkpoint from here: https://github.com/zcablii/LSKNet/blob/main/configs/lsknet/lsk_s_fpn_1x_dota_le90.py and https://download.openmmlab.com/mmrotate/v1.0/lsknet/lsk_s_fpn_1x_dota_le90/lsk_s_fpn_1x_dota_le90_20230116-99749191.pth

The DOTA dataset is processed following: https://github.com/zcablii/LSKNet/blob/main/docs/en/get_started.md

I used the command following: https://github.com/zcablii/LSKNet/blob/main/docs/en/get_started.md

Reproduces the problem - command or script

/mnt/data/Large-Selective-Kernel-Network$ CUDA_VISIBLE_DEVICES=1 python tools/test.py /mnt/data/Large-Selective-Kernel-Network/checkpoints/lsk_s_fpn_1x_dota_le90.py /mnt/data/Large-Selective-Kernel-Network/checkpoints/lsk_s_fpn_1x_dota_le90_20230116-99749191.pth --eval mAP

Reproduces the problem - error message

Traceback (most recent call last):
  File "tools/test.py", line 263, in <module>
    main()
  File "tools/test.py", line 116, in main
    cfg = Config.fromfile(args.config)
  File "/mnt/data/anaconda3/envs/lsknet/lib/python3.8/site-packages/mmcv/utils/config.py", line 340, in fromfile
    cfg_dict, cfg_text = Config._file2dict(filename,
  File "/mnt/data/anaconda3/envs/lsknet/lib/python3.8/site-packages/mmcv/utils/config.py", line 208, in _file2dict
    mod = import_module(temp_module_name)                        
  File "/mnt/data/anaconda3/envs/lsknet/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)                                                     
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked                                                 
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/tmp/tmpmrlo1xus/tmp7glnglw1.py", line 1, in <module>    
NameError: name 'true' is not defined                            
Exception ignored in: <function _TemporaryFileCloser.__del__ at 0x7fe4528d1550>
Traceback (most recent call last):                               
  File "/mnt/data/anaconda3/envs/lsknet/lib/python3.8/tempfile.py", line 440, in __del__
    self.close()                                                 
  File "/mnt/data/anaconda3/envs/lsknet/lib/python3.8/tempfile.py", line 436, in close                       
    unlink(self.name)                                                                                               
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpmrlo1xus/tmp7glnglw1.py'

Additional information

When I change all true to True, false to False, null to None, and run again. The following error occurs:

Traceback (most recent call last):
  File "tools/test.py", line 263, in <module>
    main()
  File "tools/test.py", line 120, in main
    cfg = compat_cfg(cfg)
  File "/mnt/data/Large-Selective-Kernel-Network/mmrotate/utils/compat_config.py", line 16, in compat_cfg
    cfg = compat_imgs_per_gpu(cfg)
  File "/mnt/data/Large-Selective-Kernel-Network/mmrotate/utils/compat_config.py", line 39, in compat_imgs_per_gpu
    if 'imgs_per_gpu' in cfg.data:
  File "/mnt/data/anaconda3/envs/lsknet/lib/python3.8/site-packages/mmcv/utils/config.py", line 519, in __getattr__
    return getattr(self._cfg_dict, name)
  File "/mnt/data/anaconda3/envs/lsknet/lib/python3.8/site-packages/mmcv/utils/config.py", line 50, in __getattr__
    raise ex
AttributeError: 'ConfigDict' object has no attribute 'data'

RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

sys.platform: linux
Python: 3.8.8 (default, Feb 24 2021, 21:46:12) [GCC 7.3.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.1, V11.1.105
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.1
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.1
OpenCV: 4.8.0
MMCV: 1.7.1
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMRotate: 0.3.4+12961ad

Reproduces the problem - code sample

CUDA_VISIBLE_DEVICES=0 python tools/train.py /root/workspace/Large-Selective-Kernel-Network/configs/lsknet/lsk_t_fpn_1x_dota_le90.py

Reproduces the problem - command or script

CUDA_VISIBLE_DEVICES=0 python tools/train.py /root/workspace/Large-Selective-Kernel-Network/configs/lsknet/lsk_t_fpn_1x_dota_le90.py

Reproduces the problem - error message

RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

Additional information

你好,我下载了预训练模型后,想使用您的config文件运行,但是报错了RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

在同样的环境下,我运行Orien RCNN是不会报错的,请问这是什么问题,应该如何解决呢?

"Default process group is not initialized" AssertionError: Default process group is not initialized[Bug]

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

image
how I solve it for training my datasets.

Reproduces the problem - code sample

1

Reproduces the problem - command or script

1

Reproduces the problem - error message

1

Additional information

No response

关于Lsk论文中Fair1mV1.0结果复现的一些问题(Some questions about the reproduction of Fair1mV1.0 results in the Lsk paper)

What's the feature?

作者你好,我一直在关注ISPRS(Fair1m)的比赛,在你的论文中记录LSK可以在Fair1mV1.0上达到 map 47.87,我在ISPRS的网站上也查到yuxuanli(推测就是你的账号)的得分为47.3128,这对于单模型来说应该是非常好的成绩。
之前我一直在做Fair1mV2.0的训练,也尝试过LSK,可能是因为参数选择的原因,对于Fair1mV2.0来说(数据集的处理方式是全部转换为jpg,而后采用1024 200进行切割;训练的时候是单卡A40,把synbn改为bn,lr用的还是0.001)在12epoch下最好的结果是map:0.36,在ISPRS上评分只有14.9618。
我认为这个结果没有显示出lsk的实际能力,现在准备开始在Fair1mV1.0下复现你的论文结果,包括竞赛结果。因为这里可能涉及到比较多的问题,所以想先了解一下:
1、数据集方面:应该也是使用1024 200的切割吧?有无进行格式转化(对于Fair1m数据集中提出的tif文件带地理空间信息我也研究过,似乎对于结果没有提高)。特别是Fair1mV1.0它没有区分训练集和验证集,那么你是如何划分的?
2、模型训练方面:是否就是使用lsk_s_backbone-e9d2e551.pth作为预训练模型?还是使用其它模型?达到 map 47.87需要训练多少epoch?
3、结果提交方面:我发现最后合理使用NMS能够对于ISPRS上分数提升有所帮助,请问是否有相关考虑?
最后,我发现lsk_s_backbone-e9d2e551.pth是放在mmlab的网站上面的,但是在mmrotate v1.0 中没有找到lsk的config,想了解一下相关情况。
非常感谢,主要还是想复现,回头如果可以我把这里具体的过程总结出来,并且继续在Fair1mV2.0上进行尝试。

=====Since it's on GitHub, write another English version==============================
Hello author, I have been following the ISPRS (Fair1m) competition, in your paper it is recorded that LSK can reach map 47.87 on Fair1mV1.0, and I also found on the ISPRS website that yuxuanli (presumably your account) has a score of 47.3128。it should be a very good score for a single model.
I have been doing Fair1mV2.0 training before, and I have also tried LSK, probably because of parameter selection, for Fair1mV2.0 (the processing method of the dataset is all converted to jpg, and then 1024 200 is used for cutting; When training is a single card A40, change synbn to bn, lr is still 0.001) The best result at 12epoch is map:0.36, and the rating on ISPRS is only 14.9618.
I think this result HASNOT shows lsk's actual ability.Now I am ready to start reproducing your paper results, including competition results, under Fair1mV1.0. Because there may be more problems involved here, I want to understand it first:

  1. data set: it should also use 1024 200 cutting, right? There is no format conversion (I have also studied the geospatial information of the tif file proposed in the Fair1m dataset, and it seems that the results have not improved).
    In particular, Fair1mV1.0 does not distinguish between training set and validation set, so how do you divide it?

  2. Model training: is it to use lsk_s_backbone-e9d2e551.pth as a pre-training model? Or use a different model? How much epoch do I need to train to reach MAP 47.87?

3.result submission: I find that the reasonable use of NMS can help improve the score on ISPRS, are there any relevant considerations?

Finally, I found that lsk_s_backbone-e9d2e551.pth is placed on the MMLAB website, but I can't find lsk's config in mmrotate v1.0.

Thank you very much, mainly I want to reproduce, go back if I can summarize the specific process here, and continue to try on Fair1mV2.0.

Any other context?

No response

load checkpoint from local path: /home/xyn02/LSKNet/configs/xynModel/lsk_s_ema_fpn_1x_dota_le90_20230212-30ed4041.pth The model and loaded state dict do not match exactly[Bug]

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

sys.platform: linux
Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 4090
GPU 1: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda-11.8
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 11.1.0-1ubuntu1~18.04.1) 11.1.0
PyTorch: 2.0.1+cu118
PyTorch compiling details: PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.8
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  • CuDNN 8.9.5
    • Built with CuDNN 8.7
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.2+cu118
OpenCV: 4.9.0
MMCV: 1.7.2
MMCV Compiler: GCC 9.3
MMCV CUDA Compiler: 11.8
MMRotate: 0.3.4+

Reproduces the problem - code sample

Copyright (c) OpenMMLab. All rights reserved.

import argparse
import os
import os.path as osp
import time
import warnings

import mmcv
import torch
from mmcv import Config, DictAction
from mmcv.cnn import fuse_conv_bn
from mmcv.parallel import MMDataParallel, MMDistributedDataParallel
from mmcv.runner import (get_dist_info, init_dist, load_checkpoint,
wrap_fp16_model)
from mmdet.apis import multi_gpu_test, single_gpu_test
from mmdet.datasets import build_dataloader, replace_ImageToTensor

from mmrotate.datasets import build_dataset
from mmrotate.models import build_detector
from mmrotate.utils import compat_cfg, setup_multi_processes

def parse_args():
"""Parse parameters."""
parser = argparse.ArgumentParser(
description='MMDet test (and eval) a model')
parser.add_argument('--config', default='/home/xyn02/LSKNet/configs/xynModel/lsk_s_ema_fpn_1x_dota_le90.py', help='test config file path')
parser.add_argument('--checkpoint', default='/home/xyn02/LSKNet/configs/xynModel/lsk_s_ema_fpn_1x_dota_le90_20230212-30ed4041.pth', help='checkpoint file')
parser.add_argument(
'--work-dir',
default='/home/xyn02/LSKNet/runs/202403/20240326/lsk_s_ema_fpn_1x_dota_le90_run/work_dir',
help='the directory to save the file containing evaluation metrics')
parser.add_argument('--out', help='output result file in pickle format')
parser.add_argument(
'--fuse-conv-bn',
action='store_true',
help='Whether to fuse conv and bn, this will slightly increase'
'the inference speed')
parser.add_argument(
'--gpu-ids',
type=int,
nargs='+',
help='ids of gpus to use '
'(only applicable to non-distributed testing)')
parser.add_argument(
'--format-only',
action='store_true',
help='Format the output results without perform evaluation. It is'
'useful when you want to format the result to a specific format and '
'submit it to the test server')
parser.add_argument(
'--eval',
type=str,
nargs='+',
help='evaluation metrics, which depends on the dataset, e.g., "bbox",'
' "segm", "proposal" for COCO, and "mAP", "recall" for PASCAL VOC')
parser.add_argument('--show', action='store_true', help='show results')
parser.add_argument(
'--show-dir', default='/home/xyn02/LSKNet/runs/202403/20240326/lsk_s_ema_fpn_1x_dota_le90_run/show_dir', help='directory where painted images will be saved')
parser.add_argument(
'--show-score-thr',
type=float,
default=0.3,
help='score threshold (default: 0.3)')
parser.add_argument(
'--gpu-collect',
action='store_true',
help='whether to use gpu to collect results.')
parser.add_argument(
'--tmpdir',
help='tmp directory used for collecting results from multiple '
'workers, available when gpu-collect is not specified')
parser.add_argument(
'--cfg-options',
nargs='+',
action=DictAction,
help='override some settings in the used config, the key-value pair '
'in xxx=yyy format will be merged into config file. If the value to '
'be overwritten is a list, it should be like key="[a,b]" or key=a,b '
'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" '
'Note that the quotation marks are necessary and that no white space '
'is allowed.')
parser.add_argument(
'--eval-options',
nargs='+',
action=DictAction,
help='custom options for evaluation, the key-value pair in xxx=yyy '
'format will be kwargs for dataset.evaluate() function')
parser.add_argument(
'--launcher',
choices=['none', 'pytorch', 'slurm', 'mpi'],
default='none',
help='job launcher')
parser.add_argument('--local_rank', type=int, default=0)
args = parser.parse_args()
if 'LOCAL_RANK' not in os.environ:
os.environ['LOCAL_RANK'] = str(args.local_rank)

return args

def main():
args = parse_args()

assert args.out or args.eval or args.format_only or args.show \
    or args.show_dir, \
    ('Please specify at least one operation (save/eval/format/show the '
     'results / save the results) with the argument "--out", "--eval"'
     ', "--format-only", "--show" or "--show-dir"')

if args.eval and args.format_only:
    raise ValueError('--eval and --format_only cannot be both specified')

if args.out is not None and not args.out.endswith(('.pkl', '.pickle')):
    raise ValueError('The output file must be a pkl file.')

cfg = Config.fromfile(args.config)
if args.cfg_options is not None:
    cfg.merge_from_dict(args.cfg_options)

cfg = compat_cfg(cfg)

if args.format_only and cfg.mp_start_method != 'spawn':
    warnings.warn(
        '`mp_start_method` in `cfg` is set to `spawn` to use CUDA '
        'with multiprocessing when formatting output result.')
    cfg.mp_start_method = 'spawn'

# set multi-process settings
setup_multi_processes(cfg)

# set cudnn_benchmark
if cfg.get('cudnn_benchmark', False):
    torch.backends.cudnn.benchmark = True

cfg.model.pretrained = None
if cfg.model.get('neck'):
    if isinstance(cfg.model.neck, list):
        for neck_cfg in cfg.model.neck:
            if neck_cfg.get('rfp_backbone'):
                if neck_cfg.rfp_backbone.get('pretrained'):
                    neck_cfg.rfp_backbone.pretrained = None
    elif cfg.model.neck.get('rfp_backbone'):
        if cfg.model.neck.rfp_backbone.get('pretrained'):
            cfg.model.neck.rfp_backbone.pretrained = None

if args.gpu_ids is not None:
    cfg.gpu_ids = args.gpu_ids
else:
    cfg.gpu_ids = range(1)

# init distributed env first, since logger depends on the dist info.
if args.launcher == 'none':
    distributed = False
    if len(cfg.gpu_ids) > 1:
        warnings.warn(
            f'We treat {cfg.gpu_ids} as gpu-ids, and reset to '
            f'{cfg.gpu_ids[0:1]} as gpu-ids to avoid potential error in '
            'non-distribute testing time.')
        cfg.gpu_ids = cfg.gpu_ids[0:1]
else:
    distributed = True
    init_dist(args.launcher, **cfg.dist_params)

test_dataloader_default_args = dict(
    samples_per_gpu=1, workers_per_gpu=2, dist=distributed, shuffle=False)

# in case the test dataset is concatenated
if isinstance(cfg.data.test, dict):
    cfg.data.test.test_mode = True
    if 'samples_per_gpu' in cfg.data.test:
        warnings.warn('`samples_per_gpu` in `test` field of '
                      'data will be deprecated, you should'
                      ' move it to `test_dataloader` field')
        test_dataloader_default_args['samples_per_gpu'] = \
            cfg.data.test.pop('samples_per_gpu')
    if test_dataloader_default_args['samples_per_gpu'] > 1:
        # Replace 'ImageToTensor' to 'DefaultFormatBundle'
        cfg.data.test.pipeline = replace_ImageToTensor(
            cfg.data.test.pipeline)
elif isinstance(cfg.data.test, list):
    for ds_cfg in cfg.data.test:
        ds_cfg.test_mode = True
        if 'samples_per_gpu' in ds_cfg:
            warnings.warn('`samples_per_gpu` in `test` field of '
                          'data will be deprecated, you should'
                          ' move it to `test_dataloader` field')
    samples_per_gpu = max(
        [ds_cfg.pop('samples_per_gpu', 1) for ds_cfg in cfg.data.test])
    test_dataloader_default_args['samples_per_gpu'] = samples_per_gpu
    if samples_per_gpu > 1:
        for ds_cfg in cfg.data.test:
            ds_cfg.pipeline = replace_ImageToTensor(ds_cfg.pipeline)

test_loader_cfg = {
    **test_dataloader_default_args,
    **cfg.data.get('test_dataloader', {})
}

rank, _ = get_dist_info()
# allows not to create
if args.work_dir is not None and rank == 0:
    mmcv.mkdir_or_exist(osp.abspath(args.work_dir))
    timestamp = time.strftime('%Y%m%d_%H%M%S', time.localtime())
    json_file = osp.join(args.work_dir, f'eval_{timestamp}.json')

# build the dataloader
dataset = build_dataset(cfg.data.test)
data_loader = build_dataloader(dataset, **test_loader_cfg)

# build the model and load checkpoint
cfg.model.train_cfg = None
model = build_detector(cfg.model, test_cfg=cfg.get('test_cfg'))
fp16_cfg = cfg.get('fp16', None)
if fp16_cfg is not None:
    wrap_fp16_model(model)
checkpoint = load_checkpoint(model, args.checkpoint, map_location='cpu')
if args.fuse_conv_bn:
    model = fuse_conv_bn(model)
# old versions did not save class info in checkpoints, this walkaround is
# for backward compatibility
if 'CLASSES' in checkpoint.get('meta', {}):
    model.CLASSES = checkpoint['meta']['CLASSES']
else:
    model.CLASSES = dataset.CLASSES

if not distributed:
    model = MMDataParallel(model, device_ids=cfg.gpu_ids)
    outputs = single_gpu_test(model, data_loader, args.show, args.show_dir,
                              args.show_score_thr)
else:
    model = MMDistributedDataParallel(
        model.cuda(),
        device_ids=[torch.cuda.current_device()],
        broadcast_buffers=False)
    outputs = multi_gpu_test(model, data_loader, args.tmpdir,
                             args.gpu_collect)

rank, _ = get_dist_info()
if rank == 0:
    if args.out:
        print(f'\nwriting results to {args.out}')
        mmcv.dump(outputs, args.out)
    kwargs = {} if args.eval_options is None else args.eval_options
    if args.format_only:
        dataset.format_results(outputs, **kwargs)
    if args.eval:
        eval_kwargs = cfg.get('evaluation', {}).copy()
        # hard-code way to remove EvalHook args
        for key in [
                'interval', 'tmpdir', 'start', 'gpu_collect', 'save_best',
                'rule', 'dynamic_intervals'
        ]:
            eval_kwargs.pop(key, None)
        eval_kwargs.update(dict(metric=args.eval, **kwargs))
        metric = dataset.evaluate(outputs, **eval_kwargs)
        print(metric)
        metric_dict = dict(config=args.config, metric=metric)
        if args.work_dir is not None and rank == 0:
            mmcv.dump(metric_dict, json_file)

if name == 'main':
main()

Reproduces the problem - command or script

python ./tools/test.py --format-only --eval-options submission_dir=/home/xyn02/LSKNet/runs/202403/20240326/lsk_s_ema_fpn_1x_dota_le90_run/work_dir/Task1_results

Reproduces the problem - error message

(LSKNet) xyn02@server-R740:~/LSKNet$ python ./tools/test.py --format-only --eval-options submission_dir=/home/xyn02/LSKNet/runs/202403/20240326/lsk_s_ema_fpn_1x_dota_le90_run/work_dir/Task1_results
/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
./tools/test.py:124: UserWarning: mp_start_method in cfg is set to spawn to use CUDA with multiprocessing when formatting output result.
warnings.warn(
/home/xyn02/LSKNet/mmrotate/utils/setup_env.py:38: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/home/xyn02/LSKNet/mmrotate/utils/setup_env.py:48: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmdet/models/dense_heads/anchor_head.py:116: UserWarning: DeprecationWarning: num_anchors is deprecated, for consistency or also use num_base_priors instead
warnings.warn('DeprecationWarning: num_anchors is deprecated, '
load checkpoint from local path: /home/xyn02/LSKNet/configs/xynModel/lsk_s_ema_fpn_1x_dota_le90_20230212-30ed4041.pth
The model and loaded state dict do not match exactly

unexpected key in source state_dict: ema_backbone_patch_embed1_proj_weight, ema_backbone_patch_embed1_proj_bias, ema_backbone_patch_embed1_norm_weight, ema_backbone_patch_embed1_norm_bias, ema_backbone_patch_embed1_norm_running_mean, ema_backbone_patch_embed1_norm_running_var, ema_backbone_patch_embed1_norm_num_batches_tracked, ema_backbone_block1_0_layer_scale_1, ema_backbone_block1_0_layer_scale_2, ema_backbone_block1_0_norm1_weight, ema_backbone_block1_0_norm1_bias, ema_backbone_block1_0_norm1_running_mean, ema_backbone_block1_0_norm1_running_var, ema_backbone_block1_0_norm1_num_batches_tracked, ema_backbone_block1_0_norm2_weight, ema_backbone_block1_0_norm2_bias, ema_backbone_block1_0_norm2_running_mean, ema_backbone_block1_0_norm2_running_var, ema_backbone_block1_0_norm2_num_batches_tracked, ema_backbone_block1_0_attn_proj_1_weight, ema_backbone_block1_0_attn_proj_1_bias, ema_backbone_block1_0_attn_spatial_gating_unit_conv0_weight, ema_backbone_block1_0_attn_spatial_gating_unit_conv0_bias, ema_backbone_block1_0_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block1_0_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block1_0_attn_spatial_gating_unit_conv1_weight, ema_backbone_block1_0_attn_spatial_gating_unit_conv1_bias, ema_backbone_block1_0_attn_spatial_gating_unit_conv2_weight, ema_backbone_block1_0_attn_spatial_gating_unit_conv2_bias, ema_backbone_block1_0_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block1_0_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block1_0_attn_spatial_gating_unit_conv_weight, ema_backbone_block1_0_attn_spatial_gating_unit_conv_bias, ema_backbone_block1_0_attn_proj_2_weight, ema_backbone_block1_0_attn_proj_2_bias, ema_backbone_block1_0_mlp_fc1_weight, ema_backbone_block1_0_mlp_fc1_bias, ema_backbone_block1_0_mlp_dwconv_dwconv_weight, ema_backbone_block1_0_mlp_dwconv_dwconv_bias, ema_backbone_block1_0_mlp_fc2_weight, ema_backbone_block1_0_mlp_fc2_bias, ema_backbone_block1_1_layer_scale_1, ema_backbone_block1_1_layer_scale_2, ema_backbone_block1_1_norm1_weight, ema_backbone_block1_1_norm1_bias, ema_backbone_block1_1_norm1_running_mean, ema_backbone_block1_1_norm1_running_var, ema_backbone_block1_1_norm1_num_batches_tracked, ema_backbone_block1_1_norm2_weight, ema_backbone_block1_1_norm2_bias, ema_backbone_block1_1_norm2_running_mean, ema_backbone_block1_1_norm2_running_var, ema_backbone_block1_1_norm2_num_batches_tracked, ema_backbone_block1_1_attn_proj_1_weight, ema_backbone_block1_1_attn_proj_1_bias, ema_backbone_block1_1_attn_spatial_gating_unit_conv0_weight, ema_backbone_block1_1_attn_spatial_gating_unit_conv0_bias, ema_backbone_block1_1_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block1_1_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block1_1_attn_spatial_gating_unit_conv1_weight, ema_backbone_block1_1_attn_spatial_gating_unit_conv1_bias, ema_backbone_block1_1_attn_spatial_gating_unit_conv2_weight, ema_backbone_block1_1_attn_spatial_gating_unit_conv2_bias, ema_backbone_block1_1_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block1_1_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block1_1_attn_spatial_gating_unit_conv_weight, ema_backbone_block1_1_attn_spatial_gating_unit_conv_bias, ema_backbone_block1_1_attn_proj_2_weight, ema_backbone_block1_1_attn_proj_2_bias, ema_backbone_block1_1_mlp_fc1_weight, ema_backbone_block1_1_mlp_fc1_bias, ema_backbone_block1_1_mlp_dwconv_dwconv_weight, ema_backbone_block1_1_mlp_dwconv_dwconv_bias, ema_backbone_block1_1_mlp_fc2_weight, ema_backbone_block1_1_mlp_fc2_bias, ema_backbone_norm1_weight, ema_backbone_norm1_bias, ema_backbone_patch_embed2_proj_weight, ema_backbone_patch_embed2_proj_bias, ema_backbone_patch_embed2_norm_weight, ema_backbone_patch_embed2_norm_bias, ema_backbone_patch_embed2_norm_running_mean, ema_backbone_patch_embed2_norm_running_var, ema_backbone_patch_embed2_norm_num_batches_tracked, ema_backbone_block2_0_layer_scale_1, ema_backbone_block2_0_layer_scale_2, ema_backbone_block2_0_norm1_weight, ema_backbone_block2_0_norm1_bias, ema_backbone_block2_0_norm1_running_mean, ema_backbone_block2_0_norm1_running_var, ema_backbone_block2_0_norm1_num_batches_tracked, ema_backbone_block2_0_norm2_weight, ema_backbone_block2_0_norm2_bias, ema_backbone_block2_0_norm2_running_mean, ema_backbone_block2_0_norm2_running_var, ema_backbone_block2_0_norm2_num_batches_tracked, ema_backbone_block2_0_attn_proj_1_weight, ema_backbone_block2_0_attn_proj_1_bias, ema_backbone_block2_0_attn_spatial_gating_unit_conv0_weight, ema_backbone_block2_0_attn_spatial_gating_unit_conv0_bias, ema_backbone_block2_0_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block2_0_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block2_0_attn_spatial_gating_unit_conv1_weight, ema_backbone_block2_0_attn_spatial_gating_unit_conv1_bias, ema_backbone_block2_0_attn_spatial_gating_unit_conv2_weight, ema_backbone_block2_0_attn_spatial_gating_unit_conv2_bias, ema_backbone_block2_0_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block2_0_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block2_0_attn_spatial_gating_unit_conv_weight, ema_backbone_block2_0_attn_spatial_gating_unit_conv_bias, ema_backbone_block2_0_attn_proj_2_weight, ema_backbone_block2_0_attn_proj_2_bias, ema_backbone_block2_0_mlp_fc1_weight, ema_backbone_block2_0_mlp_fc1_bias, ema_backbone_block2_0_mlp_dwconv_dwconv_weight, ema_backbone_block2_0_mlp_dwconv_dwconv_bias, ema_backbone_block2_0_mlp_fc2_weight, ema_backbone_block2_0_mlp_fc2_bias, ema_backbone_block2_1_layer_scale_1, ema_backbone_block2_1_layer_scale_2, ema_backbone_block2_1_norm1_weight, ema_backbone_block2_1_norm1_bias, ema_backbone_block2_1_norm1_running_mean, ema_backbone_block2_1_norm1_running_var, ema_backbone_block2_1_norm1_num_batches_tracked, ema_backbone_block2_1_norm2_weight, ema_backbone_block2_1_norm2_bias, ema_backbone_block2_1_norm2_running_mean, ema_backbone_block2_1_norm2_running_var, ema_backbone_block2_1_norm2_num_batches_tracked, ema_backbone_block2_1_attn_proj_1_weight, ema_backbone_block2_1_attn_proj_1_bias, ema_backbone_block2_1_attn_spatial_gating_unit_conv0_weight, ema_backbone_block2_1_attn_spatial_gating_unit_conv0_bias, ema_backbone_block2_1_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block2_1_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block2_1_attn_spatial_gating_unit_conv1_weight, ema_backbone_block2_1_attn_spatial_gating_unit_conv1_bias, ema_backbone_block2_1_attn_spatial_gating_unit_conv2_weight, ema_backbone_block2_1_attn_spatial_gating_unit_conv2_bias, ema_backbone_block2_1_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block2_1_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block2_1_attn_spatial_gating_unit_conv_weight, ema_backbone_block2_1_attn_spatial_gating_unit_conv_bias, ema_backbone_block2_1_attn_proj_2_weight, ema_backbone_block2_1_attn_proj_2_bias, ema_backbone_block2_1_mlp_fc1_weight, ema_backbone_block2_1_mlp_fc1_bias, ema_backbone_block2_1_mlp_dwconv_dwconv_weight, ema_backbone_block2_1_mlp_dwconv_dwconv_bias, ema_backbone_block2_1_mlp_fc2_weight, ema_backbone_block2_1_mlp_fc2_bias, ema_backbone_norm2_weight, ema_backbone_norm2_bias, ema_backbone_patch_embed3_proj_weight, ema_backbone_patch_embed3_proj_bias, ema_backbone_patch_embed3_norm_weight, ema_backbone_patch_embed3_norm_bias, ema_backbone_patch_embed3_norm_running_mean, ema_backbone_patch_embed3_norm_running_var, ema_backbone_patch_embed3_norm_num_batches_tracked, ema_backbone_block3_0_layer_scale_1, ema_backbone_block3_0_layer_scale_2, ema_backbone_block3_0_norm1_weight, ema_backbone_block3_0_norm1_bias, ema_backbone_block3_0_norm1_running_mean, ema_backbone_block3_0_norm1_running_var, ema_backbone_block3_0_norm1_num_batches_tracked, ema_backbone_block3_0_norm2_weight, ema_backbone_block3_0_norm2_bias, ema_backbone_block3_0_norm2_running_mean, ema_backbone_block3_0_norm2_running_var, ema_backbone_block3_0_norm2_num_batches_tracked, ema_backbone_block3_0_attn_proj_1_weight, ema_backbone_block3_0_attn_proj_1_bias, ema_backbone_block3_0_attn_spatial_gating_unit_conv0_weight, ema_backbone_block3_0_attn_spatial_gating_unit_conv0_bias, ema_backbone_block3_0_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block3_0_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block3_0_attn_spatial_gating_unit_conv1_weight, ema_backbone_block3_0_attn_spatial_gating_unit_conv1_bias, ema_backbone_block3_0_attn_spatial_gating_unit_conv2_weight, ema_backbone_block3_0_attn_spatial_gating_unit_conv2_bias, ema_backbone_block3_0_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block3_0_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block3_0_attn_spatial_gating_unit_conv_weight, ema_backbone_block3_0_attn_spatial_gating_unit_conv_bias, ema_backbone_block3_0_attn_proj_2_weight, ema_backbone_block3_0_attn_proj_2_bias, ema_backbone_block3_0_mlp_fc1_weight, ema_backbone_block3_0_mlp_fc1_bias, ema_backbone_block3_0_mlp_dwconv_dwconv_weight, ema_backbone_block3_0_mlp_dwconv_dwconv_bias, ema_backbone_block3_0_mlp_fc2_weight, ema_backbone_block3_0_mlp_fc2_bias, ema_backbone_block3_1_layer_scale_1, ema_backbone_block3_1_layer_scale_2, ema_backbone_block3_1_norm1_weight, ema_backbone_block3_1_norm1_bias, ema_backbone_block3_1_norm1_running_mean, ema_backbone_block3_1_norm1_running_var, ema_backbone_block3_1_norm1_num_batches_tracked, ema_backbone_block3_1_norm2_weight, ema_backbone_block3_1_norm2_bias, ema_backbone_block3_1_norm2_running_mean, ema_backbone_block3_1_norm2_running_var, ema_backbone_block3_1_norm2_num_batches_tracked, ema_backbone_block3_1_attn_proj_1_weight, ema_backbone_block3_1_attn_proj_1_bias, ema_backbone_block3_1_attn_spatial_gating_unit_conv0_weight, ema_backbone_block3_1_attn_spatial_gating_unit_conv0_bias, ema_backbone_block3_1_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block3_1_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block3_1_attn_spatial_gating_unit_conv1_weight, ema_backbone_block3_1_attn_spatial_gating_unit_conv1_bias, ema_backbone_block3_1_attn_spatial_gating_unit_conv2_weight, ema_backbone_block3_1_attn_spatial_gating_unit_conv2_bias, ema_backbone_block3_1_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block3_1_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block3_1_attn_spatial_gating_unit_conv_weight, ema_backbone_block3_1_attn_spatial_gating_unit_conv_bias, ema_backbone_block3_1_attn_proj_2_weight, ema_backbone_block3_1_attn_proj_2_bias, ema_backbone_block3_1_mlp_fc1_weight, ema_backbone_block3_1_mlp_fc1_bias, ema_backbone_block3_1_mlp_dwconv_dwconv_weight, ema_backbone_block3_1_mlp_dwconv_dwconv_bias, ema_backbone_block3_1_mlp_fc2_weight, ema_backbone_block3_1_mlp_fc2_bias, ema_backbone_block3_2_layer_scale_1, ema_backbone_block3_2_layer_scale_2, ema_backbone_block3_2_norm1_weight, ema_backbone_block3_2_norm1_bias, ema_backbone_block3_2_norm1_running_mean, ema_backbone_block3_2_norm1_running_var, ema_backbone_block3_2_norm1_num_batches_tracked, ema_backbone_block3_2_norm2_weight, ema_backbone_block3_2_norm2_bias, ema_backbone_block3_2_norm2_running_mean, ema_backbone_block3_2_norm2_running_var, ema_backbone_block3_2_norm2_num_batches_tracked, ema_backbone_block3_2_attn_proj_1_weight, ema_backbone_block3_2_attn_proj_1_bias, ema_backbone_block3_2_attn_spatial_gating_unit_conv0_weight, ema_backbone_block3_2_attn_spatial_gating_unit_conv0_bias, ema_backbone_block3_2_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block3_2_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block3_2_attn_spatial_gating_unit_conv1_weight, ema_backbone_block3_2_attn_spatial_gating_unit_conv1_bias, ema_backbone_block3_2_attn_spatial_gating_unit_conv2_weight, ema_backbone_block3_2_attn_spatial_gating_unit_conv2_bias, ema_backbone_block3_2_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block3_2_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block3_2_attn_spatial_gating_unit_conv_weight, ema_backbone_block3_2_attn_spatial_gating_unit_conv_bias, ema_backbone_block3_2_attn_proj_2_weight, ema_backbone_block3_2_attn_proj_2_bias, ema_backbone_block3_2_mlp_fc1_weight, ema_backbone_block3_2_mlp_fc1_bias, ema_backbone_block3_2_mlp_dwconv_dwconv_weight, ema_backbone_block3_2_mlp_dwconv_dwconv_bias, ema_backbone_block3_2_mlp_fc2_weight, ema_backbone_block3_2_mlp_fc2_bias, ema_backbone_block3_3_layer_scale_1, ema_backbone_block3_3_layer_scale_2, ema_backbone_block3_3_norm1_weight, ema_backbone_block3_3_norm1_bias, ema_backbone_block3_3_norm1_running_mean, ema_backbone_block3_3_norm1_running_var, ema_backbone_block3_3_norm1_num_batches_tracked, ema_backbone_block3_3_norm2_weight, ema_backbone_block3_3_norm2_bias, ema_backbone_block3_3_norm2_running_mean, ema_backbone_block3_3_norm2_running_var, ema_backbone_block3_3_norm2_num_batches_tracked, ema_backbone_block3_3_attn_proj_1_weight, ema_backbone_block3_3_attn_proj_1_bias, ema_backbone_block3_3_attn_spatial_gating_unit_conv0_weight, ema_backbone_block3_3_attn_spatial_gating_unit_conv0_bias, ema_backbone_block3_3_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block3_3_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block3_3_attn_spatial_gating_unit_conv1_weight, ema_backbone_block3_3_attn_spatial_gating_unit_conv1_bias, ema_backbone_block3_3_attn_spatial_gating_unit_conv2_weight, ema_backbone_block3_3_attn_spatial_gating_unit_conv2_bias, ema_backbone_block3_3_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block3_3_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block3_3_attn_spatial_gating_unit_conv_weight, ema_backbone_block3_3_attn_spatial_gating_unit_conv_bias, ema_backbone_block3_3_attn_proj_2_weight, ema_backbone_block3_3_attn_proj_2_bias, ema_backbone_block3_3_mlp_fc1_weight, ema_backbone_block3_3_mlp_fc1_bias, ema_backbone_block3_3_mlp_dwconv_dwconv_weight, ema_backbone_block3_3_mlp_dwconv_dwconv_bias, ema_backbone_block3_3_mlp_fc2_weight, ema_backbone_block3_3_mlp_fc2_bias, ema_backbone_norm3_weight, ema_backbone_norm3_bias, ema_backbone_patch_embed4_proj_weight, ema_backbone_patch_embed4_proj_bias, ema_backbone_patch_embed4_norm_weight, ema_backbone_patch_embed4_norm_bias, ema_backbone_patch_embed4_norm_running_mean, ema_backbone_patch_embed4_norm_running_var, ema_backbone_patch_embed4_norm_num_batches_tracked, ema_backbone_block4_0_layer_scale_1, ema_backbone_block4_0_layer_scale_2, ema_backbone_block4_0_norm1_weight, ema_backbone_block4_0_norm1_bias, ema_backbone_block4_0_norm1_running_mean, ema_backbone_block4_0_norm1_running_var, ema_backbone_block4_0_norm1_num_batches_tracked, ema_backbone_block4_0_norm2_weight, ema_backbone_block4_0_norm2_bias, ema_backbone_block4_0_norm2_running_mean, ema_backbone_block4_0_norm2_running_var, ema_backbone_block4_0_norm2_num_batches_tracked, ema_backbone_block4_0_attn_proj_1_weight, ema_backbone_block4_0_attn_proj_1_bias, ema_backbone_block4_0_attn_spatial_gating_unit_conv0_weight, ema_backbone_block4_0_attn_spatial_gating_unit_conv0_bias, ema_backbone_block4_0_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block4_0_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block4_0_attn_spatial_gating_unit_conv1_weight, ema_backbone_block4_0_attn_spatial_gating_unit_conv1_bias, ema_backbone_block4_0_attn_spatial_gating_unit_conv2_weight, ema_backbone_block4_0_attn_spatial_gating_unit_conv2_bias, ema_backbone_block4_0_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block4_0_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block4_0_attn_spatial_gating_unit_conv_weight, ema_backbone_block4_0_attn_spatial_gating_unit_conv_bias, ema_backbone_block4_0_attn_proj_2_weight, ema_backbone_block4_0_attn_proj_2_bias, ema_backbone_block4_0_mlp_fc1_weight, ema_backbone_block4_0_mlp_fc1_bias, ema_backbone_block4_0_mlp_dwconv_dwconv_weight, ema_backbone_block4_0_mlp_dwconv_dwconv_bias, ema_backbone_block4_0_mlp_fc2_weight, ema_backbone_block4_0_mlp_fc2_bias, ema_backbone_block4_1_layer_scale_1, ema_backbone_block4_1_layer_scale_2, ema_backbone_block4_1_norm1_weight, ema_backbone_block4_1_norm1_bias, ema_backbone_block4_1_norm1_running_mean, ema_backbone_block4_1_norm1_running_var, ema_backbone_block4_1_norm1_num_batches_tracked, ema_backbone_block4_1_norm2_weight, ema_backbone_block4_1_norm2_bias, ema_backbone_block4_1_norm2_running_mean, ema_backbone_block4_1_norm2_running_var, ema_backbone_block4_1_norm2_num_batches_tracked, ema_backbone_block4_1_attn_proj_1_weight, ema_backbone_block4_1_attn_proj_1_bias, ema_backbone_block4_1_attn_spatial_gating_unit_conv0_weight, ema_backbone_block4_1_attn_spatial_gating_unit_conv0_bias, ema_backbone_block4_1_attn_spatial_gating_unit_conv_spatial_weight, ema_backbone_block4_1_attn_spatial_gating_unit_conv_spatial_bias, ema_backbone_block4_1_attn_spatial_gating_unit_conv1_weight, ema_backbone_block4_1_attn_spatial_gating_unit_conv1_bias, ema_backbone_block4_1_attn_spatial_gating_unit_conv2_weight, ema_backbone_block4_1_attn_spatial_gating_unit_conv2_bias, ema_backbone_block4_1_attn_spatial_gating_unit_conv_squeeze_weight, ema_backbone_block4_1_attn_spatial_gating_unit_conv_squeeze_bias, ema_backbone_block4_1_attn_spatial_gating_unit_conv_weight, ema_backbone_block4_1_attn_spatial_gating_unit_conv_bias, ema_backbone_block4_1_attn_proj_2_weight, ema_backbone_block4_1_attn_proj_2_bias, ema_backbone_block4_1_mlp_fc1_weight, ema_backbone_block4_1_mlp_fc1_bias, ema_backbone_block4_1_mlp_dwconv_dwconv_weight, ema_backbone_block4_1_mlp_dwconv_dwconv_bias, ema_backbone_block4_1_mlp_fc2_weight, ema_backbone_block4_1_mlp_fc2_bias, ema_backbone_norm4_weight, ema_backbone_norm4_bias, ema_neck_lateral_convs_0_conv_weight, ema_neck_lateral_convs_0_conv_bias, ema_neck_lateral_convs_1_conv_weight, ema_neck_lateral_convs_1_conv_bias, ema_neck_lateral_convs_2_conv_weight, ema_neck_lateral_convs_2_conv_bias, ema_neck_lateral_convs_3_conv_weight, ema_neck_lateral_convs_3_conv_bias, ema_neck_fpn_convs_0_conv_weight, ema_neck_fpn_convs_0_conv_bias, ema_neck_fpn_convs_1_conv_weight, ema_neck_fpn_convs_1_conv_bias, ema_neck_fpn_convs_2_conv_weight, ema_neck_fpn_convs_2_conv_bias, ema_neck_fpn_convs_3_conv_weight, ema_neck_fpn_convs_3_conv_bias, ema_rpn_head_rpn_conv_weight, ema_rpn_head_rpn_conv_bias, ema_rpn_head_rpn_cls_weight, ema_rpn_head_rpn_cls_bias, ema_rpn_head_rpn_reg_weight, ema_rpn_head_rpn_reg_bias, ema_roi_head_bbox_head_fc_cls_weight, ema_roi_head_bbox_head_fc_cls_bias, ema_roi_head_bbox_head_fc_reg_weight, ema_roi_head_bbox_head_fc_reg_bias, ema_roi_head_bbox_head_shared_fcs_0_weight, ema_roi_head_bbox_head_shared_fcs_0_bias, ema_roi_head_bbox_head_shared_fcs_1_weight, ema_roi_head_bbox_head_shared_fcs_1_bias

completed: 0, elapsed: 0s/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(

Merging patch bboxes into full image!!!
Multiple processing
completed: 0, elapsed: 0s
/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
/home/xyn02/anaconda3/envs/LSKNet/lib/python3.8/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
Traceback (most recent call last):
File "./tools/test.py", line 264, in
main()
File "./tools/test.py", line 246, in main
dataset.format_results(outputs, **kwargs)
File "/home/xyn02/LSKNet/mmrotate/datasets/dota.py", line 327, in format_results
id_list, dets_list = self.merge_det(results, nproc)
ValueError: not enough values to unpack (expected 2, got 0)

Additional information

No response

[Docs]

Branch

master branch https://mmrotate.readthedocs.io/en/latest/

📚 The doc issue

Hello, really amazing work and repository! I had a doubt, I'm training on a Tesla T4 15GB on a relatively small dataset. I have put samples_per_gpu=1, workers_per_gpu=2. However, I'm still going out of memory. To be precise, the model trains 1 epoch perfectly but during the val after the epoch it launches 4 python processes (as seen in Nvidia-smi). The images are 1024x1024. I have even tried to remove the PolyRandomRotate augmentations. How come the model is training perfectly 1 epoch and giving error only during validation? Sorry if this is posted in the wrong place.

Suggest a potential alternative/fix

Screenshot 2023-03-28 at 12 01 10 AM

No response

[Bug]

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

Name Version Build Channel

addict 2.4.0 pypi_0 pypi
aliyun-python-sdk-core 2.14.0 pypi_0 pypi
aliyun-python-sdk-kms 2.16.2 pypi_0 pypi
blas 1.0 mkl
brotlipy 0.7.0 py38h2bbff1b_1003
ca-certificates 2023.08.22 haa95532_0
certifi 2023.7.22 py38haa95532_0
cffi 1.16.0 pypi_0 pypi
chardet 5.2.0 pypi_0 pypi
charset-normalizer 3.3.0 pypi_0 pypi
click 8.1.7 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
contourpy 1.1.1 pypi_0 pypi
crcmod 1.7 pypi_0 pypi
cryptography 41.0.4 pypi_0 pypi
cudatoolkit 11.8.0 hd77b12b_0
cycler 0.12.1 pypi_0 pypi
filelock 3.9.0 py38haa95532_0
fonttools 4.43.1 pypi_0 pypi
freetype 2.12.1 ha860e81_0
fsspec 2023.9.2 pypi_0 pypi
giflib 5.2.1 h8cc25b3_3
huggingface-hub 0.17.3 pypi_0 pypi
idna 3.4 py38haa95532_0
importlib-metadata 6.8.0 pypi_0 pypi
importlib-resources 6.1.0 pypi_0 pypi
intel-openmp 2023.1.0 h59b6b97_46319
jinja2 3.1.2 py38haa95532_0
jmespath 0.10.0 pypi_0 pypi
jpeg 9e h2bbff1b_1
kiwisolver 1.4.5 pypi_0 pypi
lerc 3.0 hd77b12b_0
libdeflate 1.17 h2bbff1b_1
libffi 3.4.4 hd77b12b_0
libjpeg-turbo 2.0.0 h196d8e1_0
libpng 1.6.39 h8cc25b3_0
libtiff 4.5.1 hd77b12b_0
libuv 1.44.2 h2bbff1b_0
libwebp 1.3.2 hbc33d0d_0
libwebp-base 1.3.2 h2bbff1b_0
lz4-c 1.9.4 h2bbff1b_0
markdown 3.5 pypi_0 pypi
markdown-it-py 3.0.0 pypi_0 pypi
markupsafe 2.1.1 py38h2bbff1b_0
matplotlib 3.7.3 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
mkl 2023.1.0 h6b88ed4_46357
mkl-service 2.4.0 py38h2bbff1b_1
mkl_fft 1.3.8 py38h2bbff1b_0
mkl_random 1.2.4 py38h59b6b97_0
mmcv-full 1.7.1 pypi_0 pypi
mmdet 2.28.1 pypi_0 pypi
mmrotate 0.3.4 pypi_0 pypi
model-index 0.1.11 pypi_0 pypi
mpmath 1.3.0 py38haa95532_0
networkx 3.1 py38haa95532_0
numpy 1.24.4 pypi_0 pypi
numpy-base 1.24.3 py38h8a87ada_1
opencv-python 4.8.1.78 pypi_0 pypi
opendatalab 0.0.10 pypi_0 pypi
openjpeg 2.4.0 h4fc8c34_0
openmim 0.3.9 pypi_0 pypi
openssl 3.0.11 h2bbff1b_2
openxlab 0.0.26 pypi_0 pypi
ordered-set 4.1.0 pypi_0 pypi
oss2 2.17.0 pypi_0 pypi
packaging 23.2 pypi_0 pypi
pandas 2.0.3 pypi_0 pypi
pillow 10.0.1 py38h045eedc_0
pip 23.2.1 py38haa95532_0
platformdirs 3.11.0 pypi_0 pypi
pycocotools 2.0.7 pypi_0 pypi
pycparser 2.21 pyhd3eb1b0_0
pycryptodome 3.19.0 pypi_0 pypi
pygments 2.16.1 pypi_0 pypi
pyopenssl 23.2.0 py38haa95532_0
pyparsing 3.1.1 pypi_0 pypi
pysocks 1.7.1 py38haa95532_0
python 3.8.18 h1aa4202_0
python-dateutil 2.8.2 pypi_0 pypi
pytorch 2.1.0 py3.8_cpu_0 pytorch
pytorch-mutex 1.0 cpu pytorch
pytz 2023.3.post1 pypi_0 pypi
pywin32 306 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
regex 2023.10.3 pypi_0 pypi
requests 2.28.2 pypi_0 pypi
rich 13.4.2 pypi_0 pypi
safetensors 0.4.0 pypi_0 pypi
scipy 1.10.1 pypi_0 pypi
setuptools 60.2.0 pypi_0 pypi
shapely 2.0.1 pypi_0 pypi
six 1.16.0 pypi_0 pypi
sqlite 3.41.2 h2bbff1b_0
sympy 1.11.1 py38haa95532_0
tabulate 0.9.0 pypi_0 pypi
tbb 2021.8.0 h59b6b97_0
terminaltables 3.1.10 pypi_0 pypi
timm 0.9.7 pypi_0 pypi
tk 8.6.12 h2bbff1b_0
tomli 2.0.1 pypi_0 pypi
torchvision 0.16.0 py38_cpu pytorch
tqdm 4.65.2 pypi_0 pypi
typing-extensions 4.8.0 pypi_0 pypi
typing_extensions 4.7.1 py38haa95532_0
tzdata 2023.3 pypi_0 pypi
urllib3 1.26.17 pypi_0 pypi
vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
wheel 0.41.2 py38haa95532_0
win_inet_pton 1.1.0 py38haa95532_0
xz 5.4.2 h8cc25b3_0
yaml 0.2.5 he774522_0
yapf 0.40.2 pypi_0 pypi
zipp 3.17.0 pypi_0 pypi
zlib 1.2.13 h8cc25b3_0
zstd 1.5.5 hd43e919_0

Reproduces the problem - code sample

from mmdet.apis import init_detector, inference_detector
import mmrotate

config_file = 'oriented_rcnn_r50_fpn_1x_dota_le90.py'
checkpoint_file = 'oriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
inference_detector(model, 'demo/demo.jpg')

Reproduces the problem - command or script

python test.py

Reproduces the problem - error message

No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1'
F:\Anaconda\Anaconda\envs\openmmlab\lib\site-packages\mmcv_init_.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-
full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
Traceback (most recent call last):
File "test.py", line 6, in
model = init_detector(config_file, checkpoint_file, device='cuda:0')
File "F:\Anaconda\Anaconda\envs\openmmlab\lib\site-packages\mmdet\apis\inference.py", line 39, in init_detector
if 'pretrained' in config.model:
File "F:\Anaconda\Anaconda\envs\openmmlab\lib\site-packages\mmcv\utils\config.py", line 519, in getattr
return getattr(self._cfg_dict, name)
File "F:\Anaconda\Anaconda\envs\openmmlab\lib\site-packages\mmcv\utils\config.py", line 50, in getattr
raise ex
AttributeError: 'ConfigDict' object has no attribute 'model'

Additional information

AttributeError: 'ConfigDict' object has no attribute 'model'

[Docs] Checkpoints

Branch

master branch https://mmrotate.readthedocs.io/en/latest/

📚 The doc issue

Hello author, may I ask how the checkpoints of the model you provided were selected? Do you train on trainval, then perform validation on val every 3 epochs, and take the epoch with the highest validation mAP? Or should we take the epoch with the lowest training loss?

Suggest a potential alternative/fix

No response

Cannot reproduce lsknet_s_ss result on DOTA1.0

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

sys.platform: linux
Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3,4,5,6,7: GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda-10.1
NVCC: Cuda compilation tools, release 10.1, V10.1.10
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0
OpenCV: 4.8.1
MMCV: 1.5.3
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.2
MMRotate: 0.3.4+

Reproduces the problem - code sample

dataset_type = 'DOTADataset'
data_root = 'data/split_ss_dota/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='RResize', img_scale=(1024, 1024)),
dict(
type='RRandomFlip',
flip_ratio=[0.25, 0.25, 0.25],
direction=['horizontal', 'vertical', 'diagonal'],
version='le90'),
dict(
type='PolyRandomRotate',
rotate_ratio=0.5,
angles_range=180,
auto_bound=False,
rect_classes=[9, 11],
version='le90'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=1,
workers_per_gpu=2,
train=dict(
type='DOTADataset',
ann_file='data/split_ss_dota/trainval/annfiles/',
img_prefix='data/split_ss_dota/trainval/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='RResize', img_scale=(1024, 1024)),
dict(
type='RRandomFlip',
flip_ratio=[0.25, 0.25, 0.25],
direction=['horizontal', 'vertical', 'diagonal'],
version='le90'),
dict(
type='PolyRandomRotate',
rotate_ratio=0.5,
angles_range=180,
auto_bound=False,
rect_classes=[9, 11],
version='le90'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
],
version='le90'),
val=dict(
type='DOTADataset',
ann_file='data/split_ss_dota/val/annfiles/',
img_prefix='data/split_ss_dota/val/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
],
version='le90'),
test=dict(
type='DOTADataset',
ann_file='data/split_ss_dota/test/images/',
img_prefix='data/split_ss_dota/test/images/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 1024),
flip=False,
transforms=[
dict(type='RResize'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
],
version='le90'))
evaluation = dict(interval=1, metric='mAP')
optimizer = dict(type='AdamW', lr=5e-05, betas=(0.9, 0.999), weight_decay=0.05)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.3333333333333333,
step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
opencv_num_threads = 0
mp_start_method = 'fork'
angle_version = 'le90'
gpu_number = 6
model = dict(
type='OrientedRCNN',
backbone=dict(
type='LSKNet',
embed_dims=[64, 128, 320, 512],
drop_rate=0.1,
drop_path_rate=0.1,
depths=[2, 2, 4, 2],
init_cfg=dict(
type='Pretrained',
checkpoint='/data/yiliu/lsknet/lsk_s_backbone-e9d2e551.pth'),
norm_cfg=dict(type='SyncBN', requires_grad=True)),
neck=dict(
type='FPN',
in_channels=[64, 128, 320, 512],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='OrientedRPNHead',
in_channels=256,
feat_channels=256,
version='le90',
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='MidpointOffsetCoder',
angle_range='le90',
target_means=[0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0, 0.5, 0.5]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='OrientedStandardRoIHead',
bbox_roi_extractor=dict(
type='RotatedSingleRoIExtractor',
roi_layer=dict(
type='RoIAlignRotated',
out_size=7,
sample_num=2,
clockwise=True),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='RotatedShared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=15,
bbox_coder=dict(
type='DeltaXYWHAOBBoxCoder',
angle_range='le90',
norm_factor=None,
edge_swap=True,
proj_xy=True,
target_means=(0.0, 0.0, 0.0, 0.0, 0.0),
target_stds=(0.1, 0.1, 0.2, 0.2, 0.1)),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
gpu_assign_thr=800,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.8),
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=False,
iou_calculator=dict(type='RBboxOverlaps2D'),
gpu_assign_thr=800,
ignore_iof_thr=-1),
sampler=dict(
type='RRandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False)),
test_cfg=dict(
rpn=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.8),
min_bbox_size=0),
rcnn=dict(
nms_pre=2000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(iou_thr=0.1),
max_per_img=2000)))
work_dir = 'lrs5'
auto_resume = False
gpu_ids = range(0, 6)

Reproduces the problem - command or script

tools/dist_train.sh configs/lsknet/lsk_s_fpn_1x_dota_le90.py 6

tools/dist_test.sh configs/lsknet/lsk_s_fpn_1x_dota_le90.py lys5 6 --format-only --eval-options submission_dir=work_dirss/Task1_results

Reproduces the problem - error message

20240328_012920.log

Additional information

I used the officially provided single-scale image split tool to get the ss dataset. But I cannot reproduce the result of lsknet_s with single-scale random rotate train and test.
1711603823318
Can anyone help me check my configuration? Thanks a lot!

LSKNet is not in the models registry

When I try to get through the code here, I get the following error, what should I do? Thank~

KeyError: "OrientedRCNN: 'LSKNet is not in the models registry'"

[Docs] How to set the learning with different number of gpus?

Branch

master branch https://mmrotate.readthedocs.io/en/latest/

📚 The doc issue

First of all, it's impressive work for me!. thank you for your repository.
I'm tring to re-generate your result on DOTA v1.0. However, I got the nan loss value during the training.
I'm using dota dataset with multi-scale, and using lsk_s_ema_fpn_1x_dota_le90.py file with 4 gpus.
Based on your comment on line 167, lr=0.0002, #/8*gpu_number, I understood that I need match the value with the different of number of it.
So I set the lr = 0.0004 so that match the same value with 0.0002 on 8 number of gpus, but I got nan loss value on the first epoch.
Maybe I can solve the problem by setting the lower lr value on it, I'm worried that it can be said 're-generation of the paper'.

May I ask you how can I set the appropriate lr on this situation?

Suggest a potential alternative/fix

No response

How can i train and test on a new own data?

Model/Dataset/Scheduler description

How do I prepare the data to test it on my own data

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

No response

预训练设置

Model/Dataset/Scheduler description

您好,在您网络LSKnet的基础上,我将lskmodule增加为4个分支,想请教下您这个预训练操作是如何进行的,是用的预训练数据集是什么?

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

No response

[New Models] Accuracy of ImageNet pretraining goes to 0 when changing the network

Model/Dataset/Scheduler description

Hi there,
thank you for your exceptional work! I'm trying to reproduce your results and to improve them.
In particular, I've tried to make a new network, with the structure of the S and double the embedding dims, but after a few epochs the accuracy goes to zero.

Since there might be multiple factors, I'd like to have a chat to clearify which direction I should take to make the network bigger.

Thank you in advance,
Giovanni

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

No response

[Question] Where exactly is the `Large Kernel Decomposition` in the code?

What's the feature?

Hi, I managed to find the source file models/backbones/lsknet.py but I can't seem to find where the decomposition is happening in the code.

If I understood correctly, class DWConv is the depthwise conv and class Block is the LSK Module. According to Fig. 4 in the paper, there should be at least 2 DWConv in each Block but I could only find one in Block->Mlp.

Help is very appreciated!

Any other context?

No response

Questions about image split

Model/Dataset/Scheduler description

作者您好,很感谢您的研究。我用您提供的代码对DOTA数据集进行分割后,train部分分割后有图片数量有103421张,但是我看您的日志文件中的图片数量为8541。

Hello author, thank you very much for your research. After I split dota with the code you provided, there are 103,421 pictures after the split of train, but I see that the number of pictures in your log file is 8541.

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

No response

[Feature] dockerfile

What's the feature?

Hi
I tried to use the dockerfile to build an docker image testing on my mac using cpu only.
But it seems like it's not working.
Is the dockerfile updated?

thanks!

Any other context?

No response

[Feature] 将lskblock加入yolov5

What's the feature?

我想将LSKblock加入yolov5-v7,使用的LSKblock代码以及模型配置文件如下
image
image

然而出现了梯度爆炸,当我使用SKblock或者使用原始yolo模型的时候模型可以正常训练,SKblock如下所示
image
LSKblock的dim参数传递的应该是正确的,如下图
image
我不知道哪里出了问题,是和激活函数有关吗

Any other context?

No response

[Feature] Backbone pretrain config

What's the feature?

Hi, thank you for your amazing work!

I would like to know if you can share the lsknet backbone pre-train config for the ImageNet, like the batch size, optimizer, lr scheduler, learning rate, etc. I want to finetune the backbone and modify the architecture for downstream applications.

Thank you in advance!

Any other context?

No response

[Bug] TypeError: FormatCode() got an unexpected keyword argument 'verify'

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

cuda==11.1
python==3.8.0
torch==1.8.1+cu111
mmcv-full==1.6.0
mmde==2.25.1
mmrotate==0.3.4

Reproduces the problem - code sample

TypeError: FormatCode() got an unexpected keyword argument 'verify'
subprocess.CalledProcessError: Command '['/root/miniconda3/bin/python', '-u', './LSKNet/tools/train.py', '--local_rank=1', 'LSKNet/configs/lsknet/lsk_s_fpn_1x_dota_le90.py', '--seed', '0', '--launcher', 'pytorch']' returned non-zero exit status 1.

Reproduces the problem - command or script

./LSKNet/tools/dist_train.sh LSKNet/configs/lsknet/lsk_s_fpn_1x_dota_le90.py 2

Reproduces the problem - error message

More details are shown in the figure below:
屏幕截图 2024-03-09 185212

Additional information

1.I'm training a model with an RTX 4090 cloud server, two cards
2.The dataset I'm using is dota v1.0
This is my first time training a model, and I'd love to hear from you to help me solve the problem and point out my shortcomings

Why I use lsk_s_ema_fpn_1x_dota_le90.py ad config and lsk_s_ema_fpn_1x_dota_le90_20230212-30ed4041.pth as checkpoint to test DOTAv1, the test mAP is only 81.3% rather than 81.85% reported on LSKNet official website?

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

sys.platform: linux
Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 4090
GPU 1: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda-11.8
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 11.1.0-1ubuntu1~18.04.1) 11.1.0
PyTorch: 2.0.1+cu118
PyTorch compiling details: PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.8
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  • CuDNN 8.9.5
    • Built with CuDNN 8.7
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.2+cu118
OpenCV: 4.9.0
MMCV: 1.7.2
MMCV Compiler: GCC 9.3
MMCV CUDA Compiler: 11.8
MMRotate: 0.3.4+

Reproduces the problem - code sample

Copyright (c) OpenMMLab. All rights reserved.

import argparse
import os
import os.path as osp
import time
import warnings

import mmcv
import torch
from mmcv import Config, DictAction
from mmcv.cnn import fuse_conv_bn
from mmcv.parallel import MMDataParallel, MMDistributedDataParallel
from mmcv.runner import (get_dist_info, init_dist, load_checkpoint,
wrap_fp16_model)
from mmdet.apis import multi_gpu_test, single_gpu_test
from mmdet.datasets import build_dataloader, replace_ImageToTensor

from mmrotate.datasets import build_dataset
from mmrotate.models import build_detector
from mmrotate.utils import compat_cfg, setup_multi_processes

def parse_args():
"""Parse parameters."""
parser = argparse.ArgumentParser(
description='MMDet test (and eval) a model')
parser.add_argument('--config', default='/home/xyn02/LSKNet/configs/xynModel/lsk_s_ema_fpn_1x_dota_le90.py', help='test config file path')
parser.add_argument('--checkpoint', default='/home/xyn02/LSKNet/configs/xynModel/lsk_s_ema_fpn_1x_dota_le90_20230212-30ed4041.pth', help='checkpoint file')
parser.add_argument(
'--work-dir',
default='/home/xyn02/LSKNet/runs/202403/20240326/lsk_s_ema_fpn_1x_dota_le90_run/work_dir',
help='the directory to save the file containing evaluation metrics')
parser.add_argument('--out', help='output result file in pickle format')
parser.add_argument(
'--fuse-conv-bn',
action='store_true',
help='Whether to fuse conv and bn, this will slightly increase'
'the inference speed')
parser.add_argument(
'--gpu-ids',
type=int,
nargs='+',
help='ids of gpus to use '
'(only applicable to non-distributed testing)')
parser.add_argument(
'--format-only',
action='store_true',
help='Format the output results without perform evaluation. It is'
'useful when you want to format the result to a specific format and '
'submit it to the test server')
parser.add_argument(
'--eval',
type=str,
nargs='+',
help='evaluation metrics, which depends on the dataset, e.g., "bbox",'
' "segm", "proposal" for COCO, and "mAP", "recall" for PASCAL VOC')
parser.add_argument('--show', action='store_true', help='show results')
parser.add_argument(
'--show-dir', default='/home/xyn02/LSKNet/runs/202403/20240326/lsk_s_ema_fpn_1x_dota_le90_run/show_dir', help='directory where painted images will be saved')
parser.add_argument(
'--show-score-thr',
type=float,
default=0.3,
help='score threshold (default: 0.3)')
parser.add_argument(
'--gpu-collect',
action='store_true',
help='whether to use gpu to collect results.')
parser.add_argument(
'--tmpdir',
help='tmp directory used for collecting results from multiple '
'workers, available when gpu-collect is not specified')
parser.add_argument(
'--cfg-options',
nargs='+',
action=DictAction,
help='override some settings in the used config, the key-value pair '
'in xxx=yyy format will be merged into config file. If the value to '
'be overwritten is a list, it should be like key="[a,b]" or key=a,b '
'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" '
'Note that the quotation marks are necessary and that no white space '
'is allowed.')
parser.add_argument(
'--eval-options',
nargs='+',
action=DictAction,
help='custom options for evaluation, the key-value pair in xxx=yyy '
'format will be kwargs for dataset.evaluate() function')
parser.add_argument(
'--launcher',
choices=['none', 'pytorch', 'slurm', 'mpi'],
default='none',
help='job launcher')
parser.add_argument('--local_rank', type=int, default=0)
args = parser.parse_args()
if 'LOCAL_RANK' not in os.environ:
os.environ['LOCAL_RANK'] = str(args.local_rank)

return args

def main():
args = parse_args()

assert args.out or args.eval or args.format_only or args.show \
    or args.show_dir, \
    ('Please specify at least one operation (save/eval/format/show the '
     'results / save the results) with the argument "--out", "--eval"'
     ', "--format-only", "--show" or "--show-dir"')

if args.eval and args.format_only:
    raise ValueError('--eval and --format_only cannot be both specified')

if args.out is not None and not args.out.endswith(('.pkl', '.pickle')):
    raise ValueError('The output file must be a pkl file.')

cfg = Config.fromfile(args.config)
if args.cfg_options is not None:
    cfg.merge_from_dict(args.cfg_options)

cfg = compat_cfg(cfg)

if args.format_only and cfg.mp_start_method != 'spawn':
    warnings.warn(
        '`mp_start_method` in `cfg` is set to `spawn` to use CUDA '
        'with multiprocessing when formatting output result.')
    cfg.mp_start_method = 'spawn'

setup_multi_processes(cfg)

if cfg.get('cudnn_benchmark', False):
    torch.backends.cudnn.benchmark = True

cfg.model.pretrained = None
if cfg.model.get('neck'):
    if isinstance(cfg.model.neck, list):
        for neck_cfg in cfg.model.neck:
            if neck_cfg.get('rfp_backbone'):
                if neck_cfg.rfp_backbone.get('pretrained'):
                    neck_cfg.rfp_backbone.pretrained = None
    elif cfg.model.neck.get('rfp_backbone'):
        if cfg.model.neck.rfp_backbone.get('pretrained'):
            cfg.model.neck.rfp_backbone.pretrained = None

if args.gpu_ids is not None:
    cfg.gpu_ids = args.gpu_ids
else:
    cfg.gpu_ids = range(1)

if args.launcher == 'none':
    distributed = False
    if len(cfg.gpu_ids) > 1:
        warnings.warn(
            f'We treat {cfg.gpu_ids} as gpu-ids, and reset to '
            f'{cfg.gpu_ids[0:1]} as gpu-ids to avoid potential error in '
            'non-distribute testing time.')
        cfg.gpu_ids = cfg.gpu_ids[0:1]
else:
    distributed = True
    init_dist(args.launcher, **cfg.dist_params)

test_dataloader_default_args = dict(
    samples_per_gpu=1, workers_per_gpu=2, dist=distributed, shuffle=False)

if isinstance(cfg.data.test, dict):
    cfg.data.test.test_mode = True
    if 'samples_per_gpu' in cfg.data.test:
        warnings.warn('`samples_per_gpu` in `test` field of '
                      'data will be deprecated, you should'
                      ' move it to `test_dataloader` field')
        test_dataloader_default_args['samples_per_gpu'] = \
            cfg.data.test.pop('samples_per_gpu')
    if test_dataloader_default_args['samples_per_gpu'] > 1:
        # Replace 'ImageToTensor' to 'DefaultFormatBundle'
        cfg.data.test.pipeline = replace_ImageToTensor(
            cfg.data.test.pipeline)
elif isinstance(cfg.data.test, list):
    for ds_cfg in cfg.data.test:
        ds_cfg.test_mode = True
        if 'samples_per_gpu' in ds_cfg:
            warnings.warn('`samples_per_gpu` in `test` field of '
                          'data will be deprecated, you should'
                          ' move it to `test_dataloader` field')
    samples_per_gpu = max(
        [ds_cfg.pop('samples_per_gpu', 1) for ds_cfg in cfg.data.test])
    test_dataloader_default_args['samples_per_gpu'] = samples_per_gpu
    if samples_per_gpu > 1:
        for ds_cfg in cfg.data.test:
            ds_cfg.pipeline = replace_ImageToTensor(ds_cfg.pipeline)

test_loader_cfg = {
    **test_dataloader_default_args,
    **cfg.data.get('test_dataloader', {})
}

rank, _ = get_dist_info()
if args.work_dir is not None and rank == 0:
    mmcv.mkdir_or_exist(osp.abspath(args.work_dir))
    timestamp = time.strftime('%Y%m%d_%H%M%S', time.localtime())
    json_file = osp.join(args.work_dir, f'eval_{timestamp}.json')

dataset = build_dataset(cfg.data.test)
data_loader = build_dataloader(dataset, **test_loader_cfg)

cfg.model.train_cfg = None
model = build_detector(cfg.model, test_cfg=cfg.get('test_cfg'))
fp16_cfg = cfg.get('fp16', None)
if fp16_cfg is not None:
    wrap_fp16_model(model)
checkpoint = load_checkpoint(model, args.checkpoint, map_location='cpu')
if args.fuse_conv_bn:
    model = fuse_conv_bn(model)
if 'CLASSES' in checkpoint.get('meta', {}):
    model.CLASSES = checkpoint['meta']['CLASSES']
else:
    model.CLASSES = dataset.CLASSES

if not distributed:
    model = MMDataParallel(model, device_ids=cfg.gpu_ids)
    outputs = single_gpu_test(model, data_loader, args.show, args.show_dir,
                              args.show_score_thr)
else:
    model = MMDistributedDataParallel(
        model.cuda(),
        device_ids=[torch.cuda.current_device()],
        broadcast_buffers=False)
    outputs = multi_gpu_test(model, data_loader, args.tmpdir,
                             args.gpu_collect)

rank, _ = get_dist_info()
if rank == 0:
    if args.out:
        print(f'\nwriting results to {args.out}')
        mmcv.dump(outputs, args.out)
    kwargs = {} if args.eval_options is None else args.eval_options
    if args.format_only:
        dataset.format_results(outputs, **kwargs)
    if args.eval:
        eval_kwargs = cfg.get('evaluation', {}).copy()
        for key in [
                'interval', 'tmpdir', 'start', 'gpu_collect', 'save_best',
                'rule', 'dynamic_intervals'
        ]:
            eval_kwargs.pop(key, None)
        eval_kwargs.update(dict(metric=args.eval, **kwargs))
        metric = dataset.evaluate(outputs, **eval_kwargs)
        print(metric)
        metric_dict = dict(config=args.config, metric=metric)
        if args.work_dir is not None and rank == 0:
            mmcv.dump(metric_dict, json_file)

if name == 'main':
main()

Reproduces the problem - command or script

python ./tools/test.py --format-only --eval-options submission_dir=/home/xyn02/LSKNet/runs/202403/20240326/lsk_s_ema_fpn_1x_dota_le90_run/work_dir/Task1_results

Reproduces the problem - error message

image
image

Additional information

No response

FAIR1M pretrained weights

Hi, the link to download the pretrained weights for FAIR1M (LSKNet_S) on the README redirects me to baidu and I can't download it. Do you have another link? Looks like the wetranfer link that was here #2 has expired. Many thanks! @zcablii

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

/

Reproduces the problem - code sample

/

Reproduces the problem - command or script

/

Reproduces the problem - error message

/

Additional information

/

[Docs] What is the version of mmcv?

Branch

master branch https://mmrotate.readthedocs.io/en/latest/

📚 The doc issue

"Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
root@csip-108:/mnt/csip-107/mmrotate/tools#

lsknet.py
Can not from mmcv.cnn.utils.weight_init import (constant_init, normal_init,
trunc_normal_init)

Mine : mmcv-2.0.0 mmrotate-0.3.4

Suggest a potential alternative/fix

No response

About LSKmodule

Model/Dataset/Scheduler description

Thank you for such good open source code! Can LSKmodule be plugged into other network architectures? If so, where is the module usually located? Like behind the convolution layer? Look forward to your reply, thank you!!

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

No response

[Feature]

What's the feature?

Hello author, Figure 5 in this paper shows that LSKnet's feature map visualization effect is better than resnet-50. Could you please provide your feature map visualization code?
image

Any other context?

No response

[shceduler]

Model/Dataset/Scheduler description

Your paper is very well-written, and I appreciate you open-sourcing the code. Your experiments utilized 8 GPUs, and the norm_cfg during training was set to SyncBN. In my experimental setup, I don't have as many GPUs. When I used two GPUs and reduced the learning rate accordingly (1/4 of the original), the mAP of the model trained with the backbone you provided was approximately 1% lower than that reported in the paper. Is the number of GPUs the main reason for this issue? Additionally, although there was some improvement in mAP when I increased the learning rate, it still didn't reach the level reported in the paper. I noticed in your log files that the training set was labeled as trainval_2 instead of the usual trainval. Did you add any operations during the data processing phase?

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

No response

[Feature] 请问LSKNet是否支持连接ReFPN?

What's the feature?

尊敬的LSKNet作者及社区维护人你们好!

我想咨询一下LSKNet是否支持链接ReFPN,如果支持的话,下面是我的config以及报错信息,希望可以帮忙看下是哪里出了问题,谢谢!

image

IPH}3`6FHQ}{J6BQP}KLR$0

Any other context?

No response

[Bug] 你好,我试着安装mmrotate的框架,在测试demo的时候运行image_demo.py,只有运行设备为CPU的时候好使,换其他任何一块显卡最终输出的都是原图,照着您的配置我觉得没啥问题,请帮我看一下哪出了问题?谢谢

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

sys.platform: linux
Python: 3.8.18 | packaged by conda-forge | (default, Dec 23 2023, 17:21:28) [GCC 12.3.0]
CUDA available: True
GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 3090
CUDA_HOME: /home/ZhangShiwei/cuda-11.3
NVCC: Cuda compilation tools, release 11.3, V11.3.58
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.7.1+cu110
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.0
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.8.2+cu110
OpenCV: 4.9.0
MMCV: 1.5.3
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.0
MMRotate: 0.3.4+

Reproduces the problem - code sample

我的conda list
# Name Version Build Channel _libgcc_mutex 0.1 conda_forge https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge _openmp_mutex 4.5 2_gnu https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge addict 2.4.0 pypi_0 pypi aliyun-python-sdk-core 2.15.0 pypi_0 pypi aliyun-python-sdk-kms 2.16.2 pypi_0 pypi bzip2 1.0.8 hd590300_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge ca-certificates 2024.2.2 hbcca054_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge certifi 2024.2.2 pypi_0 pypi cffi 1.16.0 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi click 8.1.7 pypi_0 pypi colorama 0.4.6 pypi_0 pypi contourpy 1.1.1 pypi_0 pypi crcmod 1.7 pypi_0 pypi cryptography 42.0.5 pypi_0 pypi cudatoolkit 11.0.3 h7761cd4_13 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge cycler 0.12.1 pypi_0 pypi e2cnn 0.2.3 pypi_0 pypi filelock 3.13.1 pypi_0 pypi fonttools 4.50.0 pypi_0 pypi fsspec 2024.3.0 pypi_0 pypi huggingface-hub 0.21.4 pypi_0 pypi idna 3.6 pypi_0 pypi importlib-metadata 7.0.2 pypi_0 pypi importlib-resources 6.3.1 pypi_0 pypi jmespath 0.10.0 pypi_0 pypi kiwisolver 1.4.5 pypi_0 pypi ld_impl_linux-64 2.40 h41732ed_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libffi 3.4.2 h7f98852_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libgcc-ng 13.2.0 h807b86a_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libgomp 13.2.0 h807b86a_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libnsl 2.0.1 hd590300_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libsqlite 3.45.2 h2797004_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libstdcxx-ng 13.2.0 h7e041cc_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libuuid 2.38.1 h0b41bf4_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libxcrypt 4.4.36 hd590300_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge libzlib 1.2.13 hd590300_5 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge markdown 3.6 pypi_0 pypi markdown-it-py 3.0.0 pypi_0 pypi matplotlib 3.7.5 pypi_0 pypi mdurl 0.1.2 pypi_0 pypi mmcv-full 1.5.3 pypi_0 pypi mmdet 2.25.1 pypi_0 pypi mmrotate 0.3.4 dev_0 <develop> model-index 0.1.11 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi ncurses 6.4 h59595ed_2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge numpy 1.24.4 pypi_0 pypi opencv-python 4.9.0.80 pypi_0 pypi opendatalab 0.0.10 pypi_0 pypi openmim 0.3.9 pypi_0 pypi openssl 3.2.1 hd590300_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge openxlab 0.0.36 pypi_0 pypi ordered-set 4.1.0 pypi_0 pypi oss2 2.17.0 pypi_0 pypi packaging 24.0 pypi_0 pypi pandas 2.0.3 pypi_0 pypi pillow 10.2.0 pypi_0 pypi pip 24.0 pyhd8ed1ab_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge platformdirs 4.2.0 pypi_0 pypi pycocotools 2.0.7 pypi_0 pypi pycparser 2.21 pypi_0 pypi pycryptodome 3.20.0 pypi_0 pypi pygments 2.17.2 pypi_0 pypi pyparsing 3.1.2 pypi_0 pypi python 3.8.18 hd12c33a_1_cpython https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge python-dateutil 2.9.0.post0 pypi_0 pypi pytz 2023.4 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi readline 8.2 h8228510_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge requests 2.28.2 pypi_0 pypi rich 13.4.2 pypi_0 pypi safetensors 0.4.2 pypi_0 pypi scipy 1.10.1 pypi_0 pypi setuptools 60.2.0 pypi_0 pypi six 1.16.0 pypi_0 pypi sympy 1.12 pypi_0 pypi tabulate 0.9.0 pypi_0 pypi terminaltables 3.1.10 pypi_0 pypi timm 0.9.16 pypi_0 pypi tk 8.6.13 noxft_h4845f30_101 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge tomli 2.0.1 pypi_0 pypi torch 1.7.1+cu110 pypi_0 pypi torchaudio 0.7.2 pypi_0 pypi torchvision 0.8.2+cu110 pypi_0 pypi tqdm 4.65.2 pypi_0 pypi typing-extensions 4.10.0 pypi_0 pypi tzdata 2024.1 pypi_0 pypi urllib3 1.26.18 pypi_0 pypi wheel 0.42.0 pyhd8ed1ab_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge xz 5.2.6 h166bdaf_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge yapf 0.40.1 pypi_0 pypi zipp 3.18.1 pypi_0 pypi

Reproduces the problem - command or script

1

Reproduces the problem - error message

使用 python demo/image_demo.py demo/demo.jpg oriented_rcnn_r50_fpn_1x_dota_le90.py oriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth --out-file result.jpg --device 'cuda:2'时输出的result.jpg是原图
使用 python demo/image_demo.py demo/demo.jpg oriented_rcnn_r50_fpn_1x_dota_le90.py oriented_rcnn_r50_fpn_1x_dota_le90-6d2b2ce0.pth --out-file result.jpg --device 'cpu'就是带预测框的图

Additional information

1

weight file[Docs]

Branch

master branch https://mmrotate.readthedocs.io/en/latest/

📚 The doc issue

Hello author, I am a CV novice. I would like to know where to look at the code for the model network section in your project? Secondly, when I run your project, the terminal will prompt that some weight keys and weight files cannot correspond. What is the reason for this?

Suggest a potential alternative/fix

No response

pre-trained weights

What's the feature?

Hi,
I'm trying to download weights, but the link suggests me in Chinese to install beidu net drive instead. Could you please put the weights to some hassle-free server such as github or google drive.
Thank you!

Any other context?

No response

WARNING - The model and loaded state dict do not match exactly

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

Package Version Source


mmcv 1.6.0 https://github.com/open-mmlab/mmcv
mmcv-full 1.6.0 https://github.com/open-mmlab/mmcv
mmdet 2.28.2 https://github.com/open-mmlab/mmdetection
mmengine 0.7.4 https://github.com/open-mmlab/mmengine
mmrotate 0.3.4 /mnt/workspace/Large-Selective-Kernel-Network

Reproduces the problem - code sample

(openmmlab) /mnt/workspace> cd Large-Selective-Kernel-Network/
(openmmlab) /mnt/workspace/Large-Selective-Kernel-Network> python tools/train.py configs/lsknet/
.ipynb_checkpoints/ lsk_s_fpn_1x_fair_le90.py README.md
lsk_s_ema_fpn_1x_dota_le90.py lsk_s_fpn_3x_hrsc_le90.py
lsk_s_fpn_1x_dota_le90.py lsk_t_fpn_1x_dota_le90.py

Reproduces the problem - command or script

(openmmlab) /mnt/workspace/Large-Selective-Kernel-Network> python tools/train.py configs/lsknet/lsk_t_fpn_1x_dota_le90.py

Reproduces the problem - error message

2023-06-23 16:52:23,196 - mmrotate - INFO - Set random seed to 276630770, deterministic: False
/home/pai/envs/openmmlab/lib/python3.8/site-packages/mmdet/models/dense_heads/anchor_head.py:116: UserWarning: DeprecationWarning: num_anchors is deprecated, for consistency or also use num_base_priors instead
warnings.warn('DeprecationWarning: num_anchors is deprecated, '
init cfg {'type': 'Pretrained', 'checkpoint': '/mnt/workspace/Large-Selective-Kernel-Network/lsk_s_backbone-e9d2e551.pth'}
2023-06-23 16:52:23,414 - mmrotate - INFO - initialize LSKNet with init_cfg {'type': 'Pretrained', 'checkpoint': '/mnt/workspace/Large-Selective-Kernel-Network/lsk_s_backbone-e9d2e551.pth'}
2023-06-23 16:52:23,414 - mmcv - INFO - load model from: /mnt/workspace/Large-Selective-Kernel-Network/lsk_s_backbone-e9d2e551.pth
2023-06-23 16:52:23,415 - mmcv - INFO - load checkpoint from local path: /mnt/workspace/Large-Selective-Kernel-Network/lsk_s_backbone-e9d2e551.pth
2023-06-23 16:52:23,471 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: head.weight, head.bias

2023-06-23 16:52:23,487 - mmrotate - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-06-23 16:52:23,510 - mmrotate - INFO - initialize OrientedRPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01}
2023-06-23 16:52:23,518 - mmrotate - INFO - initialize RotatedShared2FCBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}, {'type': 'Xavier', 'layer': 'Linear', 'override': [{'name': 'shared_fcs'}, {'name': 'cls_fcs'}, {'name': 'reg_fcs'}]}]

Additional information

I downloaded the pre-trained model and ran the command according to the instructions of https://github.com/zcablii/Large-Selective-Kernel-Network/issues/4, and I got the following error:
it seems The model and loaded state dict do not match exactly,Want to ask how to solve it? Thank!

[Bug] ValueError: need at least one array to concatenate

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

TorchVision: 0.14.1+cu116
OpenCV: 4.9.0
MMCV: 1.7.1
MMCV Compiler: GCC 9.3
MMCV CUDA Compiler: 11.6
MMRotate: 0.3.4+

Reproduces the problem - code sample

When I run train.py on my own datasets, it got an error ValueError: need at least one array to concatenate

Reproduces the problem - command or script

python tools/train.py configs/lsknet/lsk_s_fpn_1x_dota_le90.py --gpu-ids 2

Reproduces the problem - error message

Traceback (most recent call last):
File "/data/hxj/LSKNet-main/tools/train.py", line 193, in
main()
File "/data/hxj/LSKNet-main/tools/train.py", line 182, in main
train_detector(
File "/data/hxj/LSKNet-main/mmrotate/apis/train.py", line 141, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/mmcv/runner/epoch_based_runner.py", line 49, in train
for i, data_batch in enumerate(self.data_loader):
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 435, in iter
return self._get_iterator()
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 381, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1072, in init
self._reset(loader, first_iter=True)
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1105, in _reset
self._try_put_index()
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1339, in _try_put_index
index = self._next_index()
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 618, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 254, in iter
for idx in self.sampler:
File "/home/hxj/anaconda3/envs/openmmlab/lib/python3.9/site-packages/mmdet/datasets/samplers/group_sampler.py", line 36, in iter
indices = np.concatenate(indices)
ValueError: need at least one array to concatenate

Additional information

No response

[Feature] Use LSKNet to do segmentaion on LoveDA

What's the feature?

I have seen the result on paperswithcode about your results on LoveDA dataset in semantic segmentation task, I want to know how realize the process?

Any other context?

No response

[Docs]

Branch

master branch https://mmrotate.readthedocs.io/en/latest/

📚 The doc issue

I've searched about the FPS of this code in your docs to make sure that i'm on the right way, but I've got nothing
can you add more info about the fps?

Suggest a potential alternative/fix

addinfo about system specs and value of fps on it

感谢您的工作,想请教一下您的模型怎么得到的81.8的过程可以吗?

Model/Dataset/Scheduler description

我看到您训练的config是拿训练和评估集作为训练,评估集作为val,训练了12个epoch,在您readme中LSKNet_S* log评估集中的结果为mAP: 0.8691,上传到dota上得到的结果是81.85.我想请教的是,这个val的结果和test集上的结果有相关性吗?您提交的test集结果是怎么分解的test,做了什么特殊处理吗?

Open source status

  • The model implementation is available
  • The model weights are available.

Provide useful links for the implementation

No response

[Docs]Some questions about the code reproduction results of LSKNet_T in DOTA-1.0 dataset

Branch

master branch https://mmrotate.readthedocs.io/en/latest/

📚 The doc issue

作者您好,我正在基于单个GPU复现您的代码,按照您文档中的说明,采用多尺度训练,仅读取预训练的主干,配置文件使用LSKNet_T,将syncBN改成BN,学习率从原来的0.0002改成0.0002/8,下面附件里是训练过程中的日志文件,从实验结果中看到精度差异较大,并不能达到您日志中的0.852,请问是我超参数设置的有问题吗,期待您的回复。
Hello author, I am replicating your code based on a single GPU. Following the instructions in your document, we are using multi-scale training to only read the pre trained backbone. The configuration file uses LSKNet-T, changing syncBN to BN, and changing the learning rate from 0.0002 to 0.0002/8. Attached is the log file from the training process. From the experimental results, we can see that there is a significant difference in accuracy, which cannot reach the 0.852 in your log. May I ask if there is an issue with my hyperparameter settings? Looking forward to your reply.

Suggest a potential alternative/fix

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.