jessemelpolio / faster_rcnn_for_dota Goto Github PK

View Code? Open in Web Editor NEW

331.0 13.0 102.0 268 KB

Code used for training Faster R-CNN on DOTA

Home Page: https://arxiv.org/abs/1711.10398

License: Apache License 2.0

Python 67.01% C++ 4.62% Cuda 27.53% Batchfile 0.03% Shell 0.02% Makefile 0.03% C 0.76%

aerial detection faster-rcnn dota

faster_rcnn_for_dota's People

Contributors

Stargazers

Watchers

Forkers

hadhe145 youthhan dingjiansw101 nightinwhite xychen9459 gninnur zfxu wubizhi arasharchor tuyaliang li-yapeng qiuwhu chenliqiong zhuhuilong liujiandu lin-zhipeng lovematmod haoji007 cybersp ainewdemo maxbazik balrajashwath zjucsxxd shashaqingmuzi apple1987 xjmeng001 19940312 ucf-arcc rodji jtchenpro jh101024 almonotonous soubanerjee tangzixia rdavarymajd kerinchen911 sheex2018 dalal1983 sdmanwang deep-learning-repos yxt031 xmjiayou linhanxiao samux87 locussam whuhyk froggor sunshinezhihuo witzou laszo djdongbudong jcsome adienly lewj85 shanhedian2017 iynaur clw5180 sailing-m mengce97 jpielorz wyj0613 xiaowang1516 shuai-xie kevenlee xiaowang5121 wgqbit95 yeosan bapleliu leftthink tiber2013 zhangfx123 xtmeng zzdxlee garylia huihuili2019 d791289718 akhilesh-niol haobabuhaoba fangbotom ezekielbarnett bityangke xuyuting45 bronzepot jamalextends25 satrec-initiative sui6662012 jeshy yuhan19971211 lionking6792 wangxiangsheng123 matt-sharp liuwenhaha altkddhfcjs amberrferr madeoka guidam7 tang-agui aidenfather yildirimcagatay34 flowerstree

faster_rcnn_for_dota's Issues

Assistance with Inference on Custom Dataset

can anyone help with inferencing this model on custom dataset

Test the pretrained model on my aerial images

Hi,

Can anyone tell me how i can test the pretrained model on my aerial images before fine tuning? which script i should run ?

I am confused with the train and test folder here used for training which i think should be train and val folder instead of test folder. because test images does not have corresponding labels.

Can anyone suggest ?

Thanks in advance

Data preparation problem

I don't know how to produce the train.txt. and what in images and labelTx. Can you provide more detail. thanks your help

how to use the nms method

from nms2.nms import py_nms_wrapper, cpu_nms_wrapper, gpu_nms_wrapper

boxes2 = np.array([
		    [0, 100, 0, 0, 100, 0, 100, 100, 0.99],
			[10, 110, 10, 10, 130, 10, 150, 110, 0.88],#keep 0.68
	        [150, 250, 150, 150, 250, 150, 250, 250, 0.77],  # keep 0.0
			[20, 50, 50, 20, 120, 50, 50, 120, 0.66],#discard 0.70
         ], dtype=np.float32)
nms = gpu_nms_wrapper(0.6, 0)
keep = nms(boxes2)
print(keep)

This is a simple test code about the gpunms. The output of this code is [2, 1, 3, 0].
Does this result correct?

I want to train the DOTA in specific image scale

Hello,thanks for your work!
I have the question that I want to train the DOTA dataset in specific image-scale like 768*768 and some classes like plane or storage tank which is splitted by the DOTA-devkit. But I got the training problem like below:
('Called with argument:', Namespace(cfg='experiments/faster_rcnn/cfgs/DOTA_quadrangle.yaml', frequent=100))
{'CLASS_AGNOSTIC': False,
'MXNET_VERSION': 'mxnet',
'RESIZE_TO_FIX_SIZE': True,
'SCALES': [(768, 768)],
'TEST': {'BATCH_IMAGES': 1,
'CXX_PROPOSAL': False,
'DO_MULTISCALE_TEST': False,
'HAS_RPN': True,
'MULTISCALE': [1.0, 1.2, 1.4, 1.6],
'NMS': 0.3,
'PROPOSAL_MIN_SIZE': 0,
'PROPOSAL_NMS_THRESH': 0.7,
'PROPOSAL_POST_NMS_TOP_N': 2000,
'PROPOSAL_PRE_NMS_TOP_N': 20000,
'RPN_MIN_SIZE': 0,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'max_per_image': 300,
'save_img_path': '/home/zxr/houweining/test_frcnn/Faster_RCNN_for_DOTA/data/hwn/model_01_vis',
'test_epoch': 59},
'TRAIN': {'ALTERNATE': {'RCNN_BATCH_IMAGES': 0,
'RPN_BATCH_IMAGES': 0,
'rfcn1_epoch': 0,
'rfcn1_lr': 0,
'rfcn1_lr_step': '',
'rfcn2_epoch': 0,
'rfcn2_lr': 0,
'rfcn2_lr_step': '',
'rpn1_epoch': 0,
'rpn1_lr': 0,
'rpn1_lr_step': '',
'rpn2_epoch': 0,
'rpn2_lr': 0,
'rpn2_lr_step': '',
'rpn3_epoch': 0,
'rpn3_lr': 0,
'rpn3_lr_step': ''},
'ASPECT_GROUPING': True,
'BATCH_IMAGES': 1,
'BATCH_ROIS': 128,
'BATCH_ROIS_OHEM': 128,
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZATION_PRECOMPUTED': False,
'BBOX_REGRESSION_THRESH': 0.5,
'BBOX_STDS': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'BBOX_WEIGHTS': array([1., 1., 1., 1., 1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.1,
'CXX_PROPOSAL': False,
'ENABLE_OHEM': True,
'END2END': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'FLIP': True,
'RESUME': False,
'RPN_BATCH_SIZE': 256,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 0,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'SHUFFLE': True,
'begin_epoch': 0,
'end_epoch': 60,
'lr': 0.0005,
'lr_factor': 0.1,
'lr_step': '45,52',
'model_prefix': 'rcnn_DOTA_quadrangle',
'momentum': 0.9,
'warmup': True,
'warmup_lr': 5e-05,
'warmup_step': 1000,
'wd': 0.0005},
'dataset': {'NUM_CLASSES': 2,
'dataset': 'DOTA_oriented',
'dataset_path': '/home/zxr/houweining/test_frcnn/Faster_RCNN_for_DOTA/data/dota_for_obb_768',
'image_set': 'train',
'proposal': 'rpn',
'root_path': '/home/zxr/houweining/test_frcnn/Faster_RCNN_for_DOTA/data/hwn/model_01',
'test_image_set': 'test'},
'default': {'frequent': 100, 'kvstore': 'device'},
'gpus': '0',
'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'FIXED_PARAMS': ['conv1',
'bn_conv1',
'res2',
'bn2',
'gamma',
'beta'],
'FIXED_PARAMS_SHARED': ['conv1',
'bn_conv1',
'res2',
'bn2',
'res3',
'bn3',
'res4',
'bn4',
'gamma',
'beta'],
'IMAGE_STRIDE': 0,
'NUM_ANCHORS': 9,
'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),
'RCNN_FEAT_STRIDE': 16,
'RPN_FEAT_STRIDE': 16,
'pretrained': './model/pretrained_model/resnet_v1_101',
'pretrained_epoch': 0},
'output_path': './output/rcnn/DOTA_quadrangle',
'symbol': 'resnet_v1_101_rcnn_quadrangle'}
num_images 2026
wrote gt roidb to /home/zxr/houweining/test_frcnn/Faster_RCNN_for_DOTA/data/hwn/model_01/cache/DOTA_oriented_train_gt_roidb.pkl
append flipped images to roidb
filtered 150 roidb entries: 4052 -> 3902
providing maximum shape [('data', (1, 3, 768, 768)), ('gt_boxes', (1, 100, 9))] [('label', (1, 20736)), ('bbox_target', (1, 36, 48, 48)), ('bbox_weight', (1, 36, 48, 48))]
{'bbox_target': (1L, 36L, 48L, 48L),
'bbox_weight': (1L, 36L, 48L, 48L),
'data': (1L, 3L, 768L, 768L),
'gt_boxes': (1L, 3L, 9L),
'im_info': (1L, 3L),
'label': (1L, 20736L)}
('lr', 0.0005, 'lr_epoch_diff', [45.0, 52.0], 'lr_iters', [175590, 202904])
[11:55:59] src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.
[11:55:59] src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.
[11:55:59] src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.
[11:55:59] src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.
[11:55:59] src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.
[11:55:59] src/operator/convolution.cu:87: This convolution is not supported by cudnn, MXNET convolution is applied.
Error in CustomOp.forward: Traceback (most recent call last):
File "/home/zxr/anaconda3/envs/hwn_frcnn/lib/python2.7/site-packages/mxnet/operator.py", line 758, in forward_entry
aux=tensors[4])
File "experiments/faster_rcnn/../../faster_rcnn/operator_py/proposal_target_quadrangle.py", line 87, in forward
sample_rois_quadrangle(all_rois, fg_rois_per_image, rois_per_image, self._num_classes, self._cfg, gt_boxes=gt_boxes)
File "experiments/faster_rcnn/../../faster_rcnn/core/rcnn.py", line 265, in sample_rois_quadrangle
expand_bbox_regression_targets_quadrangle(bbox_target_data, num_classes, cfg)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/bbox/bbox_regression.py", line 157, in expand_bbox_regression_targets_quadrangle
bbox_targets[index, start:end] = bbox_targets_data[index, 1:]
ValueError: could not broadcast input array from shape (8) into shape (0)

[11:55:59] /home/ubuntu/mxnet-distro/mxnet-build/dmlc-core/include/dmlc/logging.h:304: [11:55:59] src/operator/custom/custom.cc:77: Check failed: reinterpret_cast(op_info_->callbacks[kCustomOpForward])( ptrs.size(), ptrs.data(), tags.data(), reqs.data(), static_cast(ctx.is_train), op_info_->contexts[kCustomOpForward])

Stack trace returned 6 entries:
[bt] (0) /home/zxr/anaconda3/envs/hwn_frcnn/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x184dfc) [0x7fd31db3fdfc]
[bt] (1) /home/zxr/anaconda3/envs/hwn_frcnn/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x248ec8) [0x7fd31dc03ec8]
[bt] (2) /home/zxr/anaconda3/envs/hwn_frcnn/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x2424ce) [0x7fd31dbfd4ce]
[bt] (3) /home/zxr/anaconda3/envs/hwn_frcnn/bin/../lib/libstdc++.so.6(+0xb7260) [0x7fd346407260]
[bt] (4) /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fd34d0576ba]
[bt] (5) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fd34c67d41d]

terminate called after throwing an instance of 'dmlc::Error'
what(): [11:55:59] src/operator/custom/custom.cc:77: Check failed: reinterpret_cast(op_info_->callbacks[kCustomOpForward])( ptrs.size(), ptrs.data(), tags.data(), reqs.data(), static_cast(ctx.is_train), op_info_->contexts[kCustomOpForward])

Aborted (core dumped)

I have checked my dataset that generated by DOTA-devkit but can't found the reason, can you give me some help or tell me which configures should be modified?

used a basic quadrilateral for annotation...? getting improper results...

Sir, i have used a quadrilateral(not special ones like rectangle) to annotate the aerial images and used the same format in which DOTA images were annotated and trained the model with the help of transfer learning..
The confidence score for the prediction is very less (order of 0.001) and the obtained bounding boxes are not covering the entire object..
There is one major problem, my images are aerial view of the seashore,harbor, sea and land near the shore..When the model is predicting, it is showing harbor everywhere in the sea...
My image size is around (2000020000 pixels) and the object is roughly around (3030 pixels)
So, what can be done to improve my results????
Any help will be much appreciated...

TypeError: zip argument #2 must support iteration

/Projection/DOTA_python/Faster_RCNN_for_DOTA-master/faster_rcnn/../lib/utils/symbol.py:38: UserWarning: Cannot decide shape for the following arguments (0s in shape means unknown dimensions). Consider providing them as input:
gt_boxes: ()
arg_shape, out_shape, aux_shape = self.sym.infer_shape(**data_shape_dict)
Traceback (most recent call last):
File "/Projection/DOTA_python/Faster_RCNN_for_DOTA-master/experiments/faster_rcnn/rcnn_dota_quadrangle_e2e.py", line 25, in
train_quadrangle_end2end.main()
File "/Projection/DOTA_python/Faster_RCNN_for_DOTA-master/faster_rcnn/train_quadrangle_end2end.py", line 166, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch, config.TRAIN.lr, config.TRAIN.lr_step)
File "/Projection/DOTA_python/Faster_RCNN_for_DOTA-master/faster_rcnn/train_quadrangle_end2end.py", line 93, in train_net
sym_instance.infer_shape(data_shape_dict)
File "/Projection/DOTA_python/Faster_RCNN_for_DOTA-master/faster_rcnn/../lib/utils/symbol.py", line 41, in infer_shape
self.arg_shape_dict = dict(zip(self.sym.list_arguments(), arg_shape))
TypeError: zip argument #2 must support iteration

problem

出现import train_quadrangle_end2end和 from symbols import *的问题，没有这两个。

Cannot find the bbox_overlaps_cython

When I run the code. It says: cannot import name bbox_overlaps_cython. I want to know where is the bbox_overlaps_cython.

Improve build methodology

Would it be possible to make this follow a more standard build/setup process? This would make it easier to install and could facilitate things such as making a conda recipe.

What's the meaning of the test.txt and train.txt?

/home/xjd/anaconda2/lib/python2.7/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
('Called with argument:', Namespace(cfg='experiments/faster_rcnn/cfgs/DOTA_quadrangle.yaml', frequent=100))
{'CLASS_AGNOSTIC': False,
'MXNET_VERSION': 'mxnet',
'RESIZE_TO_FIX_SIZE': True,
'SCALES': [(1024, 1024)],
'TEST': {'BATCH_IMAGES': 1,
'CXX_PROPOSAL': False,
'DO_MULTISCALE_TEST': False,
'HAS_RPN': True,
'MULTISCALE': [1.0, 1.2, 1.4, 1.6],
'NMS': 0.3,
'PROPOSAL_MIN_SIZE': 0,
'PROPOSAL_NMS_THRESH': 0.7,
'PROPOSAL_POST_NMS_TOP_N': 2000,
'PROPOSAL_PRE_NMS_TOP_N': 20000,
'RPN_MIN_SIZE': 0,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'max_per_image': 300,
'save_img_path': '/home/dj/data/vis',
'test_epoch': 59},
'TRAIN': {'ALTERNATE': {'RCNN_BATCH_IMAGES': 0,
'RPN_BATCH_IMAGES': 0,
'rfcn1_epoch': 0,
'rfcn1_lr': 0,
'rfcn1_lr_step': '',
'rfcn2_epoch': 0,
'rfcn2_lr': 0,
'rfcn2_lr_step': '',
'rpn1_epoch': 0,
'rpn1_lr': 0,
'rpn1_lr_step': '',
'rpn2_epoch': 0,
'rpn2_lr': 0,
'rpn2_lr_step': '',
'rpn3_epoch': 0,
'rpn3_lr': 0,
'rpn3_lr_step': ''},
'ASPECT_GROUPING': True,
'BATCH_IMAGES': 1,
'BATCH_ROIS': 128,
'BATCH_ROIS_OHEM': 128,
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZATION_PRECOMPUTED': False,
'BBOX_REGRESSION_THRESH': 0.5,
'BBOX_STDS': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'BBOX_WEIGHTS': array([1., 1., 1., 1., 1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.1,
'CXX_PROPOSAL': False,
'ENABLE_OHEM': True,
'END2END': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'FLIP': True,
'RESUME': False,
'RPN_BATCH_SIZE': 256,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 0,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'SHUFFLE': True,
'begin_epoch': 0,
'end_epoch': 60,
'lr': 0.0005,
'lr_factor': 0.1,
'lr_step': '45,52',
'model_prefix': 'rcnn_DOTA_quadrangle',
'momentum': 0.9,
'warmup': True,
'warmup_lr': 5e-05,
'warmup_step': 1000,
'wd': 0.0005},
'dataset': {'NUM_CLASSES': 16,
'dataset': 'DOTA_oriented',
'dataset_path': './path-to-dota-split',
'image_set': 'train',
'proposal': 'rpn',
'root_path': './result',
'test_image_set': 'test'},
'default': {'frequent': 100, 'kvstore': 'device'},
'gpus': '0',
'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'FIXED_PARAMS': ['conv1',
'bn_conv1',
'res2',
'bn2',
'gamma',
'beta'],
'FIXED_PARAMS_SHARED': ['conv1',
'bn_conv1',
'res2',
'bn2',
'res3',
'bn3',
'res4',
'bn4',
'gamma',
'beta'],
'IMAGE_STRIDE': 0,
'NUM_ANCHORS': 9,
'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),
'RCNN_FEAT_STRIDE': 16,
'RPN_FEAT_STRIDE': 16,
'pretrained': './model/pretrained_model/resnet_v1_101',
'pretrained_epoch': 0},
'output_path': './output/rcnn/DOTA_quadrangle',
'symbol': 'resnet_v1_101_rcnn_quadrangle'}
num_images 0
DOTA_oriented_train gt roidb loaded from ./result/cache/DOTA_oriented_train_gt_roidb.pkl
append flipped images to roidb
filtered 0 roidb entries: 0 -> 0
Traceback (most recent call last):
File "experiments/faster_rcnn/rcnn_dota_quadrangle_e2e.py", line 23, in
train_quadrangle_end2end.main()
File "experiments/faster_rcnn/../../faster_rcnn/train_quadrangle_end2end.py", line 165, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch, config.TRAIN.lr, config.TRAIN.lr_step)
File "experiments/faster_rcnn/../../faster_rcnn/train_quadrangle_end2end.py", line 81, in train_net
anchor_ratios=config.network.ANCHOR_RATIOS, aspect_grouping=config.TRAIN.ASPECT_GROUPING)
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 659, in init
self.get_batch_individual()
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 807, in get_batch_individual
iroidb = [roidb[i] for i in range(islice.start, islice.stop)]
IndexError: list index out of range

I have download the dota v1.0 already, and try to run with the usage as you mentioned before,but I can't run successfully.And I think the mistake is caused by dataset processing. I can't understand the meaning of your description of the test.txt ande the train.txt. Are the two txt files used to store the name of the train of test picture, such as P000001?
I'm looking forward to hearing from you.
Thank you.

I can't run the code successfully following the readme.md

Hi, I confront the following issue when executing"python experiments/faster_rcnn/rcnn_dota_quadrangle_e2e.py --cfg experiments/faster_rcnn/cfgs/DOTA_quadrangle.yaml"

Traceback (most recent call last):
File "experiments/faster_rcnn/rcnn_dota_quadrangle_e2e.py", line 23, in
train_quadrangle_end2end.main()
File "experiments/faster_rcnn/../../faster_rcnn/train_quadrangle_end2end.py", line 165, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch, config.TRAIN.lr, config.TRAIN.lr_step)
File "experiments/faster_rcnn/../../faster_rcnn/train_quadrangle_end2end.py", line 74, in train_net
for image_set in image_sets]
File "experiments/faster_rcnn/../../faster_rcnn/../lib/utils/load_data.py", line 18, in load_quadrangle_gt_roidb
roidb = imdb.gt_roidb()
File "experiments/faster_rcnn/../../faster_rcnn/../lib/dataset/DOTA.py", line 291, in gt_roidb
gt_roidb = [self.load_annotation(index) for index in self.image_set_index]
File "experiments/faster_rcnn/../../faster_rcnn/../lib/dataset/DOTA.py", line 350, in load_annotation
cls = class_to_index[obj[8].lower().strip()]
KeyError: 'turntable'

I have not found a solution, and could someone help me with it?

A question about the resnet_v1_101_rcnn_quadrangle

In resnet_v1_101_rcnn_quadrangle.py, you define the bbox_weight as shape=(-1, 8 * num_classes) at line 996 , and at line 1023 the output dim of fully connected layer is num_reg_classes * 4. why there is a mismatch between those two dimension?

An error while loading dataset

I got an error while training, here it is:
Traceback (most recent call last):
File "experiments/faster_rcnn/rcnn_dota_quadrangle_e2e.py", line 23, in
train_quadrangle_end2end.main()
File "experiments/faster_rcnn/../../faster_rcnn/train_quadrangle_end2end.py", line 165, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch, config.TRAIN.lr, config.TRAIN.lr_step)
File "experiments/faster_rcnn/../../faster_rcnn/train_quadrangle_end2end.py", line 74, in train_net
for image_set in image_sets]
File "experiments/faster_rcnn/../../faster_rcnn/../lib/utils/load_data.py", line 18, in load_quadrangle_gt_roidb
roidb = imdb.gt_roidb()
File "experiments/faster_rcnn/../../faster_rcnn/../lib/dataset/DOTA.py", line 291, in gt_roidb
gt_roidb = [self.load_annotation(index) for index in self.image_set_index]
File "experiments/faster_rcnn/../../faster_rcnn/../lib/dataset/DOTA.py", line 350, in load_annotation
cls = class_to_index[obj[8].lower().strip()]
KeyError: 'container-crane'
I guss it is a data trouble, I do not know what's gonging on, any help would be appreciated.

hwo to evaluate the result？

I ran the test code and got the prediction result，but there is no evaluate function，i want to get the ap and map of the results，how can i do？

Import Rrror:cannot import name bbox_overlaps_cpython

When I run the code,I met a problem above.How to solve it?

Train a model: RCNN accuracy decreases

Hi,
Could you please provide an insight,

In training I get the following phenomena ,
RPN acuracy is improved but RCN accuracy is decreased.
Also note what happens to L1 and Log losses of RCNN part, the first increases and the last decreased.

Do you have any idea which parameter may need to be tuned?

Thanks,

Epoch[0] Batch [100] Speed: 0.95 samples/sec Train-RPNAcc=0.808787, RPNLogLoss=0.439269, RPNL1Loss=0.095613, RCNNAcc=0.874149, RCNNLogLoss=0.446150, RCNNL1Loss=0.086455,
Epoch[0] Batch [2000] Speed: 0.93 samples/sec Train-RPNAcc=0.915812, RPNLogLoss=0.214620, RPNL1Loss=0.061609, RCNNAcc=0.875118, RCNNLogLoss=0.347196, RCNNL1Loss=0.104136,
Epoch[0] Batch [4000] Speed: 0.96 samples/sec Train-RPNAcc=0.933159, RPNLogLoss=0.173305, RPNL1Loss=0.057984, RCNNAcc=0.871077, RCNNLogLoss=0.339611, RCNNL1Loss=0.107748,
Epoch[0] Batch [5000] Speed: 0.98 samples/sec Train-RPNAcc=0.938542, RPNLogLoss=0.160328, RPNL1Loss=0.054555, RCNNAcc=0.869417, RCNNLogLoss=0.333381, RCNNL1Loss=0.109924,
Epoch[0] Batch [5500] Speed: 0.98 samples/sec Train-RPNAcc=0.940943, RPNLogLoss=0.154480, RPNL1Loss=0.053179, RCNNAcc=0.868047, RCNNLogLoss=0.332574, RCNNL1Loss=0.111403,

Ask for more details in training FRCNN OBB

Hi, I try to reproduce the work,use config file and pre-trained model you provided and trained for 60 epoch. Using 0.1 thresh to do NMS during test.
But finally map is just 40.7,is far lower than baseline .Did I miss something or do somethig wrong?

RESUME:false in yaml file

I notice that after the train ceased, I can't resume training based on my half-tained model.
Is it because the parameter RESUME is set false in the DOTA_quadrangle.yaml file?
If I set it true , will it automatically resume training based on the params and states file I got in the last training?

mAP becomes low after merging files.

Checked resultsmerge and dota evaluation task 2 python files, they both seems legit. Is it normal, I do not think so. After merging, it should increase? On crop images, I have mAP=40 approx but after merging, I got mAP=15 approx.

I set source file in result merge to be 15 files
destination file, where it should be saved after merge and nms

For dota evaluation task 2,
detpath = r'PATH_TO_BE_CONFIGURED/Task1_{:s}.txt' --> destination file, where it should be saved after merge and nms
annopath = r'PATH_TO_BE_CONFIGURED/{:s}.txt' --> val/labelTxt
imagesetfile = r'PATH_TO_BE_CONFIGURED/valset.txt' --> text file path for original large img size validation set

Please see my settings and comment. Thanks

@jessemelpolio @dingjiansw101

Check failed: exec_ctx.dev_id < device_count_ (1 vs. 1)

while running test file I got this error I know its GPU error but I dont know how to resolve it can you help me out?
mxnet.base.MXNetError: [13:48:30] src/engine/threaded_engine.cc:320: Check failed: exec_ctx.dev_id < device_count_ (1 vs. 1) Invalid GPU Id: 1, Valid device id should be less than device_count: 1

Is it possible to set BATCH_IMAGES > 1?

Seemed that you can only accelerate training/testing by using multiple GPUs, which doesn't make sense now, since the memory of mainstream GPU is more than 10G, while training with batch_image 1 only use about 4G per card.

Why nms is commented in faster_rcnn/core/tester.py

I am running faster_rcnn pretrained model on test set. A large number of the instance detection results that I am getting have very low confidence score. Then I noticed that the authors has commented nms (link) in the code. Why is that? Am I missing something?

what are differences beteewn rcnn_dota_e2e.py and rcnn_dota_quadrangle_e2e.py?

which file train FasterR-CNN on oriented bounding boxes,and which train FasterR-CNN on horizontal bounding boxes ? thank you

Questions on the test mAP

Hello,
Thanks for your baseline code.
I recently try to eval the mAP metric on validation dataset using your code with your trained model(rcnn_DOTA_quadrangle-0059.params). However, the output mAP is only ~0.1498, which is much lower than the numbes claimed in DOTA website. Here is my output details. From the 'check fp', it seems there are lots of false postive...
Anyway, this mAP seems unnormal. But I try to visualize the prediction results, which however seems not so much bad. Could do please help?

Readme error

I am afraid there is a typo in your readme. It's said train and test on quadrangle in an end-to-end manner by running python experiments/faster_rcnn/rcnn_dota_e2e.py --cfg experiments/faster_rcnn/cfgs/DOTA.yaml, which I think should be python experiments/faster_rcnn/rcnn_dota_quadrangle_e2e.py --cfg experiments/faster_rcnn/cfgs/DOTA_quadrangle.yaml
:)

does anyone able to successfully run this or able to train this?

Im getting error index out of range while it fetches DOTA.py's load_annotation function.

what's type of your dataset?

Is it the pascal voc? If not ,what's type of your dataset?

libpng error: IDAT: CRC error Exception in thread Thread-1: TypeError: 'NoneType' object has no attribute 'getitem' AttributeError: 'NoneType' object has no attribute 'shape'

Firstly, my terminal outputs are like this:
libpng error: IDAT: CRC error
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/wh/anaconda2/envs/dy/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/home/wh/anaconda2/envs/dy/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/utils/PrefetchingIter.py", line 60, in prefetch_func
self.next_batch[i] = self.iters[i].next()
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 701, in next
self.get_batch_individual()
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 808, in get_batch_individual
rst.append(self.parfetch(iroidb))
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 816, in parfetch
data, label = get_rpn_batch_quadrangle(iroidb, self.cfg)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/rpn/rpn.py", line 91, in get_rpn_batch_quadrangle
imgs, roidb = get_image_quadrangle_bboxes(roidb, cfg)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/utils/image.py", line 61, in get_image_quadrangle_bboxes
im = im[:, ::-1, :]
TypeError: 'NoneType' object has no attribute 'getitem'

and sometimes like this:

Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/wh/anaconda2/envs/dy/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/home/wh/anaconda2/envs/dy/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/utils/PrefetchingIter.py", line 61, in prefetch_func
self.next_batch[i] = self.iters[i].next()
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 701, in next
self.get_batch_individual()
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 808, in get_batch_individual
rst.append(self.parfetch(iroidb))
File "experiments/faster_rcnn/../../faster_rcnn/core/loader.py", line 816, in parfetch
data, label = get_rpn_batch_quadrangle(iroidb, self.cfg)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/rpn/rpn.py", line 91, in get_rpn_batch_quadrangle
imgs, roidb = get_image_quadrangle_bboxes(roidb, cfg)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/utils/image.py", line 66, in get_image_quadrangle_bboxes
im, im_scale = resize(im, target_size, max_size, stride=config.network.IMAGE_STRIDE)
File "experiments/faster_rcnn/../../faster_rcnn/../lib/utils/image.py", line 219, in resize
im_shape = im.shape
AttributeError: 'NoneType' object has no attribute 'shape'
I'm sure that my images' format is .png . And my log is as followed:
2018-10-12 20:21:08,725 training config:{'CLASS_AGNOSTIC': False,
'MXNET_VERSION': 'mxnet',
'RESIZE_TO_FIX_SIZE': True,
'SCALES': [(1024, 1024)],
'TEST': {'BATCH_IMAGES': 1,
'CXX_PROPOSAL': False,
'DO_MULTISCALE_TEST': False,
'HAS_RPN': True,
'MULTISCALE': [1.0, 1.2, 1.4, 1.6],
'NMS': 0.3,
'PROPOSAL_MIN_SIZE': 0,
'PROPOSAL_NMS_THRESH': 0.7,
'PROPOSAL_POST_NMS_TOP_N': 2000,
'PROPOSAL_PRE_NMS_TOP_N': 20000,
'RPN_MIN_SIZE': 0,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'max_per_image': 300,
'save_img_path': '/home/wh/Faster_RCNN_for_DOTA/data/vis',
'test_epoch': 59},
'TRAIN': {'ALTERNATE': {'RCNN_BATCH_IMAGES': 0,
'RPN_BATCH_IMAGES': 0,
'rfcn1_epoch': 0,
'rfcn1_lr': 0,
'rfcn1_lr_step': '',
'rfcn2_epoch': 0,
'rfcn2_lr': 0,
'rfcn2_lr_step': '',
'rpn1_epoch': 0,
'rpn1_lr': 0,
'rpn1_lr_step': '',
'rpn2_epoch': 0,
'rpn2_lr': 0,
'rpn2_lr_step': '',
'rpn3_epoch': 0,
'rpn3_lr': 0,
'rpn3_lr_step': ''},
'ASPECT_GROUPING': True,
'BATCH_IMAGES': 1,
'BATCH_ROIS': 128,
'BATCH_ROIS_OHEM': 128,
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZATION_PRECOMPUTED': False,
'BBOX_REGRESSION_THRESH': 0.5,
'BBOX_STDS': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'BBOX_WEIGHTS': array([1., 1., 1., 1., 1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.1,
'CXX_PROPOSAL': False,
'ENABLE_OHEM': True,
'END2END': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'FLIP': True,
'RESUME': False,
'RPN_BATCH_SIZE': 256,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 0,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'SHUFFLE': True,
'begin_epoch': 0,
'end_epoch': 60,
'lr': 0.0005,
'lr_factor': 0.1,
'lr_step': '45,52',
'model_prefix': 'rcnn_DOTA_quadrangle',
'momentum': 0.9,
'warmup': True,
'warmup_lr': 5e-05,
'warmup_step': 1000,
'wd': 0.0005},
'dataset': {'NUM_CLASSES': 16,
'dataset': 'DOTA_oriented',
'dataset_path': '/home/wh/Faster_RCNN_for_DOTA/data',
'image_set': 'train',
'proposal': 'rpn',
'root_path': '/home/wh/Faster_RCNN_for_DOTA/data',
'test_image_set': 'test'},
'default': {'frequent': 100, 'kvstore': 'device'},
'gpus': '0',
'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'FIXED_PARAMS': ['conv1',
'bn_conv1',
'res2',
'bn2',
'gamma',
'beta'],
'FIXED_PARAMS_SHARED': ['conv1',
'bn_conv1',
'res2',
'bn2',
'res3',
'bn3',
'res4',
'bn4',
'gamma',
'beta'],
'IMAGE_STRIDE': 0,
'NUM_ANCHORS': 9,
'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),
'RCNN_FEAT_STRIDE': 16,
'RPN_FEAT_STRIDE': 16,
'pretrained': './model/pretrained_model/resnet_v1_101',
'pretrained_epoch': 0},
'output_path': './output/rcnn/DOTA_quadrangle',
'symbol': 'resnet_v1_101_rcnn_quadrangle'}

2018-10-12 20:21:11,505 bucketing: data "gt_boxes" has a shape (1L, 386L, 9L), which is larger than already allocated shape (1L, 100L, 9L). Need to re-allocate. Consider putting default_bucket_key to be the bucket taking the largest input for better memory sharing.
2018-10-12 20:21:19,930 bucketing: data "gt_boxes" has a shape (1L, 597L, 9L), which is larger than already allocated shape (1L, 386L, 9L). Need to re-allocate. Consider putting default_bucket_key to be the bucket taking the largest input for better memory sharing.
2018-10-12 20:21:50,888 training config:{'CLASS_AGNOSTIC': False,
'MXNET_VERSION': 'mxnet',
'RESIZE_TO_FIX_SIZE': True,
'SCALES': [(1024, 1024)],
'TEST': {'BATCH_IMAGES': 1,
'CXX_PROPOSAL': False,
'DO_MULTISCALE_TEST': False,
'HAS_RPN': True,
'MULTISCALE': [1.0, 1.2, 1.4, 1.6],
'NMS': 0.3,
'PROPOSAL_MIN_SIZE': 0,
'PROPOSAL_NMS_THRESH': 0.7,
'PROPOSAL_POST_NMS_TOP_N': 2000,
'PROPOSAL_PRE_NMS_TOP_N': 20000,
'RPN_MIN_SIZE': 0,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'max_per_image': 300,
'save_img_path': '/home/wh/Faster_RCNN_for_DOTA/data/vis',
'test_epoch': 59},
'TRAIN': {'ALTERNATE': {'RCNN_BATCH_IMAGES': 0,
'RPN_BATCH_IMAGES': 0,
'rfcn1_epoch': 0,
'rfcn1_lr': 0,
'rfcn1_lr_step': '',
'rfcn2_epoch': 0,
'rfcn2_lr': 0,
'rfcn2_lr_step': '',
'rpn1_epoch': 0,
'rpn1_lr': 0,
'rpn1_lr_step': '',
'rpn2_epoch': 0,
'rpn2_lr': 0,
'rpn2_lr_step': '',
'rpn3_epoch': 0,
'rpn3_lr': 0,
'rpn3_lr_step': ''},
'ASPECT_GROUPING': True,
'BATCH_IMAGES': 1,
'BATCH_ROIS': 128,
'BATCH_ROIS_OHEM': 128,
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZATION_PRECOMPUTED': False,
'BBOX_REGRESSION_THRESH': 0.5,
'BBOX_STDS': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'BBOX_WEIGHTS': array([1., 1., 1., 1., 1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.1,
'CXX_PROPOSAL': False,
'ENABLE_OHEM': True,
'END2END': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'FLIP': True,
'RESUME': False,
'RPN_BATCH_SIZE': 256,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 0,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'SHUFFLE': True,
'begin_epoch': 0,
'end_epoch': 60,
'lr': 0.0005,
'lr_factor': 0.1,
'lr_step': '45,52',
'model_prefix': 'rcnn_DOTA_quadrangle',
'momentum': 0.9,
'warmup': True,
'warmup_lr': 5e-05,
'warmup_step': 1000,
'wd': 0.0005},
'dataset': {'NUM_CLASSES': 16,
'dataset': 'DOTA_oriented',
'dataset_path': '/home/wh/Faster_RCNN_for_DOTA/data',
'image_set': 'train',
'proposal': 'rpn',
'root_path': '/home/wh/Faster_RCNN_for_DOTA/data',
'test_image_set': 'test'},
'default': {'frequent': 100, 'kvstore': 'device'},
'gpus': '0',
'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'FIXED_PARAMS': ['conv1',
'bn_conv1',
'res2',
'bn2',
'gamma',
'beta'],
'FIXED_PARAMS_SHARED': ['conv1',
'bn_conv1',
'res2',
'bn2',
'res3',
'bn3',
'res4',
'bn4',
'gamma',
'beta'],
'IMAGE_STRIDE': 0,
'NUM_ANCHORS': 9,
'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),
'RCNN_FEAT_STRIDE': 16,
'RPN_FEAT_STRIDE': 16,
'pretrained': './model/pretrained_model/resnet_v1_101',
'pretrained_epoch': 0},
'output_path': './output/rcnn/DOTA_quadrangle',
'symbol': 'resnet_v1_101_rcnn_quadrangle'}

2018-10-12 20:21:55,379 bucketing: data "gt_boxes" has a shape (1L, 263L, 9L), which is larger than already allocated shape (1L, 100L, 9L). Need to re-allocate. Consider putting default_bucket_key to be the bucket taking the largest input for better memory sharing.
2018-10-12 20:21:56,718 bucketing: data "gt_boxes" has a shape (1L, 295L, 9L), which is larger than already allocated shape (1L, 263L, 9L). Need to re-allocate. Consider putting default_bucket_key to be the bucket taking the largest input for better memory sharing.
2018-10-12 20:22:08,772 bucketing: data "gt_boxes" has a shape (1L, 1028L, 9L), which is larger than already allocated shape (1L, 295L, 9L). Need to re-allocate. Consider putting default_bucket_key to be the bucket taking the largest input for better memory sharing.
2018-10-12 20:22:19,004 Epoch[0] Batch [100] Speed: 3.89 samples/sec Train-RPNAcc=0.838219, RPNLogLoss=0.442614, RPNL1Loss=0.823888, RCNNAcc=0.750155, RCNNLogLoss=2.288598, RCNNL1Loss=0.208822,

So, what's wrong?

about the epoch

Hi, I want to know what I should do to recover the training when the training was interrupted.
I tried to set the begin_epoch not to be 0, but it seems doesn't work?

Compatibility issue

Just after cloning, while initializing, got the multiple warnings/errors. Below here is detailed explaination what i did for perticular warnings/errors:

Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /home/rajeshjat Desktop/Dota 2/Faster_RCNN_for_DOTA/lib/dataset/pycocotools/_mask.pyx

to resolve this: I used, language_level = 3 earlier it was language_level = None in the

There were multiple errors similar to maskApi.c: In function ‘rlesFree’: maskApi.c:29:3: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation] 29 | for(siz i=0; i<n; i++) rleFree((*R)+i); free(*R); *R=0;

to resolve this : rewrote complete maskAPI.c to make sure proper use of indentation.

In the setup_linux.py, replaced iteritems() with items()

Is it Okay to make these changes?? Or should i clone it again?

can't import bbox_overlap_cpython

Resume training

Hi,
I would like to resume training from a pre-trained model with rotations.( not the ordinary resnet101 model)
I would like to use the rcnn_DOTA_quadrangle-0059.params , but I need the corresponding .state file.

Could you provide the .state file or if there is another way could you point it out?

Thanks,

请问paper的结果是用train+val数据得出的吗？

大佬你好，请问DOTA 那篇paper的结果是用train+val数据得出的吗？

怎么使用训练好的模型去看图片检测的效果？

我自己用这个框架训练了模型，也验证了精度，但不知道怎么去看实际图片检测效果，求告知。

What tool did you use to label data?

Dear jessemelpolio,
Thank you for your awesome source code and paper.

I wonder what tool did you use to label data. For example, if I have other images, how could I label them and add them to the training dataset?

Thank you again.

What algorithm are you using?

Is it R2CNN?
https://github.com/beacandler/R2CNN
https://github.com/yangxue0827/R2CNN_FPN_Tensorflow

test.py: error: the following arguments are required: --cfg

I was testing satellite images on your trained model, while running text.py on spyder(python 3.6) I faced some errors and I couldn't find its solution anywhere.

usage: test.py [-h] --cfg CFG
test.py: error: the following arguments are required: --cfg
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

Webcam

it's possible to use a webcam to detect the object in real time?

如何使用提供的预训练模型预测？

您好，下载完您在网盘中保存的权重文件后，如何根据权重文件进行预测呢？
方便的话请您指导下。

How to run demo?

Can anyone provide guidance about how to run the demo? I can't find any file in the repo that indicates how to do this.

Very very strange phenomenon, divergence problem when training RPN RCNN jointly?

Hello, I have spend weeks struggling with the divergence problem of your code. Simply put, the problem is that the loss always divergent after some iterations(usually near the finish of the first epoch). Please see the following snippet of my training log(training on the whole DOTA dataset):

training on this roidb: /data/dota/DOTA_split/Images/P1054__1__0___19.png
Epoch[0] Batch [18150]  Speed: 4.22 samples/sec Train-RPNAcc=0.968750,  RPNLogLoss=0.061721,    RPNL1Loss=0.159727,    RCNNAcc=0.781250, RCNNLogLoss=0.386619,     RCNNL1Loss=0.023027,

training on this roidb: /data/dota/DOTA_split/Images/P0144__1__0___0.png
Epoch[0] Batch [18151]  Speed: 5.03 samples/sec Train-RPNAcc=0.972656,  RPNLogLoss=0.072231,    RPNL1Loss=0.023806,    RCNNAcc=0.867188, RCNNLogLoss=0.316344,     RCNNL1Loss=0.020055,

training on this roidb: /data/dota/DOTA_split/Images/P0605__1__0___61.png
Epoch[0] Batch [18152]  Speed: 4.65 samples/sec Train-RPNAcc=0.785156,  RPNLogLoss=6.829865,    RPNL1Loss=3.960626,    RCNNAcc=0.960938, RCNNLogLoss=1.082687,     RCNNL1Loss=0.018287,

training on this roidb: /data/dota/DOTA_split/Images/P1994__1__0___0.png
Epoch[0] Batch [18153]  Speed: 4.55 samples/sec Train-RPNAcc=0.738281,  RPNLogLoss=4.315770,    RPNL1Loss=16.474699,   RCNNAcc=0.968750, RCNNLogLoss=1.007381,     RCNNL1Loss=0.105764,

training on this roidb: /data/dota/DOTA_split/Images/P0562__1__0___0.png
Epoch[0] Batch [18154]  Speed: 4.46 samples/sec Train-RPNAcc=0.500000,  RPNLogLoss=15.975729,   RPNL1Loss=4.111494,    RCNNAcc=0.750000, RCNNLogLoss=1.387988,     RCNNL1Loss=0.069026,

training on this roidb: /data/dota/DOTA_split/Images/P0555__1__0___0.png
Epoch[0] Batch [18155]  Speed: 4.75 samples/sec Train-RPNAcc=0.394531,  RPNLogLoss=19.518009,   RPNL1Loss=29.258448,   RCNNAcc=0.945312, RCNNLogLoss=2.378677,     RCNNL1Loss=1.344344,

training on this roidb: /data/dota/DOTA_split/Images/P2257__1__0___57.png
Epoch[0] Batch [18156]  Speed: 4.66 samples/sec Train-RPNAcc=0.890625,  RPNLogLoss=3.525833,    RPNL1Loss=45.849377,   RCNNAcc=0.968750, RCNNLogLoss=0.585869,     RCNNL1Loss=0.166008,

training on this roidb: /data/dota/DOTA_split/Images/P0477__1__0___191.png
Epoch[0] Batch [18157]  Speed: 4.58 samples/sec Train-RPNAcc=0.414062,  RPNLogLoss=18.888393,   RPNL1Loss=173.241898,  RCNNAcc=0.898438, RCNNLogLoss=3.158766,     RCNNL1Loss=0.109512,

You can when training P0605__1__0___61.png, the loss suddenly increase, after some iterations, the loss will become nan. I inspect the annotations of P0605__1__0___61, but find no strange thing. I try to train only on these problematic-like images by adding below sentence block in DOTA.py

       # for debug only##
        print 'debug the input images'
        self.image_set_index = ['P1451__1__1175___2772', 'P1994__1__0___0', 'P0562__1__0___0',
                                'P0555__1__0___0', 'P2257__1__0___57', 'P0477__1__0___191',
                                'P0522__1__0___837', 'P0605__1__0___61']
        # for debug only##

However, the loss is normal.

I also try to train RPN and RCNN seperatly, there is no divergence problem.

This have troubled me for many days, could do please help me with that?
Many thanks.

Trying to reproduce pre-trained model of author + problem in training

Hello,

By using the provided code, I am trying to reproduce the model of Faster-RCNN for DOTA. So that I can train it on my data.
After training of 60 epochs, the mAP on validation set of my trained model should be similar to the provided pre-trained model, but they are not (mine is much worse). I tried to preprocess the data using the get_best_begin_point function from DOTA Devkit as the authors mentioned in the disclaimer but it did not help either. Does anyone know what could be the problem, what I might be doing wrong?

During training, between epoch 23 to epoch 43, the RCNNLogLoss decreases from 0.18 to 0.17, is it normal? (The authors are decreasing learning rate as the training progresses.)

Scripts for preparing the dataset for training

Hi,

Can you publish the scripts for preparing the dataset to train the rcnn_dota_quadrangle model?
It seems the ImgSplit.py(https://github.com/CAPTAIN-WHU/DOTA_devkit/blob/master/ImgSplit.py) only copes with one example image. Can you present the details of preparing the whole dataset for training?