Code Monkey home page Code Monkey logo

fcos's People

Contributors

103yiran avatar apacha avatar ausk avatar belowmit avatar bernhardschaefer avatar botcs avatar chhshen avatar climbsrocks avatar coincheung avatar fmassa avatar godricly avatar henrywang1 avatar isameer avatar jario-jin avatar jiayuan-gu avatar keineahnung2345 avatar killthekitten avatar leviviana avatar newstzpz avatar renebidart avatar rodrigoberriel avatar soumith avatar stan-haochen avatar stanstarks avatar tianzhi0549 avatar wat3rbro avatar xudangliatiger avatar yelantf avatar zhangliliang avatar zimenglan-sysu-512 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fcos's Issues

When I train resnet101 backbone,I encounter this error,while I can train resnet50 successfuly with the same coco file,what should I do?

2019-05-04 00:17:28,682 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
File "tools/train_net.py", line 189, in
main()
File "tools/train_net.py", line 182, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 87, in train
arguments,
File "/home/abc/code/FCOS/maskrcnn_benchmark/engine/trainer.py", line 56, in do_train
for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/datasets/coco.py", line 94, in getitem
img, target = self.transforms(img, target)
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/transforms/transforms.py", line 15, in call
image, target = t(image, target)
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/transforms/transforms.py", line 58, in call
size = self.get_size(image.size)
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/transforms/transforms.py", line 42, in get_size
if max_original_size / min_original_size * size > max_size:
TypeError: unsupported operand type(s) for *: 'float' and 'range'

Traceback (most recent call last):
File "tools/train_net.py", line 189, in
main()
File "tools/train_net.py", line 182, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 87, in train
arguments,
File "/home/abc/code/FCOS/maskrcnn_benchmark/engine/trainer.py", line 56, in do_train
for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/abc/anaconda3/envs/FCOS/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/datasets/coco.py", line 94, in getitem
img, target = self.transforms(img, target)
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/transforms/transforms.py", line 15, in call
image, target = t(image, target)
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/transforms/transforms.py", line 58, in call
size = self.get_size(image.size)
File "/home/abc/code/FCOS/maskrcnn_benchmark/data/transforms/transforms.py", line 42, in get_size
if max_original_size / min_original_size * size > max_size:
TypeError: unsupported operand type(s) for *: 'float' and 'range'

what is the final loss?

I am training the mode on my own dataset, and I also modified some part of the code.

Could you tell me, what is the final value of each loss of your model? About cls_loss, reg_loss, centerness_loss. This might be helpful for me to check my training procedure and related code.

Thanks.

some question about loss.py in fcos file

Thanks for your work,I am reading your code and I have some questions.
in the loss.py in FCOS file,'level' comes several times,such as 'points_per_level','labels_level_first','reg_targets_level_first',So,what is the 'level' mean?,furthermore,what is the 'labels_level_first' and 'reg_targets_level_first' mean?

training error

Hi, @tianzhi0549,thanks for your project.
I am trying to run this project with on my own dataset. I change the corresponding setup in config file and began to train. However, it runs for several iterations and then this error appears:

File "tools/train_net.py", line 174, in
main()
File "tools/train_net.py", line 167, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 73, in train
arguments,
File "/home/detection/FCOS/maskrcnn_benchmark/engine/trainer.py", line 56, in do_train
for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
File "/home/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 656, in _process_next_batch
self._put_indices()
File "/home/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 646, in _put_indices
indices = next(self.sample_iter, None)
File "/home/detection/FCOS/maskrcnn_benchmark/data/samplers/iteration_based_batch_sampler.py", line 24, in iter
for batch in self.batch_sampler:
File "/home/detection/FCOS/maskrcnn_benchmark/data/samplers/grouped_batch_sampler.py", line 107, in iter
batches = self._prepare_batches()
File "/home/detection/FCOS/maskrcnn_benchmark/data/samplers/grouped_batch_sampler.py", line 79, in _prepare_batches
first_element_of_batch = [t[0].item() for t in merged]
File "/home/detection/FCOS/maskrcnn_benchmark/data/samplers/grouped_batch_sampler.py", line 79, in
first_element_of_batch = [t[0].item() for t in merged]
IndexError: index 0 is out of bounds for dimension 0 with size 0

I have checked the format of my dataset, could you give me some suggestions about this error?
Thanks a lot

loss nan

I try to train coco, but loss is nan.
this is my training script:

CUDA_VISIBLE_DEVICES=1,3,4,5 python -m torch.distributed.launch \
    --nproc_per_node=4 \
    --master_port=$((RANDOM + 10000)) \
    tools/train_net.py \
    --skip-test \
    --config-file configs/fcos/fcos_R_50_FPN_1x.yaml \
    DATALOADER.NUM_WORKERS 2 \
    OUTPUT_DIR training_dir/fcos_R_50_FPN_1x

this is my result

2019-04-16 09:19:17,383 maskrcnn_benchmark.trainer INFO: Start training
2019-04-16 09:19:33,719 maskrcnn_benchmark.trainer INFO: eta: 20:24:50  iter: 20  loss: 4.2079 (4.8923)  loss_centerness: 0.6670 (0.6685)  loss_cls: 0.9797 (0.9730)  loss_reg: 2.5527 (3.2508)  time: 0.6882 (0.8167)  data: 0.0219 (0.0651)  lr: 0.003333  max mem: 7051
2019-04-16 09:19:48,380 maskrcnn_benchmark.trainer INFO: eta: 19:21:49  iter: 40  loss: 3.2185 (4.0965)  loss_centerness: 0.6607 (0.6652)  loss_cls: 0.8450 (0.9074)  loss_reg: 1.6475 (2.5240)  time: 0.6947 (0.7749)  data: 0.0265 (0.0462)  lr: 0.003333  max mem: 7051
2019-04-16 09:20:02,270 maskrcnn_benchmark.trainer INFO: eta: 18:41:23  iter: 60  loss: 2.9554 (3.7219)  loss_centerness: 0.6592 (0.6634)  loss_cls: 0.7685 (0.8647)  loss_reg: 1.5265 (2.1938)  time: 0.6972 (0.7481)  data: 0.0283 (0.0399)  lr: 0.003333  max mem: 7051
2019-04-16 09:20:15,608 maskrcnn_benchmark.trainer INFO: eta: 18:10:43  iter: 80  loss: 2.8321 (nan)  loss_centerness: 0.6582 (nan)  loss_cls: 0.7013 (nan)  loss_reg: 1.4726 (nan)  time: 0.6690 (0.7278)  data: 0.0277 (0.0374)  lr: 0.003333  max mem: 7051
2019-04-16 09:20:28,939 maskrcnn_benchmark.trainer INFO: eta: 17:52:08  iter: 100  loss: nan (nan)  loss_centerness: nan (nan)  loss_cls: nan (nan)  loss_reg: nan (nan)  time: 0.6653 (0.7156)  data: 0.0262 (0.0353)  lr: 0.003333  max mem: 7051

I have tried for 3 times, always nan.
what's wrong with me?

training doesn't cost time

during my training , there was no error but total training time is 0 s. i'm sure my environment is correct. here is some related information:

MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
WEIGHT: "pre_train/mymodel.pth"
RPN_ONLY: True
FCOS_ON: True
BACKBONE:
CONV_BODY: "R-50-FPN-RETINANET"
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RETINANET:
USE_C5: False # FCOS uses P5 instead of C5
DATASETS:
TRAIN: ("coco_2019_train","coco_2019_val")
TEST: ("coco_2019_test","coco_2019_val")
INPUT:
MIN_SIZE_TRAIN: (400,)
MAX_SIZE_TRAIN: 1200
MIN_SIZE_TEST: 400
MAX_SIZE_TEST: 1200
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
BASE_LR: 0.001
WEIGHT_DECAY: 0.0001
STEPS: (2400, 6000)
MAX_ITER: 12000
IMS_PER_BATCH: 1
WARMUP_METHOD: "constant"

DATALOADER:
ASPECT_RATIO_GROUPING: True
NUM_WORKERS: 4
SIZE_DIVISIBILITY: 32
DATASETS:
TEST: ('coco_2019_test', 'coco_2019_val')
TRAIN: ('coco_2019_train', 'coco_2019_val')
INPUT:
MAX_SIZE_TEST: 1200
MAX_SIZE_TRAIN: 1200
MIN_SIZE_RANGE_TRAIN: (-1, -1)
MIN_SIZE_TEST: 400
MIN_SIZE_TRAIN: (400,)
PIXEL_MEAN: [102.9801, 115.9465, 122.7717]
PIXEL_STD: [1.0, 1.0, 1.0]
TO_BGR255: True

FCOS:
FPN_STRIDES: [8, 16, 32, 64, 128]
INFERENCE_TH: 0.05
LOSS_ALPHA: 0.25
LOSS_GAMMA: 2.0
NMS_TH: 0.6
NUM_CLASSES: 3
NUM_CONVS: 4
PRE_NMS_TOP_N: 1000
PRIOR_PROB: 0.01
FCOS_ON: True

WEIGHT: pre_train/mymodel.pth
OUTPUT_DIR: ./experiments/result
PATHS_CATALOG: /home/user/cocoapi/PythonAPI/maskrcnn_FCOS/FCOS/maskrcnn_benchmark/config/paths_catalog.py
SOLVER:
BASE_LR: 0.001
BIAS_LR_FACTOR: 2
CHECKPOINT_PERIOD: 2000
GAMMA: 0.1
IMS_PER_BATCH: 1
MAX_ITER: 12000
MOMENTUM: 0.9
STEPS: (2400, 6000)
WARMUP_FACTOR: 0.3333333333333333
WARMUP_ITERS: 500
WARMUP_METHOD: constant
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0
TEST:
DETECTIONS_PER_IMG: 100
EXPECTED_RESULTS: []
EXPECTED_RESULTS_SIGMA_TOL: 4
IMS_PER_BATCH: 1

2019-05-05 18:30:27,756 maskrcnn_benchmark.trainer INFO: Start training
2019-05-05 18:30:27,852 maskrcnn_benchmark.trainer INFO: Total training time: 0:00:00.095197 (0.0000 s / it)
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
2019-05-05 18:30:27,883 maskrcnn_benchmark.inference INFO: Start evaluation on coco_2019_test dataset(127 images)

training error about GC?

Hello author,thank you very much for publicizing the fcos code, this is really a great job!But I encountered a problem in the process of training the model, sometimes I can train normally, sometimes,it get error:

Fatal Python error: GC object already tracked

Thread 0x00007f58161ba700 (most recent call first):

Thread 0x00007f58159b9700 (most recent call first):

Thread 0x00007f58151b8700 (most recent call first):

Thread 0x00007f58169bb700 (most recent call first):

Thread 0x00007f5825cbc700 (most recent call first):
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 296 in wait
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/multiprocessing/queues.py", line 224 in _feed
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 865 in run
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 917 in _bootstrap_inner
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f58264bd700 (most recent call first):
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 296 in wait
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/multiprocessing/queues.py", line 224 in _feed
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 865 in run
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 917 in _bootstrap_inner
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f5826cbe700 (most recent call first):
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 296 in wait
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/multiprocessing/queues.py", line 224 in _feed
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 865 in run
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 917 in _bootstrap_inner
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 885 in _bootstrap

Thread 0x00007f582813e700 (most recent call first):
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 296 in wait
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/multiprocessing/queues.py", line 224 in _feed
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 865 in run
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 917 in _bootstrap_inner
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/threading.py", line 885 in _bootstrap

Current thread 0x00007f5903479740 (most recent call first):
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489 in call
File "/home/zzy/fcos/FCOS/maskrcnn_benchmark/modeling/backbone/resnet.py", line 334 in forward
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 494 in call
File "/home/zzy/fcos/FCOS/maskrcnn_benchmark/modeling/backbone/resnet.py", line 140 in forward
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 494 in call
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/container.py", line 97 in forward
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 494 in call
File "/home/zzy/fcos/FCOS/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 49 in forward
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 494 in call
File "/home/zzy/fcos/FCOS/maskrcnn_benchmark/engine/trainer.py", line 66 in do_train
File "/home/zzy/fcos/FCOS/tools/train_net.py", line 73 in train
File "/home/zzy/fcos/FCOS/tools/train_net.py", line 167 in main
File "/home/zzy/fcos/FCOS/tools/train_net.py", line 174 in

How can I solve this problem? I am using this line of code to start training:

python tools/train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1

Data preparation error

When i train the network using my own data, there is a mistake:

TypeError: Traceback (most recent call last):
File "/home/wh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/wh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 232, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/wh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 232, in
return [default_collate(samples) for samples in transposed]
File "/home/wh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 234, in default_collate
raise TypeError((error_msg.format(type(batch[0]))))
TypeError: batch must contain tensors, numbers, dicts or lists; found <class 'maskrcnn_benchmark.structures.bounding_box.BoxList'>

I follow the same configurations and data preparation process and I find Line 71 in maskrcnn_benchmark/data/dataset/voc.py, the return parameter is target which is a BoxList, so is there any modification I miss? Thanks a lot!

Can not download the pre-trained model

When I run train.py to train FCOS_X_101_64x4d_FPN_2x, but it happens download error:
Downloading: "https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/20171220/X-101-64x4d.pkl" to /home/zhengchenbin/.torch/models/X-101-64x4d.pkl Traceback (most recent call last): File "tools/train_net.py", line 175, in <module> main() File "tools/train_net.py", line 168, in main model = train(cfg, args.local_rank, args.distributed) File "tools/train_net.py", line 54, in train extra_checkpoint_data = checkpointer.load(cfg.MODEL.WEIGHT) File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/utils/checkpoint.py", line 65, in load checkpoint = self._load_file(f) File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/utils/checkpoint.py", line 133, in _load_file cached_f = cache_url(f) File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/utils/model_zoo.py", line 54, in cache_url _download_url_to_file(url, cached_file, hash_prefix, progress=progress) File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/utils/model_zoo.py", line 76, in _download_url_to_file u = urlopen(url) File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/urllib/request.py", line 531, in open response = meth(req, response) File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/urllib/request.py", line 641, in http_response 'http', request, response, code, msg, hdrs) File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/urllib/request.py", line 569, in error return self._call_chain(*args) File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/urllib/request.py", line 503, in _call_chain result = func(*args) File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/urllib/request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden
Can you provide the pre-trained model X-101-64x4d.pkl or give another download link such as baiduyun or google drive? Thank you very much!

Could FCOS overfit one single train image

Hi, I am trying to overfit one single image using FCOS (train/test with single image without any other transforms like horizontal flip) to test if I correctly use your codes. I have a very strange result and it seems like it can not fully overfit one single training images.

Inference:
prediction|center|200x0

Ground-truth:
bounding_box

I have borrowed codes from mask rcnn repo and use the fcos code from rpn/fcos but didn't check other differences between these two repos. Am I missing something that may cause this problem?

or

FCOS just can't fully overfit one single image like Mask RCNN because it uses multiple binary classifier (Sigmoid Focal Loss) instead of SoftmaxFocalLoss so that other classes (except for instances that appear on training image) classifier won't train for single image dataset.

Could you give me some hints on debugging this problem?

Many thanks.

ValueError: num_samples should be a positive integeral value, but got num_samples=0

Traceback (most recent call last):
File "tools/train_net.py", line 175, in
main()
File "tools/train_net.py", line 168, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 61, in train
start_iter=arguments["iteration"],
File "/home/administrator/USPIntern/zq/FCOS/maskrcnn_benchmark/data/build.py", line 158, in make_data_loader
sampler = make_data_sampler(dataset, shuffle, is_distributed)
File "/home/administrator/USPIntern/zq/FCOS/maskrcnn_benchmark/data/build.py", line 63, in make_data_sampler
sampler = torch.utils.data.sampler.RandomSampler(dataset)
File "/home/administrator/.local/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 64, in init
"value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integeral value, but got num_samples=0

I converted my data to coco format, but this error occurred.
can you help me?thanks a lot!

What's the inference speed?

What's the inference speed of FCOS with ResNet-101-FPN and ResNeXt-32x8d-101-FPN backbone. The input image size is consistency with the training setting.

when test image, the score of some object is low.

I have a question, why the scores are so low? such as the bike is 0.248 in your picture in #5 .

it also occur in my trained model, i trained the coco2017 with the code, and the score is always low, so i couldn't set the threshold, in my test, the people's score maybe 0.2-0.4 sometimes, when i set the threshold to 0.2, some false positive occur。

could you tell me how to solve the problem?

你好,我在训练自己数据集的时候,遇到了一个问题

File "/data/ubuntu/github/maskrcnn-benchmark/maskrcnn_benchmark/modeling/backbone/fpn.py", line 62, in forward last_inner = inner_lateral + inner_top_down RuntimeError: The size of tensor a (51) must match the size of tensor b (52) at non-singleton dimension 2

facebookresearch/maskrcnn-benchmark#142
已经使用这个方法,修改了config里面的_C.DATALOADER.SIZE_DIVISIBILITY,但是仍然报相同的错误,请问一下如何解决呢?

inference error

Hi,
I follow the instructions to create an environment FCOS,every package installed successful.

after run

python tools/test_net.py \
    --config-file configs/fcos/fcos_R_50_FPN_1x.yaml \
    MODEL.WEIGHT models/FCOS_R_50_FPN_1x.pth \
    TEST.IMS_PER_BATCH 4 

error happens

Traceback (most recent call last):
  File "tools/test_net.py", line 97, in <module>
    main()
  File "tools/test_net.py", line 49, in main
    cfg.merge_from_file(args.config_file)
  File "/home/peng/anaconda3/envs/FCOS/lib/python3.6/site-packages/yacs/config.py", line 213, in merge_from_file
    self.merge_from_other_cfg(cfg)
  File "/home/peng/anaconda3/envs/FCOS/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/home/peng/anaconda3/envs/FCOS/lib/python3.6/site-packages/yacs/config.py", line 460, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/peng/anaconda3/envs/FCOS/lib/python3.6/site-packages/yacs/config.py", line 473, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.FCOS_ON'

thank you!

Can you provide dockerfile?

It always got wrongs about timing out especially for pytorch 1.1 when I build a docker image from dockerfile.
Can you provide the dockerfile?

配置环境遇到的坑——已跑通

目前mask rcnn benchmark仅仅支持pytorch==1.0.0版本,而不支持最新的pytorch1.0.1的版本,因此不能使用conda安装,而应该使用pip install torch==1.0.0。
然后还有一个就是,需要指定path_category.py里面的DATA_DIR的全路径,否则会因为找不到json文件而报错

How to inference on single image

Does there any scripts test on single image? So coupled with maskrcnn-benchmark, I can not even find out where the whole pipeline of FCOS exists

compile error with pytorch1.0.0 nightly

anaconda3/envs/py35torch1/lib/python3.5/site-packages/torch/include/ATen/Dispatch.h:15:17: error: switch quantity not an integer
switch (TYPE) {
^
Project/FCOS/maskrcnn_benchmark/csrc/cpu/nms_cpu.cpp:71:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES’
AT_DISPATCH_FLOATING_TYPES(dets.type(), "nms", [&] {
^
anaconda3/envs/py35torch1/lib/python3.5/site-packages/torch/include/ATen/Dispatch.h:16:44: error: could not convert ‘Double’ from ‘c10::ScalarType’ to ‘’
AT_PRIVATE_CASE_TYPE(at::ScalarType::Double, double, VA_ARGS)
^
anaconda3/envs/py35torch1/lib/python3.5/site-packages/torch/include/ATen/Dispatch.h:8:8: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE’
case enum_type: {
^
Project/FCOS/maskrcnn_benchmark/csrc/cpu/nms_cpu.cpp:71:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES’
AT_DISPATCH_FLOATING_TYPES(dets.type(), "nms", [&] {
^
anaconda3/envs/py35torch1/lib/python3.5/site-packages/torch/include/ATen/Dispatch.h:17:44: error: could not convert ‘Float’ from ‘c10::ScalarType’ to ‘’
AT_PRIVATE_CASE_TYPE(at::ScalarType::Float, float, VA_ARGS)
^
anaconda3/envs/py35torch1/lib/python3.5/site-packages/torch/include/ATen/Dispatch.h:8:8: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE’
case enum_type: {
^
Project/FCOS/maskrcnn_benchmark/csrc/cpu/nms_cpu.cpp:71:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES’
AT_DISPATCH_FLOATING_TYPES(dets.type(), "nms", [&] {

but when i copy code to maskrcnn benchmark, this error never happend
so, I think there may some version problem

How to handle the negative values in bbox prediction convolution layer when calculating the IOU Loss?

The outputs value of the convolution layer for bbox prediction are not always positive, so when meeting with a negative value, the IOU Loss will be 'NAN' (because the log) , I find that the output value of the bbox prediction are post-processed by a 'torch.exp' operation, will this operation harm the detection performance? Is there any other operations to deal with the negative values in the bbox prediction feature map?

What is scale in class FCOSHead in focs.py?

I have a question that why init_value in scale is 1.0 and why need do 5 iters?
See the code in focs.py is self.scales = nn.ModuleList([Scale(init_value=1.0) for _ in range(5)])

How to resume training?

How to resume training? I see that there is no part of the code to resume training, they are all trained from scratch. But if the program is interrupted unexpectedly, how can I resume training?

/FCOS/maskrcnn_benchmark/layers

#from maskrcnn_benchmark import _C
from ._utils import _C

i use "from ._utils import _C" instead of "from maskrcnn_benchmark import _C" in roi_align.py,roi_pool.py

is it right?
I modified this way.it works now,but i do not know it right?

About the input size

Tanks for your work, it is excellent. I have some puzzles. Do you have some experiments on smaller input size, like 300300 or 224224, how does the input size influence the final results? Also, the postprocessing of the locations need to the filtered with classification *center-ness score, then the filtered locations will be calculated back to bbox flowed by the NMS. How many bboxes averagely will be calculated back to bbox? Does this operations with NMS cost much time? Looking forward to your reply.

a problem about ground-truth center-ness generation for inference

I have tried using the ground-truth center-ness for inference, however only got AP 38.8 rather than 42.1 metioned in the paper. I wonder if there are some errors in my code of ground-truth center-ness generation. Can you help me find the problem or give me your code? thx.
My code is as follow:

        labels, reg_targets = self.prepare_targets(locations, targets)
        centerness_target = []
        for l in range(len(labels)):
            reg_targets_flatten = reg_targets[l].reshape(-1, 4)
            reg_targets_flatten = (labels[l]!=0)[:,None].float()*reg_targets_flatten
            reg_targets_flatten = self.compute_centerness_targets(reg_targets_flatten)
            centerness_target.append(reg_targets_flatten.reshape(centerness[l].shape))

        return centerness_target
```

AttributeError: 'list' object has no attribute 'resize'

When I train on the coco dataset on single GPU, my datapath is FCOS-master/datasets/coco, in the coco folder are annotations, train2014 and val2014 folders, in the annotations folder are instances_train2014.json and instances_valminusminival2014.json, and I input "python -m torch.distributed.launch --nproc_per_node=1 --master_port=$((RANDOM + 10000)) tools/train_net.py --skip-test --config-file configs/fcos/fcos_R_50_FPN_1x.yaml DATALOADER.NUM_WORKERS 0 OUTPUT_DIR training_dir/fcos_R_50_FPN_1x", then I ran into the AttributeError:
Traceback (most recent call last):
File "tools/train_net.py", line 174, in
main()
File "tools/train_net.py", line 167, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 73, in train
arguments,
File "/cj/maskrcnn_benchmark/engine/trainer.py", line 56, in do_train
for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
File "/miniconda/envs/py36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 560, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/miniconda/envs/py36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 560, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/miniconda/envs/py36/lib/python3.6/site-packages/torch/utils/data/dataset.py", line 85, in getitem
return self.datasets[dataset_idx][sample_idx]
File "/cj/maskrcnn_benchmark/data/datasets/coco.py", line 67, in getitem
img, anno = super(COCODataset, self).getitem(idx)
File "/miniconda/envs/py36/lib/python3.6/site-packages/torchvision-0.2.3a0+9077164-py3.6-linux-x86_64.egg/torchvision/datasets/coco.py", line 114, in getitem
File "/cj/maskrcnn_benchmark/data/transforms/transforms.py", line 15, in call
image, target = t(image, target)
File "/cj/maskrcnn_benchmark/data/transforms/transforms.py", line 60, in call
target = target.resize(image.size)
AttributeError: 'list' object has no attribute 'resize'
Traceback (most recent call last):
File "/miniconda/envs/py36/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/miniconda/envs/py36/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/miniconda/envs/py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 235, in
main()
File "/miniconda/envs/py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 231, in main
cmd=process.args)
subprocess.CalledProcessError: Command '['/miniconda/envs/py36/bin/python', '-u', 'tools/train_net.py', '--local_rank=0', '--skip-test', '--config-file', 'configs/fcos/fcos_R_50_FPN_1x.yaml', 'DATALOADER.NUM_WORKERS', '0', 'OUTPUT_DIR', 'training_dir/fcos_R_50_FPN_1x']' returned non-zero exit status 1.

What should I do?

RuntimeError?

When I was training on my own dataset, I encountered the following problem:
Traceback (most recent call last):
File "tools/train_net.py", line 198, in
main()
File "tools/train_net.py", line 190, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 77, in train
arguments,
File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/engine/trainer.py", line 66, in do_train
loss_dict = model(images, targets)
File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 357, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 50, in forward
proposals, proposal_losses = self.rpn(images, features, targets)
File "/home/zhengchenbin/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/fcos.py", line 134, in forward
centerness, targets
File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/fcos.py", line 144, in _forward_train
locations, box_cls, box_regression, centerness, targets
File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/loss.py", line 146, in call
labels, reg_targets = self.prepare_targets(locations, targets)
File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/loss.py", line 57, in prepare_targets
points_all_level, targets, expanded_object_sizes_of_interest
File "/home/zhengchenbin/FcosNet/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/loss.py", line 98, in compute_targets_for_locations
is_in_boxes = reg_targets_per_im.min(dim=2)[0] > 0
RuntimeError: cannot perform reduction function min on tensor with no elements because the operation does not have an identity.
It seems to be reading an image without any ground truth box, but I checked my dataset, all images have ground truth box. And I found that the reading data class COCODataset will automatically ignore images without any box.
Any ideas? How can I solve it?

when i use the test_net.py to eval the model, an error happen

Traceback (most recent call last):
  File "/root/models/FCOS/tools/test_net.py", line 97, in <module>
    main()
  File "/root/models/FCOS/tools/test_net.py", line 91, in main
    output_folder=output_folder,
  File "/root/models/FCOS/maskrcnn_benchmark/engine/inference.py", line 115, in inference
    **extra_args)
  File "/root/models/FCOS/maskrcnn_benchmark/data/datasets/evaluation/__init__.py", line 22, in evaluate
    return coco_evaluation(**args)
  File "/root/models/FCOS/maskrcnn_benchmark/data/datasets/evaluation/coco/__init__.py", line 20, in coco_evaluation
    expected_results_sigma_tol=expected_results_sigma_tol,
  File "/root/models/FCOS/maskrcnn_benchmark/data/datasets/evaluation/coco/coco_eval.py", line 31, in do_coco_evaluation
    predictions, dataset, area=area, limit=limit
  File "/root/models/FCOS/maskrcnn_benchmark/data/datasets/evaluation/coco/coco_eval.py", line 233, in evaluate_box_proposals
    inds = prediction.get_field("objectness").sort(descending=True)[1]
  File "/root/models/FCOS/maskrcnn_benchmark/structures/bounding_box.py", line 43, in get_field
    return self.extra_fields[field]
KeyError: 'objectness'
2019-05-02 14:26:22,496 maskrcnn_benchmark.inference INFO: Evaluating bbox proposals

boxlist dosn't has the objecness field, but the coco_eval.py use it to produce results, how can i solve the problem?
if i just alter the "objectness" to "scores", is the eval result right?

【Two situations】When compiling FCOS: (1) collect2: fatal error: cannot find 'ld'. (2) unable to execute 'x86_64-conda_cos6-linux-gnu-gcc': No such file or directory.

I tried to run the code in two different environment but failed.
BTW, my linux environment is offline.

【1】The environment is:
(0) Ubuntu 16.04
(1) Anaconda python 3.6.2 (GCC 7.2.0)
(2) PyTorch 1.0.0
(3) gcc -v --> 5.4.0
(4) CUDA 9.0.176
【ERROR】:
running build
running build_py
running build_ext
/home/qinhaonan/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py:118: UserWarning:

                           !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (x86_64-conda_cos6-linux-gnu-c++) may be ABI-incompatible with PyTorch!
Please use a compiler that is ABI-compatible with GCC 4.9 and above.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html.

See https://gist.github.com/goldsborough/d466f43e8ffc948ff92de7486c5216d6
for instructions on how to install GCC 4.9 or higher.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                          !! WARNING !!

x86_64-conda_cos6-linux-gnu-c++ -pthread -shared -Wl,-O2,--sort-common,--as-needed,-z,relro,-z,now -Wl,-rpath,/opt/anaconda3/lib -L/opt/anaconda3/lib -Wl,-O2,--sort-common,--as-needed,-z,relro,-z,now -Wl,-rpath,/opt/anaconda3/lib -L/opt/anaconda3/lib build/temp.linux-x86_64-3.6/home/qinhaonan/Algorithm/FCOS/FCOS_01/FCOS-master/maskrcnn_benchmark/csrc/vision.o build/temp.linux-x86_64-3.6/home/qinhaonan/Algorithm/FCOS/FCOS_01/FCOS-master/maskrcnn_benchmark/csrc/cpu/ROIAlign_cpu.o build/temp.linux-x86_64-3.6/home/qinhaonan/Algorithm/FCOS/FCOS_01/FCOS-master/maskrcnn_benchmark/csrc/cpu/nms_cpu.o build/temp.linux-x86_64-3.6/home/qinhaonan/Algorithm/FCOS/FCOS_01/FCOS-master/maskrcnn_benchmark/csrc/cuda/nms.o build/temp.linux-x86_64-3.6/home/qinhaonan/Algorithm/FCOS/FCOS_01/FCOS-master/maskrcnn_benchmark/csrc/cuda/ROIPool_cuda.o build/temp.linux-x86_64-3.6/home/qinhaonan/Algorithm/FCOS/FCOS_01/FCOS-master/maskrcnn_benchmark/csrc/cuda/ROIAlign_cuda.o build/temp.linux-x86_64-3.6/home/qinhaonan/Algorithm/FCOS/FCOS_01/FCOS-master/maskrcnn_benchmark/csrc/cuda/SigmoidFocalLoss_cuda.o -L/usr/local/cuda/lib64 -L/opt/anaconda3/lib -lcudart -lpython3.6m -o build/lib.linux-x86_64-3.6/maskrcnn_benchmark/_C.cpython-36m-x86_64-linux-gnu.so
collect2: fatal error: cannot find 'ld'
compilation terminated.
error: command 'x86_64-conda_cos6-linux-gnu-c++' failed with exit status 1

【2】The environment is:
(0) Ubuntu 16.04
(1) Anaconda python 3.6.2 (GCC 7.2.0)
(2) PyTorch 0.4.0
(3) gcc -v --> 4.8.5
(4) CUDA 8.0.61
【ERROR】:
running build_ext
/apps/jhinno/users/.../.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py:80: UserWarning: Error checking compiler version: [Errno 2] No such file or directory: 'x86_64-conda_cos6-linux-gnu-c++'
warnings.warn('Error checking compiler version: {}'.format(error))
/apps/jhinno/users/.../.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py:106: UserWarning:

                           !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (x86_64-conda_cos6-linux-gnu-c++) may be ABI-incompatible with PyTorch!
Please use a compiler that is ABI-compatible with GCC 4.9 and above.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html.

See https://gist.github.com/goldsborough/d466f43e8ffc948ff92de7486c5216d6
for instructions on how to install GCC 4.9 or higher.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                          !! WARNING !!

warnings.warn(ABI_INCOMPATIBILITY_WARNING.format(compiler))
building 'maskrcnn_benchmark._C' extension
creating build
creating build/temp.linux-x86_64-3.6
...
creating build/temp.linux-x86_64-3.6/.../FCOS/FCOS-master/maskrcnn_benchmark/csrc/cpu
creating build/temp.linux-x86_64-3.6/.../FCOS/FCOS-master/maskrcnn_benchmark/csrc/cuda
x86_64-conda_cos6-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -Wstrict-prototypes -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -fPIC -DWITH_CUDA -I/apps/jhinno/users/IMGLAB/4003/HeroNet/FCOS/FCOS-master/maskrcnn_benchmark/csrc -I/apps/jhinno/users/IMGLAB/4003/.local/lib/python3.6/site-packages/torch/lib/include -I/apps/jhinno/users/IMGLAB/4003/.local/lib/python3.6/site-packages/torch/lib/include/TH -I/apps/jhinno/users/IMGLAB/4003/.local/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/opt/anaconda3/include/python3.6m -c /apps/jhinno/users/IMGLAB/4003/HeroNet/FCOS/FCOS-master/maskrcnn_benchmark/csrc/vision.cpp -o build/temp.linux-x86_64-3.6/apps/jhinno/users/IMGLAB/4003/HeroNet/FCOS/FCOS-master/maskrcnn_benchmark/csrc/vision.o -DTORCH_EXTENSION_NAME=maskrcnn_benchmark._C -std=c++11
unable to execute 'x86_64-conda_cos6-linux-gnu-gcc': No such file or directory
error: command 'x86_64-conda_cos6-linux-gnu-gcc' failed with exit status 1

training error, cannot start

Hi! I try to train coco_train2017 data following the step as you shown, but raise an error as follow:
2019-05-15 10:23:24,814 maskrcnn_benchmark.trainer INFO: Start training
Traceback (most recent call last):
File "/home/work/songping/anaconda3/envs/FCOS/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/work/songping/anaconda3/envs/FCOS/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/work/songping/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/distributed/launch.py", line 235, in
main()
File "/home/work/songping/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/distributed/launch.py", line 231, in main
cmd=process.args)
subprocess.CalledProcessError: Command '['/home/work/songping/anaconda3/envs/FCOS/bin/python', '-u', 'tools/train_net.py', '--local_rank=0', '--skip-test', '--config-file', 'configs/fcos/fcos_R_50_FPN_1x.yaml', 'DATALOADER.NUM_WORKERS', '2', 'OUTPUT_DIR', 'training_dir/fcos_R_50_FPN_1x']' died with <Signals.SIGSEGV: 11>.
could you help me to solve the problem? thank you

您好,请问为何在inference_single_cv中没有使用centerness

您好,非常感谢您的开源代码。我在使用inference_single_cvimage.py时,发现代码没有使用centerness,在centerness_loss时直接是一个conv2d(256,1),并没有使用compute centerness函数,同时我在compute centerness函数那里加了断点,程序并没有暂停,说明程序确实没有执行:
def compute_centerness_targets(self, reg_targets): left_right = reg_targets[:, [0, 2]] top_bottom = reg_targets[:, [1, 3]] centerness = (left_right.min(dim=-1)[0] / left_right.max(dim=-1)[0]) * \ (top_bottom.min(dim=-1)[0] / top_bottom.max(dim=-1)[0]) return torch.sqrt(centerness)
请问下是有什么参数来控制么?为何在测试这里没有执行centerness,非常感谢!

Improvement for small object detection

Thank you for the great work! I ran the proposed code on my custom dataset of medical image, the result list as below:
2019-05-06 03:04:17,399 maskrcnn_benchmark.inference INFO:
OrderedDict([('bbox', OrderedDict([('AP', 0.5055817771868518),
('AP50', 0.8926599742997058), ('AP75', 0.4691991724725123),
('APs', 0.0007072135785007071), ('APm', 0.10701570554699777),
('APl', 0.5285697565878185)]))])

compared with yolov3 on the same dataset, i found that the proposed algorithm works not very well on small object ('APs', 0.0007072135785007071), as mentioned in the paper this is a anchor free method. Is there any idea to improve the detection performance on small object? or any hyperparameters to finetune?

About the head feature sharing

Hi, dose the head features for cls and box (e.g. 4 * conv) shared in your implementation? have you experimented to compare the difference for the final performance?

box target?

Does the network output the box offsets (l*,r*,t*,b*) normalized by image size? I tried to train fcos in another task and find the regression branch's output value is unstable.

RuntimeError: cannot perform reduction function min on tensor with no elements because the operation does not have an identity

Traceback (most recent call last):
File "tools/train_net.py", line 176, in
main()
File "tools/train_net.py", line 169, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 75, in train
arguments,
File "/home/administrator/FCOS/maskrcnn_benchmark/engine/trainer.py", line 66, in do_train
loss_dict = model(images, targets)
File "/home/administrator/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/FCOS/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 50, in forward
proposals, proposal_losses = self.rpn(images, features, targets)
File "/home/administrator/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/administrator/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/fcos.py", line 134, in forward
centerness, targets
File "/home/administrator/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/fcos.py", line 144, in _forward_train
locations, box_cls, box_regression, centerness, targets
File "/home/administrator/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/loss.py", line 142, in call
labels, reg_targets = self.prepare_targets(locations, targets)
File "/home/administrator/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/loss.py", line 57, in prepare_targets
points_all_level, targets, expanded_object_sizes_of_interest
File "/home/administrator/FCOS/maskrcnn_benchmark/modeling/rpn/fcos/loss.py", line 94, in compute_targets_for_locations
is_in_boxes = reg_targets_per_im.min(dim=2)[0] > 0
RuntimeError: cannot perform reduction function min on tensor with no elements because the operation does not have an identity

how can i solve it ?

And i train my own data,i have only 20 classes
should i change 80(coco ) to 20 in the network?

centerness loss is larger than other loss

I trained FCOS on my own datasets, but the loss of centerness is much larger than loxx_cls and loss_reg. Could you please give me some advice on how to solve this.

2019-04-29 07:05:16,317 maskrcnn_benchmark.trainer INFO: eta: 11:21:07  iter: 13000  loss: 0.7219 (0.7213)  loss_centerness: 0.5492 (0.5514)  loss_cls: 0.0237 (0.0238)  loss_reg: 0.1435 (0.1462)  time: 1.1004 (1.1045)  data: 0.6328 (0.6421)  lr: 0.010000  max mem: 7608
2019-04-29 07:05:39,170 maskrcnn_benchmark.trainer INFO: eta: 11:21:39  iter: 13020  loss: 0.7145 (0.7212)  loss_centerness: 0.5515 (0.5515)  loss_cls: 0.0237 (0.0238)  loss_reg: 0.1377 (0.1459)  time: 1.1148 (1.1060)  data: 0.5771 (0.6409)  lr: 0.010000  max mem: 7608
2019-04-29 07:06:03,733 maskrcnn_benchmark.trainer INFO: eta: 11:24:04  iter: 13040  loss: 0.7170 (0.7211)  loss_centerness: 0.5514 (0.5515)  loss_cls: 0.0239 (0.0239)  loss_reg: 0.1373 (0.1457)  time: 1.1881 (1.1105)  data: 0.5283 (0.6373)  lr: 0.010000  max mem: 7608
2019-04-29 07:06:28,364 maskrcnn_benchmark.trainer INFO: eta: 11:26:21  iter: 13060  loss: 0.7417 (0.7218)  loss_centerness: 0.5507 (0.5514)  loss_cls: 0.0235 (0.0239)  loss_reg: 0.1642 (0.1464)  time: 1.1700 (1.1148)  data: 0.6051 (0.6388)  lr: 0.010000  max mem: 7608

About the dockerfile

The Dockerfile in the docker folder doesn't work well, is there something wrong with the Dockerfile?

something wrong when install

  1. how to download pytorch 1.0.0 nightly? I can only download pytorch 1.1.0 nightly version.
  2. Is cuda 9.0 not ok for this repo? I used cuda 9.0 and when run python setup.py build develop give wrong infomation /usr/local/cuda/bin/nvcc: no such file or directory.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.