ethanhe42 / kl-loss Goto Github PK

View Code? Open in Web Editor NEW

714.0 22.0 105.0 5.16 MB

Bounding Box Regression with Uncertainty for Accurate Object Detection (CVPR'19)

Home Page: https://yihui.dev/bounding-box-regression-with-uncertainty-for-accurate-object-detection

License: Apache License 2.0

CMake 4.55% Makefile 0.07% Python 93.20% MATLAB 0.24% C++ 0.37% Cuda 0.22% Dockerfile 0.10% Cython 1.25%

object-detection pytorch detection-algorithm detection-model detection detection-network

kl-loss's Introduction

Bounding Box Regression with Uncertainty for Accurate Object Detection

GitHub - yihui-he/KL-Loss: Bounding Box Regression with Uncertainty for Accurate Object Detection (CVPR'19)

CVPR 2019 Open Access Repository

CVPR 2019 [presentation (youtube)]

Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, Xiangyu Zhang, Carnegie Mellon University & Megvii Inc.

https://www.youtube.com/embed/bcGtNdTzdkc

Citation
Installation
Testing
Training
PyTorch re-implementations
FAQ

Large-scale object detection datasets (e.g., MS-COCO) try to define the ground truth bounding boxes as clear as possible. However, we observe that ambiguities are still introduced when labeling the bounding boxes. In this paper, we propose a novel bounding box regression loss for learning bounding box transformation and localization variance together. Our loss greatly improves the localization accuracies of various architectures with nearly no additional computation. The learned localization variance allows us to merge neighboring bounding boxes during non-maximum suppression (NMS), which further improves the localization performance. On MS-COCO, we boost the Average Precision (AP) of VGG-16 Faster R-CNN from 23.6% to 29.1%. More importantly, for ResNet-50-FPN Mask R-CNN, our method improves the AP and AP90 by 1.8% and 6.2% respectively, which significantly outperforms previous state-of-the-art bounding box refinement methods.

Citation

If you find the code useful in your research, please consider citing:

@InProceedings{klloss,
  author = {He, Yihui and Zhu, Chenchen and Wang, Jianren and Savvides, Marios and Zhang, Xiangyu},
  title = {Bounding Box Regression With Uncertainty for Accurate Object Detection},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2019}
}

Installation

Please find installation instructions for Caffe2 and Detectron in [INSTALL.md](INSTALL.md).

When installing cocoapi, please use my fork to get AP80 and AP90 scores.

Testing

Inference without Var Voting (8 GPUs):

python2 tools/test_net.py -c configs/e2e_faster_rcnn_R-50-FPN_2x.yaml

You will get:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.385
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.578
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.412
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.209
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.412
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.515
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.323
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.499
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.522
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.321
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.553
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.680
 Average Precision  (AP) @[ IoU=0.60      | area=   all | maxDets=100 ] = 0.533
 Average Precision  (AP) @[ IoU=0.70      | area=   all | maxDets=100 ] = 0.461
 Average Precision  (AP) @[ IoU=0.80      | area=   all | maxDets=100 ] = 0.350
 Average Precision  (AP) @[ IoU=0.85      | area=   all | maxDets=100 ] = 0.269
 Average Precision  (AP) @[ IoU=0.90      | area=   all | maxDets=100 ] = 0.154
 Average Precision  (AP) @[ IoU=0.95      | area=   all | maxDets=100 ] = 0.032

Inference with Var Voting:

python2 tools/test_net.py -c configs/e2e_faster_rcnn_R-50-FPN_2x.yaml STD_NMS True

You will get:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.392
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.576
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.425
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.212
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.417
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.526
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.324
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.528
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.564
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.346
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.594
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.736
 Average Precision  (AP) @[ IoU=0.60      | area=   all | maxDets=100 ] = 0.536
 Average Precision  (AP) @[ IoU=0.70      | area=   all | maxDets=100 ] = 0.472
 Average Precision  (AP) @[ IoU=0.80      | area=   all | maxDets=100 ] = 0.363
 Average Precision  (AP) @[ IoU=0.85      | area=   all | maxDets=100 ] = 0.281
 Average Precision  (AP) @[ IoU=0.90      | area=   all | maxDets=100 ] = 0.165
 Average Precision  (AP) @[ IoU=0.95      | area=   all | maxDets=100 ] = 0.037

Training

python2 tools/train_net.py -c configs/e2e_faster_rcnn_R-50-FPN_2x.yaml

PyTorch re-implementations

Stronger-yolo-pytorch: yolov3 + KL-loss

FAQ

Please create a new issue.

Detectron Readme

kl-loss's People

Contributors

Stargazers

Watchers

Forkers

tqdavid mathpopo hxl1990 shiyongde wangzheallen solomon1588 skyneta liu3xing3long fendaq zekunzh wujielong aihgf buaakevin eycab bygreencn zhangliliang chaos1992 liuyang9536 crackgfw dreadlord1984 zhengqun menguangwen-cn-0411 xuexixuexihaha github2016hdu horaccefeng hhyla66 lsqii pursu magiccodess hunterhawk chenlitingcs keyepoch reactivetype dreamyit leo-xxx trantorrepository cvrosefun 1700117hlt sunycl lunar-r bcc0910 heroyiuwfy my-hello-world christinaliang github-luffy haochange jessicadufirst dorniwang yushanshan05 ybyangjing starstylesky yfnn summit1993 winggyn2019 liuwenhaha bei91 bai0925 wuxiangchao wilson1068 justcallmewilliam giorking jinwook-shim guoqiang0148666 haofanghope wolfworld6 akrusher zhenxingsh maweifei youtang1993 apxlwl shadowclouds wuqingwind stiphyjay a-little-story jasonann ginobilinie freedom521jin xuefeiwang12 congzhengithub henlong liyongsheng-tech zechendev upredictable zhumingxu zhyhy zzz951008 inskeg chisyliu xrosliang dabblle mingshengcai1997 pgsrv zhangyuquansibet jaguarzhang jch-wang lucs-c jinweiliu hhy-ee gridl killer-paopaolong

kl-loss's Issues

ap95 overwrite ap90 in cocoapi

stats[16] = _summarize(1, iouThr=.9, maxDets=self.params.maxDets[2])
stats[16] = _summarize(1, iouThr=.95, maxDets=self.params.maxDets[2])
I find these two line in your modified cocoapi. Ap95 data will overwrite ap90 data. Isn't it?

Extra term in the KL-loss code and the NaN problem

Hi, thanks for the source code.

Could you please explain the aim of adding the term 0.5 * sq2 * exp(-a) * beta in the loss function?

if (abs_val < beta) {
    out[ind] = (sq2 * exp(-a) * 0.5 * val * val / beta + 0.5 * sq2 * exp(-a) * beta + a) / max(S[0], 1.0);
} else {
    out[ind] = (sq2 * exp(-a) * (abs_val - 0.5 * beta) + 0.5 * sq2 * exp(-a) * beta + a) / max(S[0], 1.0);
}

Besides, in the case that the variance is close to zero while the regression error (abs_val in the code) is relatively large (probably happens in the first iteration due to initialization), the loss is extremely large and will result in NaN. Did you encounter this problem? If yes, how to solve it?

Thanks!

could you also show the position of softNMS and variance-voting?

about the meaning of Eq. 11

Hi, Eq. 11 recomputes the new location by weighted mean of its neighbor bounding boxes and itself. I wonder that why the form of Eq. 11 is 'intuitive'? To me, the average of random samples of neighbor bounding boxes is more intuitive, i.e.

That is to say, the new coordinate is a sample which is drawn from the average distribution of neighbor bounding boxes.
Please give more explanation about your Eq. 11 and correct me if I'm wrong. Thanks!

Sorry,How to add the KL-LOSS to faster-rcnn?I only find the fast-rcnn version

Sorry,How to add the KL-LOSS to the faster-rcnn? I only find the fast-rcnn version。

I can't find the part of kl-loss in this code..(I don't know anything about caffe)

Hi yihui-he,
Could you point out this part(Kl-loss) for me?I want to change it to pytorch too.
You can email me([email protected]).
Thank you.

why stop gradient?

Hello.
may I ask why the gradient of the sigma prediction branch is stopped to the bbox branch in the program?
If it is necessary to stop the gradient of sigma prediction branches, an article in 2022 proposed to fill the gap between Gaussian distribution and Dirac delta distribution by using IoU as a power to solve the problem that Gaussian distribution and Dirac delta distribution cannot be completely equivalent. In this case, does the gradient transmitted by sigma loss also need to be stopped?

KL-loss seems could be negative?

Hi, thanks for open sourcing the code.

I see from the code model.net.Copy('bbox_pred_std', 'bbox_pred_std_abs'), you just copy the log(sigma^2) as std instead of relu or sigmoid, then the (log(sigma^2))/2 could be negative in my understanding? which may cause the reg_log negative. Could that be a problem?

Thanks!

The value of the KL Loss

Hello Yihui, Thank you very much for sharing the code, it is an excellent work! I have a question about the value of the KL loss. I did read the instructions before creating this new issue, but I think my question is a little bit different. Here is my question:
KL divergence is zero only when the two distributions are identical. But as we know, on the one hand, our predictions can never be identical to the ground truth, on the other hand, we discard the terms (H(P_D(x)) and log(2pi)/2) that are not depended on the estimated parameters. This would make the KL loss not converge to zero (or any fixed value), ideally, it would converge to a minimum, but we don't know the value of this minimum. So, my question is how you determine if the training goes well (since unlike other types of loss function which would converge to a fixed value, like 0, we don't know the ideal value of the proposed KL loss)? and how to evaluate the variance (since we don't have the ground truth for variance)? Thank you.

the loss_bbox is 0

when I try to train on my data,the yaml is e2e_faster_rcnn_R-50-FPN_2x.yaml,used pre-model.But the loss_bbox is 0!
my class is 13+1,so modify the net.py:

for blob in model.params:
        '''
        unscoped_param_names[c2_utils.UnscopeName(str(blob))] = True
        '''
        keyname = c2_utils.UnscopeName(str(blob))
        if (keyname == 'cls_score_w' or keyname == 'cls_score_b' or keyname == 'bbox_pred_w' or keyname == 'bbox_pred_b'):
            continue
        unscoped_param_names[keyname] = True

the log is:
{"accuracy_cls": "1.000000", "bbox_pred_std_abs_logw_loss": "0.000000", "bbox_pred_std_abs_mulw_loss": "0.000000", "eta": "0:00:41", "iter": 89920, "loss": "0.005503", "loss_bbox": "0.000000", "loss_cls": "0.000000", "loss_rpn_bbox_fpn2": "0.000000", "loss_rpn_bbox_fpn3": "0.000959", "loss_rpn_bbox_fpn4": "0.000380", "loss_rpn_bbox_fpn5": "0.000077", "loss_rpn_bbox_fpn6": "0.000000", "loss_rpn_cls_fpn2": "0.000117", "loss_rpn_cls_fpn3": "0.000207", "loss_rpn_cls_fpn4": "0.000185", "loss_rpn_cls_fpn5": "0.000018", "loss_rpn_cls_fpn6": "0.000000", "lr": "0.000100", "mb_qsize": 64, "mem": 13183, "time": "0.515552"}

Is there something wrong?

voc dataset

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

Please thoroughly read README.md, INSTALL.md, GETTING_STARTED.md, and FAQ.md
Please search existing open and closed issues in case your issue has already been reported
Please try to debug the issue in case you can solve it on your own before posting

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

(Delete this line and the text above it.)

Expected results

What did you expect to see?

Actual results

What did you observe instead?

Detailed steps to reproduce

E.g.:

The command that you ran

System information

Operating system: ?
Compiler version: ?
CUDA version: ?
cuDNN version: ?
NVIDIA driver version: ?
GPU models (for all devices if they are not all the same): ?
PYTHONPATH environment variable: ?
python --version output: ?
Anything else that seems relevant: ?

What do the bbox_pred and bbox_targets mean?

https://github.com/yihui-he/KL-Loss/blob/0a5617f02b5c0ebc57ddedffeb212859f1b3f008/detectron/modeling/fast_rcnn_heads.py#L114

In original bbox regression, the smooth L1 loss is calculated between the offsets, but according to your paper, I think here the 'bbox_pred' is the xyxy coordinates after regression? Can you explain it?

Question about code of KL loss

Thanks for your wonderful work.
When we use KL loss and set cfg.PRED_STD to true, the SmoothL1Loss is also computed. I think loss_bbox is equal to bbox_pred_std_abs_mulw_loss. Is my understanding correct？
https://github.com/yihui-he/KL-Loss/blob/66c0ed9e886a2218f4cf88c0efd4f40199bff54a/detectron/modeling/fast_rcnn_heads.py#L154

negative loss

Did you encounter negative loss during training.
I am seeing a negative loss after a few iteration for my Pascal VOC train set. I think it might be beacuse of the large alpha values produced. Do you know if we can clamp these values to some limits ? or might it be that i have decoded the bbox coordinates to pixel coordinates ? (Is the bbox_pred mentioned if the fastrcnnhead.py script the offsets to the target) ?

Thanks a lot in advance

What to do with bounding box regression variance typically used when normalizing bounding box targets

Hi @yihui-he,
As Faster-RCNN normally uses the cx, cy, w, h encoding of bounding boxes and for this work you used the x1, y1, x2, y2 encoding, I was wondering the following:

What to do with bounding box regression variance typically used when normalizing bounding box targets?

In bounding box regression, there is typically in the implementation (not in the papers) a 'variance' with which the difference in bounding boxes are normalized (Good explanation of what I mean: https://leimao.github.io/blog/Bounding-Box-Encoding-Decoding/). E.g. in the detectron code:

https://github.com/yihui-he/KL-Loss/blob/962a687c7caca56b3b8562b437a8370077a59074/detectron/core/config.py#L456-L458

(even though I believe that in the original implementation of Faster RCNN these values were (0.1, 0.1, 0.2, 0.2))

But where you transform the encoding from cx, cy, w, y into x1,y1,x2,y2; have you taken the variances into account as well while doing this or could you skip this? I think these variances are a bit of 'magical' values that get transported from implementation to implementation, without too many people questioning them. But I would love your take on these values.

train err

when I try to train my data,err is "ValueError: numpy.ufunc has the wrong size, try recompiling."

请问一下，这个里面方差具体是怎么预测的？

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

Please thoroughly read README.md, INSTALL.md, GETTING_STARTED.md, and FAQ.md
Please search existing open and closed issues in case your issue has already been reported
Please try to debug the issue in case you can solve it on your own before posting

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

(Delete this line and the text above it.)

Expected results

What did you expect to see?

Actual results

What did you observe instead?

Detailed steps to reproduce

E.g.:

The command that you ran

System information

Operating system: ?
Compiler version: ?
CUDA version: ?
cuDNN version: ?
NVIDIA driver version: ?
GPU models (for all devices if they are not all the same): ?
PYTHONPATH environment variable: ?
python --version output: ?
Anything else that seems relevant: ?

Questions about pre-trained weights

Thanks for open sourcing the KL-loss. I have some questions with regard to training schemes. From configs/e2e_faster_rcnn_R-50-FPN_2x.yaml the pre-trained weights are based on MSCOCO rather than ImageNet. So the total training scheme is 4x? It seems that training more iterations will increase the mAP(1.2 ~ 1.3 points from 1x to 2x) on COCO. But I am not sure about the improvement from 2x to 4x. I wonder do you have comparision with 4x normal training without KL-Loss? Thanks.

what do the 'bbox_inside_weights' and 'bbox_outside_weights' mean?

Sorry but I am not familiar with Caffe2. Could you please tell me what do the 'bbox_inside_weights' and 'bbox_outside_weights' mean in https://github.com/yihui-he/KL-Loss/blob/0a5617f02b5c0ebc57ddedffeb212859f1b3f008/detectron/modeling/fast_rcnn_heads.py#L116?
And the loss_box here https://github.com/yihui-he/KL-Loss/blob/0a5617f02b5c0ebc57ddedffeb212859f1b3f008/detectron/modeling/fast_rcnn_heads.py#L154 represent which part of loss? Why would you set the bbox_outside_weights like this?

AssertionError: Range subprocess failed (exit code: 1)

System information

Operating system: ubuntu16.04
CUDA version: 9.0
pytorch 1.0
1080ti x 1
when I run: python2 tools/test_net.py -c configs/e2e_faster_rcnn_R-50-FPN_2x.yaml
I just get :

Traceback (most recent call last):
  File "/home/user/softer_nms/tools/test_net.py", line 113, in <module>
    check_expected_results=True,
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 128, in run_inference
    all_results = result_getter()
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 125, in result_getter
    gpu_id=gpu_id
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 235, in test_net
    model = initialize_model_from_cfg(weights_file, gpu_id=gpu_id)
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 330, in initialize_model_from_cfg
    model, weights_file, gpu_id=gpu_id,
  File "/home/user/softer_nms/detectron/utils/net.py", line 62, in initialize_gpu_from_weights_file
    src_blobs = load_object(weights_file)
  File "/home/user/softer_nms/detectron/utils/io.py", line 60, in load_object
    return pickle.load(f)
EOFError
Traceback (most recent call last):
  File "tools/test_net.py", line 113, in <module>
    check_expected_results=True,
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 128, in run_inference
    all_results = result_getter()
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 108, in result_getter
    multi_gpu=multi_gpu_testing
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 155, in test_net_on_dataset
    weights_file, dataset_name, proposal_file, num_images, output_dir
  File "/home/user/softer_nms/detectron/core/test_engine.py", line 188, in multi_gpu_test_net_on_dataset
    'detection', num_images, binary, output_dir, opts
  File "/home/user/softer_nms/detectron/utils/subprocess.py", line 95, in process_in_parallel
    log_subprocess_output(i, p, output_dir, tag, start, end)
  File "/home/user/softer_nms/detectron/utils/subprocess.py", line 133, in log_subprocess_output
     assert ret == 0, 'Range subprocess failed (exit code: {})'.format(ret)
AssertionError: Range subprocess failed (exit code: 1)

How to modify this bug

hello:

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

Please thoroughly read README.md, INSTALL.md, GETTING_STARTED.md, and FAQ.md
Please search existing open and closed issues in case your issue has already been reported
Please try to debug the issue in case you can solve it on your own before posting

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

(Delete this line and the text above it.)

Expected results

What did you expect to see?

Actual results

What did you observe instead?

Detailed steps to reproduce

E.g.:

The command that you ran

System information

Operating system: ?
Compiler version: ?
CUDA version: ?
cuDNN version: ?
NVIDIA driver version: ?
GPU models (for all devices if they are not all the same): ?
PYTHONPATH environment variable: ?
python --version output: ?
Anything else that seems relevant: ?

To what part in the equations does each part of the loss correspond?

Hi @yihui-he,
I found this issue: #7, where it is mentioned that there are 3 parts to the KL-Loss:

The normal bbox regression loss: loss_bbox (basically the mean of the bbox coordinate prediction)
bbox_pred_std_abs_logw_loss
bbox_pred_std_abs_mulw_loss

I have a couple of questions. Firstly, to what part of what formula in the paper does each of the above correspond.

Similarly, what do bbox_inside_weights and bbox_outside_weights and 'val' (in comments e.g. line 120) correspond to?

Secondly, I wondered how you backpropagate the gradients from the Loss function, as you use the 'StopGradient' function. Do you backpropagate the gradient from all three components trough the whole network? Or only the normal bbox regression Loss part?

I've never used caffe2 before, so it has taken quite a bit of work to get a feel for the code. As I am trying to implement your work in a (PyTorch) SSD, I want to be sure I do the correct things.

@EternityZY,
I saw you attempted to implement the KL-Loss in YOLOv3. Did you succeed?
As I'm trying to implement the KL-Loss in SSD (a Pytorch version), your YOLOv3 implementation might have some overlap/give some intuition. Would you be willing to you share your code?

How to use this repo for detectron2

I have a trained Mask RCNN model using detectron2. I need to retrain the model from scratch with the KL loss and need box standard deviations also as outputs.
Can you guide me how to use this repo with new detectron2.
Thanks.

why scale = -0.5 in model.net.Scale('bbox_pred_std_abs_log_', 'bbox_pred_std_abs_log', scale=-0.5*model.GetLossScale())

https://github.com/yihui-he/KL-Loss/blob/0a5617f02b5c0ebc57ddedffeb212859f1b3f008/detectron/modeling/fast_rcnn_heads.py#L138-L149

it seems be scale=0.5 in model.net.Scale('bbox_pred_std_abs_log_', 'bbox_pred_std_abs_log', scale=-0.5*model.GetLossScale()), then bbox_pred_std_abs_log=0.5 * log(sigma^2).
This is the same as the if cfg.PRED_STD_LOG branch above.

BTW, the else branch does not seem to get bbox_inws_out = e^{-alpha} * smooth_l1

KL

能具体解析一下下面的代码嘛，谢谢

Code location of KL-Loss

I can't find definition or usage of KL-Loss function in code, also, I cannot find code which presents in Issues 2. @yihui-he, @dh562231640, was I missing something? Or I didn't understand your idea and code implement correctly. Thank you for your answer.

Is KL loss also useful for one-stage detectors like retinanet?

Hi, thanks for your nice open source project.
I'm doing experiments on retinanet with KL loss using my own customized dataset.
Since you have improved the performance of two-stage detectos like FasterRCNN and FPN under common datasets like COCO,
I wonder is KL loss also helpful for one-stage detectors like retinanet with COCO dataset?

And does the weights initialization of uncertainty prediction branch matters? Since FC is used for your two-stage detectors while conv2d is used for my retinanet experiments.

Thanks a lot.

Lreg is always oscillatory and can‘t converge

I try to use the KL loss in Mask R-CNN, though the loss was small at first(Lreg = 0.0256), it didn't change much after training 50000 times(Lreg = 0.0223). The loss is usually 0.0002 when KL is not used after training 50000 times.
The learning rate is 0.002, and I also try to use 0.0002. The learning rate decreases when training after 20000 and 40000 ,decay_gamma = 0.1, max_steps = 50000 .
Here is my KL code.

reg_target = self.bbox_transform_inv_xyxy(proposals[0].bbox, coor_target.bbox)
variance = variance[sampled_pos_inds_subset]
kl_loss = torch.exp(-variance)*smooth_l1_loss(box_regression[sampled_pos_inds_subset[:, None], map_inds],
                                              reg_target[sampled_pos_inds_subset], 1/9, KL=True) + variance/2
box_loss = kl_loss.mean(0).sum()*0.1

Thank you for your help!

implementation with KL-loss

i want to know,do anyone have reproduced the code successfully? i want to apply the code into faster rcnn(tensorflow or pytorch),but i don't know how to do,looking forward to your recovery!

p_i in Eq.11 doesn't included in the code?

Hi, I'm trying to reproduce your paper on tensorflow.
Seems that p_i=exp(-(1-iou)^2/sigma) in Eq.11 doesn't included in the code? Or did I miss something..?
https://github.com/yihui-he/softer-NMS/blob/master/detectron/utils/py_cpu_nms.py#L47

ovr_bbox = np.where((ious[i, :N] > thresh))[0]
avg_std_bbox = (dets[ovr_bbox, :4] / confidence[ovr_bbox]).sum(0) / (1/confidence[ovr_bbox]).sum(0)

3q!!

How to output Box and Box std?

hello， @yihui-he ，I'm reading your paper, confused about the way of model outputting Box and Box std. As figure 3 shows in you paper,

Is the Box std generated by adding another fully connected layer in the model? Beyond that, have you added dropout operation in you model to produce uncertainty?
thanks in advance.

What's the scale of two loss when training KL loss

Sorry for bothering. I am confused when reading source code to find this, so I just asked directly.
when trainning total_loss = cls_loss + a*reg_loss(KL loss)
What's the number of a did you choose to train?

KL-Loss very large

The kl-loss has three parts :bbox_pred_std_abs_mulw_loss, bbox_pred_std_abs_logw_loss,loss_bbox.

When I add it, bbox_pred_std_abs_logw_loss will be a very large negative number, resulting in a final loss=nan. If only loss_bbox is optimized, then log_loss will become a very large positive number, making the final loss_bbox almost zero. How to calculate kl-loss I can reproduce in the code, but how can you train me in the end, can you help me?

Thank you!

ethanhe42 / kl-loss Goto Github PK

kl-loss's Introduction

Bounding Box Regression with Uncertainty for Accurate Object Detection

Table of Contents

Citation

Installation

Testing

Training

PyTorch re-implementations

FAQ

kl-loss's People

Contributors

Stargazers

Watchers

Forkers

kl-loss's Issues

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

Expected results

Actual results

Detailed steps to reproduce

System information

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

Expected results

Actual results

Detailed steps to reproduce

System information

System information

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

Expected results

Actual results

Detailed steps to reproduce

System information

Recommend Projects

Recommend Topics

Recommend Org