sfzhang15 / refinedet Goto Github PK

View Code? Open in Web Editor NEW

1.4K 1.4K 393.0 6.02 MB

Single-Shot Refinement Neural Network for Object Detection, CVPR, 2018

License: Other

CMake 2.24% Makefile 0.57% Shell 0.32% Python 12.77% C++ 77.05% Cuda 6.05% MATLAB 0.72% C 0.22% Dockerfile 0.06%

object-detection

refinedet's People

Contributors

Stargazers

Watchers

Forkers

starstylesky suzhenghang walkoncross longyin880815 caozhengquan cv9527 keyky guitaryourself liuguoyou rkshuai aitechnology zxt881108 baiyancheng20 lukeandshuo sarathknv tqdavid 10183308 wanjinchang wwwanghao ml-lab zgsxwsdxg kixiang farmingyard yiliangnie cvtower zhangxujinsh zmlshiwo themikeyr lengly karolmajek tangyoubao reid3290 dengshuo objectdetection dlluopf likeucode baby47 signalimagecv zhyj3038 gaojie0105 boosting siyiding1216 oylz wavelet303 venus024 jfkkf123 mornydew bityangke bonseyes lz20061213 liuyiran13 wangzhenhua2015 vishal-keshav leesoon1984 liyunfan13 xiongweiwu huajianni666 huipengzhang beniz horaccefeng beprominent ahuirecome quxiaofeng akshaykalkunte chaitanyasiva8 pbdahzou yhkim8412 geekrick88 kkebo aaaqhbd wishinger-li fanxinjs chendan003 tangtangchx reiisky lynnw123 xialuxi nathangq fakeryfx lecea oxyai zhengfangwu calipos wangbingok1118 liangxi627 dreadlord1984 qdet swordlonn hzhang57 lzd0825 ubaidh fukatani qihuacheng shubhampachori12110095 xingwangsfu liubo0902 liu3xing3long chenhongming poodarchu ewenwan

refinedet's Issues

FPS of RefineDet

nice jobs！Have you test the FPS and Boxes of RefineDet512（resnet-101） @sfzhang15 ？

RefileDet_root/examples do not include a CMakeLists.txt

hi, when i make this repo, this argument "add_subdirectory(examples)" in RefineDet_root/CMakeLists.txt need a CMakeLists.txt in Refine_root/examples. Maybe I should comment this line?

Have you tried to set densenet as base network?

Densenet may score a higher mAP compared to VGG and Res101.
If you have finished this, would you mind sharing a Densenet_VOC2007_320.py?
thanks!

Error in VOC0712Plus/create_list.sh

When I run ./data/VOC0712Plus/create_list.sh I got this error:

./data/VOC0712/create_list.sh and ./data/VOC0712/create_data.sh is all good.

no nccl support

There is a nccl option in the config file. But no nccl support in fact?

What‘s the difference between ARM and RPN

Hi,
Can you tell me the difference between ARM and RPN?

when use finetune_VGG16_VOC2012_320.py train model this kind of error happens

@sfzhang15
Hi, my friend also have tried, and he use 320320 following the step I used to train 512512, but he got a problem as follow, it is appreciated that if you can give some helps.

I0128 22:00:27.281307 5775 layer_factory.hpp:77] Creating layer arm_conf_flatten
I0128 22:00:27.281316 5775 net.cpp:100] Creating Layer arm_conf_flatten
I0128 22:00:27.281322 5775 net.cpp:434] arm_conf_flatten <- arm_conf_softmax
I0128 22:00:27.281328 5775 net.cpp:408] arm_conf_flatten -> arm_conf_flatten
I0128 22:00:27.281378 5775 net.cpp:150] Setting up arm_conf_flatten
I0128 22:00:27.281399 5775 net.cpp:157] Top shape: 8 12750 (102000)
I0128 22:00:27.281401 5775 net.cpp:165] Memory required for data: 2383611300
I0128 22:00:27.281406 5775 layer_factory.hpp:77] Creating layer odm_loss
I0128 22:00:27.281430 5775 net.cpp:100] Creating Layer odm_loss
I0128 22:00:27.281433 5775 net.cpp:434] odm_loss <- odm_loc
I0128 22:00:27.281440 5775 net.cpp:434] odm_loss <- odm_conf
I0128 22:00:27.281443 5775 net.cpp:434] odm_loss <- arm_priorbox_arm_priorbox_0_split_1
I0128 22:00:27.281447 5775 net.cpp:434] odm_loss <- label_data_1_split_1
I0128 22:00:27.281450 5775 net.cpp:434] odm_loss <- arm_conf_flatten
I0128 22:00:27.281453 5775 net.cpp:434] odm_loss <- arm_loc_arm_loc_0_split_1
I0128 22:00:27.281460 5775 net.cpp:408] odm_loss -> odm_loss
I0128 22:00:27.281520 5775 layer_factory.hpp:77] Creating layer odm_loss_smooth_L1_loc
I0128 22:00:27.281627 5775 layer_factory.hpp:77] Creating layer odm_loss_softmax_conf
I0128 22:00:27.281647 5775 layer_factory.hpp:77] Creating layer odm_loss_softmax_conf
F0128 22:00:27.281867 5775 multibox_loss_layer.cpp:141] Check failed: num_priors_ * num_classes_ == bottom[1]->channels() (133875 vs. 12750) Number of priors must match number of confidence predictions.
*** Check failure stack trace: ***

batch_size / accum_batch_size

@sfzhang15 HI
比如说在执行ResNet101_COCO_320.py时，因为硬件的限制，只能将batch_size设置为4，但是accum_batch_size＝32，相应的iter_size=8．
换句话说，不论batch_size为多少，因为iter_size的存在，accum_batch_size始终能保持32．所以说，不论batch_size是多少，始终是在处理32张图片后，才执行一次梯度下降。

那能否这样认为：

只要accum_batch_size不变，batch_size无论大小多少，都不会改变模型的训练效果，改变的只是模型训练的快慢．
BN层的使用，需要配合一定数量的batch_size．这里的batch_size指的是'batch_size(4)'还是'accum_batch_size(32)'？

failed to upload the results on test-dev2017

Hi, Did you try to upload the results on coco evaluation server recently ?

I have tried to upload the results on test-dev2017(same as test-dev2015).

By using your refinedet_test.py, I got the result json file(1.33GB).

However, I could't upload the result file(.zip) on that server. Perhaps, the file size is so big.

Could you advice for me to downsize the json file ?

draw_net.py doesn't work

I tried draw_net.py to visualize net structure, but it doesn't work.

> python2 ./draw_net.py ../src/caffe/proto/caffe.proto graph.png
Traceback (most recent call last):
  File "./draw_net.py", line 58, in <module>
    main()
  File "./draw_net.py", line 44, in main
    text_format.Merge(open(args.input_net_proto_file).read(), net)
  File "/usr/local/lib/python2.7/dist-packages/protobuf-3.5.1-py2.7.egg/google/protobuf/text_format.py", line 533, in Merge
    descriptor_pool=descriptor_pool)
  File "/usr/local/lib/python2.7/dist-packages/protobuf-3.5.1-py2.7.egg/google/protobuf/text_format.py", line 587, in MergeLines
    return parser.MergeLines(lines, message)
  File "/usr/local/lib/python2.7/dist-packages/protobuf-3.5.1-py2.7.egg/google/protobuf/text_format.py", line 620, in MergeLines
    self._ParseOrMerge(lines, message)
  File "/usr/local/lib/python2.7/dist-packages/protobuf-3.5.1-py2.7.egg/google/protobuf/text_format.py", line 635, in _ParseOrMerge
    self._MergeField(tokenizer, message)
  File "/usr/local/lib/python2.7/dist-packages/protobuf-3.5.1-py2.7.egg/google/protobuf/text_format.py", line 703, in _MergeField
    (message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 1:1 : Message type "caffe.NetParameter" has no field named "syntax".

I appreciate if anyone can resolve this error. Thanks.

The amount relationship of two location loss

Hi, Mr Zhang. A wonderful work you have done!
I am very curious about the two location losses in arm and odm. Is there any relationship between their amount? For example, the location in arm is coarser than that of odm. So perhaps the location loss of odm is smaller than that of arm (without considering their prediction). I also tried to display these two losses and found that it was not the case. One of these two terms is bigger than the other with almost the same chance (or location loss of odm is bigger than that of arm). I also observed that the number of postive match of odm is almost always bigger than that of arm.

In my opinion, the differences between these two losses come from
1 different predictions due to different feature and initialization, and the resultant number of positive and negative match
2 odm receives filtered and refined boxes from arm as its prior boxes

How to improve training speed?

I think your work is really great. I try to run your code on Tesla V100 x 4, the speed of VGG16_VOC2007_512.py is 1.2 seconds/iter, and it may take around 40 hours to train the whole 120k itertions.

Do you have any idea to improve the speed?

I think connections between TCB may cause the whole model slow down. I think your idea is inspired by FPN. Did you try to remove connections between TCB? Then the whole model'll be more parallel. ( I know this may hurt the performance...)

detection_output_layer is only supported on cpu

hi @sfzhang15 ,I found DetectionOutput only supported on cpu without detection_output_layer.cu . In resize 512*512, the FPS of refineDet is higher than SSD(24 vs 19 ).Maybe Faster move to gpu?

About the GPU memory

I changed the gpus for a single one and run the script examples/refinedet/VGG16_VOC2007_320.py in GTX 1080 Ti (11G), but it turned out out of memory error . Does it need more than 11G GPU memory?

Unable to compile with `make all`, protoc error

Getting the following error while make all:

PROTOC src/caffe/proto/caffe.proto
CXX .build_release/src/caffe/proto/caffe.pb.cc
In file included from .build_release/src/caffe/proto/caffe.pb.cc:5:0:
.build_release/src/caffe/proto/caffe.pb.h:12:2: error: #error This file was generated by a newer version of protoc which is
 #error This file was generated by a newer version of protoc which is
  ^
.build_release/src/caffe/proto/caffe.pb.h:13:2: error: #error incompatible with your Protocol Buffer headers. Please update
 #error incompatible with your Protocol Buffer headers.  Please update
  ^
.build_release/src/caffe/proto/caffe.pb.h:14:2: error: #error your headers.
 #error your headers.
  ^
In file included from .build_release/src/caffe/proto/caffe.pb.cc:5:0:
.build_release/src/caffe/proto/caffe.pb.h:23:35: fatal error: google/protobuf/arena.h: No such file or directory
compilation terminated.
Makefile:582: recipe for target '.build_release/src/caffe/proto/caffe.pb.o' failed
make: *** [.build_release/src/caffe/proto/caffe.pb.o] Error 1

I think it is because I have installed an incompatible version of protoc.

I have updated my $PYTHONPATH.

System is running Ubuntu 16.04, GTX 1080.

Removed detection_output_layer.cu?

Hi Do you plan to implement this in the future?

this kind of problem occured during the process of training

@sfzhang15

I0119 17:40:49.416178 12387 solver.cpp:243] Iteration 380, loss = 7.68249
I0119 17:40:49.416339 12387 solver.cpp:259] Train net output #0: arm_loss = 3.80381 (* 1 = 3.80381 loss)
I0119 17:40:49.416348 12387 solver.cpp:259] Train net output #1: odm_loss = 4.65142 (* 1 = 4.65142 loss)
I0119 17:40:49.416353 12387 sgd_solver.cpp:138] Iteration 380, lr = 5e-05
OpenCV Error: Assertion failed ((scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F)) in cvtColor, file /home/qi/package/software_package/opencv-3.2.0/modules/imgproc/src/color.cpp, line 9815
terminate called after throwing an instance of 'cv::Exception'
what(): /home/qi/package/software_package/opencv-3.2.0/modules/imgproc/src/color.cpp:9815: error: (-215) (scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F) in function cvtColor

*** Aborted at 1516354852 (unix time) try "date -d @1516354852" if you are using GNU date ***
PC: @ 0x7f7e19d82428 gsignal
*** SIGABRT (@0x3e800003063) received by PID 12387 (TID 0x7f7de6e13700) from PID 12387; stack trace: ***
@ 0x7f7e19d824b0 (unknown)
@ 0x7f7e19d82428 gsignal
@ 0x7f7e19d8402a abort
@ 0x7f7e1ab9584d __gnu_cxx::__verbose_terminate_handler()
@ 0x7f7e1ab936b6 (unknown)
@ 0x7f7e1ab93701 std::terminate()
@ 0x7f7e1ab93919 __cxa_throw
@ 0x7f7e11097202 cv::error()
@ 0x7f7e11097393 cv::error()
@ 0x7f7e0f69e43e cv::cvtColor()
@ 0x7f7e1be8a86c caffe::AdjustHue()
@ 0x7f7e1be8f0ab caffe::RandomHue()
@ 0x7f7e1be8fd44 caffe::ApplyDistort()
@ 0x7f7e1bc9c3b2 caffe::DataTransformer<>::DistortImage()
@ 0x7f7e1bcf3e86 caffe::AnnotatedDataLayer<>::load_batch()
@ 0x7f7e1bd1a01f caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@ 0x7f7e1bc91345 caffe::InternalThread::entry()
@ 0x7f7e0eaa95d5 (unknown)
@ 0x7f7e0e3576ba start_thread
@ 0x7f7e19e5382d clone
@ 0x0 (unknown)

How about RefineDet_Res101_VOC results?

Hi, I think your RefineDet is very impressive result.

I wonder the accuracy and FPS of RefineDet_Res101_VOC_512.

Have you tried it ?

NameError: global name 'DeconvBNLayer' is not defined

编译/数据集/预训练模型都已准备完，执行python examples/refinedet/VGG16_VOC2007_320.py
时，提示：
Traceback (most recent call last):
File "examples/refinedet/VGG16_VOC2007_320.py", line 430, in
AddExtraLayers(net, use_batchnorm, arm_source_layers, normalizations, lr_mult=lr_mult)
File "examples/refinedet/VGG16_VOC2007_320.py", line 61, in AddExtraLayers
DeconvBNLayer(net, from_layer, out_layer, use_batchnorm, False, 256, 2, 0, 2, lr_mult=lr_mult)
NameError: global name 'DeconvBNLayer' is not defined

Can not download the model in China

国内貌似下载不了那几个模型文件，可以提供个不被墙的下载链接么？谢谢

how to measure fps?

Hi,

How to measure the inference speed?

To my knowledge, SSD's author measured the fps by train log file that showed train loss and evaluated results.

Training and testing on my own dataset

Hi, Thank you for your great work first!
I am running my own training on my own dataset and just a two classes detection. And now the problem is. I think I got trained prototext and cafffemodel successfully, but I don't know how to make detection on test set while I am not sure if you have saving json file or drawing bounding box part in text/refinedet_test.py. And refinedet_test.py are written just for VOC and COCO
So, could you provide advice about generating detection results based on trained caffemodel and dataset.

RefineDet_Res101_COCO512 results on test-dev-2017

Hi,

I have researched backbone networks for object detector using your wonderful RefineDet.

So I replaced Res101-bacbkone with my networks on RefineDet and evaluted the performance on coco test-dev.

For fair comparison, I just followed your training details in your your code..

but my networks reached 0.361% on coco test-dev. this results is little lower than your Res101model.

So I downloaded your model and evaluated it on coco test-dev using your single_test_deploy of your final model.

The coco evaluation server showed your model's performance like this.

This results are not same as your paper. So I want to know your exact details for your paper results.

why arm_loss = 0 and odm_loss = 0?

Hi, Zhang,
When I fine-tune the voc&coco model on the widerface datasets, it some time comes out arm_loss and odm_loss = 0, the whole loss is not zero.
Iteration 1030, loss = 8.67766
I0515 18:38:45.768995 29596 solver.cpp:259] Train net output #0: arm_loss = 0 (* 1 = 0 loss)
I0515 18:38:45.769006 29596 solver.cpp:259] Train net output #1: odm_loss = 0 (* 1 = 0 loss)
Have you have seen this before, and what makes this?
Thanks!

Could you share your train log?

@sfzhang15 Hi! Thanks for your great job! When I use your code to train res101 on coco, we found the training loss is so high(both arm and odm), the total loss is always around 10 (the learning rate is 0.001), so I want to know is it normal?

Number of Boxes about yolo v2

Hi, In yolo v2:
416x416 input: boxes = (416/32)x(416/32)x5=13x13x5=845
544x544 inut: boxes = (544/32)x(544/32)x5=17x17x5=1445
So in Readme.md, yolo v2 544x544 number of Boxes should be 1445

arm and odm are all based on prior_bboxes in fucntion CasRegFindMatches ?

    The odm location should be based on arm result in paper,but in CasRegFindMatches (),

if (!use_prior_for_matching) { for (int c = 0; c < loc_classes; ++c) { int label = share_location ? -1 : c; if (!share_location && label == background_label_id) { // Ignore background loc predictions. continue; } // Decode the prediction into bbox first. vector<NormalizedBBox> loc_bboxes; bool clip_bbox = false; DecodeBBoxes(prior_bboxes, prior_variances, code_type, encode_variance_in_target prior_bboxes, prior_variances, code_type, encode_variance_in_target, clip_bbox, all_loc_preds[i].find(label)->second, &loc_bboxes); MatchBBox(gt_bboxes, loc_bboxes, label, match_type, overlap_threshold, ignore_cross_boundary_bbox, &match_indices[label], &match_overlaps[label]); }

The MultiBoxLoss laye of odm moudule seems to be based on prior_bboxes instead of prior_bboxes + arm_loc because no arm parameters in DecodeBBoxes()?

normalize_layer

In the normalize_layer.cpp,

RefineDet/src/caffe/layers/normalize_layer.cpp

Lines 109 to 115 in 592bbc1

    
           caffe_cpu_gemv<Dtype>(CblasTrans, channels, spatial_dim, Dtype(1), 
        
                                 buffer_data, sum_channel_multiplier, Dtype(1), 
        
                                 norm_data); 
        
           // compute norm 
        
           caffe_powx<Dtype>(spatial_dim, norm_data, Dtype(0.5), norm_data); 
        
           // scale the layer 
        
           caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, channels, spatial_dim,

The code is not a standard L2-norm. You only added the different channel and then sqrt the data. Why not using standard L2-norm?

Just Train Network for Person Detection

Dear @sfzhang15,
Thank you for your work. I have a general question. In the VOC and COCO datasets there are some classes (e.g., 20 or 80 classes (person, car, horse, etc.) for object detection). If one train the network just for one class (e.g., just the person class) will the accuracy (mAP) of person detection increase?

Merge to my own caffe

Hi @sfzhang15 , I've trained your method on my own dataset and it works pretty well. Now that I'd like to merge the related code to my own Caffe, do you know what layers/files I need to copy/modify? As far as I can tell, there is no new layer used during training/testing, and a new "objectness_score" parameter is added in src/caffe/proto/caffe.proto (line 1038 and 1319)

Is the training process as the same with faster rcnn?

And the follow step is necessary?Thanks
3Create the LMDB file.

You can modify the parameters in create_data.sh if needed.

It will create lmdb files for trainval and test with encoded original image:

- $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb

- $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb

and make soft links at examples/VOC0712/

cd $RefineDet_ROOT
./data/VOC0712/create_data.sh

multi-scale training/testing

How to understand this sentence: The multi-scale testing is used to reduce the impact of input size for a fair comparison
why "multi-scale testing" can reduce the impact of input size?

another question is: how is multi-scale testing/training implemented?

train all the scales, so u had to train/test many times
randomly select a scale from multiple scales, so u only need to train/test once

Does it require Nvidia GPU to run?

It seems your code requires Nvidia GPU to run Is there a way to run it without GPU?

Thanks,

Layer weight copy mismatch

Hi @sfzhang15 , I tried to finetune my own model with the scripts generated by examples/refinedet/finetune_VGG16_VOC2012_320.py and models/VGGNet/VOC0712/refinedet_vgg16_320x320_coco/coco_refinedet_vgg16_320x320.caffemodel. But when I run the training with train.prototxt, it said:

Cannot copy param 0 weights from layer 'P3_mbox_conf'; shape mismatch. Source param shape is 63 256 3 3 (145152); target param shape is 6 256 3 3 (13824). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

Do you know what's wrong? Thank you!

Installation instructions unclear

The 1st step of installation says:

Get the code. We will call the directory that you cloned Caffe into $RefineDet_ROOT.

I am not able to understand if the sentence is to be parsed as "We will call [the directory that you cloned Caffe into] $RefineDet_ROOT" or that RefineDet will call the directory where caffe is installed, or that one has to clone Caffe somewhere within RefineDet's repo.

mxnet

Im got a bit strange question) Why you still using caffe instead mxnet for example? Are there some advantages?

very very very very good job!!!

thanks for your great job ! i think it is a significant advance for one-stage method.
my question is that why only use three anchors [1,1\2,2]
Do you try more anchors?

source code comparing with original ssd

Hi, I found no new layers compared with original ssd in prototxt so I wonder could I use the caffe built https://github.com/weiliu89/caffe/tree/ssd/examples/ssd here for experiments on this repo. Thanks.

train.prototxt

@sfzhang15 HI

关于使用ResNet的prototxt的几个问题（以内都是选自您refinedet_resnet101_320x320中的train.prototxt）:
1.
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 64
bias_term: false
pad: 3
kernel_size: 7
stride: 2
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
在微调时，convolution_param 中的参数 weight_filler 是不是没有用，因为它会被预训练模型覆盖，根本不需要初始化．这里的理解对吗？？

layer {
name: "bn_conv1"
type: "BatchNorm"
bottom: "conv1"
top: "conv1"
}

对于BatchNoram层，看网上很多都是如下设置的：
layer {
bottom: "res5c_branch2b"
top: "res5c_branch2b"
name: "bn5c_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
}
　1) 为什么你这里没有这３个默认的param？
　2) use_global_stats在训练时不是为false么，为什么看很多都是true？

为了描述准确，这里用了中文，麻烦了！

Training NAN

Hi, Zhang
I use the RefineDet architecture to train on widerface, the pretrained model is vgg.
when the iteration round 40, it comes out NAN.
the first iteration loss is about 10plus.
can't figure out the reason, could you give a propsal?
thank you!

Train in own dataset

Thank you for your great repo.
However, I want to know more instructions about how to train in my own dataset. What about the structure of the dataset file. I have images and annotations in .xml file. And it is just a two class detection.

Thank you in advance.

F0325 db_lmdb.hpp:15] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory

when i run ssd_pascal.py to train VOC, I got this error:
I0325 22:45:37.875373 3664 layer_factory.hpp:77] Creating layer data
I0325 22:45:37.882928 3664 net.cpp:100] Creating Layer data
I0325 22:45:37.883014 3664 net.cpp:408] data -> data
I0325 22:45:37.883144 3664 net.cpp:408] data -> label
F0325 22:45:37.885095 3685 db_lmdb.hpp:15] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory
*** Check failure stack trace: ***
@ 0x7f022621c5cd google::LogMessage::Fail()
@ 0x7f022621e433 google::LogMessage::SendToLog()
@ 0x7f022621c15b google::LogMessage::Flush()
@ 0x7f022621ee1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f0226ab42f0 caffe::db::LMDB::Open()
@ 0x7f0226b216d6 caffe::DataReader<>::Body::InternalThreadEntry()
@ 0x7f02268ed725 caffe::InternalThread::entry()
@ 0x7f021a5c15d5 (unknown)
@ 0x7f0214c006ba start_thread
@ 0x7f022527341d clone
@ (nil) (unknown)
Aborted (core dumped)
and when i try to open VOC0712_test_lmdb, it tells me this:

The link "VOC0712_test_lmdb" is broken,Move it to Trash?
This link cannot be used because its target “/home/chopin/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb” doesn't exist.

I'm confused by this problem a few days , and i found many ask, such as edit path in prototxt files,but this problm still occured. I can run ssd in picture webcam video normally. thanks! Please help me,thanks!

Questions about two-stage regression

Hi, the ODM and ARM use the same features for anchor regression so I think it's the same as only using ODM to do regression. I can understand that in Faster R-CNN, the regression in RPN is useful because it changes the features for second stage regression. But in this work, the two-stage regression looks like two independent regressions and the results are merged at the final. Do you have some comments on why this two-stage regression is better than only use ODM for regression? Thanks.

about odm

i just want to ask how many refined anchor boxes trans from arm to odm. for example, refinedet320, in arm stage, contains 6375 boxes. in this situation, how many refined anchors are left.

How to evaluate VOC2012 test?

Hi,

Recently, I have researched my backbone networks and replaced my networks with Res101 in your RefineDet.

I tried to upload VOC2012 test results. So by using the refinedet_test.py, I got the detections.pkl file.

Then, I don't know how to deal with that results.pkl file.

Could you let me know how to upload VOC2012 test results?

Thanks in advance.

parameter setting in multiscale testing code

@sfzhang15 , Hi

I have read your multiscale testing code and have a question about parameter setting.

In function multi_scale_test_net_320(), 6 different input scales (0.6x, 1.2x, 1.4x, 1.6x, 1.8x, 2.2x) are used for testing. After detection, the following codes are used for refining outputs:

...
# for 0.6x input size
index = np.where(np.maximum(det1[:, 2] - det1[:, 0] + 1, det1[:, 3] - det1[:, 1] + 1) > 32)[0]
...
# for 1.2x input size
index = np.where(np.minimum(det2[:, 2] - det2[:, 0] + 1, det2[:, 3] - det2[:, 1] + 1) < 160)[0]
...
# for 1.4x input size
index = np.where(np.minimum(det3[:, 2] - det3[:, 0] + 1, det3[:, 3] - det3[:, 1] + 1) < 128)[0]
...
# for 1.6x input size
index = np.where(np.minimum(det4[:, 2] - det4[:, 0] + 1, det4[:, 3] - det4[:, 1] + 1) < 96)[0]
...
# for 1.8x input size
index = np.where(np.minimum(det5[:, 2] - det5[:, 0] + 1, det5[:, 3] - det5[:, 1] + 1) < 64)[0]
...
# for 2.2x input size
index = np.where(np.minimum(det7[:, 2] - det7[:, 0] + 1, det7[:, 3] - det7[:, 1] + 1) < 32)[0]
...

In addition, some detection outputs are not suppressed as above codes in function multi_scale_test_net_512().

I wonder how these bbox size thresholds and refine strategies are chosen. Thanks.

how do i define arm/odm_source_layers for other networks

Hi, how do i choose *_source_layers for other networks (like deepnet; inception; etc).

What batch size you use for using ResNet-101?

Hi,

When I set the batch size 20 in your code, out of memory problem happened on TITAN X GPU(4GPU machine).

Perhaps, 5 per a GPU batch could be not fetched.

How to set batch size in your case and training environment on 512? and Can I get the exact results(36.4%) on COCO512 following the your code setting such as multi_step_value:[400k: 480k: 540k] and batch size? or other training details?

When I tried to train your code on P40 GPUs(24GB memory), this model occupied about 20-21GB memory on each GPU.

problem in caffe's make test,make runtest process

hi,sir:
thank you for your great job,when compling your caffe code ,it is ok in make -j
8 and make pycaffe process, but some problems occured in the make test and runtest process ,it tells the following errors :

.../RefineDet/src/caffe/test/test_bbox_util.cpp:757:59: error: no matching function for call to ?.etGroundTruth(float*&, const int&, int, bool, std::map<int, std::vectorcaffe::NormalizedBBox >*)?
GetGroundTruth(gt_data, num_gt, -1, true, &all_gt_bboxes);

do you know why ,how to fix it ,is it necessary to run make test command or I can skip it ,have anyone successfully compiling the code,tell me ,thank u!

No paper

论文的地址不对呀

	caffe_cpu_gemv<Dtype>(CblasTrans, channels, spatial_dim, Dtype(1),
	buffer_data, sum_channel_multiplier, Dtype(1),
	norm_data);
	// compute norm
	caffe_powx<Dtype>(spatial_dim, norm_data, Dtype(0.5), norm_data);
	// scale the layer
	caffe_cpu_gemm<Dtype>(CblasNoTrans, CblasNoTrans, channels, spatial_dim,