goatmessi7 / rfbnet Goto Github PK

View Code? Open in Web Editor NEW

1.4K 47.0 357.0 1.87 MB

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

License: MIT License

Python 91.79% Shell 0.86% C++ 0.07% Cuda 2.38% C 4.90%

detection pytorch mobilenet rfbnet

rfbnet's People

Contributors

Stargazers

Watchers

Forkers

issac8huxley tqdavid liuguoyou dl-85 10183308 wanjinchang statml likeucode csgaobb wwwanghao wjgaas arsenluca aymenx17 lilacyue hdjang dengshuo fanxianyou horaccefeng liangxi627 zxt881108 laycoding elejke vandmoon starstylesky xialuxi jacke121 fostorhunt mahlermozart zhanghaoinf hsakas shubhampachori12110095 qdet 32l liujie3948 hzhang57 wjyao runauto kyocen halfanengineer fqss0436 cvtower zgsxwsdxg dicksonyuan aust-hansen anguoyang dreadlord1984 northrend shlpu zhangjunyi1225054736 matrixplayer picekl grseb9s xtanitfy felixcaae hxl1990 liben2018 lighttoyang onexuan ml-lab zhiweiyan-96 kadeng gaobb wattx huishuai-nuist piaomiaoju snooble zbxzc35 xiaoyigwr litingsjj arasharchor huaifeng1993 liyibest angleboy8 zhancr wanggs950730 amigocdt zqdeeplearning zqdeepbluesky hdjsjyl haonan-qin vikasmech deftruth seongkyun wxinbeings wyxhahaha lizhen2017 chl916185 thoringondor ieyer humengdoudou ddeeppnneett haochen-rye shentanyue tatsuyashirakawa m0redr1nk vsunn foreverfruit liweiq xinxin12345 kixiang

rfbnet's Issues

When I finished the training step and use the trained weight to test ,the mismatch error below happened .
I used my dataset to make the VOClike dataset and trained the net .Where should I changed when I use the weight to test ?
RuntimeError: Error(s) in loading state_dict for RFBNet:
size mismatch for conf.0.weight: copying a param of torch.Size([24, 512, 3, 3]) from checkpoint, where the shape is torch.Size([126, 512, 3, 3]) in current model.
size mismatch for conf.0.bias: copying a param of torch.Size([24]) from checkpoint, where the shape is torch.Size([126]) in current model.

The FPS of SSD300* is different between the code and paper.

Hello, first of all, thanks for your code releasing.
The FPS of SSD300* is 46 in your code. But in your paper it's 120. So i just want to know which is faster ?SSD300 or RFBNet300?

inds = torch.nonzero(scores[:,j]>0.01).view(-1) is super time consuming.

Hi, when I test the result, I found that even though other parts is pretty fast, the nms tims cost is pretty high.

In this case, I test the time cost step by step and found that inds = torch.nonzero(scores[:,j]>0.01).view(-1) is super time consuming. It will takes nearly 50ms per iteration in k40c.

Does anyone has any ideas about that?

Contributors

Hi,

are you looking for contributors / partners in science ? :)

Lukas

what is the difference between 'RFB_Net_E_vgg.py' and 'RFB_Net_vgg.py'?

Hello,thanks for your code.But I am confused by the file 'RFB_Net_E_vgg.py' and 'RFB_Net_vgg.py'.I've seen there are some differences in these two files, but I don't understand what the different purposes of the two of them are. Can you tell me when should I use the former and when the latter?

ImportError: cannot import name '_mask'

Hi, thank you for releasing your code. The idea of this paper is amazing.
I tried to train the model, but I met an error as follows:

python train_RFB.py
Traceback (most recent call last):
File "train_RFB.py", line 14, in
from data import VOCroot, COCOroot, VOC_300, VOC_512, COCO_300, COCO_512, COCO_mobile_300, AnnotationTransform, COCODetection, VOCDetection, detection_collate, BaseTransform, preproc
File "/data_1/models/RFBNet-master/data/init.py", line 3, in
from .coco import COCODetection
File "/data_1/models/RFBNet-master/data/coco.py", line 21, in
from utils.pycocotools.coco import COCO
File "/data_1/models/RFBNet-master/utils/pycocotools/coco.py", line 55, in
from . import mask as maskUtils
File "/data_1/models/RFBNet-master/utils/pycocotools/mask.py", line 4, in
from . import _mask
ImportError: cannot import name '_mask'

How can I fix it? Thanks a lot.

The final loss

Good work! Can you tell me the final loss when you finish training. And how should I judge it? Thanks a lot!

how to deal with the bug of make.sh

Hi !@ruinmessi
I run the ./make.sh
but get
g++ -pthread -shared -B /home/liye/anaconda3/compiler_compat -L/home/liye/anaconda3/lib -Wl,-rpath=/home/liye/anaconda3/lib,--no-as-needed build/temp.linux-x86_64-3.6/nms/nms_kernel.o build/temp.linux-x86_64-3.6/nms/gpu_nms.o -L/usr/local/cuda-8.0/lib64 -L/home/liye/anaconda3/lib -R/usr/local/cuda-8.0/lib64 -lcudart -lpython3.6m -o /media/ubuntue/extdisk1/liye/RFBNet-master/utils/nms/gpu_nms.cpython-36m-x86_64-linux-gnu.so
g++: error: unrecognized command line option ‘-R’
error: command 'g++' failed with exit status 1

why?
can you help me?

Can not to test my dataset

when I used my dataset to train the net,after 300 epoches the lowest Location loss is under 1 and the class loss is still exeed 1 ,and I use the weight to test my test set ,but I found that if I changed the class_num in the test script ,the mismatch error will happen.so I used the VOC class_num 21 the test script can run normally .But the result is none ,the weight I trained 300 epoches detect nothing,the AP,mAP is 0. Can you give me some suggestions about how to change the situation? Thanks

RFB_mobiel_20_7.pth load failed

log:

RuntimeError: Error(s) in loading state_dict for RFBNet:
While copying the parameter named "conf.0.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.0.weight", whose dimensions in the model are torch.Size([126, 512, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 512, 1, 1]).
While copying the parameter named "conf.1.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.1.weight", whose dimensions in the model are torch.Size([126, 1024, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 1024, 1, 1]).
While copying the parameter named "conf.2.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.2.weight", whose dimensions in the model are torch.Size([126, 512, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 512, 1, 1]).
While copying the parameter named "conf.3.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.3.weight", whose dimensions in the model are torch.Size([126, 256, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 256, 1, 1]).
While copying the parameter named "conf.4.bias", whose dimensions in the model are torch.Size([84]) and whose dimensions in the checkpoint are torch.Size([324]).
While copying the parameter named "conf.4.weight", whose dimensions in the model are torch.Size([84, 256, 1, 1]) and whose dimensions in the checkpoint are torch.Size([324, 256, 1, 1]).
While copying the parameter named "conf.5.bias", whose dimensions in the model are torch.Size([84]) and whose dimensions in the checkpoint are torch.Size([324]).
While copying the parameter named "conf.5.weight", whose dimensions in the model are torch.Size([84, 128, 1, 1]) and whose dimensions in the checkpoint are torch.Size([324, 128, 1, 1]).

Did you test this mobilenet + RFB ?

mobilenet training schedule

can you give me some suggestions about the training parameters while training the mobile net version?

Why is your model‘s inference so much faster than original ssd？

More details about the parameters in Table 3 of your paper

Thanks for your brilliant work. I want to know how to calculate the parameters in Table 3?
Such as RFB's parameters 34.5M in Table 3, I do not get the result, so can you give me some details about it?

I calculate the parameters as follows:
(1)VGG16(base net)
conv1_1: weights 64(output numbers)x3(input numbers)x3x3(kernel size)=1728; bias 64(output numbers)
conv1_2: weights 64x64x3x3=36864; bias 64
......
fc7: weights 1024x1024x1x1=1048576; bias 1024
total: 12220096(weights)+6272(bias)=12226368=12.226368M
(2)RFB(fc7)
branch0: (1x1conv)weights 256x1024x1x1=262144, bn(batch norm) 256(output numbers)x2=512
(3x3conv)weights 256x256x3x3=589824, bn(batch norm) 256(output numbers)x2=512
branch1:(1x1conv)weights 128x1024x1x1=131072, bn(batch norm) 128(output numbers)x2=256
(3x3conv)weights 256x128x3x3=294912, bn(batch norm) 256(output numbers)x2=512
(3x3conv)weights 256x256x3x3=589824, bn(batch norm) 256(output numbers)x2=512
branch2:(1x1conv)weights 128x1024x1x1=131072, bn(batch norm) 128(output numbers)x2=256
(3x3conv)weights 192x128x3x3=221184, bn(batch norm) 192(output numbers)x2=384
(3x3conv)weights 256x192x3x3=442368, bn(batch norm) 256(output numbers)x2=512
(3x3conv)weights 256x256x3x3=589824, bn(batch norm) 256(output numbers)x2=512
ConvLinear: weights 1024x768x1x1=786432, bn(batch norm) 1024(output numbers)x2=2048
shortcut: weights 1024x1024x1x1=1048576, bn(batch norm) 1024(output numbers)x2=2048
total: 5087232(weights)+8064(bn)=5095396=5.095396M
(3)RFB(stride 2 or conv8)
total: 4169728(weights)+6016(bn)=4175744=4.175744M
(4)RFB(stride 2 or conv9)
total: 1042432(weights)+3008(bn)=1045440=1.04544M
(5)conv10_1: weights 128x256=32768; bn 128x2=256
conv10_2: weights 256x128x3x3=294912; bn 256x2=512
conv11_1: weights 128x256=32768; bn 128x2=256
conv11_2: weights 256x128x3x3=294912; bn 256x2=512
total: 655360(weights)+1536(bn)=656896=0.656896M
(6)multi_box
conv4_3: conf weights (21x6)x512x3x3=580608; conf bias 21x6=126
loc weights (4x6)x512x3x3=110592; loc bias 4x6=24
fc7:conf weights (21x6)x1024x3x3=1161216; conf bias 21x6=126
loc weights (4x6)x1024x3x3=221184; loc bias 4x6=24
conv8:conf weights (21x6)x512x3x3=580608; conf bias 21x6=126
loc weights (4x6)x512x3x3=110592; loc bias 4x6=24
conv9:conf weights (21x6)x256x3x3=290304; conf bias 21x6=126
loc weights (4x6)x256x3x3=55296; loc bias 4x6=24
conv10_2:conf weights (21x4)x256x3x3=193536; conf bias 21x4=84
loc weights (4x4)x256x3x3=36864; loc bias 4x4=16
conv11_2:conf weights (21x4)x256x3x3=193536; conf bias 21x4=84
loc weights (4x4)x256x3x3=36864; loc bias 4x4=16
total: 3571200(weights)+800(bn)=3572000=3.572M
above all, the total parametes are: 12.226368+5.095396+4.175744+1.04544+0.656896+3.572=26.771844M < 34.5M
Do you think my calculation is correct? Why do I calculate so many fewer parameters?
Any comments will be appreciated.
Thanks.

some question about l2 loss

作者你好，我发现您的代码和原始的SSD相比，没有了L2 loss，原本的L2正则化是接在conv4-3后面的，但是现在没有了，请问如果加上会有什么影响吗？谢谢您的回答！

Results reproducibility

Hello, @ruinmessi. Thank you for great research and open source implementation of paper. I tried your implementation on COCO and was disappointed by results.

Shortly, I used your code (without any changes) and your weights for validation on COCO 2014 minival and got next results.

For RBF512-E:

alexkirnas@dev:~/Projects/RFBNet$ CUDA_VISIBLE_DEVICES='0' python test_RFB.py -d COCO \
-v RFB_E_vgg -s 512 --trained_model ./weights/RFB512_E_34_4.pth --save_folder=./eval

....
A lot strings
....

~~~~ Summary metrics ~~~~
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.186
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.273
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.204
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.137
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.271
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.271
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.287
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.430
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.484
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.250
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.521
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.686

For RBF-Mobile:

alexkirnas@dev:~/Projects/RFBNet$  CUDA_VISIBLE_DEVICES='0' python test_RFB.py -d COCO \
-v RFB_mobile -s 300 --trained_model ./weights/RFB_mobile_20_7.pth --save_folder=./eval
....
A lot strings
....

~~~~ Summary metrics ~~~~
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.135
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.217
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.144
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.015
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.159
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.258
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.207
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.300
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.311
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.035
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.328
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.584

Note: I run code on Python 3.6 with newest PyTorch (v 0.3.0, installed via pip) and CUDA8.0.

This results is to much different from reported one in the paper so this is cannot be connected with usage of different data (your report results on trainval35k set). Can you reproduce your results with newest PyTorch?

Thanks,
Alex

Reproducibility of RFB speed

Hi,

I think you should add torch.cuda.synchronize() inside timer(e.g. after net(x) ), because CUDA is asynchronous.
By adding this, I got ~0.12s/forward.

Final detection_eval = 0.737428

Thanks your code! I run your code to training, After 120K iterations, the loss=3.09611, but the detection_eval = 0.737428. It's lower than your paper's result. Is something wrong with my train?

Do you have RetinaNet + Resnet101

Can this model be used together with resnet101 and is so, Do you guys the results of RetinaNet + Resnet101?
Is it possible to share please?

gpu_nms bug

Hi, I ran the following code after running ./make.sh

from utils.nms_wrapper import nms
import numpy as np

for x in range(10):
    #generate random detection box
    x1y1 = np.random.randint(0, 600,(100,2))
    x2y2 = x1y1+np.random.randint(0, 600,(100,2))
    conf = np.random.random((100,1))

    dets = np.concatenate([x1y1,x2y2,conf],axis=1).astype(np.float32)

    keep_cpu = nms(dets.copy(), 0.45, force_cpu=True)
    keep_gpu = nms(dets.copy(), 0.45, force_cpu=False)
    print ("CPU:%s, GPU:%s" %(len(keep_cpu),len(keep_gpu)))

But this results,

CPU:70, GPU:100
CPU:67, GPU:100
CPU:64, GPU:100
CPU:70, GPU:100
CPU:73, GPU:100
CPU:69, GPU:100
CPU:70, GPU:100
CPU:68, GPU:100
CPU:70, GPU:100
CPU:67, GPU:100

Is gpu_nms working properly on your machine?

The structure of BasicRFB is not same with Fig.4.(a) in your paper

Hello, thanks for your code releasing.

There is a branch of 1×1 conv tailed by 3×3 conv rate=1 in Fig.4.(a) in your paper(arXiv:1107.07767v3),which is not shown in class BasicRFB of models/RFB_Net_vgg.py.
There are other two branch(self.branch0 and self.branch1 in class BasicRFB) and a shortcut connection in your code. So which choice is more helpful in performance?

check something!

@ruinmessi
hey!
Can you give me your 2015test-dev result (a .json file ) which you submitted to server，I just want to check my results. thanks ，my email：[email protected]

thanks!

the difference of the backbone between vgg and mobilenet

Hi, Songtao Liu:
Glad to read your paper, one question has no relationship with the code.
In your paper, it seems that the vgg is much powerful than mobilenet, For example, VGGNet300 + RFB in coco test could get MAP30.3, while the SSD 300 MobileNet+RFB only get MAP20.7
my conclusion is right or not ?

Loss_l : inf in training ?

Hello, first of all, thanks for your code releasing.
I got the training loss inf, acutally loss_l = inf, i use your original code (only fixed some bug), but i don't know why i got inf.
Parameters: lr:0.004, batchsize:32, base_model:vgg_reducedfc.pth
GPU: 1080ti

Any comments will be appreciated.
Thanks very much!

USE multi-scale testing strategy

hello
Do you use multi-scale testing strategy in test_RFB.py ?
your paper seems not mention about using multi-scale testing strategy.
thanks!

VOC and COCO results reproduction problem

Hello, first of all, thanks for your code releasing.

Since I want to see whether the reported accuracy is reproducible or not, I trained exactly the same code on the git. However, even I tried several times, your reported accuracy of voc2007(80.5%) and COCO(29.9%) is not attainable. I got 79.9% and 28.8% respectively.

For the fair comparison, I trained SSD using same training scheme as RFBNet and I obtained 78.8%.

Any comments will be appreciated.

Thanks.

High overhead GPU to CPU

The conversion of boxes (cuda float tensors) which are returned from the detector forward to cpu float tensors has extremely high overhead. (I ignored the conversion to numpy array,takes about a microsecond)

boxes = boxes.cpu().numpy()

It takes approximately 22 milliseconds on a 512 input size (detection time is approximately 9 milliseconds)

regarding min_sizes and max_size in config file

can you please suggest how should i define the min_sizes and max_sizes for a costume dataset?

error when running test_RFB.py without cuda

test_RFB.py runs OK when cuda sets as True. However, when I set cuda as False in test_RFB.py and got the following error:

Traceback (most recent call last):
File "test_RFB.py", line 193, in
top_k, thresh=0.01)
File "demo_RFB.py", line 91, in test_net
out = net(x) # forward pass
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/topspinn/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 185, in forward
x = self.basek
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 72, in forward
input = module(input)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 282, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected object of type Variable[torch.FloatTensor] but found type Variable[torch.cuda.FloatTensor] for argument #1 'weight'

RFB-max pooling in paper

Hi, ruinmessi:
Thank your Great codes, I have some questions:
In paper 4.2 Ablation Study mention that By simply replacing the last convolution layer with the
RFB-max pooling, we can see that the result is impoved to 79.1%。
What the RFB-max pooling refer to ? The RFB(stride 2) ?
Which is the last convolution layer ?
Here the learning rate strategy is same as code? Warmup then use 4e-3, then decay by 0.1?
If it is convenient， Could I have your WeChat ID?
Thank you!

my train file will seize up at a random epoch

there is no error, just don't run, and it will seize up at a random epoch？？I dont know how to deal with it............It makes me mad!!!!

More details about Fig.3 in your paper?

Hi, thanks for your brilliant work.
I want to know how to draw Fig.3 in your paper, I had tried the method you said in #11. But I can not get that beautiful picture.
There are four questions I want to ask:
Q1: The effective receptive field drawn in Fig.3 is which module's input gradient map (RFB module after conv4_3, fc7, conv8 or conv9)
Q2: You said in #11 that the input is an image, what is the number of channels in this image (as conv4_3's RFB has 512 channels; fc7's RFB has 1024 channels...)?
Q3: The author in "Understanding the effective receptive field in deep convolutional neural networks" only describes the case where all convolutional layers are one channel. There is no description of multi-channel conditions, such as the RFB module followed by the fc7 layer, the input of this module has 1024 channels (feature map’s size is 19x19), the output also has 1024 channels, then the gradient of the central pixel on which channel is set to 1.0, that is, \frac{\partial l}{\partial y_{0,0}}(LaTeX code) in the paper.

Q4: The input of the module is multi-channel, that is, the input gradient map is also multi-channel. Then, is the final effective receptive field image selected one of the channels, or is an average value obtained on all channels? Or other operations?

My code for drawing the effective receptive field image is as follows (where the input is a randomly generated 1x1024x19x19 size feature map, the module uses RFB after fc7, the output gradient map takes Zero_grad[0][512][9][9] = 1.0, and the rest are 0): temp.txt (as the uploaded code is messy, I put it in temp.txt and cut the graph as follows)

But the image I got is as follows, and it is very different from Fig.3, so I don't know if it is caused by the details reflected in the above four questions.

More detail about Table 3 in paper

Hi, @ruinmessi .

Thanks for your brilliant work. I want to ask you something about results in Table 3.

Table 3 shows "Performance comparison of different block architectures". The architecture of RFB block is similar to inception [34] module in GoogLeNet, so I can regard results between RFB and inception as comparison between different block architectures. But the rest architecture such as Deformable CNN [4] , Dilated Conv [3] are not the same (or similar) as RFB (actually they are deeply embedded with CNN networks such as VGG and ResNet). I wonder whether results in Table 3 come from replacing components in RFB block or directly replacing the whole network by Deformable CNN [4] , Dilated Conv [3] and others. I have confused with these results for a long time and need some clearer detail on experiments.

Thanks.

How to train from scratch

Hi
Thanks for sharing your code.
Is it possible to train from scratch without using the pretrain weights?

I meet the problem.

Traceback (most recent call last):
File "test_RFB.py", line 49, in
from models.RFB_Net_E_vgg import build_net
File "/media/media_share/linkfile/RFBNet/models/RFB_Net_E_vgg.py", line 405
return RFBNet(phase, size, *multibox(size, vgg(base[str(size)], 3),add_extras(size, extras[str(size)], 1024),mbox[str(size)], num_classes), num_classes)
SyntaxError: only named arguments may follow *expression

could you help me?

thank you very much

Can I use my dataset to train this code ?

I used my dataset to make a VOClike dataset and changed the VOC classes like
VOC_CLASSES = ( 'background', # always index 0
'Car', 'Cyclist', 'Pedestrain')
but the error below happened
Traceback (most recent call last):
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/pydevd.py", line 1664, in
main()
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/jsu/yuyijie/RFBNet/train_RFB.py", line 257, in
train()
File "/home/jsu/yuyijie/RFBNet/train_RFB.py", line 208, in train
images, targets = next(batch_iterator)
File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 336, in next
return self._process_next_batch(batch)
File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
KeyError: 'Traceback (most recent call last):\n File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in \n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/home/jsu/yuyijie/RFBNet/data/voc0712.py", line 185, in getitem\n target = self.target_transform(target)\n File "/home/jsu/yuyijie/RFBNet/data/voc0712.py", line 136, in call\n label_idx = self.class_to_ind[name]\nKeyError: 'car'\n'
Could not find thread pid_14959_id_139789338658968
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789339129560
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139790992329136
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789339129840
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789339129000
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789441732792
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14

How to reproduce the COCO result

Hi, I feel excited about your amazing work and I am trying to reproduce the training process. May I ask about the training arguments of COCO dataset? In detail, --batch_size, --max_epoch and so on.

Thanks for your reply!

How to convert caffe's model to pytorch?

I want to know the transform of vgg16_reducedfc.pth. Would like to provide the code of the conversion？
Thanks!

In config.py, coco-512 settings are different from those of voc-512.

Hi, ruinmessi,

Thank you for sharing your impressive work, but I have some questions while reading the codes.
When I am using RFB_NET_E_512, I found the anchor-box size of Coco-512 and VOC-512 (min size, max size) differs from each other. Is there any special reasons ? I expected the anchor boxes to be fixed for the same network.
Thanks & Regards

About the FPS (RFB and SSD)

In the paper, SSD has larger FPS than RFB.
However, RFB has larger FPS than SSD in this repository.

Why are these different?

RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda

Environment:
Linux 16.04
python 3.7
cuda 8.0

When I compile the nms and coco tools:
by using
./make.sh
I got this kind of error:
running build_ext skipping 'nms/cpu_nms.c' Cython extension (up-to-date) building 'nms.cpu_nms' extension {'gcc': ['-Wno-cpp', '-Wno-unused-function']} gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/lib/python3.7/site-packages/numpy/core/include -I/usr/local/include/python3.7m -c nms/cpu_nms.c -o build/temp.linux-x86_64-3.7/nms/cpu_nms.o -Wno-cpp -Wno-unused-function nms/cpu_nms.c: In function ‘__pyx_pf_3nms_7cpu_nms_2cpu_soft_nms’: nms/cpu_nms.c:3172:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] __pyx_t_8 = ((__pyx_v_pos < __pyx_v_N) != 0); ^ nms/cpu_nms.c:3683:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] __pyx_t_8 = ((__pyx_v_pos < __pyx_v_N) != 0); ^ nms/cpu_nms.c: In function ‘__Pyx_PyCFunction_FastCall’: nms/cpu_nms.c:8431:12: error: too many arguments to function ‘(PyObject * (*)(PyObject *, PyObject * const*, Py_ssize_t))meth’ return (*((__Pyx_PyCFunctionFast)meth)) (self, args, nargs, NULL); ^ nms/cpu_nms.c: In function ‘__Pyx__ExceptionSave’: nms/cpu_nms.c:8892:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ *type = tstate->exc_type; ^ nms/cpu_nms.c:8893:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ *value = tstate->exc_value; ^ nms/cpu_nms.c:8894:17: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ *tb = tstate->exc_traceback; ^ nms/cpu_nms.c: In function ‘__Pyx__ExceptionReset’: nms/cpu_nms.c:8901:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tmp_type = tstate->exc_type; ^ nms/cpu_nms.c:8902:23: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tmp_value = tstate->exc_value; ^ nms/cpu_nms.c:8903:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tmp_tb = tstate->exc_traceback; ^ nms/cpu_nms.c:8904:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tstate->exc_type = type; ^ nms/cpu_nms.c:8905:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tstate->exc_value = value; ^ nms/cpu_nms.c:8906:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tstate->exc_traceback = tb; ^ nms/cpu_nms.c: In function ‘__Pyx__GetException’: nms/cpu_nms.c:8961:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tmp_type = tstate->exc_type; ^ nms/cpu_nms.c:8962:23: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tmp_value = tstate->exc_value; ^ nms/cpu_nms.c:8963:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tmp_tb = tstate->exc_traceback; ^ nms/cpu_nms.c:8964:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tstate->exc_type = local_type; ^ nms/cpu_nms.c:8965:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tstate->exc_value = local_value; ^ nms/cpu_nms.c:8966:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tstate->exc_traceback = local_tb; ^ error: command 'gcc' failed with exit status 1
Anyone can help? How to solve this problem?

demo.py

@ruinmessi

I like to test your pre-trained model on several images I have.
Do you have a demo.py code which takes an image name as input and display detection result?

Thanks,

VOC dataset mAP with my trained-model is 10% lower than RFB300_80_5.pth

hello,

I just follow your guide step by step and then trained a new VOC model (RFB-300-vgg16). I turned it down when epoches went to 130, so finally I used model with RFB_vgg_VOC_epoches_130.pth in test_RFB.py but only get mAP 70%...
could you help me how can I train RFB model with so good results as yours?
thanks a lot

map of RFB_mobile trained on VOC is 71.16%

How can i get MS COCO 'train2014(or2017)_gt_roidb.pkl'file in cache folder?

I'm trying to train this code on my pc with your instructions.
But there is a problem that generating ~_gt_roidb.pkl process.
I've read dataset instructions but nothing explained about generating MS COCO cache folder and ~_gt_roidb.pkl file, even can not find the method on https://github.com/rbgirshick/py-faster-rcnn/blob/77b773655505599b94fd8f3f9928dbf1a9a776c7/data/README.md .
Is there any way to get that file to train this model?

The difference of RFBNet in RFB_Net_vgg.py and Fig.5 ?

Hello, thank you for code releasing.

There are two branches from fc7 layer output in Fig.5, one is the input of RBF and another is the input of RBF(stride 2).

    # apply vgg up to fc7
    for k in range(23, len(self.base)):
        x = self.base[k](x)

    # apply extra layers and cache source layer outputs
    for k, v in enumerate(self.extras):
        x = v(x)
        if k < self.indicator or k%2 ==0:
            sources.append(x)

But, in RFB_Net_vgg.py, the output x from fc7 layer is the input of RFB layer, and then, the output of RFB layer is the input of RFB(stride 2) layer. There might not be two branches. And the architecture of RFBNet might be like the picture below?

RuntimeError: randperm is only implemented for CPU

Traceback (most recent call last):
File "train_RFB.py", line 253, in
train()
File "train_RFB.py", line 189, in train
shuffle=True, num_workers=args.num_workers, collate_fn=detection_collate))
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 451, in iter
return _DataLoaderIter(self)
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 247, in init
self._put_indices()
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 295, in _put_indices
indices = next(self.sample_iter, None)
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 139, in iter
for idx in self.sampler:
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 53, in iter
return iter(torch.randperm(len(self.data_source)).tolist())
RuntimeError: randperm is only implemented for CPU
I am using torch=0.4, python=2.5, ubuntu=14.04, and I have problems during training.

How to depict Fig.3 in your paper?

Hi,
I'm really impressed by your good work.

As shown Fig3, I want to depict the effective receptive field like you.

Could you share the method?

error in training mobilenet

@ruinmessi

I follow your instruction below to train VOC with mobilenet, but got an error:

python3 train_RFB.py -d VOC -v RFB_mobile -s 300
300 21
Traceback (most recent call last):
File "train_RFB.py", line 88, in
net = build_net('train', img_dim, num_classes)
File "/home/topspin/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 348, in build_net
mbox[str(size)], num_classes), num_classes)
TypeError: init() missing 2 required positional arguments: 'head' and 'num_classes'

Any idea why this happens?

Thanks,

How to use the txt format label files

The code is based on xml format label file, but I want to use txt format label file....., I have tried to modified voc0712.py, but I failed (crying), Please help me...I really need your help
Thank you so much for your kindness!!

goatmessi7 / rfbnet Goto Github PK

rfbnet's People

Contributors

Stargazers

Watchers

Forkers

rfbnet's Issues

Recommend Projects

Recommend Topics

Recommend Org