chenyuntc / simple-faster-rcnn-pytorch Goto Github PK

View Code? Open in Web Editor NEW

3.9K 55.0 1.1K 2.14 MB

A simplified implemention of Faster R-CNN that replicate performance from origin paper

License: Other

Python 13.75% Jupyter Notebook 86.25%

pytorch object-detection faster-rcnn voc visdom pythonic cupy

simple-faster-rcnn-pytorch's People

Contributors

Stargazers

Watchers

Forkers

acgtyrant lgcming jeffreyyihuang quxiaofeng yuechengyin arsenluca willdamon liuguoyou quinwu shubhampachori12110095 604557209 youngger wwwanghao codegank locosoft1986 chuckgithub felixmonkey countermaker binwang-shu tangyoubao yuckfu chelovek21 zhabzhang chenhongming chentyjpm luo-chang laurentperrinet kywang suriyanitt joeybose aymenx17 llwc yingcong12 sapjunior xychen9459 hallochen yanwang2014 hiredd irustandi niaoyu shanwf longcw shunsunsun danilopetrocelli hstart briansp2020 aust-hansen ruichen96 hyzcn ilkarman zhongyingji yqjaoshuang dansonc dodler pandinosaurus twtygqyy helloricky123 icaffe imheng jjkke88 chenvera1 shawnwuzh xug12345 liketheflower chungyeh shaunlipy grow-yhq iseekyan zhulei1109 helena2017wf robert780612 hhmaizi mahavird jianxiongcai leichangqing haileypark-kr afcarl kingxueyuf zfxu mateorico guker ycangus2415 sheldonlee66 qiaod liujianhui1986 chen849157649 yuweijimmy junan007 qeatzy cbasavaraj wanggcong sunwillz ty01csbaidu resbyte zy20091082 fanofjava wjx2 andytung2019 realmoment2017 yongchao-long

simple-faster-rcnn-pytorch's Issues

cupy cuda error

when i try to train on voc, i come across this error:
from collections import namedtuple
faster_rcnn = FasterRCNNVGG16()
File "/home/yangshao/workspace/faster_rcnn/model/faster_rcnn_vgg16.py", line 51, in init
head = VGG16RoIHead(n_class=n_fg_class+1,roi_size=7, spatial_scale=1./self.feat_stride, classifier=classifier)
File "/home/yangshao/workspace/faster_rcnn/model/faster_rcnn_vgg16.py", line 29, in init
self.roi = RoIPooling2D(self.roi_size, self.roi_size, self.spatial_scale)
File "/home/yangshao/workspace/faster_rcnn/model/roi_module.py", line 86, in init
self.RoI = RoI(outh, outw, spatial_scale)
File "/home/yangshao/workspace/faster_rcnn/model/roi_module.py", line 35, in init
self.forward_fn = load_kernel('roi_forward', kernel_forward)
File "cupy/util.pyx", line 34, in cupy.util.memoize.decorator.ret
File "cupy/cuda/device.pyx", line 19, in cupy.cuda.device.get_device_id
File "cupy/cuda/runtime.pyx", line 164, in cupy.cuda.runtime.getDevice
File "cupy/cuda/runtime.pyx", line 136, in cupy.cuda.runtime.check_status
cupy.cuda.runtime.CUDARuntimeError: cudaErrorNoDevice: no CUDA-capable device is detected

I installed cupy with command:
pip install cupy-cuda80

Error in python3 build.py build_ext --inplace

if i do python3 build.py build_ext --inplace, there is an error like this

running build_ext
skipping '_nms_gpu_post.c' Cython extension (up-to-date)
building '_nms_gpu_post' extension
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/root/anaconda3/include/python3.6m -c _nms_gpu_post.c -o build/temp.linux-x86_64-3.6/_nms_gpu_post.o
_nms_gpu_post.c:485:31: fatal error: numpy/arrayobject.h: No such file or directory
#include "numpy/arrayobject.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1

im in nvidia-docker environment, cupy 2.2.0 cuda 8.0 python 3.6.3 and torch 0.3.0

i'll look for your help

thanks

KeyError: 'unexpected key "classifier.1.weight" in state_dict'

First of all thanks for your code.when i run the command : python3 train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain it will show below. when i search the problew .someone said that the torchvision version should be the 0.1.7. my torchvesion is 0.2.0 . but after installed the torchvision 0.1.8 .it still show me that as the below:
`======user config========
{'caffe_pretrain': True,
'caffe_pretrain_path': '/root/.torch/models/vgg16-00b39a1b.pth',
'data': 'voc',
'debug_file': '/tmp/debugf',
'env': 'fasterrcnn-caffe',
'epoch': 14,
'load_path': None,
'lr': 0.001,
'lr_decay': 0.1,
'max_size': 1000,
'min_size': 600,
'num_workers': 8,
'plot_every': 100,
'port': 8097,
'pretrained_model': 'vgg16',
'roi_sigma': 1.0,
'rpn_sigma': 3.0,
'test_num': 10000,
'test_num_workers': 8,
'use_adam': False,
'use_chainer': False,
'use_drop': False,
'voc_data_dir': '/data04/data/VOCdevkit/VOC2007/',
'weight_decay': 0.0005}
==========end============
load data
Traceback (most recent call last):
File "train.py", line 131, in
fire.Fire()
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 65, in train
faster_rcnn = FasterRCNNVGG16()
File "/home/work/zhx/simple-faster-rcnn-pytorch/model/faster_rcnn_vgg16.py", line 62, in init
extractor, classifier = decom_vgg16()
File "/home/work/zhx/simple-faster-rcnn-pytorch/model/faster_rcnn_vgg16.py", line 16, in decom_vgg16
model.load_state_dict(t.load(opt.caffe_pretrain_path))
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 522, in load_state_dict
.format(name))
KeyError: 'unexpected key "classifier.1.weight" in state_dict'

If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True`

Why do you disable decay weight for bias?

See https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/model/faster_rcnn.py#L273

The Fast RCNN paper says:

A momentum of 0:9 and parameter decay of 0:0005 (on weights and biases) are used.

The Faster RCNN paper says:

We use a momentum of 0.9 and a weight decay of 0.0005 [37].

And the origin code dose not set the special weight decay in the prototxt.

Does fast rcnn have only one fc layer after roi pooling?

From the picture you put at the rear of ReadMe file, I see there should be two fc layers after the ROI Pooling of fast rcnn before the two subling classification and bbox regression networks. However, I see only one fc layer (the classifier object obtained from the original vgg network) in the code here:

simple-faster-rcnn-pytorch/model/faster_rcnn_vgg16.py

Lines 104 to 106 in ce5cf5e

    
           self.classifier = classifier 
        
           self.cls_loc = nn.Linear(4096, n_class * 4) 
        
           self.score = nn.Linear(4096, n_class)

Did I refer to the wrong part of your code?

connection error

when i run train.py after load data it occurs ConnectionError
like this:

raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f014c50b438>: Failed to establish a new connection: [Errno 111] Connection refused',))

Different bbox inputs for training and test

I found train dataloader and test dataloader return different ground truth bboxes even for the same input image? Why?

ImportError: CuPy is not correctly installed.

When I run python3 train.py train --env='fasterrcnn_caffe' --plot_every=100 --caffe_pretrain,an error occured like this:

Traceback (most recent call last):
  File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/__init__.py", line 11, in <module>
    from cupy import core  # NOQA
  File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/core/__init__.py", line 1, in <module>
    from cupy.core import core  # NOQA
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 9, in <module>
    from model import FasterRCNNVGG16
  File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/__init__.py", line 1, in <module>
    from .faster_rcnn_vgg16 import FasterRCNNVGG16
  File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/faster_rcnn_vgg16.py", line 4, in <module>
    from model.region_proposal_network import RegionProposalNetwork
  File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/region_proposal_network.py", line 7, in <module>
    from model.utils.creator_tool import ProposalCreator
  File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/utils/creator_tool.py", line 2, in <module>
    import cupy as cp
  File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/__init__.py", line 32, in <module>
    six.reraise(ImportError, ImportError(msg), exc_info[2])
  File "/home/htu/anaconda3/lib/python3.6/site-packages/six.py", line 692, in reraise
    raise value.with_traceback(tb)
  File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/__init__.py", line 11, in <module>
    from cupy import core  # NOQA
  File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/core/__init__.py", line 1, in <module>
    from cupy.core import core  # NOQA
ImportError: CuPy is not correctly installed.

If you are using wheel distribution (cupy-cudaXX), make sure that the version of CuPy you installed matches with the version of CUDA on your host.
Also, confirm that only one CuPy package is installed:
  $ pip freeze

If you are building CuPy from source, please check your environment, uninstall CuPy and reinstall it with:
  $ pip install cupy --no-cache-dir -vvvv

Check the Installation Guide for details:
  https://docs-cupy.chainer.org/en/latest/install.html

original error: libcuda.so.1: cannot open shared object file: No such file or directory

If you suspect this is an IPython bug, please report it at:
    https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
    %config Application.verbose_crash=True

python 3.6.4
cupy-cuda80 4.0.0
Cython 0.27.3
torch 0.4.0
cuda 8.0
cudnn v5.1
The version of cupy I installed matches with the version of CUDA.Why this error occured?

setting training epoch properly

Hi,
I notice that in the configuration file utils/config.py you set training epoch.

However, in the train.py file , you just break the training loop when reaches 13 epoch.

This, for me, I set 15 epoch for training, but breaks at 13 epoch.

Waste much of my time.

Please delete the break at 13 epoch in train.py thanks.

ImportError: cannot import name array_tool ??

How to solve this problem?

A problem about visdom

When I use visdom,I meet # "PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.5/dist-packages/visdom/static/fonts'", but I can not find this file,what can I do,or do you have this file,thank you

order of x and y

Can I confirm that for the training, the input bounding boxes are (y1, x1, y2, x2) and for the faster_rcnn's predict method, the returned bounding boxes are (x1,y1, x2, y2)?

Performance on Resnet101 network

Hi, I've implemented the resnet101 structure on the top of vgg16 network, while the mAP on VOC datasets could only reach 0.62 after 20 epochs.
Do you have any idea what the problem would be? You can find the code here. Thank you.

HI,How to achieve batch_size>1 or use multi_gpu，Can you give me an idea?

torchnet

How install torchnet into win10????

您好，国内要翻墙才能下pretrain model，可以提供百度云等途径吗？

您好，国内要翻墙才能下pretrain model，我运算机器是远程的国内机器，无法翻墙，请问可以提供百度云等途径吗？

万分感谢

log_info prints that lr is 0.002 sometimes

See https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/train.py#L118

I think optimizer.param_groups[0] will be the bias parameter sometime.

Debugging the wrong Label in XML files

HI @chenyuntc,
I have created my own custom dataset.
While training the model I am getting the following error:

I suspect this is because of some Label wrongly defined in one of the annotation files.

How can I debug to find which of the file is having that label?

I tried Printing the file id in the get_example function of voc_dataset.py. But the code breaks once encountered with a wrong label and doesn't enter into this function.

I know this issue is very much related to my own dataset, but It would be great if you can give me some leads.

Regards,
Mahavir

How about filtering difficultes in Dataset instead?

In other words, only return labels and gt_bboxes which are not difficultes in Dataset, so we do not need to process difficultes in others places any more.

If you have tried it, I would like to know is it equivalent to processing difficultes in others places.

pytorch .4 item vs indexing with [0]

Loss values for pytorch .4 can only be accessed through tensor.item() instead of indexing with [0] as seen in get_meter_data in train.py. This matters because with traditional indexing, a "tensor" object will be returned from get_meter_data which breaks visdom's jsonify encoder.

Running without cuda support

Hi @chenyuntc,

How can we disable Cuda, option while using the predict function? Is there a flag for doing so ?

I want to use it in CPU mode.

trainer.faster_rcnn.predict(img,visualize=True)

I am new to pytorch.

Regards,
Mahavir

Using the same model for video

Hi @chenyuntc,

Can I use the same model(chainer_best_model_converted_to_pytorch_0.7053.pth) for detecting objects in a video?If, yes can you give me some leads.What changes do I need to make in code?

Regards,
Mahavir

when I train on my own dataset, it has an error that 'RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCTensorCopy.c:21'

======user config========
{'caffe_pretrain': False,
'caffe_pretrain_path': '/vgg16_caffe.pth',
'data': 'voc',
'debug_file': '/tmp/debugf',
'env': 'fasterrcnn-caffe',
'epoch': 14,
'load_path': None,
'lr': 0.001,
'lr_decay': 0.1,
'max_size': 1000,
'min_size': 600,
'num_workers': 8,
'plot_every': 100,
'port': 8097,
'pretrained_model': 'vgg16',
'roi_sigma': 1.0,
'rpn_sigma': 3.0,
'test_num': 10000,
'test_num_workers': 8,
'use_adam': False,
'use_chainer': False,
'use_drop': False,
'voc_data_dir': '/home/chenzw/tensor/Faster-rcnn-hoi/hico_20160224_det/',
'weight_decay': 0.0005}
==========end============
loading dataset
model construct completed
1it [00:01, 1.78s/it]/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [36,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [37,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [38,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [39,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [40,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [41,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [42,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [43,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [60,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [61,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [62,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [63,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [96,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [97,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [98,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [99,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [100,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [101,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [102,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [103,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [104,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [105,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [106,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [107,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [108,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [109,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [110,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [111,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [112,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [113,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [114,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [115,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [116,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [117,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [118,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [119,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [120,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [121,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [122,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [123,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [0,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [1,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [2,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [3,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [4,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [5,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [6,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [7,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [8,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [9,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [10,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [11,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [12,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [13,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [14,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [15,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [28,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [29,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [30,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [31,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [72,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [73,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [74,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [75,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [76,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [77,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [78,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [79,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [80,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [81,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [82,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [83,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [84,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [85,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [86,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [87,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [92,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [93,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [94,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [95,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCTensorCopy.c line=21 error=59 : device-side assert triggered

Traceback (most recent call last):
File "train3.py", line 137, in
fire.Fire()
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train3.py", line 84, in train
trainer.train_step(img, bbox, label, scale)
File "/home/chenzw/tensor/Faster-rcnn-hoi/trainer.py", line 168, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "/home/chenzw/tensor/Faster-rcnn-hoi/trainer.py", line 148, in forward
gt_roi_label = at.tovariable(gt_roi_label).long()
File "/home/chenzw/tensor/Faster-rcnn-hoi/utils/array_tool.py", line 31, in tovariable
return tovariable(totensor(data))
File "/home/chenzw/tensor/Faster-rcnn-hoi/utils/array_tool.py", line 25, in totensor
tensor = tensor.cuda()
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/_utils.py", line 69, in cuda
return new_type(self.size()).copy(self, async)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCTensorCopy.c:21

If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True

faster_rcnn.py:224: UserWarning: volatile was removed and now has no effect

this line
img = t.autograd.Variable(at.totensor(img).float()[None], volatile=True)
in faster_rcnn.py 224 occurs an error: "UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead."
I have tried to substitute "Variable" as "no_grad", but the another error "TypeError: init() takes exactly 1 argument (2 given)".
My pytorch is 0.4.0, I appreciate your answer and tips.

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'

l1_smooth_loss is different with the origin paper

I find you use custom computational formula. The origin paper dose not use the custom sigma.

Arraytool missing

Hi I tried to run the code but it seems that array_tool.py is missing.

SGD but no learning rate decay found

It is hard coded in train.py that when reached 9 epoch, do learning rate decay.

This is not elegant. Also, waste people's time if they didn't notice this setting.

Please, set this hard coded learing rate stepsize in config.py

About mask-rcnn

Hello, I would like to ask, have you tried to implement the mask by yourself? Or what resources are recommended for the mask-rcnn?

model/utils/creator_tool.py#L125

Thank you for your effort, and I wonder that should the pos_roi_per_this_image be neg_roi_per_this_image in model/utils/creator_tool.py#L125 ?

extracting features for specific boxes

Hello,

first of all thanks for your great efforts!

Is it possible to extract features for specific boxes? or even fine-tune the f-rcnn based on generated those features. An example case would be let's say you have annotated boxes for visual qa and would like to use f-rcnn as feature extractor and even finetune the pretrained model. any tips towards this direction?

Code randomly crashes after some iterations.

So I've tried running the code on my own dataset, but it crashes randomly after a few 1000 iterations. I'm having a hard time debugging this but heres the output:

load data
model construct completed
99it [00:18, 5.41it/s]model/faster_rcnn.py:223: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
img = t.autograd.Variable(at.totensor(img).float()[None], volatile=True)
279it [00:49, 5.62it/s]model/utils/bbox_tools.py:183: RuntimeWarning: invalid value encountered in divide
return area_i / (area_a[:, None] + area_b - area_i)
model/utils/creator_tool.py:108: RuntimeWarning: invalid value encountered in greater_equal
pos_index = np.where(max_iou >= self.pos_iou_thresh)[0]
model/utils/creator_tool.py:116: RuntimeWarning: invalid value encountered in less
neg_index = np.where((max_iou < self.neg_iou_thresh_hi) &
model/utils/creator_tool.py:117: RuntimeWarning: invalid value encountered in greater_equal
(max_iou >= self.neg_iou_thresh_lo))[0]
model/utils/bbox_tools.py:138: RuntimeWarning: divide by zero encountered in log
dh = xp.log(base_height / height)
model/utils/bbox_tools.py:139: RuntimeWarning: divide by zero encountered in log
dw = xp.log(base_width / width)
280it [00:49, 5.62it/s]model/utils/creator_tool.py:411: RuntimeWarning: invalid value encountered in greater_equal
keep = np.where((hs >= min_size) & (ws >= min_size))[0]
12880it [28:03, 7.65it/s][[B
2097it [02:02, 17.14it/s]Traceback (most recent call last):
File "train.py", line 123, in
fire.Fire()
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 127, in Fire
component_trace = Fire(component, args, context, name)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 366, in Fire
component, remaining_args)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 542, in CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 103, in train
eval_result = eval(test_dataloader, faster_rcnn, test_num=opt.test_num)
File "train.py", line 29, in eval
pred_bboxes, pred_labels, pred_scores = faster_rcnn.predict(imgs, [sizes])
File "model/faster_rcnn.py", line 225, in predict
roi_cls_loc, roi_scores, rois, _ = self(img, scale=scale)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "model/faster_rcnn.py", line 126, in forward
h, rois, roi_indices)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "model/faster_rcnn_vgg16.py", line 136, in forward
rois = at.totensor(rois).float()
File "model/utils/array_tool.py", line 25, in totensor
tensor = tensor.cuda()
File "/usr/local/lib/python2.7/dist-packages/torch/_utils.py", line 69, in cuda
return new_type(self.size()).copy(self, async)
File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 172, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 24699) is killed by signal: Killed.

Evaluation score threshold

Hi,

is there a specific reason to why score_threshold is set to 0.05 when evaluating?

I understand that setting it to 0.7 when visualizing allows to keep only strong positives, but why is such score set so low (0.05) when computing the mAP? Is that value simply derived by dividing 1 by the number of classes (20 in this case)? Should this value be any different when training the network on a different number of classes?

Thank you.

module 'torch' has no attribute '_TensorBase'

when I run train.py,it occured that:

Traceback (most recent call last):
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/util/connection.py", line 83, in create_connection
    raise err
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.5/http/client.py", line 1106, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request
    self.endheaders(body)
  File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
    self.send(msg)
  File "/usr/lib/python3.5/http/client.py", line 877, in send
    self.connect()
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connection.py", line 166, in connect
    conn = self._new_conn()
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connection.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f9c8347a2b0>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/htu/.local/lib/python3.5/site-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/home/htu/.local/lib/python3.5/site-packages/urllib3/util/retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9c8347a2b0>: Failed to establish a new connection: [Errno 111] Connection refused',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/visdom/__init__.py", line 261, in _send
    data=json.dumps(msg),
  File "/home/htu/.local/lib/python3.5/site-packages/requests/api.py", line 112, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/home/htu/.local/lib/python3.5/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/htu/.local/lib/python3.5/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/htu/.local/lib/python3.5/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/home/htu/.local/lib/python3.5/site-packages/requests/adapters.py", line 508, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9c8347a2b0>: Failed to establish a new connection: [Errno 111] Connection refused',))
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "train.py", line 131, in <module>
    fire.Fire()
  File "/home/htu/.local/lib/python3.5/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/home/htu/.local/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/home/htu/.local/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "train.py", line 78, in train
    scale = at.scalar(scale)
  File "/home/htu/yc/simple-faster-rcnn-pytorch-master/utils/array_tool.py", line 43, in scalar
    if isinstance(data, t._TensorBase):
AttributeError: module 'torch' has no attribute '_TensorBase'

If you suspect this is an IPython bug, please report it at:
    https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
    %config Application.verbose_crash=True

About train.py

resource.setrlimit(resource.RLIMIT_NOFILE, (20480, rlimit[1]))
20480 is too larger to run it.when I change it to 4096,I can run it.

how to detect tiny objects

Hi, I want to train object detector detecting tiny objects, what parameters should i change?
For example, if the image size is (500, 375), and the bbox is (166, 477, 188, 487). Thanks.

when i run training.py , nothing is shown on visdom

hi i installed everything and then i run the python3 train.py
but nothing was shown on visdom and these codes were on the terminal
os: <module 'os' from '/root/anaconda3/lib/python3.6/os.py'>
ipdb: <module 'ipdb' from '/root/anaconda3/lib/python3.6/site-packages/ipdb/init.py'>
matplotlib: <module 'matplotlib' from '/root/anaconda3/lib/python3.6/site-packages/matplotlib/init.py'>
tqdm: <class 'tqdm.tqdm.tqdm'>
opt: <utils.config.Config object at 0x7f6ae060cac8>
Dataset: <class 'data.dataset.Dataset'>
TestDataset: <class 'data.dataset.TestDataset'>
inverse_normalize: <function inverse_normalize at 0x7f6ae060e598>
FasterRCNNVGG16: <class 'model.faster_rcnn_vgg16.FasterRCNNVGG16'>
Variable: <class 'torch.autograd.variable.Variable'>
data: <module 'torch.utils.data' from '/root/anaconda3/lib/python3.6/site-packages/torch/utils/data/init.py'>
FasterRCNNTrainer: <class 'trainer.FasterRCNNTrainer'>
at: <module 'utils.array_tool' from '/home/garcons/simple-faster-rcnn-pytorch/utils/array_tool.py'>
visdom_bbox: <function visdom_bbox at 0x7f6a903a1620>
eval_detection_voc: <function eval_detection_voc at 0x7f6a9010f598>
resource: <module 'resource' from '/root/anaconda3/lib/python3.6/lib-dynload/resource.cpython-36m-x86_64-linux-gnu.so'>
rlimit: [524288, 1048576]
eval: <function eval at 0x7f6afe032e18>
train: <function train at 0x7f6a9010f840>
fire: <module 'fire' from '/root/anaconda3/lib/python3.6/site-packages/fire/init.py'>

with trainer.py, nothing happend.

what is the matter??
thanks!

Problem with demo code

I run your demo code and get the following problem.
`---------------------------------------------------------------------------
RecursionError Traceback (most recent call last)
in ()
1 trainer.load('pretrain/torchvision_pretrain.pth')
2 opt.caffe_pretrain=False # this model was trained from torchvision-pretrained model
----> 3 _bboxes, _labels, _scores = trainer.faster_rcnn.predict(img,visualize=True)
4 vis_bbox(at.tonumpy(img[0]),
5 at.tonumpy(_bboxes[0]),

/home/shared/project/simple-faster-rcnn-pytorch/model/faster_rcnn.py in predict(self, imgs, sizes, visualize)
212 for img in imgs:
213 size = img.shape[1:]
--> 214 img = preprocess(at.tonumpy(img))
215 prepared_imgs.append(img)
216 sizes.append(size)

/home/shared/project/simple-faster-rcnn-pytorch/utils/array_tool.py in tonumpy(data)
12 # return data.cpu().numpy()
13 if isinstance(data, t.autograd.Variable):
---> 14 return tonumpy(data.data)
15
16

... last 1 frames repeated, from the frame below ...

RecursionError: maximum recursion depth exceeded`

Thanks.

Data Augmentation

Hi,

I see that in your dataset class you are horizontally flipping the images and their corresponding bounding box labels, but instead of appending them to the original set you are replacing them. Link

I have two questions:

Why is the training done using horizontally flipped images and not on non-flipped images? If I comment out those lines, the performance is exactly the same (which is intuitive).
Adding data augmentation option would be a nice (and simple) addition to your repo. Any reason you have not added this option?

P.S.: I will be happy to send a PR for data augmentation.

Runtime error occurs when i train my own data

when i run python3 train.py train,
======user config========
{'caffe_pretrain': False,
'caffe_pretrain_path': '/home/garcons/simple-faster-rcnn-pytorch/fasterrcnn_12211511_0.701052458187_torchvision_pretrain.pth',
'data': 'voc',
'debug_file': '/tmp/debugf',
'env': 'faster-rcnn',
'epoch': 14,
'load_path': None,
'lr': 0.001,
'lr_decay': 0.1,
'max_size': 1000,
'min_size': 400,
'num_workers': 4,
'plot_every': 40,
'port': 8097,
'pretrained_model': 'vgg16',
'roi_sigma': 1.0,
'rpn_sigma': 3.0,
'test_num': 1000,
'test_num_workers': 4,
'use_adam': False,
'use_chainer': False,
'use_drop': False,
'voc_data_dir': '/home/garcons/simple-faster-rcnn-pytorch/garconsdata/',
'weight_decay': 0.0005}
==========end============
load data
model construct completed
0it [00:00, ?it/s]/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [32,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [33,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
...
/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [31,0,0] Assertion indexAtDim < data.baseSizes[dim] failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorIndex.cu line=648 error=59 : device-side assert triggered

Traceback (most recent call last):
File "train.py", line 130, in
fire.Fire()
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 80, in train
trainer.train_step(img, bbox, label, scale)
File "/home/garcons/simple-faster-rcnn-pytorch/trainer.py", line 168, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "/home/garcons/simple-faster-rcnn-pytorch/trainer.py", line 147, in forward
at.totensor(gt_roi_label).long()]
File "/root/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 78, in getitem
return Index.apply(self, key)
File "/root/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 87, in forward
result = i.index(ctx.index)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorIndex.cu:648

this error occurs when i train with my own data
with pretrained model and VOC2007 dataset, there was no error like this.
i tried CUDA_LAUNCH_BLOCKING=1 python3 train.py train but it doesn't work.

how can i fix this error?

Why the program becomes slower and then stops when run at the eval() function?

It is normally in "train" mode ,but becomes slower and slower after about 3500 images had been tested. Finally the process is killed and crashed. Maybe it is because of "memory leak"?
In addition, I modify the line 22 in the code from "20480" to "4096" to avoid error.
Does anyone have any idea on how I could modify?
Thank you ~~

Training the model for custom datatset

Hi @chenyuntc,

Thanks for your simplified(simple) implementation of Faster R-CNN in pytorch.

Going by your instructions, I was successfully able to train/test the simple-faster-rcnn-pytorch setup on my system.

Now, I have a custom dataset which has 36 classes, I would like to train a Simple-Faster-R-CNN model (VGG/RES101) for that?
I think loading the dataset is the trickiest part, it would be great if you can suggest some tutorial/blog/link for the same.

Where do I get started, what changes need to be made?

I will make sure I open source the work I do during this process.

Thanks!

Sometimes IOU's are empty which breaks the code.

So I've encountered this new problem which happens rarely but atleast once / epoch on my custom dataset where the iou's are empty and the code crashes. What needs to be changed to handle this case elegantly?
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "attack_trainer.py", line 82, in train
trainer.train_step(img, bbox, label, scale)
File "trainer.py", line 406, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "trainer.py", line 362, in forward
img_size)
File "model/utils/creator_tool.py", line 209, in call
inside_index, anchor, bbox)
File "model/utils/creator_tool.py", line 226, in _create_label
self._calc_ious(anchor, bbox, inside_index)
File "model/utils/creator_tool.py", line 260, in _calc_ious
gt_argmax_ious = ious.argmax(axis=0)
ValueError: attempt to get argmax of an empty sequence

imagenet pretrained model

Hi, is there any pre-trained model on the imagenet using this code?

About RPN

Thank you for your contribution, can I extract the RPN module separately?Remove Faster RCNN classification steps?

problem with cupy

I have some difficulties installing cupy, is there anyway to run this code without cupy?

nms_gpu_post安装出错

phd@phd-HP-xw8600-Workstation:~/PycharmProjects/Examples/simple-faster-rcnn-pytorch/model/utils/nms$ python3 build.py build_ext --inplace
running build_ext
skipping '_nms_gpu_post.c' Cython extension (up-to-date)
building '_nms_gpu_post' extension
gcc -pthread -B /home/phd/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/phd/anaconda3/include/python3.6m -c _nms_gpu_post.c -o build/temp.linux-x86_64-3.6/_nms_gpu_post.o
_nms_gpu_post.c:525:31: fatal error: numpy/arrayobject.h: 没有那个文件或目录
compilation terminated.
error: command 'gcc' failed with exit status 1

我的python是用anaconda安装的，现在报这个错，但是numpy是有的

model/utils/creator_tool.py, line 259

Hi,

can you please explain to me why are you using the following function to compute gt_argmax_ious?

gt_argmax_ious = np.where(ious == gt_max_ious)[0]

If I understand it correctly, gt_argmax_ious should be an array of size K (K is the number of gt boxes) containing the indexes of the rows of ious (N x K matrix) corresponding to the maximum values.

Isn't this achieved a couple of lines above? (gt_argmax_ious = ious.argmax(axis=0))

To my understanding, where is used to return an ordered list of the indexes, but what exactly is the point if said indexes are only used to fill the vector of positive examples ( label[gt_argmax_ious] = 1, line 209 ).

Thank you.

NMS Anaconda3 build failed

blake@blake-ubuntu:~/cv/object_detection/faster-rcnn/simple-faster-rcnn-pytorch/model/utils/nms$ python build.py build_ext --inplace
running build_ext
cythoning _nms_gpu_post.pyx to _nms_gpu_post.c
building '_nms_gpu_post' extension
creating build
creating build/temp.linux-x86_64-3.5
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Wformat -Wformat-security -D_FORTIFY_SOURCE=2 -fstack-protector -O3 -fpic -fPIC -fPIC -I/home/blake/anaconda3/include/python3.5m -c _nms_gpu_post.c -o build/temp.linux-x86_64-3.5/_nms_gpu_post.o
_nms_gpu_post.c:266:31: fatal error: numpy/arrayobject.h: No such file or directory
#include "numpy/arrayobject.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1

thank you, and report bugs of the demo.ipynb

cell 1: "util" should be "utils"
cell 2: no folder named "assets" with demo.jpg inside
cell 3: "PermissionError: [Errno 13] Permission denied: '/home/username/.torch/models/vgg16-397923af.pth' " (ubuntu 16.04)

	self.classifier = classifier
	self.cls_loc = nn.Linear(4096, n_class * 4)
	self.score = nn.Linear(4096, n_class)

chenyuntc / simple-faster-rcnn-pytorch Goto Github PK

simple-faster-rcnn-pytorch's People

Contributors

Stargazers

Watchers

Forkers

simple-faster-rcnn-pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org