chenyuntc / simple-faster-rcnn-pytorch Goto Github PK
View Code? Open in Web Editor NEWA simplified implemention of Faster R-CNN that replicate performance from origin paper
License: Other
A simplified implemention of Faster R-CNN that replicate performance from origin paper
License: Other
when i try to train on voc, i come across this error:
from collections import namedtuple
faster_rcnn = FasterRCNNVGG16()
File "/home/yangshao/workspace/faster_rcnn/model/faster_rcnn_vgg16.py", line 51, in init
head = VGG16RoIHead(n_class=n_fg_class+1,roi_size=7, spatial_scale=1./self.feat_stride, classifier=classifier)
File "/home/yangshao/workspace/faster_rcnn/model/faster_rcnn_vgg16.py", line 29, in init
self.roi = RoIPooling2D(self.roi_size, self.roi_size, self.spatial_scale)
File "/home/yangshao/workspace/faster_rcnn/model/roi_module.py", line 86, in init
self.RoI = RoI(outh, outw, spatial_scale)
File "/home/yangshao/workspace/faster_rcnn/model/roi_module.py", line 35, in init
self.forward_fn = load_kernel('roi_forward', kernel_forward)
File "cupy/util.pyx", line 34, in cupy.util.memoize.decorator.ret
File "cupy/cuda/device.pyx", line 19, in cupy.cuda.device.get_device_id
File "cupy/cuda/runtime.pyx", line 164, in cupy.cuda.runtime.getDevice
File "cupy/cuda/runtime.pyx", line 136, in cupy.cuda.runtime.check_status
cupy.cuda.runtime.CUDARuntimeError: cudaErrorNoDevice: no CUDA-capable device is detected
I installed cupy with command:
pip install cupy-cuda80
if i do python3 build.py build_ext --inplace, there is an error like this
running build_ext
skipping '_nms_gpu_post.c' Cython extension (up-to-date)
building '_nms_gpu_post' extension
gcc -pthread -B /root/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/root/anaconda3/include/python3.6m -c _nms_gpu_post.c -o build/temp.linux-x86_64-3.6/_nms_gpu_post.o
_nms_gpu_post.c:485:31: fatal error: numpy/arrayobject.h: No such file or directory
#include "numpy/arrayobject.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1
im in nvidia-docker environment, cupy 2.2.0 cuda 8.0 python 3.6.3 and torch 0.3.0
i'll look for your help
thanks
First of all thanks for your code.when i run the command : python3 train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain it will show below. when i search the problew .someone said that the torchvision version should be the 0.1.7. my torchvesion is 0.2.0 . but after installed the torchvision 0.1.8 .it still show me that as the below:
`======user config========
{'caffe_pretrain': True,
'caffe_pretrain_path': '/root/.torch/models/vgg16-00b39a1b.pth',
'data': 'voc',
'debug_file': '/tmp/debugf',
'env': 'fasterrcnn-caffe',
'epoch': 14,
'load_path': None,
'lr': 0.001,
'lr_decay': 0.1,
'max_size': 1000,
'min_size': 600,
'num_workers': 8,
'plot_every': 100,
'port': 8097,
'pretrained_model': 'vgg16',
'roi_sigma': 1.0,
'rpn_sigma': 3.0,
'test_num': 10000,
'test_num_workers': 8,
'use_adam': False,
'use_chainer': False,
'use_drop': False,
'voc_data_dir': '/data04/data/VOCdevkit/VOC2007/',
'weight_decay': 0.0005}
==========end============
load data
Traceback (most recent call last):
File "train.py", line 131, in
fire.Fire()
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 65, in train
faster_rcnn = FasterRCNNVGG16()
File "/home/work/zhx/simple-faster-rcnn-pytorch/model/faster_rcnn_vgg16.py", line 62, in init
extractor, classifier = decom_vgg16()
File "/home/work/zhx/simple-faster-rcnn-pytorch/model/faster_rcnn_vgg16.py", line 16, in decom_vgg16
model.load_state_dict(t.load(opt.caffe_pretrain_path))
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 522, in load_state_dict
.format(name))
KeyError: 'unexpected key "classifier.1.weight" in state_dict'
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True`
See https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/model/faster_rcnn.py#L273
The Fast RCNN paper says:
A momentum of 0:9 and parameter decay of 0:0005 (on weights and biases) are used.
The Faster RCNN paper says:
We use a momentum of 0.9 and a weight decay of 0.0005 [37].
And the origin code dose not set the special weight decay in the prototxt.
From the picture you put at the rear of ReadMe file, I see there should be two fc layers after the ROI Pooling of fast rcnn before the two subling classification and bbox regression networks. However, I see only one fc layer (the classifier object obtained from the original vgg network) in the code here:
simple-faster-rcnn-pytorch/model/faster_rcnn_vgg16.py
Lines 104 to 106 in ce5cf5e
when i run train.py after load data it occurs ConnectionError
like this:
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f014c50b438>: Failed to establish a new connection: [Errno 111] Connection refused',))
I found train dataloader and test dataloader return different ground truth bboxes even for the same input image? Why?
When I run python3 train.py train --env='fasterrcnn_caffe' --plot_every=100 --caffe_pretrain
,an error occured like this:
Traceback (most recent call last):
File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/__init__.py", line 11, in <module>
from cupy import core # NOQA
File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/core/__init__.py", line 1, in <module>
from cupy.core import core # NOQA
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 9, in <module>
from model import FasterRCNNVGG16
File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/__init__.py", line 1, in <module>
from .faster_rcnn_vgg16 import FasterRCNNVGG16
File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/faster_rcnn_vgg16.py", line 4, in <module>
from model.region_proposal_network import RegionProposalNetwork
File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/region_proposal_network.py", line 7, in <module>
from model.utils.creator_tool import ProposalCreator
File "/home/htu/yc/simple-faster-rcnn-pytorch-master/model/utils/creator_tool.py", line 2, in <module>
import cupy as cp
File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/__init__.py", line 32, in <module>
six.reraise(ImportError, ImportError(msg), exc_info[2])
File "/home/htu/anaconda3/lib/python3.6/site-packages/six.py", line 692, in reraise
raise value.with_traceback(tb)
File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/__init__.py", line 11, in <module>
from cupy import core # NOQA
File "/home/htu/anaconda3/lib/python3.6/site-packages/cupy/core/__init__.py", line 1, in <module>
from cupy.core import core # NOQA
ImportError: CuPy is not correctly installed.
If you are using wheel distribution (cupy-cudaXX), make sure that the version of CuPy you installed matches with the version of CUDA on your host.
Also, confirm that only one CuPy package is installed:
$ pip freeze
If you are building CuPy from source, please check your environment, uninstall CuPy and reinstall it with:
$ pip install cupy --no-cache-dir -vvvv
Check the Installation Guide for details:
https://docs-cupy.chainer.org/en/latest/install.html
original error: libcuda.so.1: cannot open shared object file: No such file or directory
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
python 3.6.4
cupy-cuda80 4.0.0
Cython 0.27.3
torch 0.4.0
cuda 8.0
cudnn v5.1
The version of cupy I installed matches with the version of CUDA.Why this error occured?
Hi,
I notice that in the configuration file utils/config.py
you set training epoch.
However, in the train.py
file , you just break the training loop when reaches 13 epoch.
This, for me, I set 15 epoch for training, but breaks at 13 epoch.
Waste much of my time.
Please delete the break at 13 epoch in train.py
thanks.
How to solve this problem?
When I use visdom,I meet # "PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.5/dist-packages/visdom/static/fonts'", but I can not find this file,what can I do,or do you have this file,thank you
Can I confirm that for the training, the input bounding boxes are (y1, x1, y2, x2) and for the faster_rcnn's predict method, the returned bounding boxes are (x1,y1, x2, y2)?
Hi, I've implemented the resnet101 structure on the top of vgg16 network, while the mAP on VOC datasets could only reach 0.62 after 20 epochs.
Do you have any idea what the problem would be? You can find the code here. Thank you.
How install torchnet into win10????
您好,国内要翻墙才能下pretrain model,我运算机器是远程的国内机器,无法翻墙,请问可以提供百度云等途径吗?
万分感谢
See https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/train.py#L118
I think optimizer.param_groups[0]
will be the bias parameter sometime.
HI @chenyuntc,
I have created my own custom dataset.
While training the model I am getting the following error:
I suspect this is because of some Label wrongly defined in one of the annotation files.
How can I debug to find which of the file is having that label?
I tried Printing the file id in the get_example function of voc_dataset.py. But the code breaks once encountered with a wrong label and doesn't enter into this function.
I know this issue is very much related to my own dataset, but It would be great if you can give me some leads.
Regards,
Mahavir
In other words, only return labels and gt_bboxes which are not difficultes in Dataset, so we do not need to process difficultes in others places any more.
If you have tried it, I would like to know is it equivalent to processing difficultes in others places.
Loss values for pytorch .4 can only be accessed through tensor.item() instead of indexing with [0] as seen in get_meter_data in train.py. This matters because with traditional indexing, a "tensor" object will be returned from get_meter_data which breaks visdom's jsonify encoder.
Hi @chenyuntc,
How can we disable Cuda, option while using the predict function? Is there a flag for doing so ?
I want to use it in CPU mode.
trainer.faster_rcnn.predict(img,visualize=True)
I am new to pytorch.
Regards,
Mahavir
Hi @chenyuntc,
Can I use the same model(chainer_best_model_converted_to_pytorch_0.7053.pth) for detecting objects in a video?If, yes can you give me some leads.What changes do I need to make in code?
Regards,
Mahavir
======user config========
{'caffe_pretrain': False,
'caffe_pretrain_path': '/vgg16_caffe.pth',
'data': 'voc',
'debug_file': '/tmp/debugf',
'env': 'fasterrcnn-caffe',
'epoch': 14,
'load_path': None,
'lr': 0.001,
'lr_decay': 0.1,
'max_size': 1000,
'min_size': 600,
'num_workers': 8,
'plot_every': 100,
'port': 8097,
'pretrained_model': 'vgg16',
'roi_sigma': 1.0,
'rpn_sigma': 3.0,
'test_num': 10000,
'test_num_workers': 8,
'use_adam': False,
'use_chainer': False,
'use_drop': False,
'voc_data_dir': '/home/chenzw/tensor/Faster-rcnn-hoi/hico_20160224_det/',
'weight_decay': 0.0005}
==========end============
loading dataset
model construct completed
1it [00:01, 1.78s/it]/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [36,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [37,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [38,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [39,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [40,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [41,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [42,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [43,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [60,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [61,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [62,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [63,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [96,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [97,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [98,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [99,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [100,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [101,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [102,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [103,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [104,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [105,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [106,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [107,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [108,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [109,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [110,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [111,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [112,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [113,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [114,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [115,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [116,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [117,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [118,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [119,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [120,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [121,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [122,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [123,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [0,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [1,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [2,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [3,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [4,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [5,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [6,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [7,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [8,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [9,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [10,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [11,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [12,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [13,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [14,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [15,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [28,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [29,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [30,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [31,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [72,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [73,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [74,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [75,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [76,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [77,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [78,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [79,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [80,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [81,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [82,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [83,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [84,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [85,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [86,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [87,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [92,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [93,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [94,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:417: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [95,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCTensorCopy.c line=21 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train3.py", line 137, in
fire.Fire()
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train3.py", line 84, in train
trainer.train_step(img, bbox, label, scale)
File "/home/chenzw/tensor/Faster-rcnn-hoi/trainer.py", line 168, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "/home/chenzw/tensor/Faster-rcnn-hoi/trainer.py", line 148, in forward
gt_roi_label = at.tovariable(gt_roi_label).long()
File "/home/chenzw/tensor/Faster-rcnn-hoi/utils/array_tool.py", line 31, in tovariable
return tovariable(totensor(data))
File "/home/chenzw/tensor/Faster-rcnn-hoi/utils/array_tool.py", line 25, in totensor
tensor = tensor.cuda()
File "/home/chenzw/anaconda3/envs/tensor/lib/python3.5/site-packages/torch/_utils.py", line 69, in cuda
return new_type(self.size()).copy(self, async)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCTensorCopy.c:21
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
this line
img = t.autograd.Variable(at.totensor(img).float()[None], volatile=True)
in faster_rcnn.py 224 occurs an error: "UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead."
I have tried to substitute "Variable" as "no_grad", but the another error "TypeError: init() takes exactly 1 argument (2 given)".
My pytorch is 0.4.0, I appreciate your answer and tips.
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'
I find you use custom computational formula. The origin paper dose not use the custom sigma.
Hi I tried to run the code but it seems that array_tool.py is missing.
It is hard coded in train.py
that when reached 9 epoch, do learning rate decay.
This is not elegant. Also, waste people's time if they didn't notice this setting.
Please, set this hard coded learing rate stepsize in config.py
Hello, I would like to ask, have you tried to implement the mask by yourself? Or what resources are recommended for the mask-rcnn?
Thank you for your effort, and I wonder that should the pos_roi_per_this_image
be neg_roi_per_this_image
in model/utils/creator_tool.py#L125 ?
Hello,
first of all thanks for your great efforts!
Is it possible to extract features for specific boxes? or even fine-tune the f-rcnn based on generated those features. An example case would be let's say you have annotated boxes for visual qa and would like to use f-rcnn as feature extractor and even finetune the pretrained model. any tips towards this direction?
So I've tried running the code on my own dataset, but it crashes randomly after a few 1000 iterations. I'm having a hard time debugging this but heres the output:
load data
model construct completed
99it [00:18, 5.41it/s]model/faster_rcnn.py:223: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad():
instead.
img = t.autograd.Variable(at.totensor(img).float()[None], volatile=True)
279it [00:49, 5.62it/s]model/utils/bbox_tools.py:183: RuntimeWarning: invalid value encountered in divide
return area_i / (area_a[:, None] + area_b - area_i)
model/utils/creator_tool.py:108: RuntimeWarning: invalid value encountered in greater_equal
pos_index = np.where(max_iou >= self.pos_iou_thresh)[0]
model/utils/creator_tool.py:116: RuntimeWarning: invalid value encountered in less
neg_index = np.where((max_iou < self.neg_iou_thresh_hi) &
model/utils/creator_tool.py:117: RuntimeWarning: invalid value encountered in greater_equal
(max_iou >= self.neg_iou_thresh_lo))[0]
model/utils/bbox_tools.py:138: RuntimeWarning: divide by zero encountered in log
dh = xp.log(base_height / height)
model/utils/bbox_tools.py:139: RuntimeWarning: divide by zero encountered in log
dw = xp.log(base_width / width)
280it [00:49, 5.62it/s]model/utils/creator_tool.py:411: RuntimeWarning: invalid value encountered in greater_equal
keep = np.where((hs >= min_size) & (ws >= min_size))[0]
12880it [28:03, 7.65it/s][[B
2097it [02:02, 17.14it/s]Traceback (most recent call last):
File "train.py", line 123, in
fire.Fire()
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 127, in Fire
component_trace = Fire(component, args, context, name)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 366, in Fire
component, remaining_args)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 542, in CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 103, in train
eval_result = eval(test_dataloader, faster_rcnn, test_num=opt.test_num)
File "train.py", line 29, in eval
pred_bboxes, pred_labels, pred_scores = faster_rcnn.predict(imgs, [sizes])
File "model/faster_rcnn.py", line 225, in predict
roi_cls_loc, roi_scores, rois, _ = self(img, scale=scale)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "model/faster_rcnn.py", line 126, in forward
h, rois, roi_indices)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "model/faster_rcnn_vgg16.py", line 136, in forward
rois = at.totensor(rois).float()
File "model/utils/array_tool.py", line 25, in totensor
tensor = tensor.cuda()
File "/usr/local/lib/python2.7/dist-packages/torch/_utils.py", line 69, in cuda
return new_type(self.size()).copy(self, async)
File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 172, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 24699) is killed by signal: Killed.
Hi,
is there a specific reason to why score_threshold
is set to 0.05 when evaluating?
I understand that setting it to 0.7 when visualizing allows to keep only strong positives, but why is such score set so low (0.05) when computing the mAP? Is that value simply derived by dividing 1 by the number of classes (20 in this case)? Should this value be any different when training the network on a different number of classes?
Thank you.
when I run train.py,it occured that:
Traceback (most recent call last):
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/util/connection.py", line 83, in create_connection
raise err
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.5/http/client.py", line 1106, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request
self.endheaders(body)
File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
self.send(msg)
File "/usr/lib/python3.5/http/client.py", line 877, in send
self.connect()
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f9c8347a2b0>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/htu/.local/lib/python3.5/site-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/htu/.local/lib/python3.5/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9c8347a2b0>: Failed to establish a new connection: [Errno 111] Connection refused',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/visdom/__init__.py", line 261, in _send
data=json.dumps(msg),
File "/home/htu/.local/lib/python3.5/site-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/home/htu/.local/lib/python3.5/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/home/htu/.local/lib/python3.5/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/home/htu/.local/lib/python3.5/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/home/htu/.local/lib/python3.5/site-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /events (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9c8347a2b0>: Failed to establish a new connection: [Errno 111] Connection refused',))
0it [00:00, ?it/s]
Traceback (most recent call last):
File "train.py", line 131, in <module>
fire.Fire()
File "/home/htu/.local/lib/python3.5/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/htu/.local/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/htu/.local/lib/python3.5/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 78, in train
scale = at.scalar(scale)
File "/home/htu/yc/simple-faster-rcnn-pytorch-master/utils/array_tool.py", line 43, in scalar
if isinstance(data, t._TensorBase):
AttributeError: module 'torch' has no attribute '_TensorBase'
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at [email protected]
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
%config Application.verbose_crash=True
resource.setrlimit(resource.RLIMIT_NOFILE, (20480, rlimit[1]))
20480 is too larger to run it.when I change it to 4096,I can run it.
Hi, I want to train object detector detecting tiny objects, what parameters should i change?
For example, if the image size is (500, 375), and the bbox is (166, 477, 188, 487). Thanks.
hi i installed everything and then i run the python3 train.py
but nothing was shown on visdom and these codes were on the terminal
os: <module 'os' from '/root/anaconda3/lib/python3.6/os.py'>
ipdb: <module 'ipdb' from '/root/anaconda3/lib/python3.6/site-packages/ipdb/init.py'>
matplotlib: <module 'matplotlib' from '/root/anaconda3/lib/python3.6/site-packages/matplotlib/init.py'>
tqdm: <class 'tqdm.tqdm.tqdm'>
opt: <utils.config.Config object at 0x7f6ae060cac8>
Dataset: <class 'data.dataset.Dataset'>
TestDataset: <class 'data.dataset.TestDataset'>
inverse_normalize: <function inverse_normalize at 0x7f6ae060e598>
FasterRCNNVGG16: <class 'model.faster_rcnn_vgg16.FasterRCNNVGG16'>
Variable: <class 'torch.autograd.variable.Variable'>
data: <module 'torch.utils.data' from '/root/anaconda3/lib/python3.6/site-packages/torch/utils/data/init.py'>
FasterRCNNTrainer: <class 'trainer.FasterRCNNTrainer'>
at: <module 'utils.array_tool' from '/home/garcons/simple-faster-rcnn-pytorch/utils/array_tool.py'>
visdom_bbox: <function visdom_bbox at 0x7f6a903a1620>
eval_detection_voc: <function eval_detection_voc at 0x7f6a9010f598>
resource: <module 'resource' from '/root/anaconda3/lib/python3.6/lib-dynload/resource.cpython-36m-x86_64-linux-gnu.so'>
rlimit: [524288, 1048576]
eval: <function eval at 0x7f6afe032e18>
train: <function train at 0x7f6a9010f840>
fire: <module 'fire' from '/root/anaconda3/lib/python3.6/site-packages/fire/init.py'>
with trainer.py, nothing happend.
what is the matter??
thanks!
I run your demo code and get the following problem.
`---------------------------------------------------------------------------
RecursionError Traceback (most recent call last)
in ()
1 trainer.load('pretrain/torchvision_pretrain.pth')
2 opt.caffe_pretrain=False # this model was trained from torchvision-pretrained model
----> 3 _bboxes, _labels, _scores = trainer.faster_rcnn.predict(img,visualize=True)
4 vis_bbox(at.tonumpy(img[0]),
5 at.tonumpy(_bboxes[0]),
/home/shared/project/simple-faster-rcnn-pytorch/model/faster_rcnn.py in predict(self, imgs, sizes, visualize)
212 for img in imgs:
213 size = img.shape[1:]
--> 214 img = preprocess(at.tonumpy(img))
215 prepared_imgs.append(img)
216 sizes.append(size)
/home/shared/project/simple-faster-rcnn-pytorch/utils/array_tool.py in tonumpy(data)
12 # return data.cpu().numpy()
13 if isinstance(data, t.autograd.Variable):
---> 14 return tonumpy(data.data)
15
16
... last 1 frames repeated, from the frame below ...
/home/shared/project/simple-faster-rcnn-pytorch/utils/array_tool.py in tonumpy(data)
12 # return data.cpu().numpy()
13 if isinstance(data, t.autograd.Variable):
---> 14 return tonumpy(data.data)
15
16
RecursionError: maximum recursion depth exceeded`
Thanks.
Hi,
I see that in your dataset class you are horizontally flipping the images and their corresponding bounding box labels, but instead of appending them to the original set you are replacing them. Link
I have two questions:
P.S.: I will be happy to send a PR for data augmentation.
when i run python3 train.py train,
======user config========
{'caffe_pretrain': False,
'caffe_pretrain_path': '/home/garcons/simple-faster-rcnn-pytorch/fasterrcnn_12211511_0.701052458187_torchvision_pretrain.pth',
'data': 'voc',
'debug_file': '/tmp/debugf',
'env': 'faster-rcnn',
'epoch': 14,
'load_path': None,
'lr': 0.001,
'lr_decay': 0.1,
'max_size': 1000,
'min_size': 400,
'num_workers': 4,
'plot_every': 40,
'port': 8097,
'pretrained_model': 'vgg16',
'roi_sigma': 1.0,
'rpn_sigma': 3.0,
'test_num': 1000,
'test_num_workers': 4,
'use_adam': False,
'use_chainer': False,
'use_drop': False,
'voc_data_dir': '/home/garcons/simple-faster-rcnn-pytorch/garconsdata/',
'weight_decay': 0.0005}
==========end============
load data
model construct completed
0it [00:00, ?it/s]/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [32,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [33,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
...
/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [31,0,0] Assertion indexAtDim < data.baseSizes[dim]
failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorIndex.cu line=648 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train.py", line 130, in
fire.Fire()
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 80, in train
trainer.train_step(img, bbox, label, scale)
File "/home/garcons/simple-faster-rcnn-pytorch/trainer.py", line 168, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "/home/garcons/simple-faster-rcnn-pytorch/trainer.py", line 147, in forward
at.totensor(gt_roi_label).long()]
File "/root/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 78, in getitem
return Index.apply(self, key)
File "/root/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 87, in forward
result = i.index(ctx.index)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorIndex.cu:648
this error occurs when i train with my own data
with pretrained model and VOC2007 dataset, there was no error like this.
i tried CUDA_LAUNCH_BLOCKING=1 python3 train.py train but it doesn't work.
how can i fix this error?
It is normally in "train" mode ,but becomes slower and slower after about 3500 images had been tested. Finally the process is killed and crashed. Maybe it is because of "memory leak"?
In addition, I modify the line 22 in the code from "20480" to "4096" to avoid error.
Does anyone have any idea on how I could modify?
Thank you ~~
Hi @chenyuntc,
Thanks for your simplified(simple) implementation of Faster R-CNN in pytorch.
Going by your instructions, I was successfully able to train/test the simple-faster-rcnn-pytorch setup on my system.
Now, I have a custom dataset which has 36 classes, I would like to train a Simple-Faster-R-CNN model (VGG/RES101) for that?
I think loading the dataset is the trickiest part, it would be great if you can suggest some tutorial/blog/link for the same.
Where do I get started, what changes need to be made?
I will make sure I open source the work I do during this process.
Thanks!
So I've encountered this new problem which happens rarely but atleast once / epoch on my custom dataset where the iou's are empty and the code crashes. What needs to be changed to handle this case elegantly?
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/local/lib/python2.7/dist-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "attack_trainer.py", line 82, in train
trainer.train_step(img, bbox, label, scale)
File "trainer.py", line 406, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "trainer.py", line 362, in forward
img_size)
File "model/utils/creator_tool.py", line 209, in call
inside_index, anchor, bbox)
File "model/utils/creator_tool.py", line 226, in _create_label
self._calc_ious(anchor, bbox, inside_index)
File "model/utils/creator_tool.py", line 260, in _calc_ious
gt_argmax_ious = ious.argmax(axis=0)
ValueError: attempt to get argmax of an empty sequence
Hi, is there any pre-trained model on the imagenet using this code?
Thank you for your contribution, can I extract the RPN module separately?Remove Faster RCNN classification steps?
I have some difficulties installing cupy, is there anyway to run this code without cupy?
phd@phd-HP-xw8600-Workstation:~/PycharmProjects/Examples/simple-faster-rcnn-pytorch/model/utils/nms$ python3 build.py build_ext --inplace
running build_ext
skipping '_nms_gpu_post.c' Cython extension (up-to-date)
building '_nms_gpu_post' extension
gcc -pthread -B /home/phd/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/phd/anaconda3/include/python3.6m -c _nms_gpu_post.c -o build/temp.linux-x86_64-3.6/_nms_gpu_post.o
_nms_gpu_post.c:525:31: fatal error: numpy/arrayobject.h: 没有那个文件或目录
compilation terminated.
error: command 'gcc' failed with exit status 1
我的python是用anaconda安装的,现在报这个错,但是numpy是有的
Hi,
can you please explain to me why are you using the following function to compute gt_argmax_ious
?
gt_argmax_ious = np.where(ious == gt_max_ious)[0]
If I understand it correctly, gt_argmax_ious
should be an array of size K (K is the number of gt boxes) containing the indexes of the rows of ious
(N x K matrix) corresponding to the maximum values.
Isn't this achieved a couple of lines above? (gt_argmax_ious = ious.argmax(axis=0)
)
To my understanding, where
is used to return an ordered list of the indexes, but what exactly is the point if said indexes are only used to fill the vector of positive examples ( label[gt_argmax_ious] = 1
, line 209 ).
Thank you.
blake@blake-ubuntu:~/cv/object_detection/faster-rcnn/simple-faster-rcnn-pytorch/model/utils/nms$ python build.py build_ext --inplace
running build_ext
cythoning _nms_gpu_post.pyx to _nms_gpu_post.c
building '_nms_gpu_post' extension
creating build
creating build/temp.linux-x86_64-3.5
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Wformat -Wformat-security -D_FORTIFY_SOURCE=2 -fstack-protector -O3 -fpic -fPIC -fPIC -I/home/blake/anaconda3/include/python3.5m -c _nms_gpu_post.c -o build/temp.linux-x86_64-3.5/_nms_gpu_post.o
_nms_gpu_post.c:266:31: fatal error: numpy/arrayobject.h: No such file or directory
#include "numpy/arrayobject.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1
cell 1: "util" should be "utils"
cell 2: no folder named "assets" with demo.jpg inside
cell 3: "PermissionError: [Errno 13] Permission denied: '/home/username/.torch/models/vgg16-397923af.pth' " (ubuntu 16.04)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.