Code Monkey home page Code Monkey logo

retinanet_tensorflow_rotation's People


yangxue0827 avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

retinanet_tensorflow_rotation's Issues


/home/lyy/hq/RetinaNet_Tensorflow_Rotation-master/libs/box_utils/ RuntimeWarning: divide by zero encountered in log
targets_dh = np.log(gt_rois[:, 3] / ex_rois[:, 3])

compile error

error information:

python build_ext --inplace
running build_ext
skipping 'bbox.c' Cython extension (up-to-date)
skipping 'nms.c' Cython extension (up-to-date)
building 'cython_bbox' extension
creating build
creating build/temp.linux-x86_64-3.7
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
gcc -pthread -B /home/huangwei/anaconda3/envs/tensorflow-R3det/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/huangwei/anaconda3/envs/tensorflow-R3det/lib/python3.7/site-packages/numpy/core/include -I/home/huangwei/anaconda3/envs/tensorflow-R3det/include/python3.7m -c bbox.c -o build/temp.linux-x86_64-3.7/bbox.o -Wno-cpp -Wno-unused-function
bbox.c: In function ‘__Pyx__ExceptionSave’:
bbox.c:9439:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
*type = tstate->exc_type;
bbox.c:9440:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
*value = tstate->exc_value;
bbox.c:9441:17: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
*tb = tstate->exc_traceback;
bbox.c: In function ‘__Pyx__ExceptionReset’:


OutOfRangeError (see above for traceback): PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 8, current size 0) 错误,请问应该如何解决这个问题。

how to test a image

how to test a image with this net,and where should I download and put the weight file?
please ...



OOM using RTX 2080

I cropped 800x800, overlap=200, and get oom for a few steps(~ 300). I never meet this problem when training on your R2CNN project, even cropped to 1024x1024. Does RetinaNet need more memory? How to deal with this problem......

Simple question about training

I am successfully training with a minor issue under cuda 10 and TensorFlow 1.14. environment.
I want to ask two things

  1. Does this code have Total epochs to automatically end the training or keep going unless I stop by myself?
    but I can see It does save the weight in the middle of training and how many iterations are for one epoch.
    ------------------------------------------ Train config
    FIXED_BLOCKS = 1 # allow 0~3
    FREEZE_BLOCKS = [True, False, False, False, False] # for gluoncv backbone
    USE_07_METRIC = True

MUTILPY_BIAS_GRADIENT = 2.0 # if None, will not multipy
GRADIENT_CLIPPING_BY_NORM = 10.0 # if None, will not clip


EPSILON = 1e-5
LR = 5e-4
WARM_SETP = int(1.0 / 4.0 * SAVE_WEIGHTS_INTE)

  1. Another question is about a warning issue, Do you have any idea if it is bad enough to influence the result or not? Thanks in advance!

Issue is below
WARNING:tensorflow:Entity <bound method of <tensorflow.python.layers.convolutional.Conv2D object at 0x7fb846a0c550>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, setthe verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: converting <bound of <tensorflow.python.layers.convolutional.Conv2D object at 0x7fb846a0c550>>: AssertionError: Bad argument numberfor Name: 3, expecting 4

Error of loading checkpoint when training finished

Here is the lines I changed in my

VERSION = 'resnet_v1_50_20190729' # a new name
NET_NAME = 'resnet50_v1d' # 'MobilenetV2'

Then I started to train and had a series of files in ./output/trained_weights/resnet_v1_50_20190729, including:

Then I tested and got following errors:
2019-07-29 21:13:08.984880: W tensorflow/core/framework/] OP_REQUIRES failed at : Not found: Key resnet50_v1d/C1/conv0/BatchNorm/beta not found in checkpoint
tensorflow.python.framework.errors_impl.NotFoundError: Key resnet50_v1d/C1/conv0/BatchNorm/beta not found in checkpoint
[[{{node save/RestoreV2}}]]
tensorflow.python.framework.errors_impl.NotFoundError: Key resnet50_v1d/C1/conv0/BatchNorm/beta not found in checkpoint
[[node save/RestoreV2 (defined at ../libs/networks/ ]]
tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint

This error is similar to #14. I put those DOTA_1000model.ckpt files into ./data/pretrained_weights and it didn't help. Any suggestions?

采用更新后的代码进行训练 检测结果图片上没有画框结果

1.我采用的是更新后的代码进行训练,先将训练集图片进行裁剪 裁剪后有两万多张 由于我的gpu很差速度很慢目前只训练了5k步 但我用训练结果进行测试 图片上完全没有结果 是哪里有问题吗(虽然没到一个epoch 但一个结果都没有总感觉有些奇怪)
2.我用tensorboard查看训练结果 totalloss降到了0.9左右 gtboxes_h和gtbox_r都有框 但是final_detection完全没有框 这种结果是正常的吗?

compile error

ubuntu 18.04 cuda 10.0 tensorflow-gpu1.13.1
bbox.c:9512:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
tstate->exc_value = local_value;
bbox.c:9513:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
tstate->exc_traceback = local_tb;
error: command 'gcc' failed with exit status 1
Makefile:2: recipe for target 'all' failed
make: *** [all] Error 1

error: (-215) intersection.size() <= 8 in function rotatedRectangleIntersection

训练过程中会随机出现这个错误,有的时候训练了200000step才出现,有时候很快就出现,我在github的opencv项目下看到也有其他人提到了这个问题,说是可以在intersection.cpp里面把float改成double,但是我并没有找到这个文件;另外我把opencv3.4.2版本给conda remove了,不知道程序为什么还能跑起来,也没报No module named 'cv2'的错误,不知道是什么原因。有谁遇到了这个问题么?


配置文件cfgs.py中 IMG_SHORT_SIDE_LEN ,IMG_MAX_LENGTH 两个参数是什么含义?如果我训练的数据集图片大小为1024*1024的话,这两个参数需要修改马?其他文件需要修改吗?

训练了550k steps,使用默认配置文件在DOTA v1.0上测试mAP只有0.49

很奇怪,我按照作者大佬提供的配置文件没有做修改,直接拿来训练DOTA,但是比正常情况低了10个点,然后我看了每一类的情况,像small-vehicle ,mAP应该能达到0.66,但我这里mAP只有0.18,之前做R2CNN也是small-vehicle的mAP非常低,然后整体mAP差不多低了10个点。我怀疑是否给的anchor的size太大了,检测不到小物体?另外是否还要手动改下cfg.py的参数才能达到给出的mAP?

训练自己数据出现了out of memory

out of memory
invalid argument
an illegal memory access was encountered
an illegal memory access was encountered

训练中遇到了一些问题 out of memory invalid argument

out of memory
invalid argument
an illegal memory access was encountered
an illegal memory access was encountered
2019-07-26 12:37:44.321824: E tensorflow/stream_executor/cuda/] failed to record completion event; therefore, failed to create inter-stream dependency
2019-07-26 12:37:44.321869: I tensorflow/stream_executor/] stream 0x55dba725bbb0 did not memcpy host-to-device; source: 0x7faecb400000
2019-07-26 12:37:44.321836: E tensorflow/stream_executor/cuda/] failed to record completion event; therefore, failed to create inter-stream dependency
2019-07-26 12:37:44.321837: E tensorflow/stream_executor/cuda/] failed to record completion event; therefore, failed to create inter-stream dependency
2019-07-26 12:37:44.321896: I tensorflow/stream_executor/] stream 0x55dba725bbb0 did not memcpy host-to-device; source: 0x7faec9800000
2019-07-26 12:37:44.321903: I tensorflow/stream_executor/] stream 0x55dba725bbb0 did not memcpy host-to-device; source: 0x7faeccc00000
2019-07-26 12:37:44.321892: E tensorflow/stream_executor/] Error recording event in stream: error recording CUDA event on stream 0x55dba75d5170: CUDA_ERROR_ILLEGAL_ADDRESS; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
2019-07-26 12:37:44.321837: E tensorflow/stream_executor/cuda/] failed to record completion event; therefore, failed to create inter-stream dependency
2019-07-26 12:37:44.321971: I tensorflow/stream_executor/] stream 0x55dba725bbb0 did not memcpy host-to-device; source: 0x7fb050800000
2019-07-26 12:37:44.321974: E tensorflow/stream_executor/cuda/] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS
2019-07-26 12:37:44.321981: F tensorflow/core/common_runtime/gpu/] Unexpected Event status: 1

About train batchsize > 1 in one GPU?

Does this code support train in one GPU with 2 batchsize?

I find that in it does not limit the batchsize, while in the, it limits that batchsize must be 1.

So, can I train with 2 batchsize in one GPU?



Mask-RCNN Model

Waiting for the Mask-RCNN implementation.
This is just 5% more accuracy than the previous best.

关于 batch_size和epoch

一次 epoch 是否是遍历完一次 所有sample?
如果是这样的话,遍历完一次所需要的 step 是否是等于 sample/batch_size?




[2019-07-31 19:41:10] global_step:1035 current_step:1035 per_cost_time:1.036s
cls_loss:1.436 reg_loss:0.588 total_losses:2.024

[2019-07-31 19:41:15] global_step:1040 current_step:1040 per_cost_time:1.011s
cls_loss:1.184 reg_loss:0.614 total_losses:1.798

[2019-07-31 19:41:20] global_step:1045 current_step:1045 per_cost_time:1.066s
cls_loss:0.467 reg_loss:0.000 total_losses:0.467

[2019-07-31 19:41:26] global_step:1050 current_step:1050 per_cost_time:1.079s
cls_loss:1.418 reg_loss:0.469 total_losses:1.887

[2019-07-31 19:41:31] global_step:1055 current_step:1055 per_cost_time:1.070s
cls_loss:1.206 reg_loss:0.575 total_losses:1.781

[2019-07-31 19:41:36] global_step:1060 current_step:1060 per_cost_time:1.025s
cls_loss:1.335 reg_loss:0.463 total_losses:1.798

[2019-07-31 19:41:42] global_step:1065 current_step:1065 per_cost_time:1.039s
cls_loss:1.209 reg_loss:0.550 total_losses:1.759

[2019-07-31 19:41:47] global_step:1070 current_step:1070 per_cost_time:0.951s
cls_loss:1.000 reg_loss:0.512 total_losses:1.512

[2019-07-31 19:41:52] global_step:1075 current_step:1075 per_cost_time:1.074s
cls_loss:0.891 reg_loss:0.636 total_losses:1.526

About load checkpoint

when i load the resnet50_v1d chekpoint, it shows that :

NotFoundError (see above for traceback)Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Tensor name "resnet50_v1d/C1/conv0/BatchNorm/beta" not found in checkpoint files './data/pretrained_weights/resnet50_v1d.ckpt'

I download the resnet50_v1.ckpt from the link in readme, can u give me some suggestions aobut this ?



about creating tfrecord from own dataset

hello! I have issue on using my own dataset.

tested the code on tensorflow 1.15.0 version.
cuda 10.1

I met

Out of range: PaddingFIFOQueue '_1_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)

error with own dataset tfrecord file.
I used your file but I didn't do the data crop before that.
Is this error comes from not cropping my data to 600x600?
and yes my datasets are all have different size...

it would be very thankful to give me some help!

error in SCRDet link

hi,i can't find the SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects and the link of it is unused, so can you give some help.

Compile box_utils under window system.


Rotation is not accurate as R2CNN

I have trained a model but why is the output more than often making a plus sign. So the horizontal detection is coming along with another detection vertically making a plus sign. I don't know why is it happening ?


安装并运行了代码,测试了几张无人机俯视拍摄的包含汽车的图片和dota数据集中的几张图片,使用--show_box成功保存处理后的图像,但是图上都没有画出框。检查了test_dota.py中间变量,发现det_boxes_r_均返回空数组。我的环境是anaconda3+ python3.5+ tensorflow1.12+cuda 9.1。cfgs.py文件中只修改了NET_NAME = 'resnet_v1_50'。换成其他预训练网结果相同:程序可以跑通但是没有检测到任何结果。



回归loss nan


Import rbbx_overlaps fails

When I ran "python", I got an error importing rbbx_overlaps as follows

Traceback (most recent call last):
File "", line 17, in
from libs.networks import build_whole_network
File "../libs/networks/", line 13, in
from libs.losses import losses
File "../libs/losses/", line 9, in
from libs.box_utils.iou_rotate import iou_rotate_calculate2
File "../libs/box_utils/", line 10, in
from libs.box_utils.rbbox_overlaps import rbbx_overlaps
ImportError: ../libs/box_utils/ failed to map segment from shared object

All my packages have the correct versions. I have run the "python build_ext --inplace" commands to generate those .so files. Any suggestions?

By the way, I use English since my laptop has no Pinyin input. Please feel free to reply in Chinese. Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.