Code Monkey home page Code Monkey logo

faster-rcnn_tensorflow's Introduction

Faster-RCNN_Tensorflow

Abstract

This is a tensorflow re-implementation of Faster R-CNN: Towards Real-Time ObjectDetection with Region Proposal Networks.

This project is completed by YangXue and YangJirui. Some relevant projects (R2CNN) and (RRPN) based on this code.

Train on VOC 2007 trainval and test on VOC 2007 test (PS. This project also support coco training.)

1

Comparison

use_voc2012_metric

Models mAP sheep horse bicycle bottle cow sofa bus dog cat person train diningtable aeroplane car pottedplant tvmonitor chair bird boat motorbike
resnet50_v1 75.16 74.08 89.27 80.27 55.74 83.38 69.35 85.13 88.80 91.42 81.17 81.71 62.74 78.65 86.86 47.00 76.71 50.29 79.05 60.51 80.96
resnet101_v1 77.03 79.68 89.33 83.89 59.41 85.68 76.59 84.23 88.50 88.50 81.54 79.16 72.66 80.26 88.42 47.50 79.81 52.85 80.70 59.94 81.87
mobilenet_v2 50.36 46.68 70.45 67.43 25.69 53.60 46.26 58.95 37.62 43.97 67.67 61.35 52.14 56.54 75.02 24.47 49.89 27.76 38.04 38.20 65.46

use_voc2007_metric

Models mAP sheep horse bicycle bottle cow sofa bus dog cat person train diningtable aeroplane car pottedplant tvmonitor chair bird boat motorbike
resnet50_v1 73.09 72.11 85.63 77.74 55.82 81.19 67.34 82.44 85.66 87.34 77.49 79.13 62.65 76.54 84.01 47.90 74.13 50.09 76.81 60.34 77.47
resnet101_v1 74.63 76.35 86.18 79.87 58.73 83.4 74.75 80.03 85.4 86.55 78.24 76.07 70.89 78.52 86.26 47.80 76.34 52.14 78.06 58.90 78.04
mobilenet_v2 50.34 46.99 68.45 65.89 28.16 53.21 46.96 57.80 38.60 44.12 66.20 60.49 52.40 56.06 72.68 26.91 49.99 30.18 39.38 38.54 64.74

Requirements

1、tensorflow >= 1.2
2、cuda8.0
3、python2.7 (anaconda2 recommend)
4、opencv(cv2)

Download Model

1、please download resnet50_v1resnet101_v1 pre-trained models on Imagenet, put it to $PATH_ROOT/data/pretrained_weights.
2、please download mobilenet_v2 pre-trained model on Imagenet, put it to $PATH_ROOT/data/pretrained_weights/mobilenet.
3、please download trained model by this project, put it to $PATH_ROOT/output/trained_weights.

Data Format

├── VOCdevkit
│   ├── VOCdevkit_train
│       ├── Annotation
│       ├── JPEGImages
│   ├── VOCdevkit_test
│       ├── Annotation
│       ├── JPEGImages

Compile

cd $PATH_ROOT/libs/box_utils/cython_utils
python setup.py build_ext --inplace

Demo(available)

Select a configuration file in the folder ($PATH_ROOT/libs/configs/) and copy its contents into cfgs.py, then download the corresponding weights.

cd $PATH_ROOT/tools
python inference.py --data_dir='/PATH/TO/IMAGES/' 
                    --save_dir='/PATH/TO/SAVE/RESULTS/' 
                    --GPU='0'

Eval

cd $PATH_ROOT/tools
python eval.py --eval_imgs='/PATH/TO/IMAGES/'  
               --annotation_dir='/PATH/TO/TEST/ANNOTATION/'
               --GPU='0'

Train

1、If you want to train your own data, please note:

(1) Modify parameters (such as CLASS_NUM, DATASET_NAME, VERSION, etc.) in $PATH_ROOT/libs/configs/cfgs.py
(2) Add category information in $PATH_ROOT/libs/label_name_dict/lable_dict.py     
(3) Add data_name to line 76 of $PATH_ROOT/data/io/read_tfrecord.py 

2、make tfrecord

cd $PATH_ROOT/data/io/  
python convert_data_to_tfrecord.py --VOC_dir='/PATH/TO/VOCdevkit/VOCdevkit_train/' 
                                   --xml_dir='Annotation'
                                   --image_dir='JPEGImages'
                                   --save_name='train' 
                                   --img_format='.jpg' 
                                   --dataset='pascal'

3、train

cd $PATH_ROOT/tools
python train.py

Tensorboard

cd $PATH_ROOT/output/summary
tensorboard --logdir=.

2 1

Reference

1、https://github.com/endernewton/tf-faster-rcnn
2、https://github.com/zengarden/light_head_rcnn
3、https://github.com/tensorflow/models/tree/master/research/object_detection

faster-rcnn_tensorflow's People

Contributors

yangxue0827 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

faster-rcnn_tensorflow's Issues

关于dota数据集

看到你用到了DOTA数据集,想问下对这个数据集你是怎么处理的?把它也变成了xml文件?那文件的格式是什么样子的呢?

无法正确推导

为什么预训练模型和我自己训练出来的模型进行推导有很多密密麻麻小标注框?是IOU的阈值需要修改吗?

The precision

Why the precision is so low ? does it present anything?

关于resnet提取特征

我看你的代码中关于采用resnet提取特征的,发现特征图是从conv_4之后那个进入roi pooling的,为什么不是从最后得到的特征图进入roi pooling呢,是采用的Object Detection Networks on Convolutional Feature Maps中NoC的方法吗?不知道我理解的对不对
s a kkqxy9 rn57alrw43

DataLossError (see above for traceback): Unable to open table file /home/litao/Algorithm/Faster-RCNN_Tensorflow-master/data/pretrained_weights/mobilenet/mobilenet_v2_1.0_224.ckpt: Data loss: file is too short to be an sstable: perhaps your file is in a different file format and you need to use a different restore operator? [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]] [[Node: save/RestoreV2/_305 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_306_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

大佬,训练的时候读取预训练出现这个问题,享请教一下?

decreasing the amount of ram used on gpu

Hello,

I'm trying to decrease the amount of ram used on the gpu for inference.

i've used the supplied cfgs.py that uses resnet101,
it uses up about 6G during training,
then i've frozen the graph -> created a 190MB frozen file,
i then use this frozen file from CPP , it consumes about 800MB on gpu for inference.

i then tried using the cfgs_res50.py, this lead to 3.5G of ram using on gpu during training,
a 115MB frozen graph file, and ram consumed on gpu during inference (from cpp) has not decreased significantly ( about 720M).

can you suggest a method to decrease the ram consumption on gpu ?
( maybe some other property of the network ? )

thanks,
Omer.

我在tensorflow1.8的环境使用pb的相关代码没有问题,但是我在1.12版本/1.11版本就会出现下面的问题,怎么半?

Traceback (most recent call last):
File "./Faster-RCNN_Tensorflow/testwww.py", line 106, in
'dataSet/matrixtime/lear/whole_image_data/train_test_split_json/1/lier_val1.json')
File "./Faster-RCNN_Tensorflow/testwww.py", line 55, in test
img = graph.get_tensor_by_name("input_img:0")
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3666, in get_tensor_by_name
return self.as_graph_element(name, allow_tensor=True, allow_operation=False)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3490, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3532, in _as_graph_element_locked
"graph." % (repr(name), repr(op_name)))
KeyError: "The name 'input_img:0' refers to a Tensor which does not exist. The operation, 'input_img', does not exist in the graph."

关于resnet的结构

image
想问下resnet.py为什么这里block3的stride=1呢,正常的resnet不应该是stride=2呢,这里设为1是有什么原因嘛?

Does the roi_pooling func is roi align ?

Hello my friend , I am using your Faster RCNN Code for my research . In the code, the Roi Pooling is used to pooling the different features maps to united spatial size. Here the tf.image.crop_and_resize function uses bilinear method to get the cropped feature. And the Roi Align method also use the bilinear method. I wonder whether the method in your code roi_pooling function is the same as Roi Align ? thank you

frcnn_res50 evaluation results of pascal inconsistent with yours

i down your FasterRCNN_20180527 model, and replace libs/configs/cfgs.py with cfgs_res50.py. my other setttings keep the same yours. Then i run tools/eval.py to get results: 85.6%, and yours in cfgs_res50.py is 75.2%. its much different. why it happened?

rpn_loc_loss = nan

image

hi, guys, when i train with default VOC dataset (and i did not change the version, num_class or add any dataset_name, just keep it default), I got this: rpn loss equals nan. Any idea about why this happens? Thanks a lot!

PaddingFIFOQueue '_2_get_batch/batch/padding_fifo_queue

PaddingFIFOQueue '_2_get_batch/batch/padding_fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
image
image
I was training with my own data, which has two categories. I got several steps of successfully training, with all the loss looks good, then I got an error like this.

NMS about postfastrcnn

Hello, my friend,
I'm very interested in your implementations for several excellent object detection algorithms. When I read your python file, 'build_whole_network.py' , the function

'keep = tf.image.non_max_suppression(
boxes=tmp_decoded_boxes,
scores=tmp_score,
max_output_size=cfgs.FAST_RCNN_NMS_MAX_BOXES_PER_CLASS,
iou_threshold=cfgs.FAST_RCNN_NMS_IOU_THRESHOLD).
calculate the indices after NMS. It's ok. But a question might confused me.

In the function 'tf.image.non_max_suppression', the official documents define the boxes, the coordinate vales are [ymin, xmin, ymax, xmax], However, in your code, the order of tmp_decoded_boxes is [xmin, ymin, xmax, ymax].

'def clip_boxes_to_img_boundaries(decode_boxes, img_shape):
'''

:param decode_boxes:
:return: decode boxes, and already clip to boundaries
'''

with tf.name_scope('clip_boxes_to_img_boundaries'):

    # xmin, ymin, xmax, ymax = tf.unstack(decode_boxes, axis=1)
    xmin = decode_boxes[:, 0]
    ymin = decode_boxes[:, 1]
    xmax = decode_boxes[:, 2]
    ymax = decode_boxes[:, 3]
    img_h, img_w = img_shape[1], img_shape[2]

    img_h, img_w = tf.cast(img_h, tf.float32), tf.cast(img_w, tf.float32)

    xmin = tf.maximum(tf.minimum(xmin, img_w-1.), 0.)
    ymin = tf.maximum(tf.minimum(ymin, img_h-1.), 0.)

    xmax = tf.maximum(tf.minimum(xmax, img_w-1.), 0.)
    ymax = tf.maximum(tf.minimum(ymax, img_h-1.), 0.)

    return tf.transpose(tf.stack([xmin, ymin, xmax, ymax]))' in your code 

Does it ok for calculating NMS indices when using [xmin, ymin, xmax, ymax], thank you !

Besides, do you plan to write a soft-NMS function with TF.

Compare vgg and resnet, What is your opinion?

Although resnet performs better than vggnet in many other projects, It seems that in vggnet is performing better in object detection than resnet, according to my experiments result. I am not sure that is the case for most people. Some of my friends have the same feelings. What is your opinion?

how to setting the number of train iteration about rpn and fast rcnn

when i running the train,py , the terminal only print the rpn_cla_loss infomation, another loss that
rpn_loc_loss,fast_rcnn_loc_loss,fast_rcnn_cla_loss and so on seem not execute train.
as the paper elaborate of faster rcnn that the training step should contain four step.

incompatible shapes

When training, I encountered a problem:
Invalid argument:Incompatible shapes:[1,1024,57,38] vs. [1,1024,29,19]

who can help me

加载预训练模型出错

2018-09-19 19:58:32.524151: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /home/user10/notebook/Faster-RCNN_Tensorflow/data/pretrained_weights/resnet_v1_101.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?加载与训练模型的时候会出错呢,好像是不匹配

Problems with resnet_50 and mobilenet

Hi there!

I tried to evaluate all three pre-trained weights but only got similar results to yours with resnet101. Resnet50 and mobilenet produce awful results. Am I missing something? I changed the configs file and completed the demo with success.
Edit: Using Pascal VOC2007.

Batch normalization is not trained.

Thank you for sharing your code, I have used it in my own project.
I have another question. you do note use BN to train resnet, since batch size is too small(The code only supports batch size 1), it works well when training voc data。I think voc data and imageNet are all natural scene data, so the BN parameters can be transfered. But, considering the image in industrial application, the image differ much with natural image, is it appropriate to use fixed BN parameters, should I open the BN training switch?
Thank you again.

mAP of Mobilenet_v2

Hi!
I train the resnet model sucessfully with a high mAP, but when i train mobilenet model, it seems very difficult to get the mAP in your readme. My specific experimental results are as follows:

  • 20 W iterations: 0.26
  • 40 W iterations: 0.32

The experimental configuration is consistent with MobilenetV2-cfgs in your code

训练方法请教!!!!!

感谢作者分享@yangxue0827

# 1. 我使用预训练的resnet_v1_101对faster rcnn 进行训练。训练20万次。
mAP:

cls : aeroplane|| Recall: 0.6666666666666666 || Precison: 0.0022756827048114434|| AP: 0.46463509684436277
____________________
cls : bicycle|| Recall: 0.8823529411764706 || Precison: 0.0012340600575894694|| AP: 0.6647693707354104
____________________
cls : bird|| Recall: 0.8260869565217391 || Precison: 0.0014851872117564292|| AP: 0.726061076604555
____________________
cls : boat|| Recall: 0.5714285714285714 || Precison: 0.0006957731779439903|| AP: 0.18661987013613413
____________________
cls : bottle|| Recall: 0.525 || Precison: 0.0017114914425427872|| AP: 0.26288153438397655
____________________
cls : bus|| Recall: 0.8947368421052632 || Precison: 0.0014474244359301831|| AP: 0.7767364360520345
____________________
cls : car|| Recall: 0.8727272727272727 || Precison: 0.0039049788480312398|| AP: 0.6629979033514892
____________________
cls : cat|| Recall: 0.8695652173913043 || Precison: 0.0017488632388947185|| AP: 0.7241643766664854
____________________
cls : chair|| Recall: 0.603448275862069 || Precison: 0.002913267854170135|| AP: 0.3342093527356763
____________________
cls : cow|| Recall: 0.8333333333333334 || Precison: 0.0008138020833333334|| AP: 0.6456923337137465
____________________
cls : diningtable|| Recall: 0.85 || Precison: 0.001575386896487814|| AP: 0.4378677644669432
____________________
cls : dog|| Recall: 0.9423076923076923 || Precison: 0.004218682737839001|| AP: 0.8287655837582466
____________________
cls : horse|| Recall: 0.9230769230769231 || Precison: 0.0010108668182966893|| AP: 0.735318642779188
____________________
cls : motorbike|| Recall: 0.8125 || Precison: 0.0010638297872340426|| AP: 0.6587310606060605
____________________
cls : person|| Recall: 0.7990867579908676 || Precison: 0.013724413771468904|| AP: 0.6867495503929344
____________________
cls : pottedplant|| Recall: 0.55 || Precison: 0.0009232835319791841|| AP: 0.3314001624013871
____________________
cls : sheep|| Recall: 0.6666666666666666 || Precison: 0.001166569452545621|| AP: 0.4209187768091877
____________________
cls : sofa|| Recall: 0.9285714285714286 || Precison: 0.001256402822074031|| AP: 0.3066758517301329
____________________
cls : train|| Recall: 1.0 || Precison: 0.0014677552518117603|| AP: 0.9336013645224172
____________________
cls : tvmonitor|| Recall: 0.8421052631578947 || Precison: 0.0014217167229429537|| AP: 0.7038577828454168
____________________
mAP is : 0.5746326945767893

我的config:

NET_NAME = 'resnet_v1_101' #'MobilenetV2'
ADD_BOX_IN_TENSORBOARD = True

# ---------------------------------------- System_config
ROOT_PATH = os.path.abspath('../')
print(20*"++--")
print(ROOT_PATH)
GPU_GROUP = "0"
SHOW_TRAIN_INFO_INTE = 50
SMRY_ITER = 100
SAVE_WEIGHTS_INTE = 500
FAST_RCNN_IOU_MAP=0.5 # voc is 05, coco diffrent in voc

SUMMARY_PATH = ROOT_PATH + '/output/summary'
TEST_SAVE_PATH = ROOT_PATH + '/tools/test_result'
# INFERENCE_IMAGE_PATH = ROOT_PATH + '/tools/inference_image'
# INFERENCE_SAVE_PATH = ROOT_PATH + '/tools/inference_results'

if NET_NAME.startswith("resnet"):
    weights_name = NET_NAME
elif NET_NAME.startswith("MobilenetV2"):
    weights_name = "mobilenet/mobilenet_v2_1.0_224"
else:
    raise Exception('net name must in [resnet_v1_101, resnet_v1_50, MobilenetV2]')

PRETRAINED_CKPT = ROOT_PATH + '/data/pretrained_weights/resnet_v1_101_2016_08_28/' + weights_name + '.ckpt'
TRAINED_CKPT = os.path.join(ROOT_PATH, 'output/trained_weights')
EVALUATE_DIR = ROOT_PATH + '/output/evaluate_result_pickle/'

# ------------------------------------------ Train config
RESTORE_FROM_RPN = False
IS_FILTER_OUTSIDE_BOXES = True
FIXED_BLOCKS = 1  # allow 0~3

RPN_LOCATION_LOSS_WEIGHT = 1.
RPN_CLASSIFICATION_LOSS_WEIGHT = 1.

FAST_RCNN_LOCATION_LOSS_WEIGHT = 1.
FAST_RCNN_CLASSIFICATION_LOSS_WEIGHT = 1.
RPN_SIGMA = 3.0
FASTRCNN_SIGMA = 1.0

MUTILPY_BIAS_GRADIENT = None   # 2.0  # if None, will not multipy
GRADIENT_CLIPPING_BY_NORM = None   # 10.0  if None, will not clip

EPSILON = 1e-5
MOMENTUM = 0.9
LR = 0.0003 # 0.001  # 0.0003
DECAY_STEP = [5000, 10000, 50000, 100000]  # 50000, 70000
MAX_ITERATION = 200000

# -------------------------------------------- Data_preprocess_config
DATASET_NAME = 'pascal'  # 'ship', 'spacenet', 'pascal', 'coco' airplane
PIXEL_MEAN = [123.68, 116.779, 103.939]  # R, G, B. In tf, channel is RGB. In openCV, channel is BGR
IMG_SHORT_SIDE_LEN = 600
IMG_MAX_LENGTH = 1000
CLASS_NUM = 20

# --------------------------------------------- Network_config
BATCH_SIZE = 1
INITIALIZER = tf.random_normal_initializer(mean=0.0, stddev=0.01)
BBOX_INITIALIZER = tf.random_normal_initializer(mean=0.0, stddev=0.001)
WEIGHT_DECAY = 0.00004 if NET_NAME.startswith('Mobilenet') else 0.0001

# ---------------------------------------------Anchor config
BASE_ANCHOR_SIZE_LIST = [32, 64, 128,  256, 512]  # can be modified
ANCHOR_STRIDE = [16]  # can not be modified in most situations
ANCHOR_SCALES = [0.5, 1., 1.5, 2.0]  # [4, 8, 16, 32]
ANCHOR_RATIOS = [0.5, 1., 1.5, 2.0]
ROI_SCALE_FACTORS = [10., 10., 5.0, 5.0]
ANCHOR_SCALE_FACTORS = None


# --------------------------------------------RPN config
KERNEL_SIZE = 3
RPN_IOU_POSITIVE_THRESHOLD = 0.7
RPN_IOU_NEGATIVE_THRESHOLD = 0.3
TRAIN_RPN_CLOOBER_POSITIVES = False

RPN_MINIBATCH_SIZE = 256
RPN_POSITIVE_RATE = 0.5
RPN_NMS_IOU_THRESHOLD = 0.7
RPN_TOP_K_NMS_TRAIN = 12000
RPN_MAXIMUM_PROPOSAL_TARIN = 2000

RPN_TOP_K_NMS_TEST = 6000  # 5000
RPN_MAXIMUM_PROPOSAL_TEST = 300  # 300


# -------------------------------------------Fast-RCNN config
ROI_SIZE = 14
ROI_POOL_KERNEL_SIZE = 2
USE_DROPOUT = False
KEEP_PROB = 1.0
SHOW_SCORE_THRSHOLD = 0.5  # only show in tensorboard

FAST_RCNN_NMS_IOU_THRESHOLD = 0.3  # 0.6
FAST_RCNN_NMS_MAX_BOXES_PER_CLASS = 100
FAST_RCNN_IOU_POSITIVE_THRESHOLD = 0.5
FAST_RCNN_IOU_NEGATIVE_THRESHOLD = 0.0   # 0.1 < IOU < 0.5 is negative
FAST_RCNN_MINIBATCH_SIZE = 256  # if is -1, that is train with OHEM
FAST_RCNN_POSITIVE_RATE = 0.25

ADD_GTBOXES_TO_TRAIN = False

2 .我使用了提供的FasterRCNN_20180517这个已经训练好的模型(voc_200000),在没改参数的情况下(使用您的默认参数),我在这个基础上再训练3万步。得到voc_233500model:

cls : aeroplane|| Recall: 0.7619047619047619 || Precison: 0.001602323368884883|| AP: 0.6741481378722718
____________________
cls : bicycle|| Recall: 0.8823529411764706 || Precison: 0.0007361601884570083|| AP: 0.8715109573241061
____________________
cls : bird|| Recall: 0.9565217391304348 || Precison: 0.0011441647597254005|| AP: 0.8858960564712017
____________________
cls : boat|| Recall: 0.7857142857142857 || Precison: 0.0005639290474725725|| AP: 0.6115246098439377
____________________
cls : bottle|| Recall: 0.7 || Precison: 0.0014211755151761242|| AP: 0.6218996403084555
____________________
cls : bus|| Recall: 0.8421052631578947 || Precison: 0.0008425043441630246|| AP: 0.7195901250230623
____________________
cls : car|| Recall: 0.8545454545454545 || Precison: 0.002362165150525205|| AP: 0.7775615386015804
____________________
cls : cat|| Recall: 1.0 || Precison: 0.001188200650927313|| AP: 0.8049488314763458
____________________
cls : chair|| Recall: 0.7758620689655172 || Precison: 0.002310892004313665|| AP: 0.46282754992507574
____________________
cls : cow|| Recall: 0.9166666666666666 || Precison: 0.0005663097199341021|| AP: 0.7368307019777608
____________________
cls : diningtable|| Recall: 0.65 || Precison: 0.0007833212822366836|| AP: 0.3668397782093261
____________________
cls : dog|| Recall: 0.9615384615384616 || Precison: 0.002598077422707197|| AP: 0.8625212115702469
____________________
cls : horse|| Recall: 0.9230769230769231 || Precison: 0.0006301528120569238|| AP: 0.8547008547008548
____________________
cls : motorbike|| Recall: 0.9375 || Precison: 0.0007374268718352097|| AP: 0.9177083333333333
____________________
cls : person|| Recall: 0.8584474885844748 || Precison: 0.009370015948963317|| AP: 0.7914530917349596
____________________
cls : pottedplant|| Recall: 0.75 || Precison: 0.0007785332433694919|| AP: 0.4599851926825611
____________________
cls : sheep|| Recall: 0.6666666666666666 || Precison: 0.0007417611529087634|| AP: 0.529915371887559
____________________
cls : sofa|| Recall: 0.9285714285714286 || Precison: 0.0007365021811795365|| AP: 0.5368746286393346
____________________
cls : train|| Recall: 1.0 || Precison: 0.0008266597778351847|| AP: 0.9278143274853801
____________________
cls : tvmonitor|| Recall: 0.8421052631578947 || Precison: 0.0008258064516129032|| AP: 0.7607491923281398
____________________
mAP is : 0.7087650065697748

这个模型的参数明显要好过我训练模型,其中的原因不明。
我想问下一这个 faster rcnn模型训练时候的congfig能提供一下吗??
本人调不出这个级别的mAP >= 0.7,这里进训练的时候有什么技巧吗?
再次谢谢您的耐性回复

about voc_eval.py

I just observed that the difference of mAP when I set "use_diff" . If i set it as True, I get a mAP of 0.69 and get 0.77 when set as False. So i try to understand what's the standard setting about "use_diff" in VOC metrics.

about voc_eval.py

I just observed that the difference of mAP when I set "use_diff" . If i set it as True, I get a mAP of 0.69 and get 0.77 when set as False. So i try to understand what's the standard setting about "use_diff" in VOC metrics.

Training problem with ohem

When I training with ohem, some error has occurred:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[2000,7,7,512]

build_whole_network中的代码请教

        with tf.control_dependencies([rpn_labels]):
            with tf.variable_scope('sample_RCNN_minibatch'):
                rois, labels, bbox_targets = \
                tf.py_func(proposal_target_layer,
                           [rois, gtboxes_batch],
                           [tf.float32, tf.float32, tf.float32])
                rois = tf.reshape(rois, [-1, 4])
                labels = tf.to_int32(labels)
                labels = tf.reshape(labels, [-1])
                bbox_targets = tf.reshape(bbox_targets, [-1, 4*(cfgs.CLASS_NUM+1)])
                self.add_roi_batch_img_smry(input_img_batch, rois, labels)

请问在404行的代码中为什么要使用 with tf.control_dependencies([rpn_labels])?我看proposal_target_layer中没有依赖rpn_labels,为什么需要执行完rpn_labels,再计算proposal_target_layer。还有tensorflow是顺序执行的,为什么这里要加上control_dependencies控制呢?谢谢

关于OHEM

你好,我想问一下,我可以直接把OHEM loss 放到FPN中去训练吗

关于mobilenet进行训练

使用mobilenet进行训练是否只需设置
NET_NAME = 'MobilenetV2' #'MobilenetV2'
预训练模型我也下载了
我尝试了训练但是一直得不到结果,没有报错但是画不出框,mAP也一直就0.08的样子,但是看score_greater_05的图好像也没啥问题

Incompatible shapes

hi,excuse me, I encountered the following problems while training the network, may I ask for some assistants?

Traceback (most recent call last):
File "/home/vivian/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1323, in _do_call
return fn(*args)
File "/home/vivian/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
status, run_metadata)
File "/home/vivian/anaconda3/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [256,2,4] vs. [2688,2,4]
[[Node: build_loss/FastRCNN_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](build_loss/FastRCNN_loss/Reshape, build_loss/FastRCNN_loss/Reshape_1)]]

About anchor centers

In function make_anchors(), there are code:
x_centers = tf.range(featuremap_width, dtype=tf.float32) * stride y_centers = tf.range(featuremap_height, dtype=tf.float32) * stride
It make me confusing.
As the code do, if stride is 32, the x_centers and y_centers may be like 0,32,64,96...
To my opinion, x_centers and y_centers should be 16, 48, 80,112..., if so, x y will be center.
Am I understanding right?

error: python eval.py

--eval_imgs=? --annotation_dir=?
How should this path be written?
I tried various forms and failed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.