eragonruan / text-detection-ctpn Goto Github PK

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

License: MIT License

Python 91.50% Shell 0.07% Cython 5.17% C++ 0.09% Cuda 3.17%

text-detection tensorflow ctpn id-card ocr robust-reading

text-detection-ctpn's Introduction

text-detection-ctpn

text detection mainly based on ctpn (connectionist text proposal network). It is implemented in tensorflow. I use id card detect as an example to demonstrate the results, but it should be noticing that this model can be used in almost every horizontal scene text detection task. The origin paper can be found here. Also, the origin repo in caffe can be found in here. For more detail about the paper and code, see this blog. If you got any questions, check the issue first, if the problem persists, open a new issue.

roadmap

freeze the graph for convenient inference
pure python, cython nms and cuda nms
loss function as referred in paper
oriented text connector
BLSTM

demo

for a quick demo,you don't have to build the library, simpely use demo_pb.py for inference.
first, git clone [email protected]:eragonruan/text-detection-ctpn.git --depth=1
then, download the pb file from release
put ctpn.pb in data/
put your images in data/demo, the results will be saved in data/results, and run demo in the root

python ./ctpn/demo_pb.py

parameters

there are some parameters you may need to modify according to your requirement, you can find them in ctpn/text.yml

USE_GPU_NMS # whether to use nms implemented in cuda or not
DETECT_MODE # H represents horizontal mode, O represents oriented mode, default is H
checkpoints_path # the model I provided is in checkpoints/, if you train the model by yourself,it will be saved in output/

training

setup

requirements: python2.7, tensorflow1.3, cython0.24, opencv-python, easydict,(recommend to install Anaconda)
if you do not have a gpu device,follow here to setup
if you have a gpu device, build the library by

cd lib/utils
chmod +x make.sh
./make.sh

prepare data

First, download the pre-trained model of VGG net and put it in data/pretrain/VGG_imagenet.npy. you can download it from google drive or baidu yun.
Second, prepare the training data as referred in paper, or you can download the data I prepared from google drive or baidu yun. Or you can prepare your own data according to the following steps.
Modify the path and gt_path in prepare_training_data/split_label.py according to your dataset. And run

cd lib/prepare_training_data
python split_label.py

it will generate the prepared data in current folder, and then run

python ToVoc.py

to convert the prepared training data into voc format. It will generate a folder named TEXTVOC. move this folder to data/ and then run

cd ../../data
ln -s TEXTVOC VOCdevkit2007

train

Simplely run

python ./ctpn/train_net.py

you can modify some hyper parameters in ctpn/text.yml, or just used the parameters I set.
The model I provided in checkpoints is trained on GTX1070 for 50k iters.
If you are using cuda nms, it takes about 0.2s per iter. So it will takes about 2.5 hours to finished 50k iterations.

some results

NOTICE: all the photos used below are collected from the internet. If it affects you, please contact me to delete them.

oriented text connector

oriented text connector has been implemented, i's working, but still need futher improvement.
left figure is the result for DETECT_MODE H, right figure for DETECT_MODE O

text-detection-ctpn's People

Contributors

Stargazers

Watchers

Forkers

justrypython louisruan fendaq angenius lxj0276 kyocen wjgaas zgsxwsdxg amos-zq dengcy028 likeucode jdc08161063 barbecacov ksharpdabu cclauss taodream benjamesbabala dreadlord1984 brownofsummer simmoncn cjt222 yingchunsun sarathknv northeastsquare wenlihaoyu weiliangxiao yuan39 pustar wyhgood whrenstone mxsurui awilliamson hsddlz hanxiaoyuyu ghhong1986 pengfei2017 afternoonzhou xiaohujecky winjia hookover kmustriver wangweilai1 rothluo carnon tower0823 citysir liu-zhy linecode qinghuizhao yangxue0827 qwzhong1988 aijiajia zhongkailv axfv attendfov xshhhm ruochen0715 machinelp hitflame xielm12 qianfu1997 jasondoinggreat csuncs89 cailiang9 bygreencn clscy sjl421 6676401088 wkhunter youngstu huasanyelao yangcan2017 yuckfu keyky troll-zhao shrincy juventi nishathussain cv9527 sayantanmukherjee6 walsvid 10183308 xbyang18 zmxheart friendmine ajaycharan chunlei alexanderluo kuyun-zhangyang apollochen82 lss616263 learnccode airyzf feng257 a382695908 xiaolaodi 280185386 timedcy fitrialif amusi

text-detection-ctpn's Issues

After CTPN. What is your idea?

Hello. eragonruan!
I was very impressed with your code.
One more time. Thank you so much.

I have a problem.
When the image passes through the CTPN, a green box is created.
If I put this image in the OCR engine, how should I separate it (green box)?
My idea is to use OpenCV.
But the green box is on the text. so it hides the text.
Can I draw a thin line to solve the problem?

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. This may consume a large amount of memory.

when i try to train my own datasets, i faced this core dump,

Computing bounding-box regression targets...
bbox target means:
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[ 0. 0. 0. 0.]
bbox target stdevs:
[[ 0.1 0.1 0.2 0.2]
[ 0.1 0.1 0.2 0.2]]
[ 0.1 0.1 0.2 0.2]
Normalizing targets
done
Solving...
/data/resys/var/python2.7.3/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Segmentation fault (core dumped)

this fault caused by the function train_model() in lib/fast-rcnn/train.py ， when it run train_op=opt.apply_gradients(list(zip(grads,tvars),global_step=global_step)) .

have you ever faced this error？

I really appreciate your work. And I saw the paper uses rnn(BLSTM). Did you use it in your code?

how to improve the performance in detecting small words?

SCALES and MAX_SIZE

@eragonruan 有没有测试过SCALES and MAX_SIZE 这两个值在什么时候可以兼顾效率和准确率呢？

tensorboard

Dear author, could you add some summaries to the code so that the details of model structure and dataflow could be visualized in a tensorboard? We think it will help a lot. Thanks.

hello, i find ctpn has some trouble with small words!!!

hello, i find ctpn has some trouble with small words!!just like image below.i have 1500 images and maybe 100 images which contain small words.i finetune model which you provide and set batch size 512 and after 2000iters,total loss is 0.5.

i try to resize image to 1000x2000 as split_label.py and input t o network, but it can not detect image correctly,in image, size of each words maybe 20x20 pixels

I try to change code from python2 to python3

I try to change code from python2 to python3,but when I finish all the mistake,and run the code,it caused error below,i do not know where it come from and how to carry out it,how can i do?Thank you so much

2017-10-26 14:07:03.163176: W tensorflow/core/framework/op_kernel.cc:1158] Unknown: KeyError: b'TEST'
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
status, run_metadata)
File "/usr/local/lib/python3.6/contextlib.py", line 88, in exit
next(self.gen)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST'
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]
[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "ctpn/demo.py", line 95, in
_, _ = test_ctpn(sess, net, im)
File "/root/chengjuntao/text-detection-ctpn/lib/fast_rcnn/test.py", line 171, in test_ctpn
rois = sess.run([net.get_output('rois')[0]],feed_dict=feed_dict)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST'
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]
[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Caused by op 'rois/PyFunc', defined at:
File "ctpn/demo.py", line 85, in
net = get_network("VGGnet_test")
File "/root/chengjuntao/text-detection-ctpn/lib/networks/factory.py", line 20, in get_network
return VGGnet_test()
File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 14, in init
self.setup()
File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 68, in setup
.proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois'))
File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 28, in layer_decorated
layer_output = op(self, layer_input, *args, **kwargs)
File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 241, in proposal_layer
[tf.float32,tf.float32])
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 198, in py_func
input=inp, token=token, Tout=Tout, name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_script_ops.py", line 38, in _py_func
name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

UnknownError (see above for traceback): KeyError: b'TEST'
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]
[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

RuntimeError: Graph is finalized and cannot be modified.

Using config:
{'ANCHOR_SCALES': [16],
'DATA_DIR': '/home/text-detection-ctpn/data',
'DEDUP_BOXES': 0.0625,
'EPS': 1e-14,
'EXP_DIR': 'ctpn_end2end',
'GPU_ID': 0,
'IS_EXTRAPOLATING': True,
'IS_MULTISCALE': False,
'IS_RPN': True,
'LOG_DIR': 'ctpn',
'MATLAB': 'matlab',
'MODELS_DIR': '/home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/models/pascal_voc',
'NCLASSES': 2,
'NET_NAME': 'VGGnet',
'PIXEL_MEANS': array([[[ 102.9801, 115.9465, 122.7717]]]),
'REGION_PROPOSAL': 'RPN',
'RNG_SEED': 3,
'ROOT_DIR': '/home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn',
'SUBCLS_NAME': 'voxel_exemplars',
'TEST': {'BBOX_REG': True,
'HAS_RPN': True,
'MAX_SIZE': 1000,
'NMS': 0.3,
'PROPOSAL_METHOD': 'selective_search',
'RPN_MIN_SIZE': 8,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 1000,
'RPN_PRE_NMS_TOP_N': 12000,
'SCALES': [600],
'SVM': False},
'TRAIN': {'ASPECTS': [1],
'ASPECT_GROUPING': True,
'BATCH_SIZE': 300,
'BBOX_INSIDE_WEIGHTS': [1, 1, 1, 1],
'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_NORMALIZE_TARGETS': True,
'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
'BBOX_REG': True,
'BBOX_THRESH': 0.5,
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'DISPLAY': 10,
'DONTCARE_AREA_INTERSECTION_HI': 0.5,
'FG_FRACTION': 0.3,
'FG_THRESH': 0.5,
'GAMMA': 0.1,
'HAS_RPN': True,
'IMS_PER_BATCH': 1,
'KERNEL_SIZE': 5,
'LEARNING_RATE': 0.001,
'LOG_IMAGE_ITERS': 100,
'MAX_SIZE': 1000,
'MOMENTUM': 0.9,
'OHEM': False,
'PRECLUDE_HARD_SAMPLES': True,
'PROPOSAL_METHOD': 'gt',
'RANDOM_DOWNSAMPLE': False,
'RPN_BATCHSIZE': 256,
'RPN_BBOX_INSIDE_WEIGHTS': [1, 1, 1, 1],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 8,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000,
'SCALES': [600],
'SCALES_BASE': [0.25, 0.5, 1.0, 2.0, 3.0],
'SNAPSHOT_INFIX': '',
'SNAPSHOT_ITERS': 1000,
'SNAPSHOT_PREFIX': 'VGGnet_fast_rcnn',
'SOLVER': 'Momentum',
'STEPSIZE': 50,
'USE_FLIPPED': True,
'USE_PREFETCH': False,
'WEIGHT_DECAY': 0.0005},
'USE_GPU_NMS': True}
<bound method pascal_voc.default_roidb of <lib.datasets.pascal_voc.pascal_voc object at 0x7f04ee1b6910>>
Loaded dataset voc_2007_trainval for training
Appending horizontally-flipped training examples...
voc_2007_trainval gt roidb loaded from /home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/data/cache/voc_2007_trainval_gt_roidb.pkl
done
Preparing training data...
done
Output will be saved to /home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/output/ctpn_end2end/voc_2007_trainval
Logs will be saved to /home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/logs/ctpn/voc_2007_trainval/2017-10-26-16-03-04
/gpu:0
Tensor("data:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
WARNING:tensorflow:<tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.BasicLSTMCell object at 0x7f0516b88450>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
Tensor("lstm_o/Reshape:0", shape=(?, ?, ?, 128), dtype=float32)
Tensor("lstm_o/Reshape:0", shape=(?, ?, ?, 128), dtype=float32)
Tensor("rpn_cls_score/Reshape:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("gt_boxes:0", shape=(?, 5), dtype=float32)
Tensor("gt_ishard:0", shape=(?,), dtype=int32)
Tensor("dontcare_areas:0", shape=(?, 4), dtype=float32)
Tensor("im_info:0", shape=(?, 3), dtype=float32)
Tensor("rpn_cls_score/Reshape:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("Reshape_5:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("rpn_bbox_pred/Reshape:0", shape=(?, ?, ?, 40), dtype=float32)
Tensor("im_info:0", shape=(?, 3), dtype=float32)
Tensor("rpn_rois_data/Reshape:0", shape=(?, 5), dtype=float32)
Tensor("rpn_rois_data/PyFunc:1", dtype=float32)
Tensor("gt_boxes:0", shape=(?, 5), dtype=float32)
Tensor("gt_ishard:0", shape=(?,), dtype=int32)
Tensor("dontcare_areas:0", shape=(?, 4), dtype=float32)
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
Computing bounding-box regression targets...
bbox target means:
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[ 0. 0. 0. 0.]
bbox target stdevs:
[[ 0.1 0.1 0.2 0.2]
[ 0.1 0.1 0.2 0.2]]
[ 0.1 0.1 0.2 0.2]
Normalizing targets
done
Solving...
/home/peng_yuxiang/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Loading pretrained model weights from data/pretrain/VGG_imagenet.npy
assign pretrain model weights to conv5_1
assign pretrain model biases to conv5_1
ignore fc6
ignore fc6
assign pretrain model weights to conv5_3
assign pretrain model biases to conv5_3
ignore fc7
ignore fc7
ignore fc8
ignore fc8
assign pretrain model weights to conv5_2
assign pretrain model biases to conv5_2
assign pretrain model weights to conv4_1
assign pretrain model biases to conv4_1
assign pretrain model weights to conv4_2
assign pretrain model biases to conv4_2
assign pretrain model weights to conv4_3
assign pretrain model biases to conv4_3
assign pretrain model weights to conv3_3
assign pretrain model biases to conv3_3
assign pretrain model weights to conv3_2
assign pretrain model biases to conv3_2
assign pretrain model weights to conv3_1
assign pretrain model biases to conv3_1
assign pretrain model weights to conv1_1
assign pretrain model biases to conv1_1
assign pretrain model weights to conv1_2
assign pretrain model biases to conv1_2
assign pretrain model weights to conv2_2
assign pretrain model biases to conv2_2
assign pretrain model weights to conv2_1
assign pretrain model biases to conv2_1

iter: 0 / 180000, total loss: 1.8194, rpn_loss_cls: 0.6936, rpn_loss_box: 0.0755, rpn_loss: 1.0502, lr: 0.001000
speed: 18.390s / iter
image: img_5836.jpg iter: 10 / 180000, total loss: 1.6851, rpn_loss_cls: 0.6919, rpn_loss_box: 0.0470, rpn_loss: 0.9462, lr: 0.001000
speed: 20.903s / iter
image: img_2801.jpg iter: 20 / 180000, total loss: 2.7015, rpn_loss_cls: 0.6911, rpn_loss_box: 0.0254, rpn_loss: 1.9850, lr: 0.001000
speed: 18.728s / iter
image: img_1023.jpg iter: 30 / 180000, total loss: 1.9418, rpn_loss_cls: 0.6903, rpn_loss_box: 0.0254, rpn_loss: 1.2261, lr: 0.001000
speed: 14.129s / iter
image: img_3455.jpg iter: 40 / 180000, total loss: 1.7795, rpn_loss_cls: 0.6869, rpn_loss_box: 0.0630, rpn_loss: 1.0296, lr: 0.001000
speed: 16.138s / iter
image: img_1155.jpg
Traceback (most recent call last):
File "ctpn/train_net.py", line 39, in
restore=bool(int(0)))
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 338, in train_net
sw.train_model(sess, max_iters, restore=restore)
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 167, in train_model
sess.run(tf.assign(lr, lr.eval() * cfg.TRAIN.GAMMA))
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
use_locking=use_locking, name=name)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 491, in apply_op
preferred_dtype=default_dtype)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 169, in constant
attrs={"value": tensor_value, "dtype": dtype_value}, name=name).outputs[0]
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2354, in create_op
self._check_not_finalized()
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2077, in _check_not_finalized
raise RuntimeError("Graph is finalized and cannot be modified.")
RuntimeError: Graph is finalized and cannot be modified.

(1) Since I test is on CPU, I modify the text.yml -> STEPSIZE=50; for a few steps of training test.
(2) Model is load properly and the train loss changed, so the train step should be right;
But, when get the end, RuntimeError: Graph is finalized and cannot be modified.

TypeError: 'NoneType' object is not subscriptable

run demo.py

InvalidArgumentError (see above for traceback): TypeError: 'NoneType' object is not subscriptable

[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

How To Setting Training Data Question

Excuse me, Sorry, I'm a green hand, When I use this program, I don't know How to Setting Training Data, When I follow "Prepare" Step, I didn't found the Training Data in "Baidu Yun", Can you tell me where the Training Data (what's link I can got the Training Data) or How to i can Set the Training Data?

Problem with CPU model

Hi author,thanks. I have only a cpu and want to run the demo.py.
According to the README.m, it seems that, in CPU mode, we need to
do the following two things before running:
(a) Set USE_GPU_NMS=False in ctpn/text.yml;
(b) Execute chmod +x make.sh and ./make.sh in the directory lib/utils.
Is this right?

However, when I run the command ./make.sh, it complains as follows:

Traceback (most recent call last):
File "setup.py", line 39, in
CUDA = locate_cuda()
File "setup.py", line 26, in locate_cuda
raise EnvironmentError('The nvcc binary could not be '
OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME
mv: cannot stat 'utils/*': No such file or directory

Then what should I do to run the demo with only a CPU?

I have been struck for a whole day. I am looking forward to your reply.

Now It is work! But I have a question about VGG16Net.

After failing to train for the first time, I did not sleep for two days and did my research.
I found the optimal value by modifying the parameter part of the text.yml file. (for my computer)
I was able to solve it with your help. Thank you so much.

I am currently researching on detecting the text in business card.
The business card image contains text, face, and company logo.
It is very different from nature image. The elements are very limited.

Anyway, (Q.1)what data did you use to train VGG16NET in advance?
(Q.2)Would it be okay if both VGG16net and fast_rcnn(CTPN) were trained in business card image?
What are your thoughts?

example)
100 million text, human face, QR code, company logo images-> training -> VGnet.
100 million business card images -> training -> CTPN.

(Q.3) If I train VGG16NET, which soure code should I use?

Thank you for your continued help.
You will receive many blessings.

ModuleNotFoundError: No module named 'lib'

When I try to execute the demo.py im receiving this error

File "demo.py", line 9, in
from lib.networks.factory import get_network
ModuleNotFoundError: No module named 'lib'

Can someone please help me with this ?

exuse me !!!!do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000iters,it can detect nothing?

exuse me !!!! do you have some tricks when training model?why do i train MLT dataset as default parameters and after 50000 iters,it can detect nothing?

AttributeError: 'module' object has no attribute 'ndarray'

when i try to train my own data, i faced this error：

Traceback (most recent call last):
File "ctpn/train_net.py", line 41, in
restore=bool(int(1)))
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/fast_rcnn/train.py", line 234, in train_net
sw = SolverWrapper(sess, network, imdb, roidb, output_dir, logdir= log_dir, pretrained_model=pretrained_model)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/fast_rcnn/train.py", line 27, in init
self.bbox_means, self.bbox_stds = rdl_roidb.add_bbox_regression_targets(roidb)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/roi_data_layer/roidb.py", line 58, in add_bbox_regression_targets
_compute_targets(rois, max_overlaps, max_classes)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/roi_data_layer/roidb.py", line 136, in _compute_targets
targets[ex_inds, 1:] = bbox_transform(ex_rois, gt_rois)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/fast_rcnn/bbox_transform.py", line 21, in bbox_transform
assert np.min(ex_widths) > 0.1 and np.min(ex_heights) > 0.1,
File "/data/resys/var/python2.7.3/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2362, in amin
if type(a) is not mu.ndarray:
AttributeError: 'module' object has no attribute 'ndarray'
Segmentation fault (core dumped)

Did you train your model on a Chinese database?

@eragonruan Hi, Thank you for your sharing. I have a question. Did you train your model on a Chinese database? I tested your model on some pictures with Chinese and it did a good job. Could you tell me which one database you chose and how to get it? Thank you very much.

Can you provide the training dataset?

我也在看CTPN的论文和代码，但是对其数据标注和Side-refinement部分一直不清楚它是怎么做的。

Bus error: 10

When I run the demo.py, it appears this error:
Loading network VGGnet_test... Restoring from checkpoints/VGGnet_fast_rcnn_iter_50000.ckpt... done
done.
Bus error: 10

I compile the code on MAC pro.

output of classfication is same !!!

@eragonruan i am sorry to bother you again!!! i have maybe 3500 image,and i set batch size 256 and lr 0.0001,after 5000 iter, the output of classfilation is always same, i do not think it is caused by overfit, because i test image in training dataset,and it can detect nothing. but i test output of classfication is ok at early of training!!! do you know what happened?

how to deal with skew box

hello, i saw your blog and you optimize CTPN so that it can detect skew words, but i want to know if you add a angle in loss function, how do you deal groundtruh？are you dividing groundtruth into small box in sknew direction? @eragonruan

about the detection speed

it take about average 2s every pic, normally how long will it take? if i want change it to gpu,where will i modify?

rpn_loss_box is always equal 0

when i train model with MLT data, rpn_loss_box is always equal 0, i want to know if you have change something in lbuild_loss function？ @eragonruan

loss become nan, invalid value encountered in log

text-detection-ctpn-master1/lib/fast_rcnn/bbox_transform.py:29: RuntimeWarning: invalid value encountered in log
targets_dh = np.log(gt_heights / ex_heights)

in my training process,this warning come out and the loss become nan. why i get this error?

What files should be in the gt_path folder?

Hello. eragonruan !
I was very surprised to see your code.
Thank you for sharing a good source.
I want to train my own image data.
You told me to edit the file split_label.py ("path" and "gt_path").
I think there is an image in the "path" folder.
But, I don't know about "gt_path".

What files should be in the "gt_path" folder? and
Can you give me some sample? (one image and gt_path.txt)

Thanks you!

there is a wrong when i train the model

Hi, I train the model with the dataset VOC2007. I modify the NCLASSES from 2 to 21 in ctpn/text.yml.
But it goes wrong after about 100 batchs.
here is the print.

assign pretrain model biases to conv1_1
assign pretrain model weights to conv1_2
assign pretrain model biases to conv1_2
assign pretrain model weights to conv2_2
assign pretrain model biases to conv2_2
assign pretrain model weights to conv2_1
assign pretrain model biases to conv2_1
iter: 0 / 180000, total loss: 3.0214, rpn_loss_cls: 0.6958, rpn_loss_box: 0.7112, rpn_loss: 1.6144, lr: 0.001000
speed: 4.622s / iter
image: 003311.jpg
iter: 10 / 180000, total loss: 3.3073, rpn_loss_cls: 0.6902, rpn_loss_box: 0.8598, rpn_loss: 1.7573, lr: 0.001000
speed: 3.948s / iter
image: 008127.jpg
iter: 20 / 180000, total loss: 2.6549, rpn_loss_cls: 0.6904, rpn_loss_box: 0.5334, rpn_loss: 1.4311, lr: 0.001000
speed: 1.100s / iter
image: 009458.jpg
iter: 30 / 180000, total loss: 3.3055, rpn_loss_cls: 0.6721, rpn_loss_box: 0.8771, rpn_loss: 1.7563, lr: 0.001000
speed: 1.115s / iter
image: 003082.jpg
iter: 40 / 180000, total loss: 3.7248, rpn_loss_cls: 0.6609, rpn_loss_box: 1.0980, rpn_loss: 1.9660, lr: 0.001000
speed: 3.983s / iter
image: 007029.jpg
iter: 50 / 180000, total loss: 2.5844, rpn_loss_cls: 0.6593, rpn_loss_box: 0.5294, rpn_loss: 1.3956, lr: 0.001000
speed: 1.080s / iter
2017-09-27 10:41:52.117726: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
2017-09-27 10:41:52.118615: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
2017-09-27 10:41:52.118645: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
2017-09-27 10:41:52.118648: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
2017-09-27 10:41:52.118656: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]

and the Exceptions is below:

File "/home/text-detection-ctpn/ctpn/train_net.py", line 39, in <module>
  restore=bool(int(0)))
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 341, in train_net
  sw.train_model(sess, max_iters, restore=restore)
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 215, in train_model
  summary_str, _= sess.run(fetches=fetch_list, feed_dict=feed_dict)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
  run_metadata_ptr)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
  feed_dict_tensor, options, run_metadata)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
  options, run_metadata)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
  raise type(e)(node_def, op, message)

tensorflow.python.framework.errors_impl.InvalidArgumentError: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
[[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
[[Node: roi-data/PyFunc/_137 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_630_roi-data/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Caused by op u'roi-data/PyFunc', defined at:
File "/usr/lib/wingide6/bin/wingdb.py", line 978, in <module>
main()
File "/usr/lib/wingide6/bin/wingdb.py", line 918, in main
netserver.abstract.kFileSystemEncoding, orig_sys_path)
File "/usr/lib/wingide6/bin/wingdb.py", line 766, in DebugFile

I am exhausted to fix the bug, any suggestion?

Is there a way to train my own image data?

Hello. eragonruan!
I was surprised to see your code.
I respect you.
Is there a way to train my own image data?
Thanks!

kernel not found in checkpoint

hi,I got this problem....
Loading network VGGnet_test...
Traceback (most recent call last):
File "./ctpn/demo.py", line 90, in
saver.restore(sess, os.path.join(os.getcwd(),"checkpoints/model_final.ckpt"))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1548, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint
[[Node: save/RestoreV2_28 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_28/tensor_names, save/RestoreV2_28/shape_and_slices)]]

Caused by op u'save/RestoreV2_28', defined at:
File "./ctpn/demo.py", line 89, in
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1139, in init
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1170, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 691, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 640, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

NotFoundError (see above for traceback): Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint
[[Node: save/RestoreV2_28 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_28/tensor_names, save/RestoreV2_28/shape_and_slices)]]

so I search net and found this Issue
I change u demo code like this:

if name == 'main':
if os.path.exists("data/results/"):
shutil.rmtree("data/results/")
os.makedirs("data/results/")

cfg.TEST.HAS_RPN = True  # Use RPN for proposals
# init session
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
# load network
net = get_network("VGGnet_test")
# load model
print ('Loading network {:s}... '.format("VGGnet_test")),

I changed this part

OLD_CHECKPOINT_FILE = "checkpoints/model_final.ckpt"
#NEW_CHECKPOINT_FILE = "model_final.ckpt"  
vars_to_rename = {
    "lstm/basic_lstm_cell/weights": "lstm/basic_lstm_cell/kernel",
    "lstm/basic_lstm_cell/biases": "lstm/basic_lstm_cell/bias",
}
new_checkpoint_vars = {}
reader = tf.train.NewCheckpointReader(OLD_CHECKPOINT_FILE)
for old_name in reader.get_variable_to_shape_map():
    if old_name in vars_to_rename:
    new_name = vars_to_rename[old_name]
    else:
    new_name = old_name
    new_checkpoint_vars[new_name] = tf.Variable(reader.get_tensor(old_name))

saver = tf.train.Saver(new_checkpoint_vars)
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    saver.restore(sess, os.path.join(os.getcwd(),"checkpoints/model_final.ckpt"))
    print 'done'
    im = 128 * np.ones((300, 300, 3), dtype=np.uint8)
    for i in xrange(2):
        _, _ = test_ctpn(sess, net, im)
    im_names = glob.glob(os.path.join(cfg.DATA_DIR, 'demo', '*.png')) + \
           glob.glob(os.path.join(cfg.DATA_DIR, 'demo', '*.jpg'))
    for im_name in im_names:
    print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
    #print('Demo for {:s}'.format(im_name))
    ctpn(sess, net, im_name)

It was already run , but in ctpn line is empty .I thought threshold may a little high that I make it lower.
so the bounding box didn't find right proposal.

btw can u speak chinese`````:)

Isn't there a test.py to evaluate the model after training?

After training, there is only a .ckpt file generated but not any evaluation.

do you know what kind of datasets which ctpn uses? does the dataset built by themself and no release?

split_label.py deal with .txt but .xml provided in VOC2007, and import python_nms error in demo

Thanks for writing this repo

I have incorporated it in my code / blogpost where I had earlier used the Caffe version.

https://github.com/AKSHAYUBHAT/DeepVideoAnalytics/tree/master/notebooks/OCR

Why undefine roi_pool_op and psroi_pooling_op?

These two imports are commented out.

roi_pool_op is required on line 211.
psroi_pooling_op is required on line 227.

There is no results saved.

i run the demo.py, but there is no results saved.

the parameter 'inds' in function 'save_results' is always equal to 0.

any problem?

How can model run using multi gpus

Now，I can just run the model using only one gpu . Can I use multi gpus to run that, and what I need to do?

why is lr so low?

hello,i have a question, i try ro train model and if i set lr 0.00001, it will converge slowly,but if i set lr higher, it will never converge and even nan, in fact, i feel that 0.00001 is so low that model can learning nothing, why does it happen?

Problem

2017-11-05 04:40:59.256258: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /media/D/code/OCR/text-detection-ctpn/output/ctpn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_50000.ckpt: Not found: /media/D/code/OCR/text-detection-ctpn/output/ctpn_end2end/voc_2007_trainval; No such file or directory

how can I fix it?

How to freeze the model graph from ckpt?

I want to freeze the parameter of the model which you pretrained to one pb file.
I add some code to demo.py, but i am failed.
This work need to know the out_put_node of the model. I print out all nodes in pretrained model. and i think these two nodes maby output nodes:rois/Reshape/shape,rois/Reshape, is that right?
Here is my code:

if name == 'main':
if os.path.exists("data/results/"):
shutil.rmtree("data/results/")
os.makedirs("data/results/")

cfg_from_file('ctpn/text.yml')

# init session
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
# load network
net = get_network("VGGnet_test")
# load model
print(('Loading network {:s}... '.format("VGGnet_test")), end=' ')
saver = tf.train.Saver()

try:
    ckpt = tf.train.get_checkpoint_state(cfg.TEST.checkpoints_path)
    #ckpt=tf.train.get_checkpoint_state("output/ctpn_end2end/voc_2007_trainval/")
    print('Restoring from {}...'.format(ckpt.model_checkpoint_path), end=' ')
    saver.restore(sess, ckpt.model_checkpoint_path)
    print('done')
except:
    raise 'Check your pretrained {:s}'.format(ckpt.model_checkpoint_path)
print (' done.')

print('all nodes are:\n')
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
node_names = [node.name for node in input_graph_def.node]
for x in node_names:
   print(x)
output_node_names = 'rois/Reshape/shape,rois/Reshape'
output_graph_def = graph_util.convert_variables_to_constants(sess, input_graph_def, output_node_names.split(','))
output_graph = 'ctpn.pb'
with tf.gfile.GFile(output_graph, 'wb') as f:
    f.write(output_graph_def.SerializeToString())
sess.close()

Bug in cpu only mode, maximum recursion depth exceeded in comparison

Caused by op 'rois/PyFunc', defined at:
File "D:/Project_PictureDetectiveSystem/src/PD-Service/deeplearn/ocr/textdetectionctpn/ctpn/demo.py", line 86, in
net = get_network("VGGnet_test")
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\factory.py", line 8, in get_network
return VGGnet_test()
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\VGGnet_test.py", line 14, in init
self.setup()
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\VGGnet_test.py", line 54, in setup
.proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois'))
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\network.py", line 23, in layer_decorated
layer_output = op(self, layer_input, *args, **kwargs)
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\network.py", line 178, in proposal_layer
[tf.float32,tf.float32])
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\ops\script_ops.py", line 203, in py_func
input=inp, token=token, Tout=Tout, name=name)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 36, in _py_func
name=name)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\framework\ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\framework\ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

UnknownError (see above for traceback): RecursionError: maximum recursion depth exceeded in comparison
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5, rpn_bbox_pred/Reshape, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

这里会报错。同样的环境，上个版本没有出错。请问这个是怎么回事呢？

Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint

Dear eragonruan,
I think your implementation of ctpn is an excellent job. When I run your demo, I have loaded other layers successfully except the lstm layer, It shows the output following:
`Loading network VGGnet_test...
Traceback (most recent call last):
File "ctpn/demo.py", line 89, in
saver.restore(sess, os.path.join(os.getcwd(),"checkpoints/model_final.ckpt"))
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint
[[Node: save/RestoreV2_28 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_28/tensor_names, save/RestoreV2_28/shape_and_slices)]]

Caused by op u'save/RestoreV2_28', defined at:
File "ctpn/demo.py", line 88, in
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1140, in init
self.build()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

Could you please tell me where I'm wrong? The version of TensorFlow that I am using is 1.3

model can't converge

use the same dataset, after updating the newest code, it can't converge, why?

can not detect anything

I used your lasted codes and your data, but it can't convergence at all. Please tell me why?

Bug In cpu only

在win下运行demo，报错： Process finished with exit code -1073741571, 这种情况怎么解决？

fatal error: numpy/arrayobject.h: No such file or directory

When I try to build nms implement and run ./make.sh, there was an error happened:
bbox.c:517:31: fatal error: numpy/arrayobject.h: No such file or directory
It seems a path problem. But where should I modify the path.
The path of numpy in my enviroment is:
'/usr/local/lib/python2.7/dist-packages/numpy/core/include'

What is the format of gt file?

I read the code in split_label.py and found the label information for each image is in gt file in gt_path, but I do not know the format you are using in this file. Can you kindly share the format of that? Thx.

There are confusing parts.

Hello.
eragonruan!
I have some question.

I tried running the "./ctpn/train_net.py" file in Google Cloud Compute.
I have modified max_iters = 60000 to max_iters = 500 and ran it. (for test)
The output folder is created and a new checkpoint is created.
The new checkpoint folder has been move to "text-detection-ctpn/checkpoint". Then I ran "demo.py" and it did not have any errors.
But, There is nothing in the results folder.

(1) Is it because I set the training smaller? (max_iters = 500)
(2) new training -> The new checkpoint is created. but what is "VGG_imagenet.npy" file?
You have provided two pre-trained images.
one, checkpoints. ex) VGGnet_fast_rcnn_iter_50000.ckpt.data-00000-of-00001
two, data/pretrain. ex) VGG_imagenet.npy
checkpoints/train-informations vs data/pretrain/VGG_imagenet.npy
What is the difference between the two?
and Why I need VGG_imagenet.npy and How make own VGG_imagenet.npy.

I would be grateful if you could answer the question.
Thank you sooo much.

Running with CPU only

Hello everyone, it is the first time I could successfully run a demo. Many thanks to the author.

To use cpu only, I follow the author's instruction and make the following modifications:
(1) Set "USE_GPU_NMS " in the file ./ctpn/text.yml as "False"
(2) Set the "__C.USE_GPU_NMS" in the file ./lib/fast_rcnn/config.py as "False";
(3) Comment out the line "from lib.utils.gpu_nms import gpu_nms" in the file ./lib/fast_rcnn/nms_wrapper.py;
(4) To rebuild the setup.py:

The author provides the new code of setup.py for cpu only:

from Cython.Build import cythonize
import numpy as np
from distutils.core import setup

try:
numpy_include = np.get_include()
except AttributeError:
numpy_include = np.get_numpy_include()

setup(
ext_modules=cythonize(["bbox.pyx","cython_nms.pyx"],include_dirs=[numpy_include]),
)

(a) execute export CFLAGS=-I/home/zhao181/ProGram1/anaconda2/lib/python2.7/site-packages/numpy/core/include
you should use your own numpy path.

(b) cd xxx/text-detection-ctpn-master/lib/utils
and execute:python setup.py build

(5) cd xxx/text-detection-ctpn-master
and execute: python ./ctpn/demo.py

By the way, I am running under ubuntu 16.04 with
Anaconda2-4.2.0-Linux-x86_64.sh and tensorflow-1.3.0-cp27-cp27mu-manylinux1_x86_64.whl(cpu).

BiLSTM and Training Time

Thanks for sharing your implementation with us. I have implemented CTPN with Caffe which failed to converge when adding LSTM.
First, I want to ask whether you have added the BiLSTM in your code or not. I am new to tensorflow. After looking at the code, I think you just implement the LSTM not the BiLSTM, is it right ?
Second, I want to ask how long did you train your model? I have run the train script of your programs on a GPU device. It seems that it would take 5-6 days to finish the first 180000 iterations.

Thanks very much.

Could you put your dataset and model in baidu yunpan?

R.T.
Thanks, very much!

Would you kindly tell that only horizontal lines can be detected?

Why it doesnot work for vertical lines?