pengcao / chinese_ocr Goto Github PK

View Code? Open in Web Editor NEW

214.0 11.0 58.0 59.02 MB

中文ocr识别

Python 97.97% C++ 0.04% Shell 0.65% Cuda 1.34%

chinese_ocr's Introduction

本文基于tensorflow、keras/pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别

功能

文字检测实现keras端到端的文本检测及识别（项目里面有两个模型keras和pytorch。）
不定长OCR识别

Ubuntu下环境构建

Bash
##GPU环境
sh setup-python3-gpu.sh

##CPU python3环境
sh setup-python3-cpu.sh

##额外依赖的安装包
apt install graphviz
pip3 install graphviz
pip3 install pydot
pip3 install torch torchvision

模型

一共分为3个网络
1. 文本方向检测网络-Classify(vgg16)
2. 文本区域检测网络-CTPN(CNN+RNN)
3. EndToEnd文本识别网络-CRNN(CNN+GRU/LSTM+CTC)

文字方向检测-vgg分类

基于图像分类，在VGG16模型的基础上，训练0、90、180、270度检测的分类模型.
详细代码参考angle/predict.py文件，训练图片8000张，准确率88.23%

模型地址[BaiduCloud](链接：https://pan.baidu.com/s/1Sqbnoeh1lCMmtp64XBaK9w 提取码：n2v4)

文字区域检测CTPN

支持CPU、GPU环境，一键部署，文本检测训练参考

OCR 端到端识别:CRNN

ocr识别采用GRU+CTC端到到识别技术，实现不分隔识别不定长文字

提供keras 与pytorch版本的训练代码，在理解keras的基础上，可以切换到pytorch版本，此版本更稳定

使用

体验

运行demo.py或者pytorch_demo.py（建议）写入测试图片的路径即可，如果想要显示ctpn的结果，修改文件./ctpn/ctpn/other.py 的draw_boxes函数的最后部分，cv2.inwrite('dest_path',img)，如此，可以得到ctpn检测的文字区域框以及图像的ocr识别结果

在进行体验的时候，注意要更改里面的一些内容（比如模型文件等）

模型训练

1 对ctpn进行训练

定位到路径--./ctpn/ctpn/train_net.py
预训练的vgg网络路径[VGG_imagenet.npy](链接：https://pan.baidu.com/s/1jzrcCr0tX6xAiVoolVRyew 提取码：a5ze ) 将预训练权重下载下来，pretrained_model指向该路径即可, 此外整个模型的预训练权重[checkpoint](链接：https://pan.baidu.com/s/1oS6_kqHgmcunkooTAXE8GA 提取码：xmjv )
ctpn数据集还是百度云数据集下载完成并解压后，将.ctpn/lib/datasets/pascal_voc.py 文件中的pascal_voc 类中的参数self.devkit_path指向数据集的路径即可

2 对crnn进行训练

keras版本 ./train/keras_train/train_batch.py model_path--指向预训练权重位置 MODEL_PATH---指向模型训练保存的位置 [keras模型预训练权重](链接：https://pan.baidu.com/s/14cTCedz1ESnj0mM9ISm__w 提取码：1kb9)
pythorch版本./train/pytorch-train/crnn_main.py

parser.add_argument(
    '--crnn',
    help="path to crnn (to continue training)",
    default=预训练权重的路径，看你下载的预训练权重在哪啦)
parser.add_argument(
    '--experiment',
    help='Where to store samples and models',
    default=模型训练的权重保存位置,这个自己指定)

[pytorch预训练权重](链接：https://pan.baidu.com/s/1kAXKudJLqJbEKfGcJUMVtw 提取码：9six)

文字检测及OCR识别结果

=========================================================== ===========================================================

主要是因为训练的时候，只包含中文和英文字母，因此很多公式结构是识别不出来的

在跑的过程中遇到了问题，请联系

邮箱：[email protected]

参考

chinese_ocr's People

Contributors

Stargazers

Watchers

Forkers

zgd716 leaf918 trymels zosimer microphoneben qqgeogor missyanc homelocation marcusxe pycn sillylawliet wlq1995 gudufeie arkzh duducode pooscan anyj sszllx holasyb teresasun jasonxgw aiedward zhuzhenping jasonj99 ekils zhaoyulu haimiaozh wangjianye joeytang3377 jinhill wjinhai username111791 slimentmax fuliangyuzqm jialinzheng chengyj97 einstein10147 cyl666 jackhappy alex-1997-wzx ironbeliever xuetuo senkey705 jianzhez wan1995 jerryname2022 niushixiong humilton doctor-damu wblyqq bigdadwolf 31415li ceazyer forestlee hanchenchen zhuxulin-star ekko-liu yjy1992

chinese_ocr's Issues

好问题

为什么可以识别例子里的的图片，test里面的其他图片识别的时候会报错
已放弃核心已转储

要运行该项目的话是不是直接运行demo.py文件啊

您好！我是一个新手，对文字识别这块比较有兴趣。要运行该项目的话是不是直接运行demo.py文件啊

Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("lstm_o/Reshape_2:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_cls_score/Reshape_1:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("Reshape_2:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("rpn_bbox_pred/Reshape_1:0", shape=(?, ?, ?, 40), dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
2019-11-08 09:57:43.611347: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Tensor_name is : rpn_conv/3x3/biases
Tensor_name is : rpn_cls_score/weights
Tensor_name is : rpn_bbox_pred/biases
Tensor_name is : lstm_o/weights
Tensor_name is : lstm_o/bidirectional_rnn/fw/lstm_cell/bias
Tensor_name is : lstm_o/bidirectional_rnn/bw/lstm_cell/kernel
Tensor_name is : lstm_o/bidirectional_rnn/bw/lstm_cell/bias
Tensor_name is : conv5_3/weights
Tensor_name is : conv5_3/biases
Tensor_name is : lstm_o/biases
Tensor_name is : conv5_2/weights
Tensor_name is : conv2_2/weights
Tensor_name is : conv1_1/weights
Tensor_name is : conv4_2/weights
Tensor_name is : conv2_2/biases
Tensor_name is : conv2_1/biases
Tensor_name is : conv1_2/weights
Tensor_name is : conv4_1/biases
Tensor_name is : conv2_1/weights
Tensor_name is : rpn_cls_score/biases
Tensor_name is : conv1_2/biases
Tensor_name is : rpn_conv/3x3/weights
Tensor_name is : conv3_1/weights
Tensor_name is : conv4_3/weights
Tensor_name is : conv3_2/biases
Tensor_name is : rpn_bbox_pred/weights
Tensor_name is : conv3_2/weights
Tensor_name is : lstm_o/bidirectional_rnn/fw/lstm_cell/kernel
Tensor_name is : conv3_3/biases
Tensor_name is : conv5_2/biases
Tensor_name is : conv5_1/weights
Tensor_name is : conv3_3/weights
Tensor_name is : conv4_1/weights
Tensor_name is : conv1_1/biases
Tensor_name is : conv4_2/biases
Tensor_name is : conv3_1/biases
Tensor_name is : conv4_3/biases
Tensor_name is : conv5_1/biases
load vggnet done
Using TensorFlow backend.
Traceback (most recent call last):
File "", line 971, in _find_and_load
File "", line 955, in _find_and_load_unlocked
File "", line 665, in _load_unlocked
File "", line 678, in exec_module
File "", line 219, in _call_with_frames_removed
File "/home/zhaoyulu/web/chinese_ocr/model.py", line 16, in
from ocr.model import predict as ocr
File "/home/zhaoyulu/web/chinese_ocr/ocr/model.py", line 8, in
import keras.backend as K
File "/usr/local/lib/python3.6/dist-packages/keras/init.py", line 3, in
from . import utils
File "/usr/local/lib/python3.6/dist-packages/keras/utils/init.py", line 6, in
from . import conv_utils
File "/usr/local/lib/python3.6/dist-packages/keras/utils/conv_utils.py", line 9, in
from .. import backend as K
File "/usr/local/lib/python3.6/dist-packages/keras/backend/init.py", line 1, in
from .load_backend import epsilon
File "/usr/local/lib/python3.6/dist-packages/keras/backend/load_backend.py", line 90, in
from .tensorflow_backend import *
File "/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py", line 54, in
get_graph = tf_keras_backend.get_graph
AttributeError: module 'tensorflow.python.keras.backend' has no attribute 'get_graph'

方便的话请尽量多包的版本号,十分感谢

ImportError: cannot import name 'bbox'

运行python demo.py时出现这个错误，怎么解决?

是否成功运行项目必须需要先安装tensorflow

如果需要安装tensorflow需要什么版本的

关于CRNN预训练模型使用的数据集及训练等相关问题

您好，想请教您如下两个问题？
1.请问您训练使用的训练集是开源的吗？大概有多大？不知道您是否方便分享一下。
2.训练出的模型训练了多少个epoch,大概用了多长时间？
期待回复，感谢。

文字方向检测-vgg分类的模型下载后怎么使用呢

crnn 训练用的数据集是哪个？

@pengcao 可否说明一下使用了哪个数据集训练的crnn

是否有联系方式，我发了邮件给你，我这边一直环境搭建不成功！

win10下面跑程序不能实现

D:\QQ\chinese_ocr-master\chinese_ocr-master> python pytorch_demo.py
Using TensorFlow backend.
Traceback (most recent call last):
File "pytorch_demo.py", line 8, in
import pytorch_model as model
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\pytorch_model.py", line 12, in
from ctpn.text_detect import text_detect
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\ctpn\text_detect.py", line 3, in
from .ctpn.detectors import TextDetector
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\ctpn\ctpn\detectors.py", line 10, in
from ..lib.fast_rcnn.nms_wrapper import nms
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\ctpn\lib_init_.py", line 1, in
from . import fast_rcnn
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\ctpn\lib\fast_rcnn_init_.py", line 2, in
from . import nms_wrapper
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\ctpn\lib\fast_rcnn\nms_wrapper.py", line 2, in
from ..utils.cython_nms import nms as cython_nms
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\ctpn\lib\utils_init_.py", line 1, in
from . import bbox
File "D:\QQ\chinese_ocr-master\chinese_ocr-master\ctpn\lib\utils\bbox.py", line 9
cimport numpy as np
^
SyntaxError: invalid syntax

about one_hot function in trainbach.py

你好,one_hot 函数默认长度是10，当超过活着小于10时，label值多余的都被标注0了，0的位置是‘，这会影响结果？这样标注不会错？

手写样本预测正确率几乎为0

我将训练结果用到手写样本进行测试，正确率几乎为0，模型没什么意义啊

文字方向识别

我用vgg训练了四个方向的分类，4万的数据，效果不理想。请问文字方向识别数据集是怎么样？

测试crnn时，RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

你好，执行sh setup-python3-cpu.sh出错了，请帮忙看一下

ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-th7tv7ue/grpcio/setup.py'"'"'; file='"'"'/tmp/pip-install-th7tv7ue/grpcio/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-th7tv7ue/grpcio/pip-egg-info
cwd: /tmp/pip-install-th7tv7ue/grpcio/
Complete output (2 lines):
Found cython-generated files...
error in grpcio setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.