Code Monkey home page Code Monkey logo

end-to-end-for-chinese-plate-recognition's Introduction

end-to-end-for-plate-recognition

多标签分类,端到端的中文车牌识别基于mxnet . 从xlvector的ocr代码修改,减少了参数,由于我没有显卡。单线程 9 samples/s 速度 ,用CPU在MBP上跑了50w张样本。识别率到了81%。不过还没有完全收敛。

训练好的模型

https://github.com/ibyte2011/end-to-end-for-chinese-plate-recognition

关于车牌识别

生成的车牌对于实际车牌并不是效果很好,在结合真实样本和GAN,训练了一个更好的模型,对真实车牌表现很好。 并实现了一整套车牌识别的系统命名为HyperLPR https://github.com/zeusees/HyperLPR

依赖:

  • Numpy
  • Mxnet
  • Opencv

生成的车牌样张

通过渲染车牌加上畸变、噪声、与自然环境结合生成车牌的样本。

image image image image image image image image image image image image

识别样张

Author

  • Jack Yu
  • Xiao Xiao

end-to-end-for-chinese-plate-recognition's People

Contributors

leihentulong avatar szad670401 avatar zhangxinnan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

end-to-end-for-chinese-plate-recognition's Issues

关于竖排文字识别

你好,我想请教一下这种Multi-label classification方法适用于竖排文字吗?
关于竖排文字该如何处理呢,这里假设要识别的文字长度是一定的,只有3个。然后这个文字的范围不超过50种文字。

LocalFileSystem: fail to open "cnn-ocr-symbol.json"

在Ubuntu16.04下,下载zip解压后,运行 python test.py 报错:
python test.py
(30, 120, 3)
(3, 30, 120)
[18:02:27] include/dmlc/logging.h:235: [18:02:27] src/io/local_filesys.cc:154: Check failed: allow_null LocalFileSystem: fail to open "cnn-ocr-symbol.json"
Traceback (most recent call last):
File "test.py", line 84, in
TestRecognizeOne(cv2.imread("./plate/01.jpg"))
File "test.py", line 59, in TestRecognizeOne
_, arg_params, __ = mx.model.load_checkpoint("cnn-ocr", 1)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/model.py", line 372, in load_checkpoint
symbol = sym.load('%s-symbol.json' % prefix)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/symbol.py", line 971, in load
check_call(_LIB.MXSymbolCreateFromFile(c_str(fname), ctypes.byref(handle)))
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/base.py", line 77, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [18:02:27] src/io/local_filesys.cc:154: Check failed: allow_null LocalFileSystem: fail to open "cnn-ocr-symbol.json"
PIL我安装的是pillow,python版本2.7,请问是哪一步出现问题了呢?

作者您好,请问下opencv的版本

cv2.error: OpenCV(3.4.1) /io/opencv/modules/imgproc/src/resize.cpp:4044: error: (-215) ssize.width > 0 && ssize.height > 0 in function resize
报错如上,不知是否与opencv版本有关

关于模拟生成车牌字符的颜色问题

谢谢分享,想请教一下关于模拟生成车牌的字符颜色如何做调整,我想将字体颜色改为黑色,尝试了修改字体函数里面的参数,发现得到的颜色效果与期望的不同,是做了什么预处理吗?

Reimplement in Keras

I'm trying to reimplement the model in Keras and have some questions about this model:
https://github.com/szad670401/end-to-end-for-chinese-plate-recognition/blob/master/train.py#L108

Fo each character/number we have fc2n output in our case you have n=7 of them and num_hidden = 65 is number of unique character/number in dictionary.
So as I understand at the output we have 7x65 output (row x col) and each row have only one 1.0 and other values are 0.0 (one hot encoding).

I'm not sure how to deal with matrix, because in ordinary cases like classification we have output as vector and softmax+categorical_crossentropy on top.

And what if we have digits for example ['0','1','2','3','4','5','6','7','8','9'] (num_hidden = 10) and characters for example ['A','B','C'] (num_hidden = 3) how to concat 3 and 10 vectors in single matrix?

Can you eleborate on this?

Also seems this project is very similar:
https://github.com/apache/incubator-mxnet/blob/master/example/captcha/mxnet_captcha.R#L13

提示cv2.resize函数报错,请教是怎么回事?

self.bg  = cv2.resize(cv2.imread("./images/template.bmp"),(226,70));

cv2.error: OpenCV(3.4.3) C:\projects\opencv-python\opencv\modules\imgproc\src\resize.cpp:4044: error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'

The accuracy is not so high

platform

PC with Quadro 600(1GB)

result

2016-09-06 23:23:50,956 Epoch[0] Resetting Data Iterator
2016-09-06 23:23:50,956 Epoch[0] Time cost=13450.590
2016-09-06 23:24:08,369 Epoch[0] Validation-Accuracy=0.667429
2016-09-06 23:24:08,411 Saved checkpoint to "cnn-ocr-0001.params"
('\xe6\xb8\x9dQ8L3PC', [3, 55, 39, 51, 34, 54, 43])

image

训练好的模型

50w张图片训练了多久 不知道后续会不会放出训练好的模型

请问这个项目要怎么用gpu运行?

我把FeedForward里面dev改成mx.gpu()之后会报这个错
mxnet.base.MXNetError: [16:44:09] src/imperative/imperative.cc:78: Operator _zeros is not implemented for GPU.

求用gpu跑过的大神知道

genplate.py L163

the "len(text) == 9" in my project there is error.the correct is "len(text) == 7"

请教FullyConnected的参数问题

你好,请教一个问题,

def get_ocrnet():
fc1 = mx.symbol.FullyConnected(data = flatten, num_hidden = 120)
fc21 = mx.symbol.FullyConnected(data = fc1, num_hidden = 65)
代码中的65和120,有具体含义吗?65是代表车省份31个、24个大写字母(去除O和I)、10个数字吗?

How can I solve the mistake,thanks

[16:28:34] ./dmlc-core/include/dmlc/logging.h:208: [16:28:34] src/io/local_filesys.cc:149: Check failed: allow_null LocalFileSystem: fail to open "cnn-ocr-symbol.json"
Traceback (most recent call last):
File "/opt/github/end-to-end-for-chinese-plate-recognition/test.py", line 84, in
TestRecognizeOne(cv2.imread("./plate/01.jpg"))
File "/opt/github/end-to-end-for-chinese-plate-recognition/test.py", line 59, in TestRecognizeOne
_, arg_params, __ = mx.model.load_checkpoint("cnn-ocr", 1)
File "/usr/local/lib/python2.7/site-packages/mxnet/model.py", line 437, in load_checkpoint
symbol = sym.load('%s-symbol.json' % prefix)

training accuracy always be zero, why?

Thanks for your contribution.

I have downloaded the code and run it ok for 8 letters recognition. However, Something happened when I recode your code to recognize 14 or more letters. The problem is that training accuracy always be zero, even after dozen epoches' training, each epoch has 1w pictures.

2017-08-07 18:16:11,402 - root - INFO - Epoch[11] Batch [100] Speed: 271.67 samples/sec accuracy=0.000000
2017-08-07 18:16:15,152 - root - INFO - Epoch[11] Batch [200] Speed: 266.71 samples/sec accuracy=0.000000
2017-08-07 18:16:18,915 - root - INFO - Epoch[11] Batch [300] Speed: 265.76 samples/sec accuracy=0.000000
2017-08-07 18:16:22,544 - root - INFO - Epoch[11] Batch [400] Speed: 275.58 samples/sec accuracy=0.000000
2017-08-07 18:16:26,258 - root - INFO - Epoch[11] Batch [500] Speed: 269.32 samples/sec accuracy=0.000000
2017-08-07 18:16:30,039 - root - INFO - Epoch[11] Batch [600] Speed: 264.47 samples/sec accuracy=0.000000
2017-08-07 18:16:33,699 - root - INFO - Epoch[11] Batch [700] Speed: 273.27 samples/sec accuracy=0.000000
2017-08-07 18:16:37,393 - root - INFO - Epoch[11] Batch [800] Speed: 270.71 samples/sec accuracy=0.000000
2017-08-07 18:16:41,054 - root - INFO - Epoch[11] Batch [900] Speed: 273.20 samples/sec accuracy=0.000000

Accuracy is always 0

I ran the train.py.

However, I got those info

[14:21:51] src/operator/tensor/./matrix_op-inl.h:144: Using target_shape will be deprecated.
[14:21:51] src/operator/tensor/./matrix_op-inl.h:144: Using target_shape will be deprecated.
2017-06-28 14:21:59,105 Epoch[0] Batch [50]	Speed: 51.16 samples/sec	Accuracy=0.000000
2017-06-28 14:22:07,049 Epoch[0] Batch [100]	Speed: 50.35 samples/sec	Accuracy=0.000000
2017-06-28 14:22:15,034 Epoch[0] Batch [150]	Speed: 50.09 samples/sec	Accuracy=0.000000
2017-06-28 14:22:23,187 Epoch[0] Batch [200]	Speed: 49.07 samples/sec	Accuracy=0.000000
2017-06-28 14:22:31,466 Epoch[0] Batch [250]	Speed: 48.31 samples/sec	Accuracy=0.000000
2017-06-28 14:22:39,868 Epoch[0] Batch [300]	Speed: 47.61 samples/sec	Accuracy=0.000000
2017-06-28 14:22:48,231 Epoch[0] Batch [350]	Speed: 47.83 samples/sec	Accuracy=0.000000
2017-06-28 14:22:56,492 Epoch[0] Batch [400]	Speed: 48.42 samples/sec	Accuracy=0.000000
....

Does anyone know the reason? How to solve it? Thanks!

作者你好,想请教下识别用的具体方法

我目前了解的车牌识别方法就是两种:分割然后识别单个字符;或者使用CTC进行序列识别。我没看懂你这个code用的是什么方法,多标签分类是什么意思呀?

请问下武警车牌7个字符8个字符是怎么做对齐的

你好,我之前也用过这种端到端的模型。对于固定长度的序列,确实能做到state-of-the-art,但是变长的,比如武警车牌有7个也有8个长度的,你是怎么做到的呢。另外,我试了下ctc发现在耗时要求严格的情况下,效果也不好。能否交流下,谢谢

中文字符识别率太低

训练了5万车牌数据,字母数字都能收敛到很小,只有中文字符收敛的比较慢,中文字符识别精确度很难提高,请问有什么建议吗?

双行车牌生成

大神您好,我现在在做双行车牌的识别。奈何数据集有限,请问您有双行车牌生成的代码么?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.