ilovin / lstm_ctc_ocr Goto Github PK

View Code? Open in Web Editor NEW

354.0 20.0 140.0 213 KB

Use CTC + tensorflow to OCR

Home Page: https://ilovin.github.io/2017-04-06/tensorflow-lstm-ctc-ocr/

Python 99.78% Shell 0.22%

tensorflow warpctc-tensorflow-binding crnn

lstm_ctc_ocr's People

Stargazers

Watchers

Forkers

sickfox linvdcd fresty zhangxinnan boluoyu dem-esgal shiyongde lplenka wind2008hxy higherwang caozhengquan zhuzzjlu changgongcheng naruto-sasuke aayn ericustc qaisarrajput zhangxd12 zgsxwsdxg toxic-0518 bidai541 afternoonzhou sunjunee roy-algoritm ghhong1986 selcouthlyblue qwzhong1988 apple1987 hookover jameslin2014 skyding228 wyc2015fq alexyoung757 mldlx zhnlk hangtongluo cassiaaaaaa pengyulong juventi deep-learning youngstu ieee820 stanxii dailyactie yujiaao boragocode zouwen198317 michaelshing jupinter hushenglang sunnycat2013 superkingjc suzhiwu fieryfish dighexode wangaiyou zouxiaoyuonly kaifagongju se7enzhou wantongtang searobbersduck 92xianshen 1320800521 yunwenhuang fireae ginking alexliyang bigrlab leftstone2015 kitter chen849157649 andrew05200 xuhan-hub fancyerii tgbamg barbecacov hemaoliang gds101054108 bolghuar chengstone nguyenhongchau algerjiang qinqiang1990 kinglu mxuer ericsqxd yyz277322264 zylo117 chaitusvk xiaoyaoking aurora11111 10183308 phychaos kartherion beimingmaster kellygodlv chunfangwang wjl198435 smzcc linecode

lstm_ctc_ocr's Issues

Cannot run the code of the master branch

I install tensorflow 1.0.1 and it errors like this:

yan@yan:~/lstm_ctc_ocr$ python ./standard/lstm_ocr.py
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
  File "./standard/lstm_ocr.py", line 203, in <module>
    train(train_dir='../train',val_dir='../val')
  File "./standard/lstm_ocr.py", line 112, in train
    g = Graph()
  File "./standard/lstm_ocr.py", line 51, in __init__
    stack = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.LSTMCell(FLAGS.num_hidden,state_is_tuple=True) for _ in range(FLAGS.num_layers)] , state_is_tuple=True)
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/__init__.py", line 35, in __getattr__
    contrib = importlib.import_module('tensorflow.contrib')
  File "/home/yan/anaconda3/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 978, in _gcd_import
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load
  File "<frozen importlib._bootstrap>", line 950, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 205, in _call_with_frames_removed
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/__init__.py", line 29, in <module>
    from tensorflow.contrib import factorization
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/factorization/__init__.py", line 24, in <module>
    from tensorflow.contrib.factorization.python.ops.gmm import *
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/factorization/python/ops/gmm.py", line 32, in <module>
    from tensorflow.contrib.learn.python.learn import graph_actions
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/__init__.py", line 70, in <module>
    from tensorflow.contrib.learn.python.learn import *
yan@yan:~/Paper_code/lstm_ctc_ocr$ python ./standard/lstm_ocr.py
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
  File "./standard/lstm_ocr.py", line 203, in <module>
    train(train_dir='../train',val_dir='../val')
  File "./standard/lstm_ocr.py", line 112, in train
    g = Graph()
  File "./standard/lstm_ocr.py", line 51, in __init__
    stack = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.LSTMCell(FLAGS.num_hidden,state_is_tuple=True) for _ in range(FLAGS.num_layers)] , state_is_tuple=True)
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/__init__.py", line 35, in __getattr__
    contrib = importlib.import_module('tensorflow.contrib')
  File "/home/yan/anaconda3/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 978, in _gcd_import
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load
  File "<frozen importlib._bootstrap>", line 950, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 205, in _call_with_frames_removed
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/__init__.py", line 29, in <module>
    from tensorflow.contrib import factorization
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/factorization/__init__.py", line 24, in <module>
    from tensorflow.contrib.factorization.python.ops.gmm import *
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/factorization/python/ops/gmm.py", line 32, in <module>
    from tensorflow.contrib.learn.python.learn import graph_actions
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/__init__.py", line 70, in <module>
    from tensorflow.contrib.learn.python.learn import *
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/__init__.py", line 23, in <module>
    from tensorflow.contrib.learn.python.learn import *
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/__init__.py", line 25, in <module>
    from tensorflow.contrib.learn.python.learn import estimators
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/__init__.py", line 310, in <module>
    from tensorflow.contrib.learn.python.learn.estimators.dnn import DNNClassifier
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn.py", line 29, in <module>
    from tensorflow.contrib.learn.python.learn.estimators import dnn_linear_combined
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 33, in <module>
    from tensorflow.contrib.learn.python.learn.estimators import estimator
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 51, in <module>
    from tensorflow.contrib.learn.python.learn.learn_io import data_feeder
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/learn_io/__init__.py", line 21, in <module>
    from tensorflow.contrib.learn.python.learn.learn_io.dask_io import extract_dask_data
  File "/home/yan/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/learn_io/dask_io.py", line 26, in <module>
    import dask.dataframe as dd
  File "/home/yan/anaconda3/lib/python3.6/site-packages/dask/dataframe/__init__.py", line 3, in <module>
    from .core import (DataFrame, Series, Index, _Frame, map_partitions,
  File "/home/yan/anaconda3/lib/python3.6/site-packages/dask/dataframe/core.py", line 38, in <module>
    pd.computation.expressions.set_use_numexpr(False)
AttributeError: module 'pandas' has no attribute 'computation'

Any help will be appreciated.

accuracy = 0 and the result only contain one character after decoded

非常感谢楼主的分享，我运行测试了一下，测试结果却是：训练accuracy一直是0，而且decoded出来只有一个字符？请问该哪里出问题了，怎么调整？
seq 0: origin: [52, 23, 35, 62, 24, 33] decoded:[26]
seq 1: origin: [54, 49, 2, 40, 26, 38] decoded:[26]
seq 2: origin: [62, 48, 10, 42, 12] decoded:[26]
seq 3: origin: [53, 54, 7, 36, 45] decoded:[26]
seq 4: origin: [35, 43, 45, 7] decoded:[26]
seq 5: origin: [44, 56, 50, 2] decoded:[26]
seq 6: origin: [53, 35, 57, 7] decoded:[26]
seq 7: origin: [58, 31, 37, 8, 43] decoded:[26]
seq 8: origin: [10, 30, 53, 38, 20, 12] decoded:[26]
seq 9: origin: [45, 45, 39, 27, 61] decoded:[26]
4-23 22:24:42 Epoch 2/10000, accuracy = 0.000,train_cost = 22.159, lastbatch_err = 0.987, time = 676.276

运行代码时崩溃：段错误（核心已转储）

ilovin, 你好，感谢开源！我 clone 你的代码知乎，运行如下指令：
#From lstm_ctc_ocr/
python3 ./lstm/train_net.py --network=LSTM_train --cfg=./lstm/lstm.yml --restore=0
程序直接崩溃，返回：段错误，核心已转储。请问大概是什么原因吗？

环境搭建出现问题，作者你的环境具体是哪些版本？

现在的环境：

ubuntu16.04
cuda7.5
cudnn5
tensorflow1.0.1
gtx1060
内存16G

关于环境，这份代码可以运行在tensorflow1.4.0和cuda8.0以上吗？

错误如下

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
Called with args:
Namespace(cfg_file='./lstm/lstm.yml', gpu_id=0, max_iters=700000, network_name='LSTM_train', pre_train=None, randomize=False, restore=0, set_cfgs=None)
CUDA_VISIBLE_DEVICES: 0 CFG.GPU_ID: 0
Using config:
{'CHARSET': '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',
 'EXP_DIR': 'lstm_ctc',
 'FONT': 'fonts/Ubuntu-M.ttf',
 'GPU_ID': 0,
 'IMG_SHAPE': [180, 60],
 'LOG_DIR': 'lstm_ctc',
 'MAX_CHAR_LEN': 6,
 'MAX_LEN': 6,
 'MIN_LEN': 4,
 'NCHANNELS': 1,
 'NCLASSES': 64,
 'NET_NAME': 'LSTM',
 'NUM_FEATURES': 60,
 'POOL_SCALE': 2,
 'RNG_SEED': 3,
 'ROOT_DIR': '/srv/python/lstm_ctc_ocr_with_tf_1.0.1',
 'SPACE_INDEX': 0,
 'SPACE_TOKEN': '',
 'TEST': {},
 'TIME_STEP': 90,
 'TRAIN': {'BATCH_SIZE': 32,
           'DISPLAY': 100,
           'GAMMA': 1.0,
           'LEARNING_RATE': 0.001,
           'LOG_IMAGE_ITERS': 100,
           'MOMENTUM': 0.9,
           'NUM_EPOCHS': 2000,
           'NUM_HID': 128,
           'NUM_LAYERS': 2,
           'SNAPSHOT_INFIX': '',
           'SNAPSHOT_ITERS': 2000,
           'SNAPSHOT_PREFIX': 'lstm',
           'SOLVER': 'RMS',
           'STEPSIZE': 2000,
           'WEIGHT_DECAY': 1e-05},
 'VAL': {'BATCH_SIZE': 128,
         'NUM_EPOCHS': 1000,
         'PRINT_NUM': 5,
         'VAL_STEP': 500}}
Output will be saved to `/srv/python/lstm_ctc_ocr_with_tf_1.0.1/output/lstm_ctc`
Logs will be saved to `/srv/python/lstm_ctc_ocr_with_tf_1.0.1/logs/lstm_ctc/lstm_train/2017-11-11-12-25-00`
/gpu:0
Tensor("data:0", shape=(?, ?, 60), dtype=float32)
Tensor("conv4/BiasAdd:0", shape=(?, ?, 30, 1), dtype=float32)
Tensor("time_step_len:0", shape=(?,), dtype=int32)
Use network `LSTM_train` in training
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1060 6GB
major: 6 minor: 1 memoryClockRate (GHz) 1.759
pciBusID 0000:01:00.0
Total memory: 5.93GiB
Free memory: 24.38MiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.34G (5729727232 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 4.80G (5156754432 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 4.32G (4641078784 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.89G (4176970752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.50G (3759273472 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.15G (3383345920 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 2.83G (3045011200 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 2.55G (2740509952 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 2.30G (2466458880 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 2.07G (2219812864 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1.86G (1997831680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1.67G (1798048512 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1.51G (1618243584 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1.36G (1456419328 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1.22G (1310777344 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1.10G (1179699712 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 1012.54M (1061729792 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 911.29M (955556864 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 820.16M (860001280 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 738.14M (774001152 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 664.33M (696601088 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 597.90M (626941184 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 538.11M (564247040 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 484.30M (507822336 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 435.87M (457040128 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 392.28M (411336192 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 353.05M (370202624 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 317.75M (333182464 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 285.97M (299864320 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 257.38M (269878016 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 231.64M (242890240 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 208.47M (218601216 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 187.63M (196741120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 168.86M (177067008 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 151.98M (159360512 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 136.78M (143424512 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 123.10M (129082112 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 110.79M (116174080 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 99.71M (104556800 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 89.74M (94101248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 80.77M (84691200 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 72.69M (76222208 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 65.42M (68600064 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 58.88M (61740288 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 52.99M (55566336 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 47.69M (50009856 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 42.92M (45008896 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 38.63M (40508160 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 34.77M (36457472 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 31.29M (32811776 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 28.16M (29530624 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 25.35M (26577664 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 22.81M (23920128 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 20.53M (21528320 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 18.48M (19375616 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 16.63M (17438208 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
done
Solving...
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.32G (5715601920 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.32G (5715601920 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.32G (5715601920 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 5.32G (5715601920 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (256): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (512): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (1024): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (2048): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (4096): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (8192): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (16384): 	Total Chunks: 1, Chunks in use: 0 26.2KiB allocated for chunks. 4B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (32768): 	Total Chunks: 1, Chunks in use: 0 36.5KiB allocated for chunks. 512.0KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (65536): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (131072): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (262144): 	Total Chunks: 1, Chunks in use: 0 324.0KiB allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (524288): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (1048576): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (2097152): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (4194304): 	Total Chunks: 1, Chunks in use: 0 5.84MiB allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (8388608): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (16777216): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (33554432): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (67108864): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (134217728): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (268435456): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:660] Bin for 37.50MiB was 32.00MiB, Chunk State: 
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400000 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400600 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400900 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400a00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400b00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208400c00 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208401000 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208401100 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208401500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208401600 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208401e00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208401f00 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208402700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208402800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208402900 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208402a00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208402b00 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208403000 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208403500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208403600 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208403700 of size 73728
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208415700 of size 73728
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208427700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208427800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208427900 of size 2304
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208428200 of size 2304
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208428b00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208428c00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208428d00 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208440500 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208457d00 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208458100 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208458500 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102084d8500 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208558500 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208558d00 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208559500 of size 32768
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208561500 of size 32768
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208569500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208569600 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208569700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208569800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208569900 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208569a00 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020856a200 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020856aa00 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020856b200 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020856ba00 of size 32768
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208573a00 of size 32768
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857ba00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857bb00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857bc00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857bd00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857be00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857bf00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857c000 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857c500 of size 2304
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020857ce00 of size 32768
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858e000 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858e100 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858e200 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858e300 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858e400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858e500 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858ea00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020858eb00 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102085a6300 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102085a6700 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102085bdf00 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102085be300 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102085d5b00 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020863e300 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020863eb00 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086beb00 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086bf300 of size 2048
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086bfb00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086bfc00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086bfd00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086bfe00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086bff00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c0000 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c0100 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c0200 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c0300 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c0400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c0500 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c7300 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c7400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c7500 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c7a00 of size 1280
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c7f00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c8000 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086c8100 of size 73728
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086da100 of size 73728
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ec100 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ec200 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ec300 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ec400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ec500 of size 2304
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ece00 of size 2304
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ed700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ed800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102086ed900 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208705100 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020871c900 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020871cd00 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020871d100 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208734900 of size 96256
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020874c100 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020874c500 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020874c900 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102087cc900 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020884c900 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102088cc900 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1020894c900 of size 695552
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102089f6600 of size 524288
I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10208a76600 of size 1228800
I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x10208584e00 of size 37376
I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x102085ed300 of size 331776
I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x102086c0a00 of size 26880
I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x10208ba2600 of size 6120192
I tensorflow/core/common_runtime/bfc_allocator.cc:693]      Summary of in-use Chunks by size: 
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 57 Chunks of size 256 totalling 14.2KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 10 Chunks of size 1024 totalling 10.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 8 Chunks of size 1280 totalling 10.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 11 Chunks of size 2048 totalling 22.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 5 Chunks of size 2304 totalling 11.2KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 5 Chunks of size 32768 totalling 160.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 73728 totalling 288.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 10 Chunks of size 96256 totalling 940.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 8 Chunks of size 524288 totalling 4.00MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 695552 totalling 679.2KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 1228800 totalling 1.17MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:700] Sum Total of in-use chunks: 7.26MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:702] Stats: 
Limit:                  5729727283
InUse:                     7609088
MaxInUse:                  7609600
NumAllocs:                     146
MaxAllocSize:              1228800

W tensorflow/core/common_runtime/bfc_allocator.cc:274] ***************_*****************************************___________________________________________
W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 37.50MiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:993] Resource exhausted: OOM when allocating tensor with shape[32,160,60,32]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[32,160,60,32]
	 [[Node: conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](ExpandDims, conv1/weights/read)]]
	 [[Node: logits/bidirectional_rnn/bw/bw/All/_17 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_615_logits/bidirectional_rnn/bw/bw/All", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./lstm/train_net.py", line 89, in <module>
    restore=bool(int(args.restore)))
  File "./lstm/../lib/lstm/train.py", line 190, in train_net
    sw.train_model(sess, max_iters, restore=restore)
  File "./lstm/../lib/lstm/train.py", line 148, in train_model
    ctc_loss,summary_str, _ =  sess.run(fetches=fetch_list, feed_dict=feed_dict)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[32,160,60,32]
	 [[Node: conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](ExpandDims, conv1/weights/read)]]
	 [[Node: logits/bidirectional_rnn/bw/bw/All/_17 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_615_logits/bidirectional_rnn/bw/bw/All", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op 'conv1/Conv2D', defined at:
  File "./lstm/train_net.py", line 81, in <module>
    network = get_network(args.network_name)
  File "./lstm/../lib/networks/factory.py", line 17, in get_network
    return LSTM_train()
  File "./lstm/../lib/networks/LSTM_train.py", line 20, in __init__
    self.setup()
  File "./lstm/../lib/networks/LSTM_train.py", line 24, in setup
    .conv_single(3, 3, 32 ,1, 1, name='conv1',c_i=cfg.NCHANNELS)
  File "./lstm/../lib/networks/network.py", line 31, in layer_decorated
    layer_output = op(self, layer_input, *args, **kwargs)
  File "./lstm/../lib/networks/network.py", line 173, in conv_single
    conv = convolve(input, kernel)
  File "./lstm/../lib/networks/network.py", line 165, in <lambda>
    convolve = lambda i, k: tf.nn.conv2d(i, k, [1,s_h, s_w, 1], padding=padding)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 396, in conv2d
    data_format=data_format, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
    self._traceback = _extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[32,160,60,32]
	 [[Node: conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](ExpandDims, conv1/weights/read)]]
	 [[Node: logits/bidirectional_rnn/bw/bw/All/_17 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_615_logits/bidirectional_rnn/bw/bw/All", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

why there comes out negative loss?

Should we ignore this negative loss?

can you please share your trained model

hi there, i have followed your step to try to repeat your result. howerver, I failed due to some unkonwn reason. I think it should have something to do with my parameter settings. can you share your trained model? thanks a lot!

wrapctc

谢谢您的回答，还想再请教一下，关于wrapctc比原来的ctc快，可是我试了一下，并没有觉得快，您对这个问题有研究吗？谢谢~~

warp-ctc不能绑定到TensorFlow1.1

warp-ctc 编译以及setup都没有错误，在import warpctc_tensorflow的时候出现错误：
File "<stdin>", line 1, in <module> File "/Users/edgar/.pyenv/versions/anaconda2-4.0.0/lib/python2.7/site-packages/warpctc_tensorflow-0.1-py2.7-macosx-10.7-x86_64.egg/warpctc_tensorflow/__init__.py", line 7, in <module> _warpctc = tf.load_op_library(lib_file) File "/Users/edgar/.pyenv/versions/anaconda2-4.0.0/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library None, None, error_msg, error_code) tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users/edgar/.pyenv/versions/anaconda2-4.0.0/lib/python2.7/site-packages/warpctc_tensorflow-0.1-py2.7-macosx-10.7-x86_64.egg/warpctc_tensorflow/kernels.so, 6): Library not loaded: @rpath/libwarpctc.dylib Referenced from: /Users/edgar/.pyenv/versions/anaconda2-4.0.0/lib/python2.7/site-packages/warpctc_tensorflow-0.1-py2.7-macosx-10.7-x86_64.egg/warpctc_tensorflow/kernels.so Reason: image not found
有没有人遇到相同问题的呢

question about a parameter

Hello, I am confused about the parameter "__C.POOL_SCALE = 2" . What does that mean? when i change it to 1, an error occurs.
In addition, my result is:
seq 0: origin: [27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0] decoded:[27, 1, 5, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1]
seq 1: origin: [23 5 19 9 13 2 21 18 5 14 7 7 5 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0] decoded:[9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1]
seq 2: origin: [10 1 12 9 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0] decoded:[9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1]
seq 3: origin: [27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0] decoded:[9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1, 9, 1]
What the problem?

why num classes is C.NCLASSES = len(C.CHARSET)+2

why num classes is __C.NCLASSES = len(__C.CHARSET)+2

I read other coder ，they set __C.NCLASSES = len(__C.CHARSET)+1

accuracy = 0.000 after Epoch 132/10000

5-7 19:8:33 Epoch 130/10000, accuracy = 0.000,train_cost = 21.917, lastbatch_err = 0.983, time = 110.495
('batch', 99, ': time', 0.23030900955200195)
('batch', 199, ': time', 0.3604588508605957)
seq 0: origin: [42, 29, 32, 12, 36, 20] decoded:[40]
seq 1: origin: [30, 45, 45, 5, 54] decoded:[22, 19]
seq 2: origin: [62, 40, 57, 7, 58, 54] decoded:[46]
seq 3: origin: [48, 14, 3, 19] decoded:[19]
seq 4: origin: [17, 32, 3, 59, 29, 55] decoded:[40]
seq 5: origin: [20, 14, 55, 15] decoded:[47]
seq 6: origin: [10, 42, 29, 37, 26, 34] decoded:[46]
seq 7: origin: [22, 48, 36, 41, 26, 55] decoded:[57]
seq 8: origin: [58, 6, 27, 11, 17, 1] decoded:[39]
seq 9: origin: [52, 48, 57, 29, 14] decoded:[46]
5-7 19:11:8 Epoch 131/10000, accuracy = 0.000,train_cost = 21.915, lastbatch_err = 0.983, time = 154.524
('batch', 99, ': time', 0.36012792587280273)
('batch', 199, ': time', 0.3593289852142334)
seq 0: origin: [42, 29, 32, 12, 36, 20] decoded:[40]
seq 1: origin: [30, 45, 45, 5, 54] decoded:[13, 19]
seq 2: origin: [62, 40, 57, 7, 58, 54] decoded:[46]
seq 3: origin: [48, 14, 3, 19] decoded:[19]
seq 4: origin: [17, 32, 3, 59, 29, 55] decoded:[40]
seq 5: origin: [20, 14, 55, 15] decoded:[33]
seq 6: origin: [10, 42, 29, 37, 26, 34] decoded:[46]
seq 7: origin: [22, 48, 36, 41, 26, 55] decoded:[40]
seq 8: origin: [58, 6, 27, 11, 17, 1] decoded:[40]
seq 9: origin: [52, 48, 57, 29, 14] decoded:[46]
5-7 19:13:53 Epoch 132/10000, accuracy = 0.000,train_cost = 21.904, lastbatch_err = 0.984, time = 164.831
('batch', 99, ': time', 0.4470250606536865)
('batch', 199, ': time', 0.36761021614074707)
seq 0: origin: [42, 29, 32, 12, 36, 20] decoded:[40]
seq 1: origin: [30, 45, 45, 5, 54] decoded:[13, 40]
seq 2: origin: [62, 40, 57, 7, 58, 54] decoded:[40]
seq 3: origin: [48, 14, 3, 19] decoded:[40, 19]
seq 4: origin: [17, 32, 3, 59, 29, 55] decoded:[40]
seq 5: origin: [20, 14, 55, 15] decoded:[40]
seq 6: origin: [10, 42, 29, 37, 26, 34] decoded:[46]
seq 7: origin: [22, 48, 36, 41, 26, 55] decoded:[40]

关于修改为tf.nn.ctc_loss后收敛变慢的问题

作者你好：
我在bete版本的代码中的build_loss部分做了如下修改：
1、重新定义labels为稀疏张量
2、在获得labels时将labels转换为稀疏矩阵
3、使用之前获得的time_step_batch作为tf.nn.ctc_loss的seq_len输入
然后我将源代码中的warpctc替换为tf.nn.ctc_loss并成功跑通了网络，但是收敛的很慢（从read.me图片来看你跑了20k次就达到了96%的准确率，而我跑了35k次才勉强上90%，loss没下0.1，我没有修改代码中的参数所以我们参数应该是一样的，样本我也用的是直接生成在内存中的样本）
我想问一下：
1、是不是我还有遗漏的地方没有修改？
2、我收敛的慢是不是使用了不同的ctc函数引起的（tf.nn.ctc_loss与warpctc）

image preprocess

您好，我看您说做了一个身份证识别的模型。我想请教你数据怎么预处理的，您那个模型是直接在身份证号码上面训练的吗？有涉及到处理反光问题吗？我最近在做一个表盘数字识别，用自己生成的数据训练，在表盘数字上测试结果很差

Accuracy 0 with latest code

Hello ilovin,

I'm trying to run your exact code(latest, standard) to see If I can perform trainning. But even after 300 epoch the accuracy is 0. Epoch 354/10000, accuracy = 0.000,avg_train_cost = 22.038, lastbatch_err = 0.980, time = 328.230,lr=0.00000000

I've seen issue #1 and #2, and checked that your latest code contains all the changes still the accuracy is 0.
python 2.7 and 3.5.2
tensorflow 1.2.1 without WrapCTC
train data set size 128000

Can you please help. Thanks

请问训练数据字符个数长度不一致的时候有没有什么好的训练方式？

请问训练数据字符个数长度不一致的时候有没有什么好的训练方式？
how to train the model if the character numbers of my training data are different？

How to deal with different size of images?

I have lots of segmentaion images with different sizes. Must I rescale images to the same size? that would be much better if training can be done without scaling.

ValueError: zero-size array to reduction operation maximum which has no identity

I try code of master branch, and it errors like this:
=============================begin training=============================
Traceback (most recent call last):
File "./standard/lstm_ocr.py", line 203, in
train(train_dir='../train',val_dir='../val')
File "./standard/lstm_ocr.py", line 142, in train
val_inputs,val_seq_len,val_labels=val_feeder.input_index_generate_batch()
File "/home/yan/lstm_ctc_ocr-master/standard/utils.py", line 112, in input_index_generate_batch
batch_labels = sparse_tuple_from_label(label_batch)
File "/home/yan/lstm_ctc_ocr-master/standard/utils.py", line 143, in sparse_tuple_from_label
shape = np.asarray([len(sequences), np.asarray(indices).max(0)[1]+1], dtype=np.int64)
File "/home/yan/anaconda3/lib/python3.6/site-packages/numpy/core/_methods.py", line 26, in _amax
return umr_maximum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation maximum which has no identity

i have run the code，the rusult acc is 86%，my goal is 90%+ ,how to impove this?

7/7 8:41:54 Epoch 89/10000, accuracy = 0.864,avg_train_cost = 0.354, lastbatch_err = 0.034, time = 144.714,lr=0.00000000

Using Tensorflow CTC on Dev and Beta instead of Warp CTC

Is it possible to replace Warp CTC withTensorflow CTC for the dev and beta branches of this repository? I plan on using this for mobile and unfortunately, tensorflow android doesn't support warp ctc.

ctc

undefined symbol: _ZNK10tensorflow14TensorShapeRep11DebugStringB5cxx11Ev

there are no space “ ” in the labels，why add a space class in num_classes？

#26*2 + 10 digit + blank + space
num_classes=26+26+10+1+1

i find no spaces in file names (00015901_mTiG.png 00031901_ZqJml.png 00047901_VMYAF.png), why add it? am i looking wrong?

accuracy = 0.000 after Epoch 10000/10000

I just change the train-set size from 128000 to 12800.

seq 0: origin: [29, 40, 38, 59, 60] decoded:[]
seq 1: origin: [57, 35, 15, 25, 3] decoded:[]
seq 2: origin: [4, 10, 11, 58, 26] decoded:[]
seq 3: origin: [11, 43, 2, 22] decoded:[]
seq 4: origin: [43, 51, 15, 17] decoded:[]
seq 5: origin: [60, 12, 1, 55] decoded:[]
seq 6: origin: [38, 32, 58, 33] decoded:[]
seq 7: origin: [14, 17, 23, 55, 38, 58] decoded:[]
seq 8: origin: [11, 24, 8, 8, 22] decoded:[]
seq 9: origin: [43, 41, 28, 28, 43] decoded:[]
10/22 11:56:11 Epoch 10000/10000, accuracy = 0.000,avg_train_cost = 22.022, lastbatch_err = 1.000, time = 22.802,lr=0.00000000

Where is the problem..

the standard version with:
tensorflow 1.0.1
python 2.7.6

will this work with dual line text?

作者您好，我刚刚开始研究深度学习的ocr部分，看到这个识别验证码已经很好了，想问一下，这个模型对于多行文字识别效果怎么样？比如一张图片有三行文字需要识别，可以直接套用这个模型然后标注数据把三行文字顺序标注出或者是修改模型才能识别？

一个小问题

你好！
请问OFF_TIME_STEP是做什么的呢？为什么要加上这个参数嘞？
谢谢！

master standard 跑100多轮还是acc=0

12/13 9:55:19 Epoch 110/10000, accuracy = 0.000,avg_train_cost = 21.989, lastbatch_err = 1.000, time = 319.316,lr=0.00000000
batch 1599 : time 0.15444302558898926
batch 1699 : time 0.15120339393615723
batch 1799 : time 0.15367650985717773
batch 1899 : time 0.1504526138305664
batch 1999 : time 0.14650583267211914
seq 0: origin: [25, 35, 32, 14, 30, 61] decoded:[]
seq 1: origin: [3, 2, 44, 14] decoded:[]
seq 2: origin: [50, 35, 14, 54, 18, 14] decoded:[]
seq 3: origin: [26, 50, 46, 38, 46] decoded:[]
seq 4: origin: [18, 19, 37, 15, 14] decoded:[]
seq 5: origin: [46, 9, 49, 34, 13] decoded:[]
seq 6: origin: [62, 46, 48, 49, 58, 24] decoded:[]
seq 7: origin: [2, 34, 41, 59, 14] decoded:[]
seq 8: origin: [40, 41, 39, 2, 47] decoded:[]
seq 9: origin: [39, 52, 13, 8, 52] decoded:[]
12/13 9:57:5 Epoch 110/10000, accuracy = 0.000,avg_train_cost = 21.983, lastbatch_err = 1.000, time = 425.382,lr=0.00000000
batch 99 : time 0.15007543563842773
batch 199 : time 0.15009498596191406
batch 299 : time 0.1510307788848877
batch 399 : time 0.15029311180114746
batch 499 : time 0.15467214584350586

OSError: cannot open resource

any idea ?

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/Users/alpullu/Downloads/lstm_ctc_ocr-dev/lib/utils/genImg.py", line 26, in generateImg
data=captcha.generate(theChars)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/captcha/image.py", line 45, in generate
im = self.generate_image(chars)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/captcha/image.py", line 226, in generate_image
im = self.create_captcha_image(chars, color, background)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/captcha/image.py", line 197, in create_captcha_image
images.append(_draw_character(c))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/captcha/image.py", line 164, in _draw_character
font = random.choice(self.truefonts)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/captcha/image.py", line 122, in truefonts
for n in self._fonts
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/captcha/image.py", line 123, in
for s in self._font_sizes
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/PIL/ImageFont.py", line 238, in truetype
return FreeTypeFont(font, size, index, encoding)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/PIL/ImageFont.py", line 127, in init
self.font = core.getfont(font, size, index, encoding)
OSError: cannot open resource
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1596, in
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 974, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/alpullu/Downloads/lstm_ctc_ocr-dev/lib/utils/genImg.py", line 41, in
run(64*2000,'./data/train')
File "/Users/alpullu/Downloads/lstm_ctc_ocr-dev/lib/utils/genImg.py", line 38, in run
pool.map(generateImg,range(num))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 608, in get
raise self._value
OSError: cannot open resource

Process finished with exit code 1

ImportError: No module named utils.data_util

hi, @ilovin when i run train.sh ,i got an error:

STM_train --cfg=./lstm/lstm.yml --restore=0
/usr/local/lib/python2.7/dist-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "./lstm/train_net.py", line 11, in
from lib.lstm.train import train_net
File "./lstm/../lib/lstm/init.py", line 9, in
from . import train
File "./lstm/../lib/lstm/train.py", line 8, in
from lib.lstm.utils.gen import get_batch
File "./lstm/../lib/lstm/utils/gen.py", line 13, in
from lib.utils.data_util import GeneratorEnqueuer
ImportError: No module named utils.data_util

undefined symbol: _ZTIN10tensorflow8OpKernelE

when run train.sh or test.sh ,i got the error:

from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "./lstm/train_net.py", line 13, in
from lib.networks.factory import get_network
File "./lstm/../lib/networks/init.py", line 1, in
from . import factory
File "./lstm/../lib/networks/factory.py", line 11, in
from .LSTM_train import LSTM_train
File "./lstm/../lib/networks/LSTM_train.py", line 2, in
from .network import Network
File "./lstm/../lib/networks/network.py", line 6, in
import warpctc_tensorflow
File "/usr/local/lib/python2.7/dist-packages/warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg/warpctc_tensorflow/init.py", line 7, in
_warpctc = tf.load_op_library(lib_file)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /usr/local/lib/python2.7/dist-packages/warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg/warpctc_tensorflow/kernels.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

how can i solve this problem ? any advice @ilovin thank you

got AttributeError when training

Hi,
I run the program with python3.5 using the command:

python ./lstm/train_net.py --network=LSTM_train --cfg=./lstm/lstm.yml --restore=0

and got the following error

  File "./lstm/train_net.py", line 89, in <module>
    restore=bool(int(args.restore)))
  File "./lstm/..\lib\lstm\train.py", line 177, in train_net
    sw.train_model(sess, max_iters, restore=restore)
  File "./lstm/..\lib\lstm\train.py", line 122, in train_model
    img_Batch,targets,time_step_Batch = next(train_gen)
  File "./lstm/..\lib\lstm\utils\gen.py", line 120, in get_batch
    enqueuer.start(max_queue_size=24, workers=num_workers)
  File "./lstm/..\lib\utils\data_util.py", line 81, in start
    thread.start()
  File "C:\Users\ugton\Anaconda3\envs\py3.5\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "C:\Users\ugton\Anaconda3\envs\py3.5\lib\multiprocessing\context.py", line 212, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\ugton\Anaconda3\envs\py3.5\lib\multiprocessing\context.py", line 313, in _Popen
    return Popen(process_obj)
  File "C:\Users\ugton\Anaconda3\envs\py3.5\lib\multiprocessing\popen_spawn_win32.py", line 66, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\ugton\Anaconda3\envs\py3.5\lib\multiprocessing\reduction.py", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'GeneratorEnqueuer.start.<locals>.data_generator_task'

(py3.5) D:\git\lstm_ctc_ocr-beta>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\ugton\Anaconda3\envs\py3.5\lib\multiprocessing\spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\ugton\Anaconda3\envs\py3.5\lib\multiprocessing\spawn.py", line 116, in _main
    self = pickle.load(from_parent)
EOFError: Ran out of input

It seems like the data_generator_task function cannot be pickled because it is not defined at the top level of a module (https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled), but I don't know how to fix it. Does anyone encounter the same error?

BTW, it runs fine when use_multiprocessing is set to False.

Loss 小于0......

不知道为啥loss一路下降到了负的。我是用的自己的数据，使用的是dev版本。

iter: 100 / 1000000, total loss: 32.5983144, lr: 0.0001000 speed: 0.464s / iter
iter: 200 / 1000000, total loss: -7.0759972, lr: 0.0001000 speed: 0.419s / iter
iter: 300 / 1000000, total loss: -7.6936705, lr: 0.0001000 speed: 0.404s / iter
iter: 400 / 1000000, total loss: -7.8560844, lr: 0.0001000 speed: 0.397s / iter
iter: 500 / 1000000, total loss: -8.0681042, lr: 0.0001000 speed: 0.392s / iter
iter: 600 / 1000000, total loss: -8.1164246, lr: 0.0001000 speed: 0.390s / iter
iter: 700 / 1000000, total loss: -8.1190990, lr: 0.0001000 speed: 0.388s / iter
iter: 800 / 1000000, total loss: -8.1515580, lr: 0.0001000 speed: 0.386s / iter
iter: 900 / 1000000, total loss: -8.1126445, lr: 0.0001000 speed: 0.385s / iter
iter: 1000 / 1000000, total loss: -8.1664192, lr: 0.0001000 speed: 0.384s / iter
iter: 1100 / 1000000, total loss: -8.2163795, lr: 0.0001000 speed: 0.383s / iter
iter: 1200 / 1000000, total loss: -8.2741690, lr: 0.0001000 speed: 0.382s / iter
iter: 1300 / 1000000, total loss: -8.3386535, lr: 0.0001000 speed: 0.382s / iter
iter: 1400 / 1000000, total loss: -8.5193614, lr: 0.0001000 speed: 0.381s / iter
iter: 1500 / 1000000, total loss: -8.5624605, lr: 0.0001000 speed: 0.381s / iter
iter: 1600 / 1000000, total loss: -8.5590217, lr: 0.0001000 speed: 0.380s / iter
iter: 1700 / 1000000, total loss: -8.6395080, lr: 0.0001000 speed: 0.380s / iter
iter: 1800 / 1000000, total loss: -8.6862440, lr: 0.0001000 speed: 0.380s / iter
iter: 1900 / 1000000, total loss: -8.6726618, lr: 0.0001000 speed: 0.380s / iter

关于图像变长字符识别的问题

看了您bate版本的代码，发现您虽然增加了自动补齐图片宽度的代码，但是您生成的验证码貌似都是一样长宽的，请问对于不定长字符识别您有做过吗

can i use chinese to train the model ?, in english ,you use 10000+image to train the model, can the acc Achieve 99%？ if i want to Achieve 99%, can you give me some advice. thank you

Master branch cannot converge

The latest master branch cannot converge. I have read all other relative issues and changed decay_steps to 100000. It cannot helps. Could you give me some suggestions. I do not want to use warpCTC. Thanks.

why printed “decode” same？

hi，
thanks for your code， but I run the code in TF 1.2 get the print information like this：

after training for 8 hours ， the accuracy is 0 ,and sevral “decoded” in seq are same.
is that correct?

i use master version ,i see your acc is 96%，i use 12800 images, the acc is only 76%, how many images you used?

字符的长度应该是63个吧

有效的字符是a-zA-Z0-9 62个，然后加上一个空(epsilon)，共63个。我测试了一下把它改成63，效果没有任何变化。

$ git diff
diff --git a/lib/lstm/config.py b/lib/lstm/config.py
index 3b1a322..e61cc3b 100644
--- a/lib/lstm/config.py
+++ b/lib/lstm/config.py
@@ -20,7 +20,7 @@ __C.IMG_HEIGHT = 32
__C.MAX_CHAR_LEN = 6
__C.BLANK_TOKEN=0
__C.CHARSET = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
-__C.NCLASSES = len(__C.CHARSET)+2
+__C.NCLASSES = len(__C.CHARSET)+1

ctc和tensorflow绑定以后运行出了问题， undefined symbol: _ZNK10tensorflow14TensorShapeRep11DebugStringB5cxx11Ev

安baidu ctc的命令python setup install --prefix 到了我本地路径，然后./train.sh出现了这个问题
tensorflow.python.framework.errors_impl.NotFoundError: /home/zhangtao/.local/lib/python2.7/site-packages/warpctc_tensorflow-0.1-py2.7-linux-x86_64.egg/warpctc_tensorflow/kernels.so: undefined symbol: _ZNK10tensorflow14TensorShapeRep11DebugStringB5cxx11Ev
大家有没有遇到过，谢谢

CTC on Multiple GPUs

Have you tried Wrap-CTC on GPUs?
Do you know any clue about CTC working on multiple GPUs?

IOError: unknown file format

7/dist-packages/captcha/image.py", line 226, in generate_image
im = self.create_captcha_image(chars, color, background)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 197, in create_captcha_image
images.append(_draw_character(c))
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 164, in _draw_character
font = random.choice(self.truefonts)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 123, in truefonts
for s in self._font_sizes
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 260, in truetype
return FreeTypeFont(font, size, index, encoding, layout_engine)
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 143, in init
self.font = core.getfont(font, size, index, encoding, layout_engine=layout_engine)
IOError: unknown file format
Traceback (most recent call last):
File "/root/lstm_ctc_ocr/lib/lstm/utils/gen.py", line 77, in generator
im, label = generateImg()
File "/root/lstm_ctc_ocr/lib/lstm/utils/gen.py", line 38, in generateImg
data=captcha.generate_image(theChars)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 226, in generate_image
im = self.create_captcha_image(chars, color, background)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 197, in create_captcha_image
images.append(_draw_character(c))
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 164, in _draw_character
font = random.choice(self.truefonts)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 123, in truefonts
for s in self._font_sizes
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 260, in truetype
return FreeTypeFont(font, size, index, encoding, layout_engine)
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 143, in init
self.font = core.getfont(font, size, index, encoding, layout_engine=layout_engine)
IOError: unknown file format
Traceback (most recent call last):
File "/root/lstm_ctc_ocr/lib/lstm/utils/gen.py", line 77, in generator
im, label = generateImg()
File "/root/lstm_ctc_ocr/lib/lstm/utils/gen.py", line 38, in generateImg
data=captcha.generate_image(theChars)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 226, in generate_image
im = self.create_captcha_image(chars, color, background)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 197, in create_captcha_image
images.append(_draw_character(c))
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 164, in _draw_character
font = random.choice(self.truefonts)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 123, in truefonts
for s in self._font_sizes
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 260, in truetype
return FreeTypeFont(font, size, index, encoding, layout_engine)
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 143, in init
self.font = core.getfont(font, size, index, encoding, layout_engine=layout_engine)
IOError: unknown file format
Traceback (most recent call last):
File "/root/lstm_ctc_ocr/lib/lstm/utils/gen.py", line 77, in generator
im, label = generateImg()
File "/root/lstm_ctc_ocr/lib/lstm/utils/gen.py", line 38, in generateImg
data=captcha.generate_image(theChars)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 226, in generate_image
im = self.create_captcha_image(chars, color, background)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 197, in create_captcha_image
images.append(_draw_character(c))
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 164, in _draw_character
font = random.choice(self.truefonts)
File "/usr/local/lib/python2.7/dist-packages/captcha/image.py", line 123, in truefonts
for s in self._font_sizes
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 260, in truetype
return FreeTypeFont(font, size, index, encoding, layout_engine)
File "/usr/local/lib/python2.7/dist-packages/PIL/ImageFont.py", line 143, in init
self.font = core.getfont(font, size, index, encoding, layout_engine=layout_engine)
IOError: unknown file format

WrapCtc的GPU和CPU版本

你好，请问WrapCtc的GPU和CPU版本怎么区别使用？需要怎么设置？

master standard 跑100多轮还是acc=0

@rolai can i use chinese to train the model ?

in English, you use 10000+image to train the model, can the acc Achieve 99%？ if I want to Achieve 99%, can you give me some advice. thank you

关于master版本的代码无法收敛的问题

问题描述
之前有一些同学提到master版本网络不好收敛的问题，包括#1, #28, #31 等，
原因
有些人是因为没跑够，有些人因为数据问题（数据量不足，用自己的数据，生成数据错误，图片没出来等等问题），如果以上没问题，可能还是会出现不收敛的问题，我想这可能和初始化，数据，lr，优化器有关。
anyway，master版本网络确实存在不好收敛的问题，需要细致的调参（有的时候我会看decode出的结果来调lr与decay step。但master跑的时间较长，所以大家可能跑一次不出结果就放一边了），因为它没有加入cnn提取特征，直接上lstm+ctc硬解。
建议

所以我repo中默认的版本放的是beta版本，直接clone就是beta版本的代码（加了cnn），这个20分钟就能出结果，而且准确率更高，最关键的是我还没见任何人出现过无法收敛的情况。但大家还是用master的代码可能是因为不用装warpCTC以及叫master的原因？
ps: 之所以还放着master版本的代码，只是为了自己使用方便，留个备份，看来我还是给删了好了。
对于master版本，大家也可以尝试改改网络（初始化方式什么的）
如果实在懒得装warpCTC，那么可以改beta的代码，改成standard的ctc loss，二者的区别我在readme里的blog提过了

我看了网络的代码在LSTM部分首先是一个双向的LSTM ，然后将输出再作为一个双层单向的LSTM，之后的输出再做一个映射（tf.matmul(lstm_out,W)+b）。这么设计的想法是什么？
2.这个网络的泛化能力怎么样？
我用随机生产的验证码训练网络，准确率达到97%，但是换做个测试数据集，比如在word打一串数字然后截屏，这样的识别结果准确率为0，您觉得问题出在哪？

how can i solve it？