bes-dev / crnn-pytorch Goto Github PK

View Code? Open in Web Editor NEW

216.0 12.0 55.0 28 KB

Pytorch implementation of OCR system using CRNN + CTCLoss

License: BSD 2-Clause "Simplified" License

Python 100.00%

crnn-pytorch's Introduction

Convolutional Recurrent Neural Network

This software implements OCR system using CNN + RNN + CTCLoss, inspired by CRNN network.

Usage

python ./train.py --help

Demo

Train simple OCR using TestDataset data generator. Training for ~60-100 epochs.

python train.py --test-init True --test-epoch 10 --output-dir <path_to_folder_with_snapshots>

Run test for trained model with visualization mode.

python test.py --snapshot <path_to_folder_with_snapshots>/crnn_resnet18_10_best --visualize True

Train on custom dataset

Create dataset

Structure of dataset:

<root_dataset_dir>
---- data
-------- <img_filename_0>
...
-------- <img_filename_1>
---- desc.json

Structure of desc.json:

{
"abc": <symbols_in_aphabet>,
"train": [
{
"text": <text_on_image>
"name": <img_filename>
},
...
{
"text": <text_on_image>
"name": <img_filename>
}
],
"test": [
{
"text": <text_on_image>
"name": <img_filename>
},
...
{
"text": <text_on_image>
"name": <img_filename>
}
]
}

Train simple OCR using custom dataset.

python train.pt --test-init True --test-epoch 10 --output-dir <path_to_folder_with_snapshots> --data-path <path_to_custom_dataset>

Run test for trained model with visualization mode.

python test.py --snapshot <path_to_folder_with_snapshots>/crnn_resnet18_10_best --visualize True --data-path <path_to_custom_dataset>

Dependence

pytorch 0.3.0 +
warp-ctc

Articles

crnn-pytorch's People

Contributors

Stargazers

Watchers

Forkers

shubhampachori12110095 westhamkdk anuj-rathore aymenx17 chaitusvk dsp6414 stesteau templeblock dxist calciferfire liben2018 2232088201wzu smallflyingpig ivkosar danielwang5 jiancui1992 akramkohansal gehongpeng aidonchuk i-newton stefanruan lixiang0 hhgxx123 mingewang faizwhb aronsoyol woofpc trevorspreadbury shiyi-mu wateroot phamdinhkhanh allysakate jameswang007 msurendra hell-to-heaven phv2312 sanguigu cdyangbo devrajd lxmwust ducbx zhaoyisong yash-bhat chez advancer-debug iammosespaulr geochri zhangfengyo elanning ajeet28 harold-yh m271828ngtao bxtww yuling91 castielzhe

crnn-pytorch's Issues

How must be my custom dataset?

Hi, i have problems understanding the right format that my dataset must have. Could you give me an example?

Multiple lines images training

Hello,

Is it okay if I train the network on a dataset in which all images contains multiple lines of text at arbitrary positions? Here are some example images of my custom dataset

loss : nan when train custom data set

Hi @BelBES

I tried several batch-size from 8,16,32,64,128,256..but always end with loss : nan in every epoch when training my custom data set.

python train.py --data-path datatrain --test-init True --test-epoch 10 --output-dir snapshot --abc 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz:/. --batch-size 8

Test phase
acc: 0.0000; avg_ed: 0.0000: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 18.10it/s]
acc: 0.0	acc_best: 0; avg_ed: 18.428571428571427
epoch: 0; iter: 1998; lr: 1.0000000000000002e-06; loss_mean: nan; loss: nan: 100%|██████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:42<00:00, 46.69it/s]
epoch: 1; iter: 3998; lr: 1.0000000000000004e-10; loss_mean: nan; loss: nan: 100%|██████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:42<00:00, 46.74it/s]
epoch: 2; iter: 5998; lr: 1.0000000000000006e-14; loss_mean: nan; loss: nan: 100%|██████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:43<00:00, 45.84it/s]

I am using PyTorch 0.4, Python 3.6, GTX 1080 Ti and Ubuntu 16.04

Can you help me how to solve this problem?

Kindly Regards

Test for same input with same weight returns different output

Hi
Test for same input with same weight returns different output.

I trained the model and used the pretrained one for test the same input, by every time I get different output

Ask about seq projection

Hi everyone,
Can I ask the meaning of seq_proj? I see it is used to transform the feature map from CNN? Specifically, can someone explain this segment of code? Thank you:

def features_to_sequence(self, features):
    b, c, h, w = features.size()
    assert h == 1, "the height of out must be 1"
    if not self.fully_conv:
        features = features.permute(0, 3, 2, 1)
        features = self.proj(features)
        features = features.permute(1, 0, 2, 3)
    else:
        features = features.permute(3, 0, 2, 1)
    features = features.squeeze(2)
    return features

crnn-pytorch-master python3 train.py --test-init True --test-epoch 10 --output-dir ./snap
Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /Users/jing/.torch/models/resnet18-5c106cde.pth
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 46827520/46827520 [00:03<00:00, 12340925.60it/s]
Traceback (most recent call last):
File "train.py", line 108, in
main()
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "train.py", line 55, in main
net = load_model(data.get_abc(), seq_proj, backend, snapshot, cuda)
File "/Users/jing/Desktop/crnn-pytorch-master/models/model_loader.py", line 16, in load_model
net = CRNN(abc=abc, seq_proj=seq_proj, backend=backend)
File "/Users/jing/Desktop/crnn-pytorch-master/models/crnn.py", line 43, in init
dropout=rnn_dropout, bidirectional=True)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 485, in init
super(GRU, self).init('GRU', *args, **kwargs)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 31, in init
raise ValueError("dropout should be a number in range [0, 1] "
ValueError: dropout should be a number in range [0, 1] representing the probability of an element being zeroed

pretrained weights

Will you share some of your pretrained models? Would be great for quick testing and fine-tuning.

Error when training with custom dataset: cv2.error error: (-215:Assertion failed) !ssize.empty() in function 'resize'

Hi I'm using the crnn-pytorch project to train a new model with my custom dataset, but when I run the command python train.py --data-path datatrain --test-init True --test-epoch 10 --output-dir snapshot --batch-size 8, it returned the error like this

python train.py --data-path datatrain --test-init True --test-epoch 10 --output-dir snapshot --batch-size 8
Test phase
0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 108, in
main()
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "train.py", line 67, in main
acc, avg_ed = test(net, data, data.get_abc(), cuda, visualize=False)
File "/home/trungle/PycharmProjects/crnn-pytorch/test.py", line 28, in test
for sample in iterator:
File "/home/trungle/anaconda3/lib/python3.7/site-packages/tqdm/_tqdm.py", line 979, in iter
for obj in iterable:
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
cv2.error: Traceback (most recent call last):
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/trungle/PycharmProjects/crnn-pytorch/dataset/text_data.py", line 37, in getitem
sample = self.transform(sample)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 49, in call
img = t(img)
File "/home/trungle/PycharmProjects/crnn-pytorch/dataset/data_transform.py", line 17, in call
sample["img"] = cv2.resize(sample["img"], self.size)
cv2.error: OpenCV(3.4.2) /tmp/build/80754af9/opencv-suite_1535558553474/work/modules/imgproc/src/resize.cpp:4044: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

I'm pretty sure that the data and desc.json are in the right format.
If anyone can help I would be very appreciated, thanks so much.

Evaluation on CPU

Hi, I am getting poor performance when I test a trained model on CPU (but the checkpoint's working well on GPU as per validation numbers). Details here:
https://discuss.pytorch.org/t/different-results-on-cpu-and-gpu/22289/3
Appreciate any feedback, thanks.

Error in python setup.py install

I followed the instructions of installing warp-ctc, but I get an error in "python setup.py install" step.

LINK : warning LNK4044: unrecognized option '/Wl,-rpath,C:\Users\argha\warp-ctc\build\Debug'; ignored

LINK : fatal error LNK1181: cannot open input file 'warpctc.lib'

error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.28.29333\\bin\\HostX86\\x64\\link.exe' failed with exit status 1181

Any idea how to solve this?

Attribute error due to test dataset.py

I cloned the repo and installed the dependencies. Then I wanted to train the model for 60 to 100 epoch using the test loader as described in the README file. But when I run the train.py program, I first get an error telling me the dropout should be a number between 0 and 1 in the crnn.py file which by default is set to False. I set it to 0.5 and that error disappeared. but after that, I get the following error

AttributeError: Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/ubuntu/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in <listcomp> samples = collate_fn([dataset[i] for i in batch_indices]) File "/new/Kaushik/crnn-pytorch/dataset/test_data.py", line 64, in __getitem__ sample = {"img": img, "seq": seq, "seq_len": len(seq), "aug": self.mode == "train"} AttributeError: 'TestDataset' object has no attribute 'mode'

I am running it on python 3.6 and pytorch version 0.4. Please help me out here.

About default training with test dataset generator

when I use test dataset generator, I think I should return some training data, but how can I use is for train, cause in train.py I found that you use generated data with method "set_mode" which doesn't have definition in test_data.py
here is error report:
(dip) crnn-pytorch-master python3 train.py --test-init True --test-epoch 10 --output-dir ./snap
testdata
Test phase
Traceback (most recent call last):
File "train.py", line 143, in
main()
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "train.py", line 83, in main
data.set_mode('test')
TypeError: set_mode() takes 1 positional argument but 2 were given

how to create symbols_in_aphabet ?

Could you help me how to create symbols_in_aphabet ?

what is the symbols_in_aphabet format?

Could you give me some example?

thank you very much

IO is too slow, any suggestion?

In this project,Does it support the chinese OCR detection?

Change input size

Hi,

How can I change the input size of an image?
Currently other input sizes e.g. (256x64) result in assertion error the height of out must be 1

longer text

Hi,
Thanks for the repo. It's very well coded and easy to use with a custom dataset.
I first tried on a custom dataset where the average text length is 7 letters. This works quite well.
Now using a more complicated dataset with average text length of 18 characters, where space can be one of the characters (so multiple words instead of single words). Still training (I think the GRU takes time), but so far the results have not been that good.
With both models, the loss goes down quite smoothly, but the average edit distance jumps around quite a lot. For the first model, it improved when I used Adam (your default) instead of AdaDelta which I was playing with. For the second, Adam's not really doing the trick.
If you've worked a lot with this model and have some ideas, please let me know. Thanks.

Pre trained weights

Can you provide pre trained model file ?