Code Monkey home page Code Monkey logo

crnn-pytorch's Introduction

Convolutional Recurrent Neural Network

This software implements OCR system using CNN + RNN + CTCLoss, inspired by CRNN network.

Usage

python ./train.py --help

Demo

  1. Train simple OCR using TestDataset data generator. Training for ~60-100 epochs.
python train.py --test-init True --test-epoch 10 --output-dir <path_to_folder_with_snapshots>
  1. Run test for trained model with visualization mode.
python test.py --snapshot <path_to_folder_with_snapshots>/crnn_resnet18_10_best --visualize True

Train on custom dataset

  1. Create dataset
  • Structure of dataset:
<root_dataset_dir>
---- data
-------- <img_filename_0>
...
-------- <img_filename_1>
---- desc.json
  • Structure of desc.json:
{
"abc": <symbols_in_aphabet>,
"train": [
{
"text": <text_on_image>
"name": <img_filename>
},
...
{
"text": <text_on_image>
"name": <img_filename>
}
],
"test": [
{
"text": <text_on_image>
"name": <img_filename>
},
...
{
"text": <text_on_image>
"name": <img_filename>
}
]
}
  1. Train simple OCR using custom dataset.
python train.pt --test-init True --test-epoch 10 --output-dir <path_to_folder_with_snapshots> --data-path <path_to_custom_dataset>
  1. Run test for trained model with visualization mode.
python test.py --snapshot <path_to_folder_with_snapshots>/crnn_resnet18_10_best --visualize True --data-path <path_to_custom_dataset>

Dependence

Articles

crnn-pytorch's People

Contributors

bes-dev avatar danielwang5 avatar lysukhin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crnn-pytorch's Issues

Multiple lines images training

Hello,

Is it okay if I train the network on a dataset in which all images contains multiple lines of text at arbitrary positions? Here are some example images of my custom dataset
business_powerpoint_templates_liner_flow_creative_text_boxes_diagram_sales_ppt_slides_slide01_1
swot-pack-1-google-slides-template-powerpoint-download
30

loss : nan when train custom data set

Hi @BelBES

I tried several batch-size from 8,16,32,64,128,256..but always end with loss : nan in every epoch when training my custom data set.

python train.py --data-path datatrain --test-init True --test-epoch 10 --output-dir snapshot --abc 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz:/. --batch-size 8

Test phase
acc: 0.0000; avg_ed: 0.0000: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 18.10it/s]
acc: 0.0	acc_best: 0; avg_ed: 18.428571428571427
epoch: 0; iter: 1998; lr: 1.0000000000000002e-06; loss_mean: nan; loss: nan: 100%|██████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:42<00:00, 46.69it/s]
epoch: 1; iter: 3998; lr: 1.0000000000000004e-10; loss_mean: nan; loss: nan: 100%|██████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:42<00:00, 46.74it/s]
epoch: 2; iter: 5998; lr: 1.0000000000000006e-14; loss_mean: nan; loss: nan: 100%|██████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [00:43<00:00, 45.84it/s]

I am using PyTorch 0.4, Python 3.6, GTX 1080 Ti and Ubuntu 16.04

Can you help me how to solve this problem?

Kindly Regards

Ask about seq projection

Hi everyone,
Can I ask the meaning of seq_proj? I see it is used to transform the feature map from CNN? Specifically, can someone explain this segment of code? Thank you:

def features_to_sequence(self, features):
    b, c, h, w = features.size()
    assert h == 1, "the height of out must be 1"
    if not self.fully_conv:
        features = features.permute(0, 3, 2, 1)
        features = self.proj(features)
        features = features.permute(1, 0, 2, 3)
    else:
        features = features.permute(3, 0, 2, 1)
    features = features.squeeze(2)
    return features

dropout value error

crnn-pytorch-master python3 train.py --test-init True --test-epoch 10 --output-dir ./snap
Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /Users/jing/.torch/models/resnet18-5c106cde.pth
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 46827520/46827520 [00:03<00:00, 12340925.60it/s]
Traceback (most recent call last):
File "train.py", line 108, in
main()
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "train.py", line 55, in main
net = load_model(data.get_abc(), seq_proj, backend, snapshot, cuda)
File "/Users/jing/Desktop/crnn-pytorch-master/models/model_loader.py", line 16, in load_model
net = CRNN(abc=abc, seq_proj=seq_proj, backend=backend)
File "/Users/jing/Desktop/crnn-pytorch-master/models/crnn.py", line 43, in init
dropout=rnn_dropout, bidirectional=True)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 485, in init
super(GRU, self).init('GRU', *args, **kwargs)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 31, in init
raise ValueError("dropout should be a number in range [0, 1] "
ValueError: dropout should be a number in range [0, 1] representing the probability of an element being zeroed

pretrained weights

Will you share some of your pretrained models? Would be great for quick testing and fine-tuning.

Error when training with custom dataset: cv2.error error: (-215:Assertion failed) !ssize.empty() in function 'resize'

Hi I'm using the crnn-pytorch project to train a new model with my custom dataset, but when I run the command python train.py --data-path datatrain --test-init True --test-epoch 10 --output-dir snapshot --batch-size 8, it returned the error like this

python train.py --data-path datatrain --test-init True --test-epoch 10 --output-dir snapshot --batch-size 8
Test phase
0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 108, in
main()
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "train.py", line 67, in main
acc, avg_ed = test(net, data, data.get_abc(), cuda, visualize=False)
File "/home/trungle/PycharmProjects/crnn-pytorch/test.py", line 28, in test
for sample in iterator:
File "/home/trungle/anaconda3/lib/python3.7/site-packages/tqdm/_tqdm.py", line 979, in iter
for obj in iterable:
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
cv2.error: Traceback (most recent call last):
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/trungle/PycharmProjects/crnn-pytorch/dataset/text_data.py", line 37, in getitem
sample = self.transform(sample)
File "/home/trungle/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 49, in call
img = t(img)
File "/home/trungle/PycharmProjects/crnn-pytorch/dataset/data_transform.py", line 17, in call
sample["img"] = cv2.resize(sample["img"], self.size)
cv2.error: OpenCV(3.4.2) /tmp/build/80754af9/opencv-suite_1535558553474/work/modules/imgproc/src/resize.cpp:4044: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

I'm pretty sure that the data and desc.json are in the right format.
If anyone can help I would be very appreciated, thanks so much.

Error in python setup.py install

I followed the instructions of installing warp-ctc, but I get an error in "python setup.py install" step.

LINK : warning LNK4044: unrecognized option '/Wl,-rpath,C:\Users\argha\warp-ctc\build\Debug'; ignored

LINK : fatal error LNK1181: cannot open input file 'warpctc.lib'

error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.28.29333\\bin\\HostX86\\x64\\link.exe' failed with exit status 1181

Any idea how to solve this?

Attribute error due to test dataset.py

I cloned the repo and installed the dependencies. Then I wanted to train the model for 60 to 100 epoch using the test loader as described in the README file. But when I run the train.py program, I first get an error telling me the dropout should be a number between 0 and 1 in the crnn.py file which by default is set to False. I set it to 0.5 and that error disappeared. but after that, I get the following error

AttributeError: Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/ubuntu/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in <listcomp> samples = collate_fn([dataset[i] for i in batch_indices]) File "/new/Kaushik/crnn-pytorch/dataset/test_data.py", line 64, in __getitem__ sample = {"img": img, "seq": seq, "seq_len": len(seq), "aug": self.mode == "train"} AttributeError: 'TestDataset' object has no attribute 'mode'

I am running it on python 3.6 and pytorch version 0.4. Please help me out here.

About default training with test dataset generator

when I use test dataset generator, I think I should return some training data, but how can I use is for train, cause in train.py I found that you use generated data with method "set_mode" which doesn't have definition in test_data.py
here is error report:
(dip) crnn-pytorch-master python3 train.py --test-init True --test-epoch 10 --output-dir ./snap
testdata
Test phase
Traceback (most recent call last):
File "train.py", line 143, in
main()
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/jing/anaconda3/envs/dip/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "train.py", line 83, in main
data.set_mode('test')
TypeError: set_mode() takes 1 positional argument but 2 were given

how to create symbols_in_aphabet ?

Could you help me how to create symbols_in_aphabet ?

what is the symbols_in_aphabet format?

Could you give me some example?

thank you very much

Change input size

Hi,

How can I change the input size of an image?
Currently other input sizes e.g. (256x64) result in assertion error the height of out must be 1

longer text

Hi,
Thanks for the repo. It's very well coded and easy to use with a custom dataset.
I first tried on a custom dataset where the average text length is 7 letters. This works quite well.
Now using a more complicated dataset with average text length of 18 characters, where space can be one of the characters (so multiple words instead of single words). Still training (I think the GRU takes time), but so far the results have not been that good.
With both models, the loss goes down quite smoothly, but the average edit distance jumps around quite a lot. For the first model, it improved when I used Adam (your default) instead of AdaDelta which I was playing with. For the second, Adam's not really doing the trick.
If you've worked a lot with this model and have some ideas, please let me know. Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.