awslabs / handwritten-text-recognition-for-apache-mxnet Goto Github PK

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

License: Apache License 2.0

Jupyter Notebook 97.12% Python 2.88%

handwritten-text-recognition-for-apache-mxnet's Introduction

Handwritten Text Recognition (OCR) with MXNet Gluon

Local Setup

git clone https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet --recursive

You need to install SCLITE for WER evaluation You can follow the following bash script from this folder:

cd ..
git clone https://github.com/usnistgov/SCTK
cd SCTK
export CXXFLAGS="-std=c++11" && make config
make all
make check
make install
make doc
cd -

You also need hsnwlib

pip install pybind11 numpy setuptools
cd ..
git clone https://github.com/nmslib/hnswlib
cd hnswlib/python_bindings
python setup.py install
cd ../..

if "AssertionError: Please enter credentials for the IAM dataset in credentials.json or as arguments" occurs rename credentials.json.example and to credentials.json with your username and password.

Overview

The pipeline is composed of 3 steps:

Detecting the handwritten area in a form [blog post], [jupyter notebook], [python script]
Detecting lines of handwritten texts [blog post], [jupyter notebook], [python script]
Recognising characters and applying a language model to correct errors. [blog post], [jupyter notebook], [python script]

The entire inference pipeline can be found in this notebook. See the pretrained models section for the pretrained models.

A recorded talk detailing the approach is available on youtube. [video]

The corresponding slides are available on slideshare. [slides]

Pretrained models:

You can get the models by running python get_models.py:

Sample results

The greedy, lexicon search, and beam search outputs present similar and reasonable predictions for the selected examples. In Figure 6, interesting examples are presented. The first line of Figure 6 show cases where the lexicon search algorithm provided fixes that corrected the words. In the top example, “tovely” (as it was written) was corrected “lovely” and “woved” was corrected to “waved”. In addition, the beam search output corrected “a” into “all”, however it missed a space between “lovely” and “things”. In the second example, “selt” was converted to “salt” with the lexicon search output. However, “selt” was erroneously converted to “self” in the beam search output. Therefore, in this example, beam search performed worse. In the third example, none of the three methods significantly provided comprehensible results. Finally, in the forth example, the lexicon search algorithm incorrectly converted “forhim” into “forum”, however the beam search algorithm correctly identified “for him”.

Dataset:

To use test_iam_dataset.ipynb, create credentials.json using credentials.json.example and editing the appropriate field. The username and password can be obtained from http://www.fki.inf.unibe.ch/DBs/iamDB/iLogin/index.php.
It is recommended to use an instance with 32GB+ RAM and 100GB disk size, a GPU is also recommended. A p3.2xlarge would be the recommended starter instance on AWS for this project

Appendix

1) Handwritten area

Model architecture

Results

2) Line Detection

Model architecture

Results

3) Handwritten text recognition

Model architecture

Results

handwritten-text-recognition-for-apache-mxnet's People

Contributors

Stargazers

Watchers

Forkers

vin725k satchelwu tapasbadal fendaq th3c0d3f4th3r tasostefas benhoff heagan lucariel naitemach conqueror7 gracegaojer rr250 zakaria-hamane man0007 plodded jewelcai manikant92 pramitbhattacharyya ramstein jcassiojr yash-bhat abhishek-ag2000 sandeepkundala chadpieere delcroip luwuming zhangguangxun tpnguyen fakhraddin yshang676 srishty-singh2k eyagarci rahasayantan shiva811 subburajs qile10234 vibwipro rabi3elbeji alanqian weitaoatvison papers-collection teresasun salman0 ratnikov1996 riviera2015 jfmurphy andrekaa junkin ttyhu basit26374 soccergame lumiqai yanqi1811 frk4 chizhang2019 sepidehhosseinzadeh hamzahloqman orivera1967 ideaplexus steinsfu hoanglongtran nikilkumar9 mclumiere mdvsh psssnikhil jpery quminhdo simondavid tuanphung2308 imtiyazs kapitsa2811 rahul75 allensmile liyucode trendingtechnology zhaoxiaoliang-clh chaoyue729 jingmouren majedrifat ajaykaarthic ashudwi cdchushig zmxheart vagi rahul-kallil marcohatran rafaelmri yds05238 manognasaragadam xuweidongkobe manan2506 rahulmalu1998 kali-mane mukulareddy enitishkr ansontgn neelesh1121 github-rupanshu eijawa

handwritten-text-recognition-for-apache-mxnet's Issues

IndexError: single positional indexer is out-of-bounds

facing an error while trying to run the following code:

figs_to_plot = 4
images = []
n = 0
for i in range(0, figs_to_plot):
n = int(random.random()*len(test_ds))
image, _ = test_ds[n]
images.append(image)

How can we increase the recognising accuracy for capital letter text and for digits because for capital letters and digits it is not even 10%?

Is K-fold crossvalidation happening?

HI,

First, let me appreciate your great work in sharing this handwriting model as open source!

I just understood that you are splitting the dataset into fixed validation set for validation.

Will it be more efficient to implement k-fold cross-vaildation, so that the we can increase the amount of data used for training.

One more question, Will it be okay to use the entire dataset for testing? or should it be necessary that I should only use the unseen training dataset for testing the model?

Thanks,
Anand.

How do i validate a model on new data? can you provide stepwise solution to transform image and predict it on new data?

get_denoised adding spaces in output so half sentances is not coming in output

[AM] Torserane TVcom in Bankiner wo Finnnce
[D ] T o r e v e n g e T o m i n B a

see am is full sentance but [D ] is cutting down sentance. tell me where can i change the code to get correct output.

New registration is overriding existing custom operator _smoothing_with_dim

I have installed all the latest modules required and basically untitled.py contains all the importing of modules required for handwriting_ocr.ipynb

Invalid NDArray file format

I get this error while running the 0_handwriting_ocr notebook: "src/ndarray/ndarray.cc:1851: Check failed: fi->Read(data): Invalid NDArray file format" while loading the handwriting_line8.params file. Could you please help me with this issue? I have mxnet 1.6.0 and gluonnlp 0.9.1. Thank you in advance.

no module called "ocr"

kernal dying

Kernel dying while downloading processing IAM dataset

Kernel Restarting
The kernel for projects/DocByte/handwriting_recog/amazon_HWR/handwritten-text-recognition-for-apache-mxnet/0_handwriting_ocr.ipynb appears to have died. It will restart automatically.

test_ds = IAMDataset("form_original", train=False)
running this line

Links in the README file are broken

The links in README file still refer to the old repo's master branch files which are unavailable.

Please update README with latest changes.

Does this work for typed text

Can we use or modify this code for typed text segmentation.

Context set doesn't check for presence of GPU in 0_handwriting_ocr.ipynb

When setting the context in the denoising section of 0_handwriting_ocr.ipynb, the code doesn't check for the presence of GPUs and therefore fails when no GPUs are available.

Based on other context setting code in the example, I would suggest that this line...
ctx_nlp = mx.gpu(3)

Should read...
ctx_nlp = mx.gpu(3) if mx.context.num_gpus() > 0 else mx.cpu()

Train on IAMDataset "Word" Crashes the code

Hello, I have a need to train the model with words. Can you please help me?

I tried by updating the code with max_seq_len =96 as "jonomon" mentioned but it crashes with error
DeferredInitializationError: Parameter 'cnnbilstm0_hybridsequential1_hybridsequential0_encoderlayer0_lstm0_l0_i2h_weight' has not been initialized yet because initialization was deferred. Actual initialization happens during the first forward pass. Please pass one batch of data through the network before accessing Parameters. You can also avoid deferred initialization by specifying in_units, num_features, etc., for network layers.

During handling of the above exception, another exception occurred:

Gradient of Parameter `ssd1_batchnorm0_beta` on context gpu(0) has not been updated by backward since last `step`

I'm trying to execute the code as it is .. but i dont know why i'm getting this error ..
"UserWarning: Gradient of Parameter ssd1_batchnorm0_beta on context gpu(0) has not been updated by backward since last step. This could mean a bug in your model that made it only use a subset of the Parameters (Blocks) for this iteration. If you are intentionally only using a subset, call step with ignore_stale_grad=True to suppress this warning and skip updating of Parameters with stale gradient " on trainer.step(stepsize)

Assertion error in iam_dataset.py while running 0_handwriting_ocr.py

AssertionError: Please enter credentials for the IAM dataset in credentials.json or as arguments

AssertionError Traceback (most recent call last)
in ()
----> 1 test_ds = IAMDataset("form_original", train=False)

/content/iam_dataset.py in init(self, parse_method, credentials, root, train, output_data, output_parse_method, output_form_text_as_array)
174 self._credentials = (credentials["XXX"], credentials["XXX"])
175 else:
--> 176 assert False, "Please enter credentials for the IAM dataset in credentials.json or as arguments"
177 else:
178 self._credentials = credentials

AssertionError: Please enter credentials for the IAM dataset in credentials.json or as arguments

I saw this error when I try to call : IAMDataset("form_original", train=False) . I already resigned and have user and password but I do not know how to put user and pass word to credentials.json . Please help me to solve this problem, thank you very much !

How to make predictions from pre-trained models?

Good work, thanks.
I am using pre-trained models to get text from images. When I was going through the codes on how to do it, I learned that my test images's format has to match the format 'IAMDastaset' class in 'ocr.utils.iam_dataset' outputs.
So, How do I modify 'IAMDastaset' class in 'ocr.utils.iam_dataset' to change an input image to match test dataframe format that this class outputs or How do I get dataframe for images other than the ones in IAM dataset. I couldn't understand this class completely. So, if anyone worked on this, please help me solve this.

regions of text not being detected properly

Here is my input image in color:

1. The detection that happended from pre-trained models can be seen below, when the above image is converted from 0. RGB to grayscale and 1. BGR to grayscale (happended at paragraph segmentation from '0_handwriting_ocr.ipynb' notebook & I have made no changes in the code from that notebook):

2. Because of this improper region detection, areas that actually has text are being cropped out.
And, when i try form size other than what is in the code(because my images have smaller aspect ratio), that is form_size = (1120, 800), computer crashes. What is causing this and how can i not have this happen?
3. Presumably, because of the above improper detection or may be becuase line/word segmentaion not happening properly, here's the word segmentation:

and here's the line segmentation:

How do i fix these?

ConnectionResetError at word segmentation

Hi, I already mentioned my problem but I didin't find an issue describing what I experience at the moment. When I start the 2_line_word_segmentation.ipynb I get the following error:

ConnectionResetErrorTraceback (most recent call last)
<ipython-input-13-fbd64d2ad138> in <module>
      3     cls_metric = mx.metric.Accuracy()
      4     box_metric = mx.metric.MAE()
----> 5     train_loss = run_epoch(e, net, train_data, trainer, log_dir, print_name="train", is_train=True, update_metric=False)
      6     test_loss = run_epoch(e, net, test_data, trainer, log_dir, print_name="test", is_train=False, update_metric=True)
      7     if test_loss < best_test_loss:

<ipython-input-6-6b90c6f2ae19> in run_epoch(e, network, dataloader, trainer, log_dir, print_name, is_train, update_metric)
     32 
     33     total_losses = [0 for ctx_i in ctx]
---> 34     for i, (X, Y) in enumerate(dataloader):
     35         X = gluon.utils.split_and_load(X, ctx)
     36         Y = gluon.utils.split_and_load(Y, ctx)

/usr/local/lib/python3.6/dist-packages/mxnet/gluon/data/dataloader.py in __next__(self)
    503         try:
    504             if self._dataset is None:
--> 505                 batch = pickle.loads(ret.get(self._timeout))
    506             else:
    507                 batch = ret.get(self._timeout)

/usr/local/lib/python3.6/dist-packages/mxnet/gluon/data/dataloader.py in rebuild_ndarray(pid, fd, shape, dtype)
     59             fd = multiprocessing.reduction.rebuild_handle(fd)
     60         else:
---> 61             fd = fd.detach()
     62         return nd.NDArray(nd.ndarray._new_from_shared_mem(pid, fd, shape, dtype))
     63 

/usr/lib/python3.6/multiprocessing/resource_sharer.py in detach(self)
     55         def detach(self):
     56             '''Get the fd.  This should only be called once.'''
---> 57             with _resource_sharer.get_connection(self._id) as conn:
     58                 return reduction.recv_handle(conn)
     59 

/usr/lib/python3.6/multiprocessing/resource_sharer.py in get_connection(ident)
     85         from .connection import Client
     86         address, key = ident
---> 87         c = Client(address, authkey=process.current_process().authkey)
     88         c.send((key, os.getpid()))
     89         return c

/usr/lib/python3.6/multiprocessing/connection.py in Client(address, family, authkey)
    491 
    492     if authkey is not None:
--> 493         answer_challenge(c, authkey)
    494         deliver_challenge(c, authkey)
    495 

/usr/lib/python3.6/multiprocessing/connection.py in answer_challenge(connection, authkey)
    730     import hmac
    731     assert isinstance(authkey, bytes)
--> 732     message = connection.recv_bytes(256)         # reject large message
    733     assert message[:len(CHALLENGE)] == CHALLENGE, 'message = %r' % message
    734     message = message[len(CHALLENGE):]

/usr/lib/python3.6/multiprocessing/connection.py in recv_bytes(self, maxlength)
    214         if maxlength is not None and maxlength < 0:
    215             raise ValueError("negative maxlength")
--> 216         buf = self._recv_bytes(maxlength)
    217         if buf is None:
    218             self._bad_message_length()

/usr/lib/python3.6/multiprocessing/connection.py in _recv_bytes(self, maxsize)
    405 
    406     def _recv_bytes(self, maxsize=None):
--> 407         buf = self._recv(4)
    408         size, = struct.unpack("!i", buf.getvalue())
    409         if maxsize is not None and size > maxsize:

/usr/lib/python3.6/multiprocessing/connection.py in _recv(self, size, read)
    377         remaining = size
    378         while remaining > 0:
--> 379             chunk = read(handle, remaining)
    380             n = len(chunk)
    381             if n == 0:

ConnectionResetError: [Errno 104] Connection reset by peer

I am using a Docker Image on a Linux system. Can you help me to get the notebook to run?

Can this beam search method support chinese well

Can this beam search method support Chinese well?

MXNetError: [17:38:24] C:\Jenkins\workspace\mxnet-tag\mxnet\3rdparty\dmlc-core\src\io\local_filesys.cc:209: Check failed:

When executing the code in the handwriting notebook I get an error in the segmentation, which I show below:
LocalFileSystem :: Open "models / paragraph_segmentation2.params": No such file or directory

however, when I check the address I have the file
resnet34_v1-48216ba9.params
but in the same way I get the same error

could someone help me understand this mistake?

Downloading gbw dataset crashes machine

Below code in 'Denoising text ouptut' section of '0_handwriting_ocr.ipynb' file, when run on a machine with 35.35GB RAM and 107.77GB disk space(google colab TPU Session) crashes system for unknown reason.

ctx_nlp = mx.gpu(3)
language_model, vocab = nlp.model.big_rnn_lm_2048_512(dataset_name='gbw', pretrained=True, ctx=ctx_nlp)
moses_tokenizer = nlp.data.SacreMosesTokenizer()
moses_detokenizer = nlp.data.SacreMosesDetokenizer()

How do i download this dataset without crashing the machine? And also, I don't want to download it from next time, so can I save this dataset too?

Why there is no use of any regularisation? I think L2 regularisation should have been used.

Questions about train and evaluation

Thanks for your great work. I am a rookie on handwriting recognition and have some questions about train and evaluation.

This repo uses SCLITE for WER evaluation. I found that it will ignore space between words when SCLITE evaluates words of one line. But other mothods such as https://github.com/githubharald/SimpleHTR/blob/master/src/main.py#L81, https://github.com/jpuigcerver/xer/blob/master/xer#L116, are not like this. Which is the criterion in general?
why 100.0 - float(er)? I think it's float(er)

   for line in output_file.readlines():
            match = re.match(match_tar, line.decode('utf-8'), re.M|re.I)
            if match:
               # I think there are matching problems
                number = match.group(1)    #  --> match.group().split()[4]
                er = match.group(2)  # --> match.group().split()[-3]
        assert number != None and er != None, "Error in parsing output."
        return float(number), 100.0 - float(er)  #  return float(number), float(er)

It's average cer of all lines, not global cer.

# https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet/blob/master/0_handwriting_ocr.ipynb
def get_qualitative_results_lines(denoise_func):
    sclite.clear()
    test_ds_line = IAMDataset("line", train=False)
    for i in tqdm(range(1, len(test_ds_line))):
       # ....
        sclite.add_text([decoded_text], [actual_text])
    cer, er = sclite.get_cer()
    print("Mean CER = {}".format(cer))
    return cer

The pretrained model handwriting_line8.params works well. But I can't train such a good model.

# https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet/blob/master/ocr/handwriting_line_recognition.py#L30
# Best results:
# python handwriting_line_recognition.py --epochs 251 -n handwriting_line.params -g 0 -l 0.0001 -x 0.1 -y 0.1 -j 0.15 -k 0.15 -p 0.75 -o 2 -a 128

Looking forward to your reply. Thanks a lot.

FileNotFound

Hello guys, i have i think a simple problem : when i launch test_iam_dataset i have this error :

FileNotFoundError: [Errno 2] File /home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/../../dataset/iamdataset/subject/trainset.txt does not exist: '/home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/../../dataset/iamdataset/subject/trainset.txt'

I don't know what kind of file is it

If someone has an idea, thank's a lot !

Trying to train and test the model by multiprocessing in CPU

HI, I wanted to know how to achieve multiprocessing while training and testing the model in CPU. please help in achieving it.

NameError: name 'random_y_translation' is not defined

We are trying to run the training on windows but getting this error. Please help us fix this. The same works in ubuntu

is 50% dropout a good value to set?

HI,

I could see that the dropout percentage that is set in the handwritten_line_recognition.py script is 50%, is dropping half of the nodes a good suggestion?

self.p_dropout = 0.5

Please advice why 50% dropout is set here.

Thanks,
Anand.

Mxnet package gives error post installation

getting the error "OSError: [WinError 126] The specified module could not be found" in Windows Server 2016. What is the dll file missing?

Installed using pip

python :3.7.4

Unable to run the test_iam_dataset.ipynb

Traceback (most recent call last): File "iam_dataset.py", line 23, in <module> from .expand_bounding_box import expand_bounding_box SystemError: Parent module '' not loaded, cannot perform relative import

python version =3.5.2
os = windows
editor = pycharm

Is the learning rate 0.0001 (default) is good or the (0.00001) is good?

HI,

I was trying to tweak the learning_rate and dropout parameters for the handwriting_line_recognition.py model.

Since there is no much change in the loss for changing the dropout parameters (20%, 35%, 50%) i'm just fixing the default one.

But for the learning rate change from 0.0001 to 0.00001 there is a huge increase in the stability of the model as plotted below. (training loss is equivalent to the test loss)

plotted graph image: https://prnt.sc/rv6lzm

graph_label notations:

lr-e5 => learning_rate = 0.00001
lr-e4 => learning_rate = 0.0001

-> Bottom two lines are the train and test loss calculation for the 0.0001 learning_rate parameters and all above lines are plotted for 0.00001. We could see the bottom two lines are not stable where as the other lines are very stable (training loss is equivalent to the test loss)

Since the lr 0.00001 is better than 0.0001, can we fix 0.00001 as default or do we face any other problem if we use this new lr rate?

Please advice.

Thanks,
Anand.

Not able to get username / password from source

I am getting "iLogin Error:Unable to connect to database. Please check your iLogin MySQL server, username and password configuration options." from the provided link for credentials: http://www.fki.inf.unibe.ch/DBs/iamDB/iLogin/index.php

Any other place I could get it?

MXNetError: [11:26:33] C:\Jenkins\workspace\mxnet-tag\mxnet\src\ndarray\ndarray.cc:1279: GPU is not enabled

When executing the following command:

ctx_nlp = mx.gpu (3)
language_model, vocab = nlp.model.big_rnn_lm_2048_512 (dataset_name = 'gbw', pretrained = True, ctx = ctx_nlp)
moses_tokenizer = nlp.data.SacreMosesTokenizer ()
moses_detokenizer = nlp.data.SacreMosesDetokenizer ()

I got as a result a download of some compressed files, my download stopped and when I compile that line again, I get an error from the GPU, what can I do to detect the files that were being downloaded and where should I place them?

Is it possible to train the models with other datasets?

I would like knowing if I can train the models with a custom dataset that have not handwritten text, like digitalwritten text.

AssertionError: Shape of params are incompatible

Hi again,

I am a bit confused about this error, happening in the 4_text_denoising notebook. I just did every step from before but something does not fit with the dimensions. Can you explain why this is happening?

AssertionErrorTraceback (most recent call last)
<ipython-input-31-2a0e848a57c4> in <module>
      1 model_path = 'models/denoiser2.params'
      2 if (os.path.isfile(model_path)):
----> 3     net.load_parameters(model_path, ctx=ctx)
      4     print("Loaded parameters")
      5     best_test_loss = evaluate(net, val_data_ft)

/usr/local/lib/python3.6/dist-packages/mxnet/gluon/block.py in load_parameters(self, filename, ctx, allow_missing, ignore_extra, cast_dtype, dtype_source)
    553                         name, filename, _brief_print_list(self._params.keys())))
    554             if name in params:
--> 555                 params[name]._load_init(loaded[name], ctx, cast_dtype=cast_dtype, dtype_source=dtype_source)
    556 
    557     def load_params(self, filename, ctx=None, allow_missing=False,

/usr/local/lib/python3.6/dist-packages/mxnet/gluon/parameter.py in _load_init(self, data, ctx, cast_dtype, dtype_source)
    280                     "Failed loading Parameter '%s' from saved params: " \
    281                     "shape incompatible expected %s vs saved %s"%(
--> 282                         self.name, str(self.shape), str(data.shape))
    283             self.shape = tuple(i if i != unknown_dim_size else j
    284                                for i, j in zip(self.shape, data.shape))

AssertionError: Failed loading Parameter 'transformer_enc_const' from saved params: shape incompatible expected (150, 512) vs saved (150, 256)

Train on IAMDataset "Word" Crashes the code

The 3_handwriting_recognition.py works fine with IAMDataset("line", output_data="text", train=True) but crashes when using the word IAMDataset. Specifically, doing this crashes the code.

train_ds = IAMDataset("word", output_data="text", train=True)
print("Number of training samples: {}".format(len(train_ds)))

test_ds = IAMDataset("word", output_data="text", train=False)
print("Number of testing samples: {}".format(len(test_ds)))

Gives:
mxnet.base.MXNetError: Shape inconsistent, Provided = [13320192], inferred shape=(8863744,)

Could you please let me know on what all the languages the existing model can work?

How to compute labelling probability after prediction?

Hi,

Thanks a lot for this wonderful piece of work.

I am trying to calculate CTC loss to compute labeling probability after prediction. Please guide if it is possible

Thanks

Retrain the model with additional dataset

Hello, How can i retrain the mode with new dataset? I looked at the XML files for bounding box information but it looks different.

What are the preparation steps for retraining the model?

Please provide information!

Thanks
Dinesh

How can I test this model with my own custom hand written image?

How to use it for line or words

I want to use same code to generate text from a line containing few words. what possible changes shall I look for as it is made for paragraph text generation?

issue with resizing image(`resize_image()` function) from `ocr.utils.ims_dataset.py` for paragraph segmentation!

During Paragraph Segmentation in "0_handwriting_ocr.ipynb", when paragraph_segmentation_transform(image, form_size) is called to paragraph-segment the image, which in turn calls resize_image() function from ocr.utils.iam_dataset.py to resize image(images i have passed are not from IAM dataset. I have passed my own images to images array for text recongition). Error occurs at line 72 of that file:

color = image[0][0]
    if color < 230:
        color = 230

the problem occurs becuase image[0][0] is an array, not a single value. How do i fix this and proceed further.
Here is the screenshot of error:

How to do incremental learning with the current model?

Thanks for this wonderful piece of work from your team!.It is really very helpful for student community.

I used the model and trained it on a data set. It is working really great with close to 85% accuracy.

Now I have a new data set . I do not want to train it from scratch but use the current weights and train only on new data set. How can we do that?

Thanks!

Unable to run evaluate_cer

Hi, I just ran the evaluate_cer.
And I got this message: ParagraphSegmentationNet' is not defined

Any ideas?

EOFError: Ran out of input

Getting an error in iam_dataset.py file in this line:
df = pickle.load(open(fn, 'rb'))

How might I deploy and serve this on AWS?

Any recommendations on serving the model on either Amazon SageMaker or EC2 directly?

Thanks in advance

Regarding GPU Integration issue

When I am executing "ctx = mx.gpu() if mx.context.num_gpus() > 0 else mx.cpu()"
The output of ctx is coming only cpu() and the output of "mx.context.num_gpus()" is coming only 0 even I have enabled the GPU. I am using Mxnet 1.4.0 and leven==1.0.4.

Help regarding running this code on Google Colab

I have uploded the data set on drive and now i don't understand what to change on ' iam_dataset.py " so that my code works . I did some changes and ran this code on my local computer without any problem but it takes forever on training and kernel dies in the middle of the training and I m totally new on google colab and also not that good on python . If anyone tried this code on google Colab , please share your iam_dataset.py code with me . I've been stuck with it for 4 days now and now i m kind of clueless about what to search for on google so that i can work .
Thats my email : [email protected]

Can this OCR be used for japanese line prediction too?

Duration of training 4_text_denoising

Hi, just a short question. I started the 4_text_denoising and I have a single GPU (1x RTX2080TI ). It is running already over 50 hours. Is that normal?
At the moment I cant see the progress in the notebook anymore because I restarted the browser...

Question: How long does 4_text_denoising normally take?

Thanks for the help.

awslabs / handwritten-text-recognition-for-apache-mxnet Goto Github PK

handwritten-text-recognition-for-apache-mxnet's Introduction

Handwritten Text Recognition (OCR) with MXNet Gluon

Local Setup

Overview

Pretrained models:

Sample results

Dataset:

Appendix

1) Handwritten area

Model architecture

Results

2) Line Detection

Model architecture

Results

3) Handwritten text recognition

Model architecture

Results

handwritten-text-recognition-for-apache-mxnet's People

Contributors

Stargazers

Watchers

Forkers

handwritten-text-recognition-for-apache-mxnet's Issues

Recommend Projects

Recommend Topics

Recommend Org