githubharald / ctcwordbeamsearch Goto Github PK

View Code? Open in Web Editor NEW

540.0 540.0 159.0 1.89 MB

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

Home Page: https://towardsdatascience.com/b051d28f3d2e

License: MIT License

Shell 0.28% C++ 96.37% Python 3.08% C 0.27%

ctc decoder handwritten-text-recognition language-model recurrent-neural-networks speech-recognition text-recognition

ctcwordbeamsearch's People

Contributors

Stargazers

Watchers

Forkers

fendaq elavin11 owen864720655 jetw fakhraddin nareshr8 k-sandhu chaitusvk cdyangbo amansingh1501 pfriesch torstenholva codeants2012 mayanksuman xdcesc harirajeev shubhamchandak94 suzhoushr lancenorskog weinman entn-at beyondboy drfarasat anonymousaaardvark yanjun-zh aijedi whaozl zw76859420 zhangjiekui serdardemirbas thanhhoang283 jeasydev wyw636 ideaplexus bigdig slyviacassell etrigger jormarcus rjrajivjha sayalinagarkar rajucoder sadnen tasostefas joseph-zhong cleverjack navpreetsamra leixiaoning singsanj vatass xueshang-liulp runngezhang jiabinxue phoenixfury007 shipleyxie irinaarmstrong ieee820 yhcodes mjodeh viwoqu iiixi311 polas123 salchem isaacpassmore baunsgaard fesianxu kishorpyramid sailinglqh wj199031738 aflyingwolf chwick gclabbe jjjjohnson 30stomercury rajeshpachar devpro9219 craig-matadeen quminhdo bakwc jamesbright hetul-patel yuseungwoo baradgur yds05238 xiaolang564321 jyp0716 jayaram0 lcentarova gaoyiyeah alternative7 earlephilhower afansi honghe shiyuzh2007 yucheng-liao veerareddyvishal144 yueyedeai jeozhao edwardpwtsoi nayanhalder zhuleiustc1983

ctcwordbeamsearch's Issues

I have compiled the TFWordBeamSearch.so on GCC 7.3.1 TF 1.5.0 and Python 3.5.2 on Linux system, when i am copying the file(.so) and keeping it in src/ folder on Windows 10, gcc 5.1.0 python 3.6.7 TF 1.5.0 , it gives me an error stating it was designed to run on windows or file not found? How do i compile it on windows for it to work here ?

If you create a new issue, please provide the following information:

Which program causes the problem

Custom TF operation
C++ test program
Python prototype

Versions

TensorFlow version
Python version
C++ compiler
Operating system

Issue

Which result/error did you get?
If you think the result is wrong - what result did you expect instead?
How to reproduce the issue?
Provide all necessary data, at least these files: chars.txt, wordChars.txt, corpus.txt, gt_X.txt, mat_X.csv

Creating TFWordBeamSearch.so file for custom dataset

@githubharald

I'm trying to embed CTCWordBeamSearch to SimpleHTR
inorder to build TFWordBeamSearch.so file on my custom data we need to have mat_x.csv and gt_.txt file , how do I generate these file for my data, can I use the same TFWordBeamSearch.so file generated on IAM data set that is in the repository , please advise.

Incorporating my LM into the CTC-beam Issue #720

Hi,

I am following tutorial and want to create own beam search for specific set of characters (e.g. Date) , with SimpleHTR, can we use custom output dictionary instead of inbuilt beam search output dictionary...

is there any update on issue 720..

Getting Top-N results from Beam Search

I had a slightly different usecase, where I wanted the Top-N results from Beam Search and not just the best one.
I was trying to modify the word_beam_search CPP implementation as I am using the Numpy (Python Implementation)

Can you please help me on the changes so that I can get top N(N=beamwidth) results, when I use : wbs.compute() function ?

no file " editdistance"

If you create a new issue, please provide the following information:

Which program causes the problem

Python prototype

Versions

Python version

Issue
import editdistance
ImportError: No module named 'editdistance'

#Issue in running Custom TF operation

If you create a new issue, please provide the following information:

Which program causes the problem

Custom TF operation
NumPy operation (Python package)
C++ test program
Python prototype

Versions

TensorFlow version
Python version
C++ compiler
Operating system

Issue

Which result/error did you get?
If you think the result is wrong - what result did you expect instead?
How to reproduce the issue?
Provide all necessary data, at least these files: chars.txt, wordChars.txt, corpus.txt, gt_X.txt, mat_X.csv

question about feedMat

Hi @githubharald , thanks for you project. I have some question about the mat fed into the tf session. I am training crnn+ctc model. For example, for an image which represents for text "x181208022". Before ctc layer, I have the rnn output, if I use greedy decoding, I will get the result as "--x-11-8-1-2-0-8--0-2-2---", "-" represents for the ctc-blank. If I want to use your project, should I just feed the rnn output matrix into word beam search part?
Because I saw your testing code:

blank = len(chars)
s = ''
batch = 0
for label in res[batch]:
	if label == blank:
		break
	s += chars[label]

The for loop will break if met a ctc-blank. But in my case, ctc-blank is not the end of a word, if break it will give the wrong result

the last word from one sentence was linked with the first word from the next sentence in LM-bigram

Hi @githubharald , thanks for you project!
About the python version.
when I set dataset as "bentham", I found that the last word from one sentence was linked with the first word from next sentence, and counted as bigram.

the corpus.txt of "bentham" is:
brain. supposed submitt both mental and corporeal,
the fake friend of the family like the
is far beyond any idea

and the "bigrams" shows:
'the':{'is':2.0, 'family':2.0,'fake':2.0}

The second sentence's "the" is linked with the third sentence's "is".
I think this may effect the OCR results.
I think the correct bi-gram should be: 'the':{'family':1.0,'fake':1.0}
What do you think about this problem?
Is this a bug?

Language model kills performance in beam search

This is a continuation of a question on stack overflow. The question was Why should word level LM integration work, if it decreases the probability of valid prefixes while not scoring prefixes that haven't been decoded into words yet? I checked my dataset and I have OOV rate 8%. I also calculated the crossentropy for a bigram model trained on training set agains the test set. The results are as follows.

vocab size: 39105
oov rate: 0.08589909443725743
cross_entropy_train_test: 6.5233285695619125
train_entropy: 6.483290547147171
test_entropy: 6.314921857881477

Unfortunately, I can not share the distributions from the accoustic model and the corresponding text since the text is proprietary :/ I do not expect a direct solution or so, but rather a discussion about why word-level beam search should work. I am now running experiments with char-level LM and the results seem quite promissing. I use the following implementation of beam search.

def log_beam_search(ctc, alphabet, blank_idx, beam_width, lm=False, char_lm=False, alpha=0.3, beta=5,
                    prune=0, prefix_tree=False, end_symbol='>'):

    F = ctc.shape[1]
    ctc = np.vstack((np.zeros(F), ctc))
    T = ctc.shape[0]

    Pb = defaultdict(lambda: defaultdict(lambda : NEG_INF))
    Pnb = defaultdict(lambda: defaultdict(lambda : NEG_INF))
    Pb[0][''] = 0
    Pnb[0][''] = NEG_INF
    A_prev = ['']

    for t in range(1, T):
        if prune:
            pruned_alphabet = [alphabet[i] for i in np.where(ctc[t] > prune)[0]]
        else:
            pruned_alphabet = alphabet

        for l in A_prev:

            if len(l) > 0 and l[-1] == end_symbol:
                Pb[t][l] = Pb[t - 1][l]
                Pnb[t][l] = Pnb[t - 1][l]
                continue

            if prefix_tree:
                if len(l) > 0 and l[-1] != ' ':
                    pruned_alphabet = prefix_tree(l.split()[-1])

            for c in pruned_alphabet:
                c_idx = alphabet.index(c)  # todo: use dict to get O(log(n)) insted of O(n)

                if c_idx == blank_idx:

                    Pb[t][l] = logsumexp(
                        Pb[t][l],
                        ctc[t][blank_idx] + Pb[t - 1][l],
                        ctc[t][blank_idx] + Pnb[t - 1][l]
                    )

                else:

                    l_plus = l + c
                    if len(l) > 0 and c == l[-1]:
                        if char_lm:
                            ch = alpha * char_lm(l_plus)
                        else: ch = 0

                        Pnb[t][l_plus] = logsumexp(
                            Pnb[t][l_plus],
                            ctc[t][c_idx] + Pb[t - 1][l] + ch
                        )

                        Pnb[t][l] = logsumexp(
                            Pnb[t][l],
                            ctc[t][c_idx] + Pnb[t - 1][l]
                        )

                    elif len(l.replace(' ', '')) > 0 and c in (' ', end_symbol):

                        lm_prob = 0 if not lm else alpha * lm(l_plus.strip(' >'))

                        Pnb[t][l_plus] = logsumexp(
                                Pnb[t][l_plus],
                                lm_prob + ctc[t][c_idx] + Pb[t - 1][l],
                                lm_prob + ctc[t][c_idx] + Pnb[t - 1][l]
                            )

                    else:

                        if char_lm:
                            ch = alpha * char_lm(l_plus)
                        else: ch = 0

                        Pnb[t][l_plus] = logsumexp(
                                Pnb[t][l_plus],
                                ctc[t][c_idx] + Pb[t - 1][l] + ch,
                                ctc[t][c_idx] + Pnb[t - 1][l] + ch
                            )

                    # Make use of discarded prefixes
                    if l_plus not in A_prev:

                        Pb[t][l_plus] = logsumexp(
                            Pb[t][l_plus],
                            ctc[t][-1] + Pb[t - 1][l_plus],
                            ctc[t][-1] + Pnb[t - 1][l_plus]
                        )

                        Pnb[t][l_plus] = logsumexp(
                            Pnb[t][l_plus],
                            ctc[t][c_idx] + Pnb[t - 1][l_plus]
                        )

        A_next = {
            x: logsumexp(
                Pnb[t].get(x, NEG_INF),
                Pb[t].get(x, NEG_INF)
            )
            for x in set(Pb[t]).union(Pnb[t])
        }
        # word insertion bonus rescoring - do not use word bonus if no LM used!
        bonus = 0 if not lm else beta * math.log(len(words(l)) + 1)
        sorter = lambda l: A_next[l] + bonus
        A_prev = sorted(A_next, key=sorter, reverse=True)[:beam_width]
        del Pnb[t-1], Pb[t-1]

    return A_prev

Adapting for Digit String Recognition?

I compiled TFWordBeamSearch.so and succesfully incorporated it with wordbeamsearch as the decoderType in the simpleHTR model (using Tensorflow 1.3 for both projects).

My use case is to recognize 6-digit strings that come from a restricted set of 10 possibilities (they are aircraft serial numbers). So to conduct wordbeamsearch, in the simpleHTR project I modified

data/corpus.txt to contain only the 10 actual serial numbers
model/wordCharList.txt to contain only the digits 0 through 9

I have a test set of 10 images of these 6-digit serial strings. When running the CRNN with the default best-path decoding, I naturally get a lot of the digits mislabeled as letters. I expected that wordbeamsearch decoding would improve the result, but now all the images are labeled by the CRNN as "." with probability 0.50333.

I am trying to understand if this is related to my trying to use all digits in my "words"?

Attached is an example panel showing a few of the input images and the decoded label that results from the default best path search.

corpus.txt
wordCharList.txt

The question about shape (TxBxC)

Hi Mr.Scheidl,
I have just came into your README and I have a question. In "A First Example" section, you have mentioned about feed matrix have to be shape (TxBxC). What is the meaning of term B?
In almost articles I read about Word Beam Search, the term B refers to Beam Width, but in your Readme, I think it's not Beam Width.
Could you explain it for me? Thank you very much.

resolving undefined symbol errors caused by _GLIBCXX_USE_CXX11_ABI

I am not sure if this is just an issue with a build for python2 version of tensorflow, or there are larger matters at play, but I wanted to report a workaround for undefined symbol errors in the default build process and the test of the custom op.

My platform:
Python 2.7
Ubuntu 18.04
Tensorflow 1.10.1
g++ 6

Of course, to build with python2.7, I had to change buildTF.sh so that it invoked python rather than python3. The compile succeeds, but with several warnings:

<command-line>:0:0: warning: "__GLIBCXX_USE_CXX11_ABI" redefined

This was the first hint something might be awry.

Upon executing tf.load_op_library in the testCustomOp.py script, I then received the following error

Traceback (most recent call last):
  File "testCustomOp.py", line 85, in <module>
    testMiniExample()
  File "testCustomOp.py", line 62, in testMiniExample
    res=testCustomOp(mat, corpus, chars, wordChars)
  File "testCustomOp.py", line 16, in testCustomOp
    word_beam_search_module = tf.load_op_library('../cpp/proj/TFWordBeamSearch.so')
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ../cpp/proj/TFWordBeamSearch.so: undefined symbol: _ZN10tensorflow11GetNodeAttrERKNS_9AttrSliceENS_11StringPieceEPSs

When I remov the errant command-line define -D_GLIBCXX_USE_CXX11_ABI=0 from the compile lines, the build succeeds without warning and the test runs as expected.

I'm not sure why the define was there in the first place, but I thought I'd add a report. My tactic involved pulling out the define with an in-script environment variable. If this is a wider issue for others, perhaps such a configuration would make it easy to manually change whether it is defined or not, based on the specific platform's requirements. (I can submit a PR if you wish).

Thanks for developing and sharing this. Congratulations on the best paper award as well (how I found out about your repo). I am looking forward to some fruitful experiments blending this with my own ctc-based ocr

Does not compile / pass test correctly

If you create a new issue, please provide the following information:

Which program causes the problem

Custom TF operation
C++ test program
Python prototype

Versions

TensorFlow version = 2.1.0
Python version = 3.7.7
C++ compiler = 11.0.3
Operating system = macOS Catalina 10.15.4

Issue

Which result/error did you get?

AttributeError: module 'tensorflow' has no attribute 'Session'

If you think the result is wrong - what result did you expect instead?

 Mini example:
 Label string:  [1 0 3]
 Char string: "ba"

 # ... remainder of example omitted...

How to reproduce the issue?

Just follow the setup documentation with the latest versions of TensorFlow, Python, and Pip. It doesn't work.
Provide all necessary data, at least these files: chars.txt, wordChars.txt, corpus.txt, gt_X.txt, mat_X.csv

N/A

testCustomOp.py error

Hi Harald
i use :
ubuntu 18.04
tensorflow 1.5.0
python 3.6.5
gcc version 7.5.0
i run buildTF.sh done
next i run testCustomOp.py but i had a error :

Mini example:
Label string: [1 0 3]
Char string: "ba"
Traceback (most recent call last):
  File "testCustomOp.py", line 92, in <module>
    testRealExample()
  File "testCustomOp.py", line 81, in testRealExample
    res = testCustomOp(mat, corpus, chars, wordChars)
  File "testCustomOp.py", line 24, in testCustomOp
    assert len(chars) + 1 == mat.shape[2]
AssertionError

Can you help me ?
thanks you
Long

Usage as loss

Hi,

Thank you for such complete repo with code examples and frameworks in both Python and CPP and tensorflow ops. Really amazing work!

If I understand correctly, this is only to be used during evaluation/prediction phase, to get more meaningful ones. I was wondering if it could also be used as a loss during training, such that paths which follow the language model are more probable than those which do not.

Or it is actually not relevant since CTC squashes all paths that produce the label and therefore all versions are maximised ?

Many thanks, again, for making your contributions public !

Compiling with buildTF.sh with other version of tensorflow

Compiling the buildTF only works with 1.3.0 version for other version, it gave the following error

./buildTF.sh
In file included from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/mutex.h:31:0,
from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/op.h:32,
from ../src/TFWordBeamSearch.cpp:1:
/usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:25:22: fatal error: nsync_cv.h: No such file or directory
compilation terminated.

any solution?

Why are not used log(probabilities)?

Thank you for this awesome repo! ;)
I was wondering why are not used log probabilities? Is the beam search stable even for long sequences?

Building a custom operation fails with python3.7

If you create a new issue, please provide the following information:

Which program causes the problem

buildTF

Versions

TensorFlow version

pip3 freeze | grep tf
tf==1.0.0

pip2 freeze | grep tensorflow
tensorflow==1.12.0

Python version

Python 2.7.15
Python 3.7.0

C++ compiler

Apple LLVM version 10.0.0 (clang-1000.10.44.4)
Target: x86_64-apple-darwin18.2.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Operating system

ProductName: Mac OS X
ProductVersion: 10.14.1
BuildVersion: 18B75

Issue

Which result/error did you get?

Single-threaded decoding
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorflow'
Your TF version is
TF versions 1.3.0, 1.4.0, 1.5.0 and 1.6.0 are tested
Compiling for TF 1.5.0 or 1.6.0 now ...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorflow'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorflow'
../src/TFWordBeamSearch.cpp:1:10: fatal error: 'tensorflow/core/framework/op.h' file not found
#include <tensorflow/core/framework/op.h>
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

If you think the result is wrong - what result did you expect instead?

Looking at the code, the pip3 tf package lacks the headers shipped with the pip2 tensorflow (which is available for Python 3.7 via a hack, see tensorflow/tensorflow#20444).

I did compile the binary by switching from python3.7 to python3.6 and python2.7.

Problem with decodings when returning the top n beams' text

Hi!

I made some changes in the code to be able to return the top n beams from BeamList::getBestBeams. However, I'm getting now what looks to be incomplete decodings even when I'm using a dataset that was already successfully decoded with the "normal" version of TFWordBeamSearch.

For instance, with the normal version I decode a result and I get 品川区大井4-56-7, but when I try with the modified version and ask for the top 10 paths, I get this:

[[' 品川区大井4-56', ' 品川区大井456-', ' 品川区大井4-6-', ' 品川区大井4--6', ' 品川区大井 4-5', ' 品川区大井4-65', ' 品川区大井4-56', ' 品川区大井4-56', ' 品川区大井4-56', ' 品川区大井4-56']]

Nowhere in the list I can find the result I got with the original version. Now, I'm not proficient at all in C++, so probably I'm missing something. What could be causing the issue?

EDIT: Sometimes when I'm assigning the labels to the decoded sequence I get an error because the decoded value is way higher than the total number of characters I get. I have around 4000 characters and I sometimes get a decoded value of more than 2000000.

These are the changes I made. Let me know if you'd like the whole files and how can I provide them to you.

Thank you in advance.

On WordBeamSearch.cpp

// Function definition (also modified on WordBeamSearch.hpp)
std::vector<std::vector<uint32_t>> wordBeamSearch(const IMatrix& mat, size_t nBestBeams, size_t beamWidth, const std::shared_ptr<LanguageModel>& lm, LanguageModelType lmType)
...
// Last part

	std::vector<std::vector<uint32_t>> bestBeamsText;
	auto bestBeams = last.getBestBeams(nBestBeams);

	for (size_t t = 0; t < nBestBeams && t < bestBeams.size(); t++)
	  {

	    // return best entry
	    auto bestBeam = bestBeams[t];
	    bestBeam->completeText();
	    bestBeamsText.push_back(bestBeam->getText());

	  }

	return bestBeamsText;

On TFWordBeamSearch.cpp

REGISTER_OP("WordBeamSearch")
.Input("mat: float32")
.Attr("nBestBeams: int")  // Added this
.Attr("beamWidth: int")
.Attr("lmType: string")
.Attr("lmSmoothing: float")
.Attr("corpus: string")
.Attr("chars: string")
.Attr("wordChars: string")
.Output("result: int32")
...

private:
  std::shared_ptr<LanguageModel> m_lm;
  size_t m_beamWidth=0;
  size_t m_numChars=0;
  size_t m_nBestBeams=1; // Added this
  LanguageModelType m_lmType=LanguageModelType::Words;


...

 // read how many beams we're going to return
	  int64 nBestBeams = 0;
	  OP_REQUIRES_OK(context, context->GetAttr("nBestBeams", &nBestBeams));
	  m_nBestBeams = static_cast<size_t>(nBestBeams);


...

// on void Compute(OpKernelContext* context) override
...
OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape({B, m_nBestBeams, T}), &outputTensor));


// Inside for(int b=0; b<B; ++b)
...
// apply decoding algorithm to batch element 
			const std::vector<std::vector<uint32_t>> decoded=wordBeamSearch(mat, m_nBestBeams, m_beamWidth, m_lm, m_lmType);

			// write to output tensor
			for (size_t n=0; n < m_nBestBeams; ++n)
			  {
			    for(int t=0; t<T; ++t)
			      {
				outputMapped(b, n, t)=t<static_cast<int>(decoded.size()) ? decoded[n][t] : blank;
			      }
			  }

Why not use unique words to build prefix tree?

Hello,

Why not use unique words to build prefix tree?

In LanguageModel.py at line 53
self.tree.addWords(words) # add all unique words to tree

But I think 'words' contains duplicates, shouldn't it be 'uniqueWords' ?

Thanks
SR

Possible memory leak?

Hi!

I've been trying your code and it works very well. I just have been experiencing an issue that maybe you can shed a light on.

I'm working on handwriting recognition, and when I run my program that eventually uses yours (loaded TFWordBeamSearch.so in Python), but after an almost certain number of processed images (around 250), memory gets full and the process ends with exit code 137. I noticed that memory usage goes up and was wondering if there's something happening (or something I can do) on the cpp side to avoid this behavior.

I'll appreciate any hints.

Thank you!

Out of Dictionary prediction with specific wordbeamsearch implementation

If you create a new issue, please provide the following information:

Which program causes the problem

Custom TF operation

Versions

TensorFlow version == 1.15
Python version == 3.7
C++ compiler == 11.0.3
Operating system == MacOS

Issue
I extended your SimpleHTR with word beam search. Accuracy increased 5 percent. But I see that still predictions can be out of dictionary. I know that it is possible with word beam search but I am implementing in a different way. Let me explain.

I want to recognize dates that are written in 19/10/1993 format. There are always 2 slashes. My dataset only contains the dates from 01/01/2018-31/12/2022. This means that I have possible ~8000 targets. I treated each date as a single word then created txt files as below.

I have added all possible outcomes as a single word to corpus.txt.
my charlist.txt is /0123456789
my wordcharlist.txt is also same /0123456789 because slashes are in words (it is my word definition).

However, in prediction, I got '1/01/201' for 10/10/2019. I couldn't understand why I got out of dictionary result. Normally, IIUC, word beam search permits arbitrary results for the characters that don't exist in charlist.txt. Here I don't have any chars that don't exist in wordcharlist.txt

Do you have any idea where I am doing wrong?

Thanks for your repo and your effort to open source community...

generation of improper mat file.

Hi,
I'm unable to generate proper mat file for my test data which would run and give me proper results.
Once mat file is generated using the code provided, i'm getting error while running main.py file.

Can you help me to figure out what could be the possible issue?

Thanks in advance!

mat_0.csv | How is it created?

Which program causes the problem

Python prototype

Versions

Python version

Issue

I am trying to use more IAM validation data, but unable to understand how to get the mat_X.csv.

integrating (CTCWordBeamSearch)PureNumpy with (SimpleHTR --wordbeamsearch)

Dear Sir,
Thank you for making it compatible along with windows , Your kind suggestion is required to implement it in SimpleHTR code --wordbeamsearch option--

I followed the steps of read me at successful upto getting the output same as testPybind.py(Since windows does not support custom tf operation as per read me i followed purenumpy instruction)

Code snippet of SimpleHTR-->Model.py
##################################
word_beam_search_module = tf.load_op_library('TFWordBeamSearch.so')
chars = str().join(self.charList)
wordChars = open('../model/wordCharList.txt').read().splitlines()[0]
corpus = open('../data/corpus.txt').read()

decode using the "Words" mode of word beam search

self.decoder =word_beam_search_module.word_beam_search(tf.nn.softmax(self.ctcIn3dTBC, dim=2), 50, 'Words', 0.0, corpus.encode('utf8'), chars.encode('utf8'), wordChars.encode('utf8'))

#################################
What should I replace it with to integrate CTCWordBeamSearch with SimpleHTR

Note-
**Initially i have tried to replace it with WordBeamSearch() , got confused with what to feed
**at my second try I try to convert tf.nn.softmax(self.ctcIn3dTBC, dim=2) into np.array(tf.nn.softmax(self.ctcIn3dTBC, dim=2)) results an error
** called testPybind() with above parameters results in error ..,,
Please may guide to which direction go next

Error Running buildTF.sh

Hello,

Thanks a lot for this really nice repo. I tried to use it but wasn't successful running the buildTF.sh script.

Here are the details versions for my environment:

CentOS 7
g++ (GCC) 4.8.5 (so I used -std=c++1y instead of c++14)
TensorFlow: 1.3

Do you have any idea what's going on ?

Thanks for your help, and here is the (long) log:

../src/PrefixTree.cpp: In member function ‘void PrefixTree::allWordsAdded()’:
../src/PrefixTree.cpp:56:74: error: parameter declared ‘auto’
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                          ^
../src/PrefixTree.cpp:56:91: error: parameter declared ‘auto’
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                           ^
../src/PrefixTree.cpp: In lambda function:
../src/PrefixTree.cpp:56:104: error: ‘lhs’ was not declared in this scope
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                                        ^
../src/PrefixTree.cpp:56:116: error: ‘rhs’ was not declared in this scope
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                                                    ^
../src/PrefixTree.cpp: In member function ‘std::shared_ptr<PrefixTree::Node> PrefixTree::getNode(const std::vector<unsigned int>&) const’:
../src/PrefixTree.cpp:152:96: error: parameter declared ‘auto’
   auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
                                                                                                ^
../src/PrefixTree.cpp:152:110: error: parameter declared ‘auto’
   auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
                                                                                                              ^
../src/PrefixTree.cpp: In lambda function:
../src/PrefixTree.cpp:152:123: error: ‘p’ was not declared in this scope
   auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
                                                                                                                           ^
../src/PrefixTree.cpp:152:133: error: ‘val’ was not declared in this scope
   auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
                                                                                                                                     ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘_FIter std::lower_bound(_FIter, _FIter, const _Tp&, _Compare) [with _FIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Tp = unsigned int; _Compare = PrefixTree::getNode(const std::vector<unsigned int>&) const::__lambda2]’:
../src/PrefixTree.cpp:152:139:   required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2447:31: error: no match for call to ‘(PrefixTree::getNode(const std::vector<unsigned int>&) const::__lambda2) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, const unsigned int&)’
    if (__comp(*__middle, __val))
                               ^
../src/PrefixTree.cpp:152:82: note: candidates are:
   auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
                                                                                  ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2447:31: note: void (*)() <conversion>
    if (__comp(*__middle, __val))
                               ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2447:31: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:152:113: note: PrefixTree::getNode(const std::vector<unsigned int>&) const::__lambda2
   auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
                                                                                                                 ^
../src/PrefixTree.cpp:152:113: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘void std::__insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2226:70:   required from ‘void std::__final_insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5500:55:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128:   required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2159:29: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
    if (__comp(*__i, *__first))
                             ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2159:29: note: void (*)() <conversion>
    if (__comp(*__i, *__first))
                             ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2159:29: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘void std::__heap_select(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:5349:59:   required from ‘void std::partial_sort(_RAIter, _RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2332:68:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128:   required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:1948:27: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
  if (__comp(*__i, *__first))
                           ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:1948:27: note: void (*)() <conversion>
  if (__comp(*__i, *__first))
                           ^
/usr/include/c++/4.8.2/bits/stl_algo.h:1948:27: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘void std::__move_median_to_first(_Iterator, _Iterator, _Iterator, _Iterator, _Compare) [with _Iterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2295:13:   required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2337:62:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128:   required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:114:28: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
       if (__comp(*__a, *__b))
                            ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:114:28: note: void (*)() <conversion>
       if (__comp(*__a, *__b))
                            ^
/usr/include/c++/4.8.2/bits/stl_algo.h:114:28: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:116:25: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
    if (__comp(*__b, *__c))
                         ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:116:25: note: void (*)() <conversion>
    if (__comp(*__b, *__c))
                         ^
/usr/include/c++/4.8.2/bits/stl_algo.h:116:25: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:118:30: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
    else if (__comp(*__a, *__c))
                              ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:118:30: note: void (*)() <conversion>
    else if (__comp(*__a, *__c))
                              ^
/usr/include/c++/4.8.2/bits/stl_algo.h:118:30: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:123:33: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
       else if (__comp(*__a, *__c))
                                 ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:123:33: note: void (*)() <conversion>
       else if (__comp(*__a, *__c))
                                 ^
/usr/include/c++/4.8.2/bits/stl_algo.h:123:33: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:125:33: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
       else if (__comp(*__b, *__c))
                                 ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:125:33: note: void (*)() <conversion>
       else if (__comp(*__b, *__c))
                                 ^
/usr/include/c++/4.8.2/bits/stl_algo.h:125:33: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘_RandomAccessIterator std::__unguarded_partition(_RandomAccessIterator, _RandomAccessIterator, const _Tp&, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Tp = std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2296:78:   required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2337:62:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128:   required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, const std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
    while (__comp(*__first, __pivot))
                                   ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: note: void (*)() <conversion>
    while (__comp(*__first, __pivot))
                                   ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (const std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
    while (__comp(__pivot, *__last))
                                  ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: note: void (*)() <conversion>
    while (__comp(__pivot, *__last))
                                  ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/bits/stl_algo.h:61:0,
                 from /usr/include/c++/4.8.2/algorithm:62,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_heap.h: In instantiation of ‘void std::__adjust_heap(_RandomAccessIterator, _Distance, _Distance, _Tp, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Distance = long int; _Tp = std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_heap.h:448:15:   required from ‘void std::make_heap(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:1946:47:   required from ‘void std::__heap_select(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5349:59:   required from ‘void std::partial_sort(_RAIter, _RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2332:68:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128:   required from here
/usr/include/c++/4.8.2/bits/stl_heap.h:313:40: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
        *(__first + (__secondChild - 1))))
                                        ^
../src/PrefixTree.cpp:56:60: note: candidates are:
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                            ^
In file included from /usr/include/c++/4.8.2/bits/stl_algo.h:61:0,
                 from /usr/include/c++/4.8.2/algorithm:62,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_heap.h:313:40: note: void (*)() <conversion>
        *(__first + (__secondChild - 1))))
                                        ^
/usr/include/c++/4.8.2/bits/stl_heap.h:313:40: note:   candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
   std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
                                                                                              ^
../src/PrefixTree.cpp:56:94: note:   candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: At global scope:
/usr/include/c++/4.8.2/bits/stl_algo.h:2110:5: error: ‘void std::__unguarded_linear_insert(_RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’, declared using local type ‘PrefixTree::allWordsAdded()::__lambda1’, is used but never defined [-fpermissive]
     __unguarded_linear_insert(_RandomAccessIterator __last,
     ^
In file included from /usr/include/c++/4.8.2/bits/stl_algo.h:61:0,
                 from /usr/include/c++/4.8.2/algorithm:62,
                 from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_heap.h:331:5: error: ‘void std::__pop_heap(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’, declared using local type ‘PrefixTree::allWordsAdded()::__lambda1’, is used but never defined [-fpermissive]
     __pop_heap(_RandomAccessIterator __first, _RandomAccessIterator __last,
     ^
/usr/include/c++/4.8.2/bits/stl_heap.h:178:5: error: ‘void std::__push_heap(_RandomAccessIterator, _Distance, _Distance, _Tp, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Distance = long int; _Tp = std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’, declared using local type ‘PrefixTree::allWordsAdded()::__lambda1’, is used but never defined [-fpermissive]
     __push_heap(_RandomAccessIterator __first, _Distance __holeIndex,
     ^

test error

hello, thanks for your great work, when i run testCustomOp.py , wrong message is as below.
tensorflow.python.framework.errors_impl.NotFoundError: ../cpp/proj/TFWordBeamSearch.so: undefined symbol: _ZN10tensorflow22CheckNotInComputeAsyncEPNS_15OpKernelContextEPKc
Environment : TF 1.4 and Cuda 8.0
While, when I test it on TF 1.12 and cuda 9.0 it's ok.

Compiling TensorFlow Custom Operation

The implementation looks promising.
Is there a prebuilt FWordBeamSearch.so available for testing?

Variable sequence length support

It looks as though the decoder assumes every element in the batch has the same maximum sequence length. Many datasets do not normalize sequence length, but are instead varaible.

While this is issue is mildly inefficient if elements of wildly different lengths are in the same batch (of course, one could use bucketed batching to ameliorate that issue), my larger concern is with correctness.

Since the CTC model in tensorflow takes the sequence length vector for the batch for training, the model does not need to learn to produce the CTC blank for the padded regions.

Assuming everything so far is correct, I see two solutions:

Update the tf module to include a B-dimensional sequence length vector containing values in [1,T], where the RNN activations are TxBx(C+1). This argument is precisely the same as the sequence_length parameter to tf.nn.ctc_loss or tf.nn.ctc_beam_search_decoder. Then wordBeamSearch in WordBeamSearch.cpp would need to be modified to use an element-specific maxT as the loop bound.
The user modifies the batched RNN output so that P(blank)=1.0 for any value outside the sequence length for a given element. Though unpleasant, is does not seem particularly difficult to do on the python side of things. In contrast, it's not obvious how to do this well or efficiently with tensorflow ops.

Are there other ideas or commentary? How difficult is solution (1)? I'm certainly concerned I'll break the whole thing if I attempt it, but I may give it a go if necessary.

Thanks!

boolean value of tensor with more than one value is ambiguous.

Which program causes the problem

Python prototype

Versions

Python version 3.7.7
Operating system ubuntu 18.04
Pytorch version 1.6.0

Issue
Hi, I read your paper, and I thought it is such a good algorithm. Thus, I want to apply the word beam search to my research.
However, it is not easy to implement with python project.
I have a research about speech recognition. My input data (speech -> spectrogram) enters into the model, and it makes the output which has a shape of [sequence length (T) x batch size (B) x number of characters (C)]. e.g. (371, 32, 29)
Then it is fed into the decoder.

def WordBeamSearch(mat, beamWidth, lm, useNGrams):
    "decode matrix, use given beam width and language model"
    chars = lm.getAllChars()
    blankIdx = len(chars)  # blank label is supposed to be last label in RNN output
    #mat = mat.cpu().numpy()
    print(mat.shape)
    maxT, _, _ = mat.shape  # shape of RNN output: TxBxC

    genesisBeam = Beam(lm, useNGrams)  # empty string
    last = BeamList()  # list of beams at time-step before beginning of RNN output
    last.addBeam(genesisBeam)  # start with genesis beam
    # go over all time-steps
    for t in range(maxT):
        curr = BeamList()  # list of beams at current time-step

        # go over best beams
        bestBeams = last.getBestBeams(beamWidth)  # get best beams
        .....

The error occurs when to get best beams
and error message 'boolean value of tensor with more than one value is ambiguous. ' is popped up
at here.

def getBestBeams(self, num):
"return best beams, specify the max. number of beams to be returned (beam width)"
        u = [v for (_, v) in self.beams.items()]
        lmWeight = 1
        return sorted(u, reverse=True, key=lambda x: x.getPrTotal() * (x.getPrTextual() ** lmWeight))[:num]

I changed the tensor into the numpy array, but it makes another error again.
'The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()'

I tried to find the solution and read your code for days, but I don't know what is problem.
Please help me. If you want to know more details about error, please let me know.
I look forward to your comment.
Thank you for your consideration.

Parallel mode doesn’t work for NGramsForecast or NGramsForecastAndForecast

Either segmentation fault or corruption error. It works well with Words mode and NGrams mode. Thank you

Unable to compile TFWordBeamSearch.so on Mac OS (Catalina)

Hi Harald,

I am unable to compile TFWordBeamSearch.so (Custom TF Op) on MacOS Catalina 10.15.2. Getting error as below:

ld: library not found for -l:libtensorflow_framework.1.dylib
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Python version: 3.6.9
TF Version: 1.14.0

Any idea what could be wrong here. I tried adding additional MacOS flag -undefined dynamic_lookup in the g++ command that prepares the .so file. Please help.

I tried setting LIBRARY_PATH, CPPFLAGS, LDFLAGS, LD_LIBRARY_PATH but no luck yet. Anticipating best answer from your side. Thanks and have a good day. Stay safe.

undefined symbol

If you create a new issue, please provide the following information:

Which program causes the problem

Custom TF operation
C++ test program
Python prototype

Versions

TensorFlow version : 2.2.0
Python version : 3.7
C++ compiler
Operating system : Ubuntu 18.04

Issue

tensorflow.python.framework.errors_impl.NotFoundError: CTCWordBeamSearch/cpp/proj/TFWordBeamSearch.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs

Precondition might be too strong

Which program causes the problem

The C++ code of TFWordBeamSearch

I'm assuming the next few things:

The charList is generated by the RNN, and contains all detected characters, which includes special characters like ,!#. and so on.
The wordCharList is a given by the user, denoting all characters that actually occur in words.

The next condition is set in TFWordBeamSearch.cpp, lines 108-111:

if(!(numWordChars > 0 && numWordChars < m_numChars))
{
    throw std::invalid_argument("wordChars must contain at least one character and at least one character less than chars: 0<len(wordChars)<len(chars)");
}

It seems to me that numWordChars < m_numChars is enforced as m_numChars also contains non-word characters. The dataset that we use contains [a-z] only though, and so does the wordChars list ([a-z]). As such, it seems this precondition is too strong and should become:

if(!(numWordChars > 0 && numWordChars <= m_numChars))

Please correct me if I'm wrong.
Furthermore, great work on this project. Really top notch.

NGramsForecastAndSample and NGramsForecast unusable with Vietnamese

First, thanks for the great works, Harald. I’m trying to implement your Word Beam Search Algorithm to Vietnamese language. I modified the charList, wordcharList and the corpus for my need. It's been working good for Words and NGrams mode. But when I tried to use it in either NGramsForecast or NGramsForecastAndSample, it raised this error.

Python(23489,0x70000ee70000) malloc: *** error for object 0x7fa625811e00: incorrect checksum for freed object — object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Python(23489,0x70000ef76000) malloc: *** error for object 0x7fa62714f940: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
zsh: abort python3 main.py — validate — wordbeamsearch

Which i have no ideal how to resolve. I'm using Tensorflow 1.7.0 on MacOS 10.13.5, Ubuntu 16.04 and 18.04

BTW, you can get a sense of Vietnamese language, its structure via this wiki page: https://en.wikipedia.org/wiki/Vietnamese_language

So I really need your help here. What did i’ve missed ? What thing i can do to resolve this problem ?

Thanks you.

optimizations for ARM

Has anyone tried this algorithm on ARM Architecture? It is taking very long(around 7 secs) for input dimension 700*80 with 100 BeamWidth in ARM processor which is around 5 times in comparison to x86 architecture(1.4 secs) for the same hyperparameters. Any optimization we can do to reduce execution time in an ARM to bring it down at least equal to x86?

-

Generating the mat_x.csv

Hi Team,
Is there a way to generate mat_x.csv ourselves. i was trying to do this reading the documentation. But I was unable to find T and B values. Is there any program to generate the file. Or Can you document the procedure to do end to end testing of the application.

while generating mat.csv file, 'm getting the following error:

File "main.py", line 33, in
res=wordBeamSearch(data.mat, 10, loader.lm, useNGrams)
File "C:\Users\My Pc\Desktop\M.Tech Thesis Project\code\CTCWordBeamSearch-mast
er\py\WordBeamSearch.py", line 33, in wordBeamSearch
prBlank=beam.getPrTotal()*mat[t, blankIdx]
IndexError: index 93 is out of bounds for axis 1 with size 80

Originally posted by @rushaligupta in #3 (comment)

Replace mulitiplication with addition can prevent underflow for decimals

Python prototype, version is 3.6

Same as Substract a maxValue to prevent overflow of e(y), it is not a logical problem, just a trick to prevent underflow.

As the probability of words is decimal and the path may be long, so the probability of sequences is the mulitiplication of many decimals, for example, p(seq)=0.10.1...*0.1, and this is a unadvisable operation for underflow, I want to fix it and wonder if you have tried to do something to prevent this.

A common usage is to use log, as log(a*b) = log(a) + log(b), such codes can be found in ctc_beam_search.h:

logsumexp = Eigen::numext::log(logsumexp);
float norm_offset = max_coeff + logsumexp;

And log(a+b) can be found in ctc_loss_util.h:

// Add logarithmic probabilities using:
// ln(a + b) = ln(a) + ln(1 + exp(ln(b) - ln(a)))
// The two inputs are assumed to be log probabilities.
// (GravesTh) Eq. 7.18
inline float LogSumExp(float log_prob_1, float log_prob_2) {
  // Always have 'b' be the smaller number to avoid the exponential from
  // blowing up.
  if (log_prob_1 == kLogZero) {
    return log_prob_2;
  } else if (log_prob_2 == kLogZero) {
    return log_prob_1;
  } else {
    return (log_prob_1 > log_prob_2)
               ? log_prob_1 + log1pf(expf(log_prob_2 - log_prob_1))
               : log_prob_2 + log1pf(expf(log_prob_1 - log_prob_2));
  }
}

Unable to create TFWordBeamSearch.so file in windows

Error while running the buildTF.sh file in windows using git bash.
Error message:

../src/TFWordBeamSearch.cpp:1:10: fatal error: tensorflow/core/framework/op.h: No such file or directory
#include <tensorflow/core/framework/op.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

g++ version:
g++ (MinGW.org GCC-8.2.0-3) 8.2.0

Operating System: Windows 10.

Searched in the system for the header file and it was present.
Included the path in the environmental variables but the error persists.

Error while running buildTF.sh

I'm getting the following error while running buildTF,sh file:

buildTF.sh: [: ==: binary operator expected
Single-threaded decoding
Your TF version is 1.3.0
TF versions 1.3.0, 1.4.0, 1.5.0 and 1.6.0 are tested
buildTF.sh: syntax error near unexpected token =(' buildTF.sh: buildTF.sh: line 52: TF_CFLAGS =( $( python -c 'import tensor
flow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))' ))'

The system specifications are as follows:

OS- windows 8.1
Python: 3.5 version
Tensorflow: 1.3 version
gcc version 6.3.0 (MinGW.org GCC-6.3.0-1)

Therefore,
Not able to get 'TFWordBeamSearch.so' file.

Please help me out! Thanks in advance.

Prediction Score

Is there a way to know a score of how probable is that the decodification is right according to the input CTC? I'm guessing it should be something like the accumulated probability of the path followed to get the output word. Is there a way to retrieve something like this?

How to set decode sequence length

Custom TF operation
TensorFlow version: 1.12.0
Operating system: Ubuntu

How to set the sequence length parameter like in tensorflow api, thanks!

Compile custom TF operation

Follow these instructions to integrate word beam search decoding:

Clone repository CTCWordBeamSearch.
Compile custom TF operation (follow instructions given in README).
Copy binary TFWordBeamSearch.so from the CTCWordBeamSearch repository to the src/ directory of the SimpleHTR repository.

Can you please explain what step 2 is?
I'm unable to get that, as I want to generate .so file.

Question about decoder output.

If you create a new issue, please provide the following information:

Which program causes the problem

NumPy operation (Python package)

Versions

Pytorch version 1.6.0
Python version 3.7.7
Operating system ubuntu 18.04

Issue
Hi, I'm trying to adapt your decoder to speech recognition project.
After doing some experiments, I have a question about the decoder output.

My input data (speech -> spectrogram) is fed into the model (CNN+RNN), and the model makes the output which has a shape of [sequence length (T) x batch size (B) x number of characters (C)]. e.g. (371, 32, 29)
Then it is fed into the function "def testPyBind(feedMat, corpus, chars, wordChars)"
My corpus.txt is composed of transcription of validation and test set. It has 146,709 sentences.
My chars.txt has 28 characters ' ABCDEFGHIJKLMNOPQRSTUVWXYZ
My wordchars.txt has 27 characters 'ABCDEFGHIJKLMNOPQRSTUVWXYZ (space is removed from chars.txt)
And then, decoder makes a list of sentences. The list has a batch length. e.g. 32 sentences

The first question is that output sentence from the decoder is 100% same with the corpus sentence.
I supposed that there is some character errors in the output sentence because it is the predicted sentence which is made from the model and decoder.
I don't understand why the decoder makes the sentence which is only in the corpus.

Second question is how can I parallel predicted sentence (output) and target sentence to calculate WER and CER.
I calculated CER and WER, but these sentences totally mismatch.
It is my result.

Epoch: 0/100 | Train loss: 6.999162
Epoch: 0/100 | Val loss: 2.889192
Average CER: 767.66% | Average WER: 1543.43%

Epoch: 5/100 | Train loss: 4.038735
Epoch: 5/100 | Val loss: 1.960615
Average CER: 734.83% | Average WER: 1154.09%

Epoch: 10/100 | Train loss: 3.314095
Epoch: 10/100 | Val loss: 1.570683
Average CER: 747.28% | Average WER: 1155.41%
....

When I use greedy search algorithm, I have a result like this.

Epoch: 0/100 | Train loss: 2.138510
Epoch: 0/100 | Val loss: 2.078584
Average CER: 65.45% | Average WER: 99.07%

Epoch: 5/100 | Train loss: 1.150841
Epoch: 5/100 | Val loss: 1.037751
Average CER: 32.14% | Average WER: 79.66%

Epoch: 10/100 | Train loss: 0.916023
Epoch: 10/100 | Val loss: 0.855249
Average CER: 26.44% | Average WER: 71.63%
...

Please reply my questions.
Thank you for your consideration.

GPU support

I tried to compile it with tenserflow-gpu version 1.3.0, but its giving ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory error.

System: Ubuntu 16.04, GCC 5.4, Tensorflow 1.3 GPU,CUDA 8.0,CUDNN 5.1.

Please let me know where i am going wrong?

wrong output with Arabic language

I'm using modified Keras OCR example the prediction code is here

also how I save the .csv file
when I tray to apply your code main.py I get wrong output I'm relay not knowing what is the problem this is the data I used to test the code
test.zip
notes when I open the text files in your data [bentham] and save the files as utf8 the output slightly change
Decoding 3 samples now.
Sample: 1
Filenames: ../data/bentham/mat_0.csv|../data/bentham/gt_0.txt
Result: "an"
Ground Truth: "brain."
Editdistance: 5
Accumulated CER and WER so far: CER: 0.7142857142857143 WER: 1.0
Sample: 2
Filenames: ../data/bentham/mat_1.csv|../data/bentham/gt_1.txt
Result: "supposed"
Ground Truth: "supposed"
Editdistance: 1
Accumulated CER and WER so far: CER: 0.375 WER: 1.0
Sample: 3
Filenames: ../data/bentham/mat_2.csv|../data/bentham/gt_2.txt
Result: "submitt,both,mental,and.brain"brain"
Ground Truth: "submitt, both mental and corporeal, is far beyond
Editdistance: 32
Accumulated CER and WER so far: CER: 0.5066666666666667 WER: 0.75

What does 'check length of chars and wordChars: 0<len(wordChars)<=len(chars)' mean?

If you create a new issue, please provide the following information:

Which program causes the problem

Custom TF operation

Versions

TensorFlow version: Version: 1.15.0
Python version: 3.7
C++ compiler: gcc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010
Operating system: Linux

Issue
I couldn't understand this error message in your code. I got this error during evalution of 1st epoch in the training.

Validate NN
terminate called after throwing an instance of 'std::invalid_argument'
  what():  check length of chars and wordChars: 0<len(wordChars)<=len(chars)```


My task is to recognize handwritten dates, such as "19/10/1993" it only contains slashes and numbers. Does it result from this? 

How can I solve the issue?

How to create TFWordBeamSearch.so using my data

Hi Harald
How do I prepare data and how can I create a TFWordBeamSearch.so?
thanks you
Long

githubharald / ctcwordbeamsearch Goto Github PK

ctcwordbeamsearch's People

Contributors

Stargazers

Watchers

Forkers

ctcwordbeamsearch's Issues

decode using the "Words" mode of word beam search

Recommend Projects

Recommend Topics

Recommend Org