githubharald / ctcwordbeamsearch Goto Github PK
View Code? Open in Web Editor NEWConnectionist Temporal Classification (CTC) decoder with dictionary and language model.
Home Page: https://towardsdatascience.com/b051d28f3d2e
License: MIT License
Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
Home Page: https://towardsdatascience.com/b051d28f3d2e
License: MIT License
If you create a new issue, please provide the following information:
I'm trying to embed CTCWordBeamSearch to SimpleHTR
inorder to build TFWordBeamSearch.so file on my custom data we need to have mat_x.csv and gt_.txt file , how do I generate these file for my data, can I use the same TFWordBeamSearch.so file generated on IAM data set that is in the repository , please advise.
Hi,
I am following tutorial and want to create own beam search for specific set of characters (e.g. Date) , with SimpleHTR, can we use custom output dictionary instead of inbuilt beam search output dictionary...
is there any update on issue 720..
I had a slightly different usecase, where I wanted the Top-N results from Beam Search and not just the best one.
I was trying to modify the word_beam_search CPP implementation as I am using the Numpy (Python Implementation)
Can you please help me on the changes so that I can get top N(N=beamwidth) results, when I use : wbs.compute()
function ?
If you create a new issue, please provide the following information:
If you create a new issue, please provide the following information:
Hi @githubharald , thanks for you project. I have some question about the mat fed into the tf session. I am training crnn+ctc model. For example, for an image which represents for text "x181208022". Before ctc layer, I have the rnn output, if I use greedy decoding, I will get the result as "--x-11-8-1-2-0-8--0-2-2---", "-" represents for the ctc-blank. If I want to use your project, should I just feed the rnn output matrix into word beam search part?
Because I saw your testing code:
blank = len(chars)
s = ''
batch = 0
for label in res[batch]:
if label == blank:
break
s += chars[label]
The for loop will break if met a ctc-blank. But in my case, ctc-blank is not the end of a word, if break it will give the wrong result
Hi @githubharald , thanks for you project!
About the python version.
when I set dataset as "bentham", I found that the last word from one sentence was linked with the first word from next sentence, and counted as bigram.
the corpus.txt of "bentham" is:
brain. supposed submitt both mental and corporeal,
the fake friend of the family like the
is far beyond any idea
and the "bigrams" shows:
'the':{'is':2.0, 'family':2.0,'fake':2.0}
The second sentence's "the" is linked with the third sentence's "is".
I think this may effect the OCR results.
I think the correct bi-gram should be: 'the':{'family':1.0,'fake':1.0}
What do you think about this problem?
Is this a bug?
This is a continuation of a question on stack overflow. The question was Why should word level LM integration work, if it decreases the probability of valid prefixes while not scoring prefixes that haven't been decoded into words yet? I checked my dataset and I have OOV rate 8%. I also calculated the crossentropy for a bigram model trained on training set agains the test set. The results are as follows.
vocab size: 39105
oov rate: 0.08589909443725743
cross_entropy_train_test: 6.5233285695619125
train_entropy: 6.483290547147171
test_entropy: 6.314921857881477
Unfortunately, I can not share the distributions from the accoustic model and the corresponding text since the text is proprietary :/ I do not expect a direct solution or so, but rather a discussion about why word-level beam search should work. I am now running experiments with char-level LM and the results seem quite promissing. I use the following implementation of beam search.
def log_beam_search(ctc, alphabet, blank_idx, beam_width, lm=False, char_lm=False, alpha=0.3, beta=5,
prune=0, prefix_tree=False, end_symbol='>'):
F = ctc.shape[1]
ctc = np.vstack((np.zeros(F), ctc))
T = ctc.shape[0]
Pb = defaultdict(lambda: defaultdict(lambda : NEG_INF))
Pnb = defaultdict(lambda: defaultdict(lambda : NEG_INF))
Pb[0][''] = 0
Pnb[0][''] = NEG_INF
A_prev = ['']
for t in range(1, T):
if prune:
pruned_alphabet = [alphabet[i] for i in np.where(ctc[t] > prune)[0]]
else:
pruned_alphabet = alphabet
for l in A_prev:
if len(l) > 0 and l[-1] == end_symbol:
Pb[t][l] = Pb[t - 1][l]
Pnb[t][l] = Pnb[t - 1][l]
continue
if prefix_tree:
if len(l) > 0 and l[-1] != ' ':
pruned_alphabet = prefix_tree(l.split()[-1])
for c in pruned_alphabet:
c_idx = alphabet.index(c) # todo: use dict to get O(log(n)) insted of O(n)
if c_idx == blank_idx:
Pb[t][l] = logsumexp(
Pb[t][l],
ctc[t][blank_idx] + Pb[t - 1][l],
ctc[t][blank_idx] + Pnb[t - 1][l]
)
else:
l_plus = l + c
if len(l) > 0 and c == l[-1]:
if char_lm:
ch = alpha * char_lm(l_plus)
else: ch = 0
Pnb[t][l_plus] = logsumexp(
Pnb[t][l_plus],
ctc[t][c_idx] + Pb[t - 1][l] + ch
)
Pnb[t][l] = logsumexp(
Pnb[t][l],
ctc[t][c_idx] + Pnb[t - 1][l]
)
elif len(l.replace(' ', '')) > 0 and c in (' ', end_symbol):
lm_prob = 0 if not lm else alpha * lm(l_plus.strip(' >'))
Pnb[t][l_plus] = logsumexp(
Pnb[t][l_plus],
lm_prob + ctc[t][c_idx] + Pb[t - 1][l],
lm_prob + ctc[t][c_idx] + Pnb[t - 1][l]
)
else:
if char_lm:
ch = alpha * char_lm(l_plus)
else: ch = 0
Pnb[t][l_plus] = logsumexp(
Pnb[t][l_plus],
ctc[t][c_idx] + Pb[t - 1][l] + ch,
ctc[t][c_idx] + Pnb[t - 1][l] + ch
)
# Make use of discarded prefixes
if l_plus not in A_prev:
Pb[t][l_plus] = logsumexp(
Pb[t][l_plus],
ctc[t][-1] + Pb[t - 1][l_plus],
ctc[t][-1] + Pnb[t - 1][l_plus]
)
Pnb[t][l_plus] = logsumexp(
Pnb[t][l_plus],
ctc[t][c_idx] + Pnb[t - 1][l_plus]
)
A_next = {
x: logsumexp(
Pnb[t].get(x, NEG_INF),
Pb[t].get(x, NEG_INF)
)
for x in set(Pb[t]).union(Pnb[t])
}
# word insertion bonus rescoring - do not use word bonus if no LM used!
bonus = 0 if not lm else beta * math.log(len(words(l)) + 1)
sorter = lambda l: A_next[l] + bonus
A_prev = sorted(A_next, key=sorter, reverse=True)[:beam_width]
del Pnb[t-1], Pb[t-1]
return A_prev
I compiled TFWordBeamSearch.so
and succesfully incorporated it with wordbeamsearch as the decoderType in the simpleHTR model (using Tensorflow 1.3 for both projects).
My use case is to recognize 6-digit strings that come from a restricted set of 10 possibilities (they are aircraft serial numbers). So to conduct wordbeamsearch, in the simpleHTR project I modified
data/corpus.txt
to contain only the 10 actual serial numbersmodel/wordCharList.txt
to contain only the digits 0 through 9I have a test set of 10 images of these 6-digit serial strings. When running the CRNN with the default best-path decoding, I naturally get a lot of the digits mislabeled as letters. I expected that wordbeamsearch decoding would improve the result, but now all the images are labeled by the CRNN as "." with probability 0.50333.
I am trying to understand if this is related to my trying to use all digits in my "words"?
Attached is an example panel showing a few of the input images and the decoded label that results from the default best path search.
Hi Mr.Scheidl,
I have just came into your README and I have a question. In "A First Example" section, you have mentioned about feed matrix have to be shape (TxBxC). What is the meaning of term B?
In almost articles I read about Word Beam Search, the term B refers to Beam Width, but in your Readme, I think it's not Beam Width.
Could you explain it for me? Thank you very much.
I am not sure if this is just an issue with a build for python2 version of tensorflow, or there are larger matters at play, but I wanted to report a workaround for undefined symbol errors in the default build process and the test of the custom op.
My platform:
Python 2.7
Ubuntu 18.04
Tensorflow 1.10.1
g++ 6
Of course, to build with python2.7, I had to change buildTF.sh
so that it invoked python
rather than python3
. The compile succeeds, but with several warnings:
<command-line>:0:0: warning: "__GLIBCXX_USE_CXX11_ABI" redefined
This was the first hint something might be awry.
Upon executing tf.load_op_library
in the testCustomOp.py
script, I then received the following error
Traceback (most recent call last):
File "testCustomOp.py", line 85, in <module>
testMiniExample()
File "testCustomOp.py", line 62, in testMiniExample
res=testCustomOp(mat, corpus, chars, wordChars)
File "testCustomOp.py", line 16, in testCustomOp
word_beam_search_module = tf.load_op_library('../cpp/proj/TFWordBeamSearch.so')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ../cpp/proj/TFWordBeamSearch.so: undefined symbol: _ZN10tensorflow11GetNodeAttrERKNS_9AttrSliceENS_11StringPieceEPSs
When I remov the errant command-line define -D_GLIBCXX_USE_CXX11_ABI=0
from the compile lines, the build succeeds without warning and the test runs as expected.
I'm not sure why the define was there in the first place, but I thought I'd add a report. My tactic involved pulling out the define with an in-script environment variable. If this is a wider issue for others, perhaps such a configuration would make it easy to manually change whether it is defined or not, based on the specific platform's requirements. (I can submit a PR if you wish).
Thanks for developing and sharing this. Congratulations on the best paper award as well (how I found out about your repo). I am looking forward to some fruitful experiments blending this with my own ctc-based ocr
If you create a new issue, please provide the following information:
Which result/error did you get?
AttributeError: module 'tensorflow' has no attribute 'Session'
If you think the result is wrong - what result did you expect instead?
Mini example:
Label string: [1 0 3]
Char string: "ba"
# ... remainder of example omitted...
How to reproduce the issue?
Just follow the setup documentation with the latest versions of TensorFlow, Python, and Pip. It doesn't work.
Provide all necessary data, at least these files: chars.txt, wordChars.txt, corpus.txt, gt_X.txt, mat_X.csv
N/A
Hi Harald
i use :
ubuntu 18.04
tensorflow 1.5.0
python 3.6.5
gcc version 7.5.0
i run buildTF.sh done
next i run testCustomOp.py but i had a error :
Mini example:
Label string: [1 0 3]
Char string: "ba"
Traceback (most recent call last):
File "testCustomOp.py", line 92, in <module>
testRealExample()
File "testCustomOp.py", line 81, in testRealExample
res = testCustomOp(mat, corpus, chars, wordChars)
File "testCustomOp.py", line 24, in testCustomOp
assert len(chars) + 1 == mat.shape[2]
AssertionError
Can you help me ?
thanks you
Long
Hi,
Thank you for such complete repo with code examples and frameworks in both Python and CPP and tensorflow ops. Really amazing work!
If I understand correctly, this is only to be used during evaluation/prediction phase, to get more meaningful ones. I was wondering if it could also be used as a loss during training, such that paths which follow the language model are more probable than those which do not.
Or it is actually not relevant since CTC squashes all paths that produce the label and therefore all versions are maximised ?
Many thanks, again, for making your contributions public !
Compiling the buildTF only works with 1.3.0 version for other version, it gave the following error
./buildTF.sh
In file included from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/mutex.h:31:0,
from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/op.h:32,
from ../src/TFWordBeamSearch.cpp:1:
/usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:25:22: fatal error: nsync_cv.h: No such file or directory
compilation terminated.
any solution?
Thank you for this awesome repo! ;)
I was wondering why are not used log probabilities? Is the beam search stable even for long sequences?
If you create a new issue, please provide the following information:
pip3 freeze | grep tf
tf==1.0.0
pip2 freeze | grep tensorflow
tensorflow==1.12.0
Python 2.7.15
Python 3.7.0
Apple LLVM version 10.0.0 (clang-1000.10.44.4)
Target: x86_64-apple-darwin18.2.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
ProductName: Mac OS X
ProductVersion: 10.14.1
BuildVersion: 18B75
Single-threaded decoding
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorflow'
Your TF version is
TF versions 1.3.0, 1.4.0, 1.5.0 and 1.6.0 are tested
Compiling for TF 1.5.0 or 1.6.0 now ...
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorflow'
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorflow'
../src/TFWordBeamSearch.cpp:1:10: fatal error: 'tensorflow/core/framework/op.h' file not found
#include <tensorflow/core/framework/op.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Looking at the code, the pip3 tf
package lacks the headers shipped with the pip2 tensorflow
(which is available for Python 3.7 via a hack, see tensorflow/tensorflow#20444).
I did compile the binary by switching from python3.7 to python3.6 and python2.7.
Hi!
I made some changes in the code to be able to return the top n beams from BeamList::getBestBeams. However, I'm getting now what looks to be incomplete decodings even when I'm using a dataset that was already successfully decoded with the "normal" version of TFWordBeamSearch.
For instance, with the normal version I decode a result and I get 品川区大井4-56-7, but when I try with the modified version and ask for the top 10 paths, I get this:
[[' 品川区大井4-56', ' 品川区大井456-', ' 品川区大井4-6-', ' 品川区大井4--6', ' 品川区大井 4-5', ' 品川区大井4-65', ' 品川区大井4-56', ' 品川区大井4-56', ' 品川区大井4-56', ' 品川区大井4-56']]
Nowhere in the list I can find the result I got with the original version. Now, I'm not proficient at all in C++, so probably I'm missing something. What could be causing the issue?
EDIT: Sometimes when I'm assigning the labels to the decoded sequence I get an error because the decoded value is way higher than the total number of characters I get. I have around 4000 characters and I sometimes get a decoded value of more than 2000000.
These are the changes I made. Let me know if you'd like the whole files and how can I provide them to you.
Thank you in advance.
On WordBeamSearch.cpp
// Function definition (also modified on WordBeamSearch.hpp)
std::vector<std::vector<uint32_t>> wordBeamSearch(const IMatrix& mat, size_t nBestBeams, size_t beamWidth, const std::shared_ptr<LanguageModel>& lm, LanguageModelType lmType)
...
// Last part
std::vector<std::vector<uint32_t>> bestBeamsText;
auto bestBeams = last.getBestBeams(nBestBeams);
for (size_t t = 0; t < nBestBeams && t < bestBeams.size(); t++)
{
// return best entry
auto bestBeam = bestBeams[t];
bestBeam->completeText();
bestBeamsText.push_back(bestBeam->getText());
}
return bestBeamsText;
On TFWordBeamSearch.cpp
REGISTER_OP("WordBeamSearch")
.Input("mat: float32")
.Attr("nBestBeams: int") // Added this
.Attr("beamWidth: int")
.Attr("lmType: string")
.Attr("lmSmoothing: float")
.Attr("corpus: string")
.Attr("chars: string")
.Attr("wordChars: string")
.Output("result: int32")
...
private:
std::shared_ptr<LanguageModel> m_lm;
size_t m_beamWidth=0;
size_t m_numChars=0;
size_t m_nBestBeams=1; // Added this
LanguageModelType m_lmType=LanguageModelType::Words;
...
// read how many beams we're going to return
int64 nBestBeams = 0;
OP_REQUIRES_OK(context, context->GetAttr("nBestBeams", &nBestBeams));
m_nBestBeams = static_cast<size_t>(nBestBeams);
...
// on void Compute(OpKernelContext* context) override
...
OP_REQUIRES_OK(context, context->allocate_output(0, TensorShape({B, m_nBestBeams, T}), &outputTensor));
// Inside for(int b=0; b<B; ++b)
...
// apply decoding algorithm to batch element
const std::vector<std::vector<uint32_t>> decoded=wordBeamSearch(mat, m_nBestBeams, m_beamWidth, m_lm, m_lmType);
// write to output tensor
for (size_t n=0; n < m_nBestBeams; ++n)
{
for(int t=0; t<T; ++t)
{
outputMapped(b, n, t)=t<static_cast<int>(decoded.size()) ? decoded[n][t] : blank;
}
}
Hello,
Why not use unique words to build prefix tree?
In LanguageModel.py at line 53
self.tree.addWords(words) # add all unique words to tree
But I think 'words' contains duplicates, shouldn't it be 'uniqueWords' ?
Thanks
SR
Hi!
I've been trying your code and it works very well. I just have been experiencing an issue that maybe you can shed a light on.
I'm working on handwriting recognition, and when I run my program that eventually uses yours (loaded TFWordBeamSearch.so in Python), but after an almost certain number of processed images (around 250), memory gets full and the process ends with exit code 137. I noticed that memory usage goes up and was wondering if there's something happening (or something I can do) on the cpp side to avoid this behavior.
I'll appreciate any hints.
Thank you!
If you create a new issue, please provide the following information:
I want to recognize dates that are written in 19/10/1993
format. There are always 2 slashes. My dataset only contains the dates from 01/01/2018-31/12/2022
. This means that I have possible ~8000 targets. I treated each date as a single word then created txt files as below.
/0123456789
/0123456789
because slashes are in words (it is my word definition).However, in prediction, I got '1/01/201'
for 10/10/2019
. I couldn't understand why I got out of dictionary result. Normally, IIUC, word beam search permits arbitrary results for the characters that don't exist in charlist.txt. Here I don't have any chars that don't exist in wordcharlist.txt
Do you have any idea where I am doing wrong?
Thanks for your repo and your effort to open source community...
Hi,
I'm unable to generate proper mat file for my test data which would run and give me proper results.
Once mat file is generated using the code provided, i'm getting error while running main.py file.
Can you help me to figure out what could be the possible issue?
Thanks in advance!
I am trying to use more IAM validation data, but unable to understand how to get the mat_X.csv.
Dear Sir,
Thank you for making it compatible along with windows , Your kind suggestion is required to implement it in SimpleHTR code --wordbeamsearch option--
Code snippet of SimpleHTR-->Model.py
##################################
word_beam_search_module = tf.load_op_library('TFWordBeamSearch.so')
chars = str().join(self.charList)
wordChars = open('../model/wordCharList.txt').read().splitlines()[0]
corpus = open('../data/corpus.txt').read()
self.decoder =word_beam_search_module.word_beam_search(tf.nn.softmax(self.ctcIn3dTBC, dim=2), 50, 'Words', 0.0, corpus.encode('utf8'), chars.encode('utf8'), wordChars.encode('utf8'))
#################################
What should I replace it with to integrate CTCWordBeamSearch with SimpleHTR
Note-
**Initially i have tried to replace it with WordBeamSearch() , got confused with what to feed
**at my second try I try to convert tf.nn.softmax(self.ctcIn3dTBC, dim=2) into np.array(tf.nn.softmax(self.ctcIn3dTBC, dim=2)) results an error
** called testPybind() with above parameters results in error ..,,
Please may guide to which direction go next
Hello,
Thanks a lot for this really nice repo. I tried to use it but wasn't successful running the buildTF.sh script.
Here are the details versions for my environment:
Do you have any idea what's going on ?
Thanks for your help, and here is the (long) log:
../src/PrefixTree.cpp: In member function ‘void PrefixTree::allWordsAdded()’:
../src/PrefixTree.cpp:56:74: error: parameter declared ‘auto’
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:91: error: parameter declared ‘auto’
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp: In lambda function:
../src/PrefixTree.cpp:56:104: error: ‘lhs’ was not declared in this scope
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:116: error: ‘rhs’ was not declared in this scope
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp: In member function ‘std::shared_ptr<PrefixTree::Node> PrefixTree::getNode(const std::vector<unsigned int>&) const’:
../src/PrefixTree.cpp:152:96: error: parameter declared ‘auto’
auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
^
../src/PrefixTree.cpp:152:110: error: parameter declared ‘auto’
auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
^
../src/PrefixTree.cpp: In lambda function:
../src/PrefixTree.cpp:152:123: error: ‘p’ was not declared in this scope
auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
^
../src/PrefixTree.cpp:152:133: error: ‘val’ was not declared in this scope
auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘_FIter std::lower_bound(_FIter, _FIter, const _Tp&, _Compare) [with _FIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Tp = unsigned int; _Compare = PrefixTree::getNode(const std::vector<unsigned int>&) const::__lambda2]’:
../src/PrefixTree.cpp:152:139: required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2447:31: error: no match for call to ‘(PrefixTree::getNode(const std::vector<unsigned int>&) const::__lambda2) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, const unsigned int&)’
if (__comp(*__middle, __val))
^
../src/PrefixTree.cpp:152:82: note: candidates are:
auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2447:31: note: void (*)() <conversion>
if (__comp(*__middle, __val))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:2447:31: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:152:113: note: PrefixTree::getNode(const std::vector<unsigned int>&) const::__lambda2
auto iter = std::lower_bound(node->children.begin(), node->children.end(), c, [](const auto& p, const auto val) {return p.first < val; });
^
../src/PrefixTree.cpp:152:113: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘void std::__insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2226:70: required from ‘void std::__final_insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5500:55: required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128: required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2159:29: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
if (__comp(*__i, *__first))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2159:29: note: void (*)() <conversion>
if (__comp(*__i, *__first))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:2159:29: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘void std::__heap_select(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:5349:59: required from ‘void std::partial_sort(_RAIter, _RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2332:68: required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44: required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128: required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:1948:27: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
if (__comp(*__i, *__first))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:1948:27: note: void (*)() <conversion>
if (__comp(*__i, *__first))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:1948:27: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘void std::__move_median_to_first(_Iterator, _Iterator, _Iterator, _Iterator, _Compare) [with _Iterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2295:13: required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2337:62: required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44: required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128: required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:114:28: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
if (__comp(*__a, *__b))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:114:28: note: void (*)() <conversion>
if (__comp(*__a, *__b))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:114:28: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:116:25: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
if (__comp(*__b, *__c))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:116:25: note: void (*)() <conversion>
if (__comp(*__b, *__c))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:116:25: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:118:30: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
else if (__comp(*__a, *__c))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:118:30: note: void (*)() <conversion>
else if (__comp(*__a, *__c))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:118:30: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:123:33: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
else if (__comp(*__a, *__c))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:123:33: note: void (*)() <conversion>
else if (__comp(*__a, *__c))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:123:33: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:125:33: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
else if (__comp(*__b, *__c))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:125:33: note: void (*)() <conversion>
else if (__comp(*__b, *__c))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:125:33: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘_RandomAccessIterator std::__unguarded_partition(_RandomAccessIterator, _RandomAccessIterator, const _Tp&, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Tp = std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2296:78: required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2337:62: required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44: required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128: required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, const std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
while (__comp(*__first, __pivot))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: note: void (*)() <conversion>
while (__comp(*__first, __pivot))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (const std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
while (__comp(__pivot, *__last))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: note: void (*)() <conversion>
while (__comp(__pivot, *__last))
^
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/bits/stl_algo.h:61:0,
from /usr/include/c++/4.8.2/algorithm:62,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_heap.h: In instantiation of ‘void std::__adjust_heap(_RandomAccessIterator, _Distance, _Distance, _Tp, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Distance = long int; _Tp = std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’:
/usr/include/c++/4.8.2/bits/stl_heap.h:448:15: required from ‘void std::make_heap(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:1946:47: required from ‘void std::__heap_select(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5349:59: required from ‘void std::partial_sort(_RAIter, _RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2332:68: required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Size = long int; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44: required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’
../src/PrefixTree.cpp:56:128: required from here
/usr/include/c++/4.8.2/bits/stl_heap.h:313:40: error: no match for call to ‘(PrefixTree::allWordsAdded()::__lambda1) (std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&, std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >&)’
*(__first + (__secondChild - 1))))
^
../src/PrefixTree.cpp:56:60: note: candidates are:
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
In file included from /usr/include/c++/4.8.2/bits/stl_algo.h:61:0,
from /usr/include/c++/4.8.2/algorithm:62,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_heap.h:313:40: note: void (*)() <conversion>
*(__first + (__secondChild - 1))))
^
/usr/include/c++/4.8.2/bits/stl_heap.h:313:40: note: candidate expects 1 argument, 3 provided
../src/PrefixTree.cpp:56:94: note: PrefixTree::allWordsAdded()::__lambda1
std::sort(node->children.begin(), node->children.end(), [](const auto& lhs, const auto& rhs) {return lhs.first < rhs.first; });
^
../src/PrefixTree.cpp:56:94: note: candidate expects 0 arguments, 2 provided
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_algo.h: At global scope:
/usr/include/c++/4.8.2/bits/stl_algo.h:2110:5: error: ‘void std::__unguarded_linear_insert(_RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’, declared using local type ‘PrefixTree::allWordsAdded()::__lambda1’, is used but never defined [-fpermissive]
__unguarded_linear_insert(_RandomAccessIterator __last,
^
In file included from /usr/include/c++/4.8.2/bits/stl_algo.h:61:0,
from /usr/include/c++/4.8.2/algorithm:62,
from ../src/PrefixTree.cpp:3:
/usr/include/c++/4.8.2/bits/stl_heap.h:331:5: error: ‘void std::__pop_heap(_RandomAccessIterator, _RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’, declared using local type ‘PrefixTree::allWordsAdded()::__lambda1’, is used but never defined [-fpermissive]
__pop_heap(_RandomAccessIterator __first, _RandomAccessIterator __last,
^
/usr/include/c++/4.8.2/bits/stl_heap.h:178:5: error: ‘void std::__push_heap(_RandomAccessIterator, _Distance, _Distance, _Tp, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >*, std::vector<std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> > > >; _Distance = long int; _Tp = std::pair<unsigned int, std::shared_ptr<PrefixTree::Node> >; _Compare = PrefixTree::allWordsAdded()::__lambda1]’, declared using local type ‘PrefixTree::allWordsAdded()::__lambda1’, is used but never defined [-fpermissive]
__push_heap(_RandomAccessIterator __first, _Distance __holeIndex,
^
hello, thanks for your great work, when i run testCustomOp.py , wrong message is as below.
tensorflow.python.framework.errors_impl.NotFoundError: ../cpp/proj/TFWordBeamSearch.so: undefined symbol: _ZN10tensorflow22CheckNotInComputeAsyncEPNS_15OpKernelContextEPKc
Environment : TF 1.4 and Cuda 8.0
While, when I test it on TF 1.12 and cuda 9.0 it's ok.
The implementation looks promising.
Is there a prebuilt FWordBeamSearch.so available for testing?
It looks as though the decoder assumes every element in the batch has the same maximum sequence length. Many datasets do not normalize sequence length, but are instead varaible.
While this is issue is mildly inefficient if elements of wildly different lengths are in the same batch (of course, one could use bucketed batching to ameliorate that issue), my larger concern is with correctness.
Since the CTC model in tensorflow takes the sequence length vector for the batch for training, the model does not need to learn to produce the CTC blank for the padded regions.
Assuming everything so far is correct, I see two solutions:
Update the tf module to include a B-dimensional sequence length vector containing values in [1,T], where the RNN activations are TxBx(C+1). This argument is precisely the same as the sequence_length
parameter to tf.nn.ctc_loss
or tf.nn.ctc_beam_search_decoder
. Then wordBeamSearch
in WordBeamSearch.cpp
would need to be modified to use an element-specific maxT
as the loop bound.
The user modifies the batched RNN output so that P(blank)=1.0 for any value outside the sequence length for a given element. Though unpleasant, is does not seem particularly difficult to do on the python side of things. In contrast, it's not obvious how to do this well or efficiently with tensorflow ops.
Are there other ideas or commentary? How difficult is solution (1)? I'm certainly concerned I'll break the whole thing if I attempt it, but I may give it a go if necessary.
Thanks!
def WordBeamSearch(mat, beamWidth, lm, useNGrams):
"decode matrix, use given beam width and language model"
chars = lm.getAllChars()
blankIdx = len(chars) # blank label is supposed to be last label in RNN output
#mat = mat.cpu().numpy()
print(mat.shape)
maxT, _, _ = mat.shape # shape of RNN output: TxBxC
genesisBeam = Beam(lm, useNGrams) # empty string
last = BeamList() # list of beams at time-step before beginning of RNN output
last.addBeam(genesisBeam) # start with genesis beam
# go over all time-steps
for t in range(maxT):
curr = BeamList() # list of beams at current time-step
# go over best beams
bestBeams = last.getBestBeams(beamWidth) # get best beams
.....
The error occurs when to get best beams
and error message 'boolean value of tensor with more than one value is ambiguous. ' is popped up
at here.
def getBestBeams(self, num):
"return best beams, specify the max. number of beams to be returned (beam width)"
u = [v for (_, v) in self.beams.items()]
lmWeight = 1
return sorted(u, reverse=True, key=lambda x: x.getPrTotal() * (x.getPrTextual() ** lmWeight))[:num]
I changed the tensor into the numpy array, but it makes another error again.
'The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()'
I tried to find the solution and read your code for days, but I don't know what is problem.
Please help me. If you want to know more details about error, please let me know.
I look forward to your comment.
Thank you for your consideration.
Either segmentation fault or corruption error. It works well with Words mode and NGrams mode. Thank you
Hi Harald,
I am unable to compile TFWordBeamSearch.so (Custom TF Op) on MacOS Catalina 10.15.2. Getting error as below:
ld: library not found for -l:libtensorflow_framework.1.dylib
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Python version: 3.6.9
TF Version: 1.14.0
Any idea what could be wrong here. I tried adding additional MacOS flag -undefined dynamic_lookup in the g++ command that prepares the .so file. Please help.
I tried setting LIBRARY_PATH, CPPFLAGS, LDFLAGS, LD_LIBRARY_PATH but no luck yet. Anticipating best answer from your side. Thanks and have a good day. Stay safe.
If you create a new issue, please provide the following information:
TFWordBeamSearch
I'm assuming the next few things:
charList
is generated by the RNN, and contains all detected characters, which includes special characters like ,!#.
and so on.wordCharList
is a given by the user, denoting all characters that actually occur in words.The next condition is set in TFWordBeamSearch.cpp
, lines 108-111:
if(!(numWordChars > 0 && numWordChars < m_numChars))
{
throw std::invalid_argument("wordChars must contain at least one character and at least one character less than chars: 0<len(wordChars)<len(chars)");
}
It seems to me that numWordChars < m_numChars is enforced as m_numChars also contains non-word characters. The dataset that we use contains [a-z] only though, and so does the wordChars list ([a-z]). As such, it seems this precondition is too strong and should become:
if(!(numWordChars > 0 && numWordChars <= m_numChars))
Please correct me if I'm wrong.
Furthermore, great work on this project. Really top notch.
First, thanks for the great works, Harald. I’m trying to implement your Word Beam Search Algorithm to Vietnamese language. I modified the charList
, wordcharList
and the corpus
for my need. It's been working good for Words
and NGrams
mode. But when I tried to use it in either NGramsForecast
or NGramsForecastAndSample
, it raised this error.
Python(23489,0x70000ee70000) malloc: *** error for object 0x7fa625811e00: incorrect checksum for freed object — object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Python(23489,0x70000ef76000) malloc: *** error for object 0x7fa62714f940: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
zsh: abort python3 main.py — validate — wordbeamsearch
Which i have no ideal how to resolve. I'm using Tensorflow 1.7.0
on MacOS 10.13.5
, Ubuntu 16.04
and 18.04
BTW, you can get a sense of Vietnamese language, its structure via this wiki page: https://en.wikipedia.org/wiki/Vietnamese_language
So I really need your help here. What did i’ve missed ? What thing i can do to resolve this problem ?
Thanks you.
Has anyone tried this algorithm on ARM Architecture? It is taking very long(around 7 secs) for input dimension 700*80 with 100 BeamWidth in ARM processor which is around 5 times in comparison to x86 architecture(1.4 secs) for the same hyperparameters. Any optimization we can do to reduce execution time in an ARM to bring it down at least equal to x86?
Hi Team,
Is there a way to generate mat_x.csv ourselves. i was trying to do this reading the documentation. But I was unable to find T and B values. Is there any program to generate the file. Or Can you document the procedure to do end to end testing of the application.
File "main.py", line 33, in
res=wordBeamSearch(data.mat, 10, loader.lm, useNGrams)
File "C:\Users\My Pc\Desktop\M.Tech Thesis Project\code\CTCWordBeamSearch-mast
er\py\WordBeamSearch.py", line 33, in wordBeamSearch
prBlank=beam.getPrTotal()*mat[t, blankIdx]
IndexError: index 93 is out of bounds for axis 1 with size 80
Originally posted by @rushaligupta in #3 (comment)
Python prototype, version is 3.6
Same as Substract a maxValue to prevent overflow of e(y), it is not a logical problem, just a trick to prevent underflow.
As the probability of words is decimal and the path may be long, so the probability of sequences is the mulitiplication of many decimals, for example, p(seq)=0.10.1...*0.1, and this is a unadvisable operation for underflow, I want to fix it and wonder if you have tried to do something to prevent this.
A common usage is to use log, as log(a*b) = log(a) + log(b), such codes can be found in ctc_beam_search.h:
logsumexp = Eigen::numext::log(logsumexp);
float norm_offset = max_coeff + logsumexp;
And log(a+b) can be found in ctc_loss_util.h:
// Add logarithmic probabilities using:
// ln(a + b) = ln(a) + ln(1 + exp(ln(b) - ln(a)))
// The two inputs are assumed to be log probabilities.
// (GravesTh) Eq. 7.18
inline float LogSumExp(float log_prob_1, float log_prob_2) {
// Always have 'b' be the smaller number to avoid the exponential from
// blowing up.
if (log_prob_1 == kLogZero) {
return log_prob_2;
} else if (log_prob_2 == kLogZero) {
return log_prob_1;
} else {
return (log_prob_1 > log_prob_2)
? log_prob_1 + log1pf(expf(log_prob_2 - log_prob_1))
: log_prob_2 + log1pf(expf(log_prob_1 - log_prob_2));
}
}
Error while running the buildTF.sh file in windows using git bash.
Error message:
../src/TFWordBeamSearch.cpp:1:10: fatal error: tensorflow/core/framework/op.h: No such file or directory
#include <tensorflow/core/framework/op.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
g++ version:
g++ (MinGW.org GCC-8.2.0-3) 8.2.0
Operating System: Windows 10.
Searched in the system for the header file and it was present.
Included the path in the environmental variables but the error persists.
I'm getting the following error while running buildTF,sh file:
buildTF.sh: [: ==: binary operator expected
Single-threaded decoding
Your TF version is 1.3.0
TF versions 1.3.0, 1.4.0, 1.5.0 and 1.6.0 are tested
buildTF.sh: syntax error near unexpected token =(' buildTF.sh: buildTF.sh: line 52:
TF_CFLAGS =( $( python -c 'import tensor
flow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))' ))'
The system specifications are as follows:
OS- windows 8.1
Python: 3.5 version
Tensorflow: 1.3 version
gcc version 6.3.0 (MinGW.org GCC-6.3.0-1)
Therefore,
Not able to get 'TFWordBeamSearch.so' file.
Please help me out! Thanks in advance.
Is there a way to know a score of how probable is that the decodification is right according to the input CTC? I'm guessing it should be something like the accumulated probability of the path followed to get the output word. Is there a way to retrieve something like this?
How to set the sequence length parameter like in tensorflow api, thanks!
Follow these instructions to integrate word beam search decoding:
Can you please explain what step 2 is?
I'm unable to get that, as I want to generate .so file.
If you create a new issue, please provide the following information:
My input data (speech -> spectrogram) is fed into the model (CNN+RNN), and the model makes the output which has a shape of [sequence length (T) x batch size (B) x number of characters (C)]. e.g. (371, 32, 29)
Then it is fed into the function "def testPyBind(feedMat, corpus, chars, wordChars)"
My corpus.txt is composed of transcription of validation and test set. It has 146,709 sentences.
My chars.txt has 28 characters ' ABCDEFGHIJKLMNOPQRSTUVWXYZ
My wordchars.txt has 27 characters 'ABCDEFGHIJKLMNOPQRSTUVWXYZ (space is removed from chars.txt)
And then, decoder makes a list of sentences. The list has a batch length. e.g. 32 sentences
The first question is that output sentence from the decoder is 100% same with the corpus sentence.
I supposed that there is some character errors in the output sentence because it is the predicted sentence which is made from the model and decoder.
I don't understand why the decoder makes the sentence which is only in the corpus.
Second question is how can I parallel predicted sentence (output) and target sentence to calculate WER and CER.
I calculated CER and WER, but these sentences totally mismatch.
It is my result.
Epoch: 0/100 | Train loss: 6.999162
Epoch: 0/100 | Val loss: 2.889192
Average CER: 767.66% | Average WER: 1543.43%
Epoch: 5/100 | Train loss: 4.038735
Epoch: 5/100 | Val loss: 1.960615
Average CER: 734.83% | Average WER: 1154.09%
Epoch: 10/100 | Train loss: 3.314095
Epoch: 10/100 | Val loss: 1.570683
Average CER: 747.28% | Average WER: 1155.41%
....
When I use greedy search algorithm, I have a result like this.
Epoch: 0/100 | Train loss: 2.138510
Epoch: 0/100 | Val loss: 2.078584
Average CER: 65.45% | Average WER: 99.07%
Epoch: 5/100 | Train loss: 1.150841
Epoch: 5/100 | Val loss: 1.037751
Average CER: 32.14% | Average WER: 79.66%
Epoch: 10/100 | Train loss: 0.916023
Epoch: 10/100 | Val loss: 0.855249
Average CER: 26.44% | Average WER: 71.63%
...
Please reply my questions.
Thank you for your consideration.
I tried to compile it with tenserflow-gpu version 1.3.0, but its giving ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory error.
System: Ubuntu 16.04, GCC 5.4, Tensorflow 1.3 GPU,CUDA 8.0,CUDNN 5.1.
Please let me know where i am going wrong?
I'm using modified Keras OCR example the prediction code is here
also how I save the .csv file
when I tray to apply your code main.py I get wrong output I'm relay not knowing what is the problem this is the data I used to test the code
test.zip
notes when I open the text files in your data [bentham] and save the files as utf8 the output slightly change
Decoding 3 samples now.
Sample: 1
Filenames: ../data/bentham/mat_0.csv|../data/bentham/gt_0.txt
Result: "an"
Ground Truth: "brain."
Editdistance: 5
Accumulated CER and WER so far: CER: 0.7142857142857143 WER: 1.0
Sample: 2
Filenames: ../data/bentham/mat_1.csv|../data/bentham/gt_1.txt
Result: "supposed"
Ground Truth: "supposed"
Editdistance: 1
Accumulated CER and WER so far: CER: 0.375 WER: 1.0
Sample: 3
Filenames: ../data/bentham/mat_2.csv|../data/bentham/gt_2.txt
Result: "submitt,both,mental,and.brain"brain"
Ground Truth: "submitt, both mental and corporeal, is far beyond
Editdistance: 32
Accumulated CER and WER so far: CER: 0.5066666666666667 WER: 0.75
If you create a new issue, please provide the following information:
Validate NN
terminate called after throwing an instance of 'std::invalid_argument'
what(): check length of chars and wordChars: 0<len(wordChars)<=len(chars)```
My task is to recognize handwritten dates, such as "19/10/1993" it only contains slashes and numbers. Does it result from this?
How can I solve the issue?
Hi Harald
How do I prepare data and how can I create a TFWordBeamSearch.so?
thanks you
Long
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.