Code Monkey home page Code Monkey logo

Comments (3)

githubharald avatar githubharald commented on July 18, 2024 1

Hi,

can you please dump the content of wbs_chars.encode('utf8') and word_chars.encode('utf8') by doing

print(wbs_chars.encode('utf8'))
print(word_chars.encode('utf8'))

The output of the RNN must have C+1 entries as it includes the special "CTC blank" character, there are C characters to be recognized, and the word characters should be less than C, e.g. C-1, as this does not include word separation characters like a whitespace.

To give an example: the RNN outputs the characters " AB~" where "~" denotes the special character, the characters that we can recognize by such a model are " AB", and the word characters are "AB", as we use the whitespace " " as a word separation character (as in most languages).

Here is an example of how to use it: https://github.com/githubharald/SimpleHTR/blob/master/src/model.py#L142
And this is where the error comes from, you can see the condition for this error check there:

throw std::invalid_argument("the number of characters (chars) plus 1 must equal dimension 2 of the input tensor (mat)");

from ctcwordbeamsearch.

UniDuEChristianGold avatar UniDuEChristianGold commented on July 18, 2024 1

Thank you for your answer.
Yes, I understand that the word_chars is a smaller subset of chars.

Here is the printout:
print(wbs_chars.encode('utf8'))
b' !"#&'()*+,-./|\0123456789:;?ABCDEFGHIJKLMNOPRQSTUVWXYZabcdefghijklmnopqrstuvwxyz|}\xc3\x84\xc3\x9c\xc3\x9f\xc3\xa4\xc3\xb6\xc3\xbc\xe2\x80\x9c\xe2\x80\x9e'

print(word_chars.encode('utf8'))
b"'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xc3\xa4\xc3\xb6\xc3\xbc\xc3\x84\xc3\x96\xc3\x9c\xc3\x9f"

I added another printout to the NPWordBeamSearch.cpp (not sure if TF vs. NP is causing a difference here, but I doubt that, as the chars shouldn't be influenced by this.):
std::cout << "maxC " << maxC << " m_numChars " << m_numChars <<'\n';
-> maxC 92 m_numChars 90
so, there is one character missing at m_numChars/wbs_chars as it should be 91.

print(len(wbs_chars)) -> 91(!)
print(wbs_chars.encode('utf8'))
print(word_chars.encode('utf8'))
self.wbs_decoder = WordBeamSearch(50, 'Words', 0.0, corpus.encode('utf8'), wbs_chars.encode('utf8'),
word_chars.encode('utf8'))

so it seems like that one character is lost during:
m_numChars = m_lm->getAllChars().size();

I was able to track down the issue. I added the character | twice in my list. With getAllChars double characters are removed. Thank you so much for your help

from ctcwordbeamsearch.

githubharald avatar githubharald commented on July 18, 2024 1

Good that you found the issue 👍 .
Just as a side-mark: as you removed one character from your list, be sure that the order in which the characters occur now in the list is the same as they occur in the RNN output.

from ctcwordbeamsearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.