Comments (4)
This is how the model performs multi-word recognition. (I.e., allows it to recognize when to restart the dictionary word recognition process), which of course doesn't play nicely when you only want to recognize a single word, as indicated by your wordChars list.
I haven't thought deeply about changing the precondition as you suggest, but my way of programming around it was to insert a zero-probability character class that I then correct for after running the beam search:
# CTCWordBeamSearch requires a non-word char. We hack this by
# prepending a zero-prob " " entry to the rnn_probs
rnn_probs = tf.pad( rnn_probs,
[[0,0],[0,0],[1,0]], # Add one slice of zeros
mode='CONSTANT',
constant_values=0.0 )
chars = (' '+charset.out_charset).encode('utf8')
# Assume words can be formed from all chars--if punctuation is added
# or numbers (etc) are to be treated differently, more such
# categories should be added to the charset module
wordChars = chars[1:]
prediction,seq_prob = word_beam_search_module.word_beam_search(
rnn_probs,
sequence_length,
beam_width,
'Words', # Use No LM
0.0, # Irrelevant: No LM to smooth
corpus, # aka lexicon [are unigrams ignored?]
chars,
wordChars )
prediction = prediction - 1 # Remove hacky prepended non-word char
Note that my charset.out_charset
would be your [a-z].
from ctcwordbeamsearch.
I should add the above code uses a forked repo with a variable sequence length and exposing the resulting beam probability (see also #13).
The standard code wouldn't take the sequence_length
argument nor return the seq_prob
value.
from ctcwordbeamsearch.
as already explained by Jerod Weinman the characters are split into 2 sets to allow multi-word recognition (e.g. for lines containing multiple words separated by spaces, commas or other non-word-characters) .
If you only want to recognize single words (with charList==wordCharList), it makes sense to change the condition as you suggested.
from ctcwordbeamsearch.
@jonyvp: however, I think your use-case (recognizing only single words, but constrain them to dictionary words) is very common, therefore I've changed the < operator to the <= operator.
Thanks for your input!
from ctcwordbeamsearch.
Related Issues (20)
- Question about decoder output. HOT 6
- Compile custom TF operation HOT 2
- integrating (CTCWordBeamSearch)PureNumpy with (SimpleHTR --wordbeamsearch) HOT 8
- #Issue in running Custom TF operation HOT 4
- The Session graph is empty. Add operations to the graph before calling run(). HOT 1
- Result of Paper published HOT 2
- ./buildTF.sh giving compile errors HOT 2
- Compilation error for Mac HOT 3
- Possible to add both Word LM AND Character LM? HOT 4
- Installing ctcwordbeamsearch decoder HOT 4
- Removing the automatic use of spaces as word separators HOT 1
- Mac M1 pip3 install . HOT 6
- CTC word beam search usage for word spotting HOT 1
- What version of gcc is used HOT 4
- Unable to Install HOT 3
- Dictionary and text corpus HOT 2
- ValueError: the number of characters (chars) plus 1 must equal dimension 2 of the input tensor (mat) HOT 3
- ImportError : undefined symbol: _ZNSt15__ HOT 1
- Error with WordBeamSearch after cloning the repo and installing dependencies HOT 1
- error: subprocess-exited-with-error HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ctcwordbeamsearch.