Comments (6)
See https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html. All work on Tesseract is currently done by volunteers, so you are invited to find the answers to your questions and document them.
from tessdoc.
@stweil : Can you linkify the "100 languages" sentence in the README.md to point to that page?
from tessdoc.
@eyalroz I went ahead and propsed the change in the tesseract repo: tesseract-ocr/tesseract#4027
I also think it would be very helpful. Even though the list itself has no information on languages in v5 yet.
from tessdoc.
Even though the list itself has no information on languages in v5 yet.
There was no update for v5. All the v4 data files should work with Tesseract 5.x.
from tessdoc.
There was no update for v5. All the v4 data files should work with Tesseract 5.x.
That's at least not obvious from the table.
The information can be found in other parts of the docs, true. Users can easily miss it though.
Language model traineddata files same as listed above for version 4.0.0 can be used with Tesseract 5.x.x.
from tessdoc.
https://arxiv.org/pdf/2202.13274.pdf
from tessdoc.
Related Issues (20)
- Unlisted GUI HOT 1
- OCR convertion Issue
- Broken links in documentation HOT 1
- tesseract ocr HOT 1
- Alto versions HOT 1
- PDF workflow issues HOT 2
- Debian repository notesalexp.org down HOT 1
- Getting really bad results?
- Assertion failure when using the legacy ara.traineddata in latest tesseract version. HOT 2
- Regenerate language data for tessdata_best
- Method missing HOT 2
- `image_to_data` result doesn't match `image_to_string` HOT 4
- document C++17 req HOT 1
- Reading library spine covers HOT 1
- Issue with TESSDATA_PREFIX and Symbolic Links on macOS Ventura Using Homebrew HOT 3
- Usage of unicharsets that don't seem to be attached to a trained model
- What dataset is base tesseract 5 trained on? HOT 1
- Wrong Issue
- How to check which language are covered by scripts? HOT 1
- What should be the norm_mode for different languages?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tessdoc.