skvark / textractor Goto Github PK
View Code? Open in Web Editor NEWOCR application for Sailfish OS. Based on Tesseract OCR engine and Leptonica image processing library.
License: MIT License
OCR application for Sailfish OS. Based on Tesseract OCR engine and Leptonica image processing library.
License: MIT License
Leptonica throws error:
findFileFormatStream: failed to read first 12 bytes of file
Something has changed in new update. Cropping itself seems to be working...
Hi,
I really like the potential of this app, and would like it even more if I could have it in my native language. I took a look at the .ts file and found that it is far from complete. Are you planning to update translation file and implement localisation? If so, I would be the first to translate.
Great job anyway. Thanks!
Due to Google code shutdown Tesseract OCR codebase along other data was moved to GitHub. The data will be archived to Google Code archive, but it will be more future proof to refactor the language data downloading so that it will fetch the data from GitHub according to the release of Tesseract.
3.04 contains already a set of new language data files and Textractor still uses the 3.02 version files.
Todo:
If the phone is laying flat (camera pointing down) it is pretty much impossible to detect (or to know/guess) the correct orientation for the image. Solution is to add button to camera page which makes it able to toggle manual lock on different orientations.
Stuff needed: icon for all 4 orientations.
Currently Textractor overwrites the old preprocessed image when new recognition is run. Do we need to save all the results or is it enough to see the latest preprocessed image?
It might be possible that Tesseract leaks memory if the recognition is canceled. If this is the case, cancel function will not be implemented.
Following icons are missing:
Fix: try to find icon which exists on SFOS 2.0 and in earlier systems too.
Is it possible to completely skip the image processing? Send the image directly to tesseract?
Thank you for your work,
Cosmin Popescu.
Allow user to crop the image before processing.
Settings page contains complex parameters. Write guide how to adjust them.
Additionally split the settings page to advanced and basic settings.
Textractor interferes with Jolla's camera app sometimes because the camera is not always unloaded properly in this situation.
Currently the camera is not unloaded and this makes the app consume too much power if user pushes the app to the background after recognition.
If user tries to process an image which was taken with Jolla's camera app and the orientation was not standard, image will appear in wrong orientation after preprocessing since afaik Jolla's camera app only adds exif orientation information and does not actually rotate the image and seems that Leptonica does not support exif data.
Fix: find a way to read the exif data and rotate the image before preprocessing.
Currently only jpeg is confirmed to be working. This may need changes to Leptonica spec file to be able to build it with additional image formats.
Scanners often create pdf-documents when you use them for copying. It would be real handy if the text in those pdfs would be available.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.