otiai10 / gosseract Goto Github PK
View Code? Open in Web Editor NEWGo package for OCR (Optical Character Recognition), by using Tesseract C++ library
Home Page: https://pkg.go.dev/github.com/otiai10/gosseract
License: MIT License
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library
Home Page: https://pkg.go.dev/github.com/otiai10/gosseract
License: MIT License
github.com/otiai10/gosseract/tesseract/tess.cpp:1:31: fatal error: tesseract/baseapi.h: No such file or directory (Centos 7)
I am puzzled how to set the -l
parameter?
Tesseract actually provides a c library so you don't need to use tesseract running as a service.
Do you know how to do it?
Should add Digest().
I could implement it if you want.
api, _ := gosseract.API()
ver := api.Version()
fmt.Println(ver)
// 3.05.00
.
├── cmd
│ └── gosseract
% gosseract target.png
abcABC
go version go1.9.1 linux/amd64
# github.com/otiai10/gosseract/tesseract
In file included from /usr/include/tesseract/ltrresultiterator.h:26:0,
from /usr/include/tesseract/resultiterator.h:26,
from /usr/include/tesseract/baseapi.h:31,
from tess.cpp:1:
/usr/include/tesseract/unichar.h:164:10: ошибка: «string» does not name a type; did you mean «stdin»?
static string UTF32ToUTF8(const std::vector<char32>& str32);
^~~~~~
stdin
I tried to run the first test, but it failed with the following errors:
[wigywizzle@wigywizzle gosseract]$ go test ./...
Error opening data file /usr/share/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
FAIL github.com/otiai10/gosseract 0.006s
? github.com/otiai10/gosseract/tesseract [no test files]
Error opening data file /usr/share/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
FAIL github.com/otiai10/gosseract/tesseract/test 0.006s
% go test ./...
# github.com/otiai10/gosseract/tesseract
tesseract/tess.cpp:1:31: fatal error: tesseract/baseapi.h: No such file or directory
compilation terminated.
FAIL github.com/otiai10/gosseract [build failed]
My system is windows10 64 bit, my gcc use https://github.com/go-vgo/Mingw,
At first,error is : tesseract/baseapi.h' file not found ,
Then I went to download class libraries :
Above errors disappear.
but And there's a new mistake.
d:/git/mingw/bin/../lib/gcc/x86_64-w64-mingw32/4.8.2/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -llept
d:/git/mingw/bin/../lib/gcc/x86_64-w64-mingw32/4.8.2/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -ltesseract
collect2.exe: error: ld returned 1 exit status
Spent 2 days, and now do not know how to do, and hope to be resolved, thank you!
Add runtime test for #77
related to #10 (comment)
when i use the following code to ocr a jpeg format pic.
img_url := "http://cityjw.dlut.edu.cn:7001/ACTIONVALIDATERANDOMPICTURE.APPPROCESS"
resp, err := client.Get(img_url)
if err != nil {
// handle error
}
defer resp.Body.Close()
OcrClient, _ := gosseract.NewClient()
img, _ := jpeg.Decode(resp.Body)
out, _ := OcrClient.Image(img).Out()
fmt.Println(out)
and then i got some error.
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x43926a]
goroutine 1 [running]:
runtime.panic(0x664a20, 0x965b48)
/usr/lib/go/src/pkg/runtime/panic.c:266 +0xb6
github.com/otiai10/gosseract.(*Client).Image(0x0, 0x7f008865e7a0, 0xc210059480, 0x0)
/home/halfcrazy/gocode/src/github.com/otiai10/gosseract/client.go:58 +0x13a
main.main()
/home/halfcrazy/gocode/src/school_helper/main.go:29 +0x204
txt := goss.Must(goss.Params{
Src: "./source.png",
Digest: "/Users/otiai10/digest.txt",
})
Please translate all comments and issues to English so other non-speaking (I do not even know which languages that is 👽) developers can help with the TODO.
As is (almost vacant)
package gosseract
type tesseract0303 struct {
version string
}
func (t tesseract0303) Version() string {
return t.version
}
func (t tesseract0303) Execute(args []string) (res string, e error) {
res = "tesseract0303"
return
}
I want to install package on my mac. but i get a error like this. I checked this file in project i cannot find. I think it is missing.
go get github.com/otiai10/gosseract
# github.com/otiai10/gosseract/tesseract
../../otiai10/gosseract/tesseract/tess.cpp:5:10: fatal error: 'tesseract/baseapi.h' file not found
// Generates tmp filepath
func genTmpFilePath() string {
id, _ := uuid.NewV4()
return TMPDIR + "/" + id.String()
}
ioutil.TempFile provides this
TempFile creates a new temporary file in the directory dir with a name beginning with prefix, opens the file for reading and writing, and returns the resulting *os.File. If dir is the empty string, TempFile uses the default directory for temporary files (see os.TempDir). Multiple programs calling TempFile simultaneously will not choose the same file. The caller can use f.Name() to find the pathname of the file. It is the caller's responsibility to remove the file when no longer needed.
package gosseract_test
,
NOT using gospel
Related #13
go get github.com/otiai10/gosseract
/tmp/go-build043179523/github.com/otiai10/gosseract/tesseract/_obj/wrapper.cgo2.o: In function `_cgo_f34bd392845b_Cfunc_simple':
/usr/share/go/src/pkg/github.com/otiai10/gosseract/tesseract/wrapper.go:35: undefined reference to `simple'
collect2: ld returned 1 exit status
The build failed during tests because of this error too.
https://travis-ci.org/otiai10/gosseract/builds/89569700
tip
You can use any tagged version of Go or use tip to get the latest version.
http://docs.travis-ci.com/user/languages/go/
All go version management is handled by gimme.
such as
Servant
- cmdManager
- fileManager
../../vendor/github.com/otiai10/gosseract/goss.go:3:8: no buildable Go source files in /go/src/vendor/github.com/otiai10/gosseract/tesseract
go build:
CGO_ENABLED=0 GOOS=windows GOARCH=amd64 go build app/server
% go test ./...
# github.com/otiai10/gosseract/tesseract
tesseract/tess.cpp:1:10: fatal error: 'tesseract/baseapi.h' file not found
FAIL github.com/otiai10/gosseract [build failed]
when i use `gox -osarch="darwin/amd64" to build; it shows:
1 errors occurred:
--> darwin/amd64 error: exit status 1
Stderr: ../vendor/github.com/otiai10/gosseract/goss.go:3:8: no buildable Go source files in /home/viggo/Documents/GoPath/src/vend
or/github.com/otiai10/gosseract/tesseract
but it's ok when i use gox -osarch="linux/amd64"
And my system is debian 8
[root@localhost gosseract]# go test
/tmp/go-build841552903/gosseract/_test/gosseract.test: error while loading shared libraries: liblept.so.5: cannot open shared object file: No such file or directory
exit status 127
FAIL gosseract 0.001s
[root@localhost gosseract]# ls /usr/local/lib/
codecs/ liblept.so.5.0.1 libtesseract.so.4
liblept.a libpython3.6m.a libtesseract.so.4.0.0
liblept.la libtesseract.a pkgconfig/
liblept.so libtesseract.la python3.6/
liblept.so.5 libtesseract.so
[root@localhost gosseract]# ls /usr/local/lib/libtesseract.
libtesseract.a libtesseract.so libtesseract.so.4.0.0
libtesseract.la libtesseract.so.4
[root@localhost gosseract]# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
[root@localhost gosseract]# go test
Info in bmfCreate: Generating pixa of bitmap fonts from string
Warning. Invalid resolution 0 dpi. Using 70 instead.
Info in bmfCreate: Generating pixa of bitmap fonts from string
Warning. Invalid resolution 0 dpi. Using 70 instead.
Info in bmfCreate: Generating pixa of bitmap fonts from string
Warning. Invalid resolution 0 dpi. Using 70 instead.
all_test.go at line 37
Expected to be 42
But actual 03:41:26
--- FAIL: Test_Must_WithDigest (0.42s)
all_test.go at line 42
Expected to be <nil>
But actual No tesseract version is found, supporting 3.02~, 3.03~, 3.04~ and 3.05~
--- FAIL: Test_NewClient (0.01s)
--- FAIL: TestClient_Src (0.01s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x520c7d]
goroutine 13 [running]:
testing.tRunner.func1(0xc420069040)
/usr/local/go/src/testing/testing.go:622 +0x29d
panic(0x54a340, 0x8119c0)
/usr/local/go/src/runtime/panic.go:489 +0x2cf
gosseract.TestClient_Src(0xc420069040)
/data/software/src/gosseract/all_test.go:55 +0x3d
testing.tRunner(0xc420069040, 0x57e468)
/usr/local/go/src/testing/testing.go:657 +0x96
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:697 +0x2ca
exit status 2
FAIL gosseract 0.707s
Hi,
I was reading through the Tesseract docs here on improving the OCR output quality. It mentions that setting the tessedit_write_images
config variable allows a user to view the input file after initial processing by Tesseract.
Would it be possible to add this feature to the wrapper? It seems that a combination of api->SetVariable("tessedit_write_images", writeimages);
and api->GetThresholdedImage()
in tess.cpp
would allow saving the .tif file. I attempted this in a branch but I don't have much C++ experience and it didn't seem to work...
Thank you
Is it possible to set the languages like this instead of having to use .Must()
?
client, _ := gosseract.NewClient()
client.Languages("eng+heb")
I'd like to use tesseract with go on Windows 7.
During the installation process, as stated in the docs I execute
c:\go\src\proj>go get github.com/otiai10/gosseract
# github.com/otiai10/gosseract/tesseract
C:\go\src\github.com\otiai10\gosseract\tesseract\tess.cpp:1:31: fatal error: tesseract/baseapi.h: No such file or directory
#include <tesseract/baseapi.h>
^
compilation terminated.
And by searching the file system for the header file baseapi.h, I cannot find it.
How can I solve this? Thank you
I have installed everything and after build the go file with readme example, i have an error:
ld: warning: ld: warning: ignoring file /usr/local/lib/liblept.dylib, file was built for x86_64 which is not the architecture being linked (i386): /usr/local/lib/liblept.dylibignoring file /usr/local/lib/libtesseract.dylib, file was built for x86_64 which is not the architecture being linked (i386): /usr/local/lib/libtesseract.dylib
What can be happen?
Is there a way to edit the PSM argument that gets passed into Tesseract?
I cannot build with go build
nor I can go get github.com/otiai10/gosseract
,
I get error:
go build github.com/otiai10/gosseract/tesseract: C:\go\pkg\tool\windows_386\cgo.exe: exit status 2
I installed CGO, tried restarting my pc, nothing works.
should be automated test
is there a way to set the language eg, eng+heb ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.