thalesxav / tesseractdotnet Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/tesseractdotnet
Automatically exported from code.google.com/p/tesseractdotnet
What steps will reproduce the problem?
1. when we run the wrapper in VS2010 using C# method RetriveResultDetail
2. does not return proper word when we check charlist.
3.
What is the expected output? What do you see instead?
the charlist must contain all the characters in any word. but (in some cases)
two boxes appear in single word one box covering the first character of the
word and the other box covering the entire word but in charlist the last
character is missing in such scenario
What version of the product are you using? On what operating system?
VS 2010 C# wrapper rc2_r552 in windows xp
Please provide any additional information below.
kindly note the sample ocred file that contains two boxes in the word
Tahsil/Taluk the charlist returns ascii value for Tahsil/Talu in one box and
k in seperate word for second box but actually the small box contains the
character T the return value for this small box is the ascii value of k
instead of ascii of T which is the small box that appears in the image
Original issue reported on code.google.com by [email protected]
on 9 Nov 2011 at 7:00
Attachments:
The source in svn appears to be out of date. For instance the latest downloads,
tesseractdotnetwrapper_r590.zip and IPoVn_Release_x86.zip at time of writing,
have additional methods and functionality compared to what is in the svn
repository.
There also appears to be two different versions of the
'tesseractenginewrapper.h' and 'tesseractenginewrapper.cpp' files one under
'.\dotnetwrapper\TesseractEngineWrapper' and another under
'.\dotnetwrapper\Source\api' where the former appears to be of an older version.
Assuming I haven't made some mistake would you be able to update the svn
repository so that we can build tesseractdotnetwrapper_r590 ourselves?
Original issue reported on code.google.com by [email protected]
on 21 Jul 2011 at 11:34
1. Error
1bppIndexed image -> AccessViolationException
Solution in ccmain->output.cpp->
void Tesseract::write_results( //write output
ETEXT_DESC *monitor,
WERD_RES *word, //word to do
BLOCK *block, //block it is from
ROW_RES *row, //row it is from
const STRING &text, //text to write
const STRING &text_lengths) {.....}
this function calls for 3 times "ocr_append_char" function but use pix_grey_
if you change it "pix_grey_" to "pix_binary_", the error improves
2. Error
large image(greater than 127*100 chars in image) -> AccessViolationException
First solution
tesseract->tesseractenginewrapper.cpp->void
TesseractProcessor::InitializeMonitor(){..}
this function change "fixed_buffer_factor" variable value
forexample Increase from 100 to 1000,
Second solution
you can write function in api for request from .net users "fixed_buffer_factor"
value
Third solution
"monitor" varriable is array, if change such as "linked-list" dynamic varriable
3. Error Encoding problem
spare time, I want to look at this event to c++ code, but easy soliton on .net
platform
string k= Encoding.UTF8.GetString(Encoding.Default.GetBytes(tesseractProcessor.Apply(bmp)));
i think safer RetriveResultDetail funtion than 590's layout manager
Original issue reported on code.google.com by [email protected]
on 10 Jun 2012 at 10:42
What steps will reproduce the problem?
1.compilation tesseract-ocr-3.02-vs2008
What is the expected output? What do you see instead?
"error 1 error C1083: Can not open include file:“allheaders.h”: No such
file or directory"
What version of the product are you using? On what operating system?
vs2010 64bit
Please provide any additional information below.
I can't find "allheaders.h" in tesseract-ocr-3.02.02(the source code).
why the source code doesn't include "allheaders.h"?
Original issue reported on code.google.com by [email protected]
on 29 Jun 2013 at 1:55
What steps will reproduce the problem?
1. bool succed = api->Recognize(monitor) >= 0;
succed return true, at function RetriveResultDetail,
int nChars = head->count;
the nChars is always zero.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
tesseract 3.02 r729,Windows XP, VS2008
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 7 Jul 2012 at 1:31
Operating System.
Windows 7 64 bit
Visual Studio 2010 C#
Simple Example app download from
esseractdotnet - Revision 41: /trunk/dotnetwrapper/TesseractEngineWrapper
Http://tesseractdotnet.googlecode.com/svn/trunk/dotnetwrapper/TesseractEngineWra
pper/
What steps will reproduce the problem?
1. Build the simple example application
2. (5 Warnings) Unreachable code detected
ImageViewer.cs Line 156
Histogram.cs line 201 / 210
GreyImage.cs line 253
RGBImage.cs line 309
What is the expected output? Try to run app.. Tesseract.OCr.AppEntry
Error under windows 7.... Stopped working
Original issue reported on code.google.com by [email protected]
on 22 May 2011 at 6:07
var _ocr = new TesseractProcessor();
_ocr.SetPageSegMode(ePageSegMode.PSM_SINGLE_CHAR);
_ocr.SetVariable("tessedit_char_whitelist",
"ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789");
_ocr.Init(Program.AppPath + "tessdata\\", "eng",
(int)Enums.EOcrEngineMode.TesseractOnly);
-- expecting only alphanumeric output, i'm getting all kinds of weird characters
!*&() etc
I've read that the blacklist overrides the whitelist if the blacklist is null
or empty... does this mean that the whitelist is ignored if the blacklist isn't
specified?
What version of the product are you using? On what operating system?
r591 on windows 7
Please provide any additional information below.
also need to get at the confidence level for the characters...
Original issue reported on code.google.com by [email protected]
on 24 Aug 2011 at 5:52
Hi,
Windows 7
I'm working on visual studio 2010, and i need to know how to add the french
language, And should i know which version of Emgu i have ? if so how to do this.
Thanks
Original issue reported on code.google.com by [email protected]
on 9 Aug 2014 at 2:35
What steps will reproduce the problem?
I am using the vs 3 .net wrapper.
When I run the function Recognize it ocrs the image fine and I can get
the string.
I need the confidence level of each character, but it is always 0.
What am I doing wrong?
Dim image As New Bitmap("C:\MyImage.tif")
Dim ocr As New TesseractProcessor
ocr.Init(Nothing, "eng", False)
Console.WriteLine(ocr.Recognize(image))
ocr.InitForAnalysePage()
ocr.SetVariable("tessedit_thresholding_method", "1")
ocr.SetVariable("save_best_choices", "T")
Dim doc As DocumentLayout = ocr.AnalyseLayout(image)
For Each blk As OCR.TesseractWrapper.Block In doc.Blocks
Console.WriteLine("Block Confidence: " & blk.Confidence)
For Each para As Paragraph In blk.Paragraphs
Console.WriteLine("para Confidence: " &
para.Confidence)
For Each ln As TextLine In para.Lines
Console.WriteLine("ln Confidence: " &
ln.Confidence)
For Each wrd As Word In ln.Words
Console.WriteLine("wrd Confidence: " &
wrd.Confidence)
Console.WriteLine("wrd Text: " & wrd.Text)
For Each ch As Character In wrd.CharList
Console.WriteLine("V:" & ch.Value)
Console.WriteLine("C:" & ch.Confidence)
Next
Next
Next
Next
Next
What is the expected output? What do you see instead?
The confidence is always zero.
What version of the product are you using? On what operating system?
tesseract engine 3.x .net wrapper v1.0 RC2
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 15 Mar 2012 at 2:19
What steps will reproduce the problem?
1. Attached sample files can be OCRed using non .net wrapper
2. But cannot be OCRed using .Net wrapper; It gives all garbage
3.
What is the expected output? What do you see instead?
If the character are not broken, the .net wrapper works great. But the attached
images are out of dot matrix images.
If legacy Tesseract can OCR the sample images why not the attached one?
Also how can we update the "eng.Traineddata" file for .net wrapper. Especially,
if its possible to update the "eng.Traineddata" in legacy Tesseract.
What version of the product are you using? On what operating system?
tesseractdotnetwrapper_r590
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 27 Jun 2014 at 1:42
Attachments:
What steps will reproduce the problem?
1. ran it with psm 0/1/6, yet did not see auto rotation of image
What is the expected output? What do you see instead?
was expecting image to be rotated to correct orientation
What version of the product are you using? On what operating system?
latest build, win 7
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 31 Aug 2012 at 7:10
What steps will reproduce the problem?
1._ocr.Apply(image)
2. string result = _ocr.Apply(image);
3.List<Word> detectedWords = _ocr.RetriveResultDetail();
What is the expected output? What do you see instead?
If the image has only alphabets, the word list should contain words made of
alphabets, however i get words containing numbers.
What version of the product are you using? On what operating system?
Windows XP, Dotnet wrapper 3.02 with tessdata version 3.01
Please provide any additional information below.
tessdata language "eng"
Original issue reported on code.google.com by [email protected]
on 6 Jun 2015 at 8:24
What steps will reproduce the problem?
1.after 3.2, the eng.traineddata is big. > 20M. and if i use that in vs2012.
exception told me : System.AccessViolationException. in :string iden =
ocr.ToCR(bitmap);
and the command line display: actual_tessdata_num_entries_ <=
TESSDATA_NUM_ENTRIES:Error:Assert faild: in file ..\ccutil\tessdatamanager.cpp,
line48
if i use some little traineddata file , not that propblem . so please queickly
build a 3.2 version please
Original issue reported on code.google.com by [email protected]
on 12 Jan 2013 at 4:40
What steps will reproduce the problem?
1. Create new WinForms project
2. Add reference to tesseractengine3.dll
3. var x = new TesseractProcessor();
What is the expected output? What do you see instead?
Main form window
What version of the product are you using? On what operating system?
RC
Please provide any additional information below.
Thrown System.IO.FileLoadException
This exception is thrown if the file is not a valid .NET Framework assembly.
Many thanx for this library. I search for something last 2 week.
Original issue reported on code.google.com by [email protected]
on 28 Feb 2011 at 2:15
Hello everybody! Can you help me to resolve the below problem?
1. How to do pre-processing image (such as Binarization, Noise Detection &
Reduction, Skew & orientation detection...) (use source code tesseract ocr
3.01) ?
2. How to do "page layout analysis" combine with recognize characters (use
source code tesseract ocr 3.01)?
Original issue reported on code.google.com by [email protected]
on 11 Aug 2011 at 1:28
What steps will reproduce the problem?
1.I got downloaded the VietOCR.Net3.2 and have done so many changes as i wants,
Its running well when i run the project from visual studio.
2.Now i tried to make it as set up file, There is no error during the process
of set up creating,
3.But When i open the project after installing in start->program, it shows the
following error
What is the expected output? What do you see instead?
Could not load file or assembly 'tesseract, Version=0.0.0.0, Culture=neutral,
PublicKeyToken=null' or one of its dependencies. The system cannot find the
file specified.
What version of the product are you using? On what operating system?
visual studio 2008
Please provide any additional information below.
I have included the set up file which i generated,
Please help any one for my Issue
Original issue reported on code.google.com by [email protected]
on 20 Mar 2012 at 1:18
Attachments:
I have done the training as specified in the site for burmese language.
Instead of using another scanned page, i am trying to use the same image which
i used for training tesseract.
So this procedure should give maximum accuracy.
What steps will reproduce the problem?
1. Please find attached the trained data and the tiff file i used for training
(For testing i used paper scan tiff image of dpi 300)
2. RUn tesseract for the same image with the attached trained data.
3. Still the tesseract get confused with the characters. Accuracy is only 60%
What is the expected output? What do you see instead?
Since the same training image is used for recognition, the accuracy must be
high.
I am not sure why tesseract has problem to identify the characters.
Please help me , how to proceed with this
What version of the product are you using? On what operating system?
Tesseract 3.02 on windows 7 64 bit
Original issue reported on code.google.com by [email protected]
on 10 Jul 2013 at 5:43
What steps will reproduce the problem?
1. Checked out the project via svn (svn co
https://tesseractdotnet.googlecode.com/svn/trunk/dotnetwrapper)
2. Opened in Visual Studio 2010
3. Hit F5 to load the error, running it while not in debug simply closes down
the program
What is the expected output? What do you see instead?
I expected the form application to load, instead it simply closes
What version of the product are you using? On what operating system?
Revision 48, Windows 7, on Visual Studio 2010
Please provide any additional information below.
Just downloaded the project today and figured I'd play around with it and see
what results I get. However, when I'm trying to run the project I get the below
error:
Could not load file or assembly 'tesseractengine3, Version=0.0.0.0,
Culture=neutral, PublicKeyToken=null' or one of its dependencies. The
application has failed to start because its side-by-side configuration is
incorrect. Please see the application event log or use the command-line
sxstrace.exe tool for more detail. (Exception from HRESULT: 0x800736B1)
at Tesseract.OCR.AppEntry.MainForm..ctor()
at Tesseract.OCR.AppEntry.Program.Main() in C:\Users\Guest\Desktop\dotnetwrapper\TesseractBasedOCRAnalysis\Tesseract.OCR.AppEntry\Program.cs:line 36
Line 36 refers to the new MainForm I found, but the actual error break occurs
on line 21 in the Main.Form.Designer.cs where this.end() is called. I feel that
my issue is simple since I haven't seen it on the forums. Anyway, thanks for
any help :D
Original issue reported on code.google.com by [email protected]
on 9 Aug 2011 at 8:29
What steps will reproduce the problem?
1.Create WPF project on Visual 2012 (C# , target framework .Net 4.0 and set
project to x86)
2. add references tesseractengine3.dll
3. create TesseractProcessor tp = new TesseractProcessor() ;
4. compile in debug and released and result will be the same.
5. I found "unhandled exception of type
'System.Windows.Markup.XamlParseException' occurred in
PresentationFramework.dll"
Additional information: 'The invocation of the constructor on type
'ProjectTest.MainWindow' that matches the specified binding constraints threw
an exception.' Line number '3' and line position '9'.
What is the expected output? What do you see instead?
Just the empty windows panel.
What version of the product are you using? On what operating system?
Visual studio 2012 express with .net framework 4.0
Please provide any additional information below.
I had done OCR project on my visual studio 2008 and it works very well.
This project, I try into VS2012 because kinect for Windows SDK is compatible
with VS2010 and newer, so the big error is occur.
I spend 1 week for this error with no solution.
Thanks for your help.
Original issue reported on code.google.com by [email protected]
on 30 Aug 2013 at 10:42
Attachments:
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
Tesseract Net Wrapper x64
What version of the product are you using? On what operating system?
Win 7 x64
Please provide any additional information below.
Hello all,
Has anyone succeeded in compiling this project as a native x64 dll?
Is it possible?
The unfortunate situation is the current dll is compiled as x86 so
it cannot be included in an x64/any CPU project in windows.
Thanks for any assistance
Original issue reported on code.google.com by [email protected]
on 6 Sep 2011 at 12:55
What steps will reproduce the problem?
1. _ocrProcessor.Apply(image_object)
2.
3.
What is the expected output? What do you see instead?
I expect it to accept any image of type System.Drawing.Image. Instead, i'm
getting a corrupt memory message.
What version of the product are you using? On what operating system?
1.0 on windows vista 32 bit
Please provide any additional information below.
Whenever i save the image as a tiff file using FreeImage.NET the wrapper has no
problems loading the tiff file from the path and ocr'ing it, but when i pass a
tiff image object into into apply method, the wrapper complains about corrupt
memory. I've also tried bitmaps with the same results. It would be nice if the
wrapper took any System.Drawing.Image object and converted the image into a
format that tesseract will not choke on.
One more thing. I'm also not receiving the IList<Word> results when calling
RecieveResults. Other than that, i want to thank the author for the time and
effort put into this library. I really appreciate it.
Original issue reported on code.google.com by [email protected]
on 10 May 2011 at 11:57
What steps will reproduce the problem?
1. processor.Recognize(bmp) or rocessor.AnalyseLayout(bmp)
Tesseract 2 had the possibility to retrieve the confidence and position of each
word in the OCRWord class. I noticed when testing with the latest version that
this class is empty and doesn't contains a word and the confidence is always 0
even when i OCRed a image upsidedown.
How can i acces these values?
Original issue reported on code.google.com by [email protected]
on 12 Jul 2011 at 12:14
What steps will reproduce the problem?
1. Load tesseract.dll into a VS2008 VB.NET project
2. Go to Object Browser or try to use functions in question.
What is the expected output? What do you see instead?
The functions appear when the same DLL is loaded into a C# project. It is
expected they would appear in a VB.NET project. They do not.
What version of the product are you using? On what operating system?
Version in 7/4/2011 build. tesseract.dll SHA1 is
146404737CE2D6F1A934BE54FF5A0817BEC82A81.
Please provide any additional information below.
The functions in question (detailed below) appear when using tesseract.dll in a
C# project. However, when you bring the DLL into a VB.NET project, the
functions are nowhere to be found.
Functions in question:
AnalyzeLayoutBinaryImage(byte*, int, int)
AnalyzeLayoutGreyImage(byte*, int, int)
AnalyzeLayoutGreyImage(ushort*, int, int)
RecognizeBinaryImage(byte*, int, int)
RecognizeGreyImage(byte*, int, int)
RecognizeBinaryImage(ushort*, int, int)
Am I missing a setting or a tweak in the VS2008 environment that would bring
them in? Are there functions that aren't supposed to show up in VB.NET? Any
help or guidance provided would be appreciated.
Original issue reported on code.google.com by [email protected]
on 6 Jun 2013 at 10:27
What steps will reproduce the problem?
1.twice unexpected power interruption while using ocr
2.uninstall and re-install did not work
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
using win 7 32 bit
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 19 Mar 2014 at 5:05
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
To be able to build the .dll in Release mode.
What version of the product are you using? On what operating system?
r48; Win7 64-bit
Please provide any additional information below.
The .dll is built fine in Debug mode; however, when the Solution/Project is
switched to Release mode, no .dll is generated.
Original issue reported on code.google.com by [email protected]
on 17 Jul 2011 at 1:31
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
Expect to receive Unicode string; got UTF-8 string instead.
What version of the product are you using? On what operating system?
r42, Win7 64-bit
Please provide any additional information below.
Modify TesseractProcessor::Process(TessBaseAPI* api, Pix* pix) method in
TesseractRecognizer.cpp as follows:
Old:
String* result = new String(text);
New:
String* result = new String(text, 0, strlen(text), Encoding::UTF8);
Original issue reported on code.google.com by [email protected]
on 3 Jul 2011 at 2:13
What steps will reproduce the problem?
1.using libtesseract303.dll with jna wrapper tess4j in windows server 2012
machine
2.tessAPI1.java should read libtessract303 from the respective path
3.
What is the expected output? What do you see instead?
should read the text from the image.
What version of the product are you using? On what operating system?
tess4j 64 bit version 64 bit dlls on windows server 2012
Please provide any additional information below.
jna is checking for libtesseract303.dll in respective path but not able to read
the libtesseract303.dll file. I think it may be due to compatibilty issue
between dlls and windows server 2012
Kidly provide the solution for it.
Original issue reported on code.google.com by [email protected]
on 11 Dec 2014 at 10:46
I'm passing the tiff to tesseract.doOCR(imageFile); but before doing this, i
want the orientation of my multipage tiff to be in portrait.
How to achieve this. I'm attaching a tiff file. I want each page to be in
portrait before processing it.
Thanks In advance.
Original issue reported on code.google.com by [email protected]
on 3 Oct 2011 at 5:15
Attachments:
What steps will reproduce the problem?
1. Followed Steps outlined in the Wiki
What is the expected output? What do you see instead?
Expected compilation of DLL. Fails with one error
"error C2061: syntax error : identifier 'FILE'"
"d:\tesseract-ocr\api\baseapi.h 134"
What version of the product are you using? On what operating system?
VS2008, 32 bit Vista.
Please provide any additional information below.
After making the changes to the tesseract project
Configuration Type: Dynamic Library (.dll) Common Language Runtime Support: Old
Syntax (/clr:oldSyntax)
Output File: tesseractengine3.dll
Also need to add System, System.Drawing assembly
tesseractengine3.dll DOES compile
After adding in the tesseractenginewrapper.h and tesseractenginewrapper.cpp
files , the project will not compile
Original issue reported on code.google.com by [email protected]
on 2 Mar 2011 at 3:15
What steps will reproduce the problem?
this wrapper is great , how to make it compatible with the latest version r581?
I download the demo source you supplied and compile it displayed "Failed to
initialize Tesseract Engine 3.01" when start running it .I noticed the selected
Tesseract Data Path always add the slash at the end of it( see the attachment
below) ,I remember that the Tesseract 2.04 data path must be without the ending
slash . I removed the ending slash ,compiled and run it , it works .
sorry for my bad english .
what version of the product are you using? On what operating system?
win7 32bit vs2008
Original issue reported on code.google.com by [email protected]
on 12 Apr 2011 at 8:24
Attachments:
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
r42, Win7 64-bit
Please provide any additional information below.
Add a method to recognize a rectangular region of the image. Followed are the
changes:
Add to TesseractRecognizer.cpp:
String* TesseractProcessor::Recognize(System::Drawing::Image* image,
System::Drawing::Rectangle rect)
{
if (_apiInstance == null || image == null)
return null;
String* result = "";
Pix* pix = null;
try
{
pix = PixConverter::PixFromImage(image);
if (rect != System::Drawing::Rectangle::Empty)
this->EngineAPI->SetRectangle(rect.Left, rect.Top, rect.Width, rect.Height);
result = this->Process(this->EngineAPI, pix);
}
catch (System::Exception* exp)
{
throw exp;
}
__finally
{
if (pix != null)
{
pixDestroy(&pix);
pix = null;
}
}
return result;
}
Add to TesseractEngineWrapper.h a declaration:
String* Recognize(System::Drawing::Image* image, System::Drawing::Rectangle
rect);
Original issue reported on code.google.com by [email protected]
on 3 Jul 2011 at 2:21
What steps will reproduce the problem?
1. use the application with french text
2.
3.
What is the expected output? What do you see instead?
special characters éèà... not recognized correctly
What version of the product are you using? On what operating system?
last version - windows
Please provide any additional information below.
the bug could be corrected in tesseractenginewrapper.cpp :
static wchar_t *make_unicode_string(const char *utf8)
{
int size = 0, out_index = 0;
wchar_t *out;
/* first calculate the size of the target string */
int used = 0;
int utf8_len = strlen(utf8);
while (used < utf8_len) {
int step = UNICHAR::utf8_step(utf8 + used);
if (step == 0)
break;
used += step;
++size;
}
out = (wchar_t *) malloc((size + 1) * sizeof(wchar_t));
if (out == NULL)
return NULL;
/* now convert to Unicode */
used = 0;
while (used < utf8_len) {
int step = UNICHAR::utf8_step(utf8 + used);
if (step == 0)
break;
UNICHAR ch(utf8 + used, step);
out[out_index++] = ch.first_uni();
used += step;
}
out[out_index] = 0;
return out;
}
System::Collections::Generic::List<Word*>*
TesseractProcessor::RetriveResultDetail()
{
if (!_doMonitor || _monitorInstance == null)
return null;
System::Collections::Generic::List<Word*>* wordList = null;
ETEXT_DESC* monitor = null;
ETEXT_DESC* head = null;
Word* currentWord = null;
try
{
monitor = (ETEXT_DESC*)_monitorInstance.ToPointer();
head = &monitor[1];
int lineIndex=0;
int lineIdx = 0;
int nChars = head->count;
int i = 0;
int j;
while (i < nChars)
{
EANYCODE_CHAR* ch = &(head + i)->text[0];
if (ch->blanks > 0)
{ /*new word condition meets*/
if (currentWord != null)
wordList = currentWord->UpdateConfidenceAndInsertTo(wordList);
currentWord = null; // reset current word
}
if (currentWord != null &&
(ch->left <= currentWord->Left || ch->top >= currentWord->Bottom))
{ /*new line condition meets*/
wordList = currentWord->UpdateConfidenceAndInsertTo(wordList);
lineIdx++;
currentWord = null; // reset current word
}
if (currentWord == null)
{ /*create new word*/
currentWord = new Word();
currentWord->LineIndex = lineIdx;
currentWord->FontIndex = ch->font_index;
currentWord->PointSize = ch->point_size;
currentWord->Formating = ch->formatting;
}
unsigned char unistr[24];
for (j = i; j < nChars; j++)
{
const EANYCODE_CHAR* unich = &(head + j)->text[0];
if (ch->left != unich->left || ch->right != unich->right ||
ch->top != unich->top || ch->bottom != unich->bottom)
break;
unistr[j - i] = static_cast<unsigned char>(unich->char_code);
}
unistr[j - i] = '\0';
wchar_t *utf16ch=make_unicode_string(reinterpret_cast<const char*>(unistr));
Character* c = new Character(
static_cast<char>(*utf16ch),
ch->confidence,
ch->left, ch->top, ch->right, ch->bottom);
/* update current word */
currentWord->CharList->Add(c);
System::String* sc = new String(*utf16ch, 1);
currentWord->Text = System::String::Format(
"{0}{1}", currentWord->Text->ToString(), sc);
free(utf16ch);
currentWord->Left = Math::Min(currentWord->Left, (int)ch->left);
currentWord->Top = Math::Min(currentWord->Top, (int)ch->top);
currentWord->Right = Math::Max(currentWord->Right, (int)ch->right);
currentWord->Bottom = Math::Max(currentWord->Bottom, (int)ch->bottom);
currentWord->Confidence += ch->confidence;
i=j; /*go to next char*/
} /* end while */
if (currentWord != null)
wordList = currentWord->UpdateConfidenceAndInsertTo(wordList);
}
catch (System::Exception* exp)
{
throw exp;
}
__finally
{
currentWord = null;
head = null;
monitor = null;
}
return wordList;
}
Original issue reported on code.google.com by [email protected]
on 25 May 2011 at 4:11
hi ,
I did search in a lot of forums to found an easy tesseract exemple to ocr
image in vb.net but i cant find a simple response or a complete one begins from
what should i refer to how to initialise tesseract nor how to do the traitement
on image .
please if anyone have the time to explain to me thanks in advance.
Original issue reported on code.google.com by [email protected]
on 11 Jun 2015 at 5:02
What steps will reproduce the problem?
1. Process analyze
2.
3.
What is the expected output? What do you see instead?
List with tesseract.words
What version of the product are you using? On what operating system?
Windows XP x86
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 31 Mar 2011 at 6:55
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.