Code Monkey home page Code Monkey logo

tesseract-ocr's Introduction

Viam OCR Vision Service Module

This Viam vision service module uses tesseract-ocr through the gosseract wrapper and allows you to process an image extract text information from it. An example could be to extract license plate information to automatically open gates etc.

Tesseract is incredibly powerful and provides a vast variety of configuration parameters. The most important setting is the page segmentation mode tessedit_pageseg_mode. This article provides a great overview over the different modes: https://pyimagesearch.com/2021/11/15/tesseract-page-segmentation-modes-psms-explained-how-to-improve-your-ocr-accuracy/

Configure Component

Add this sample configuration to the smart machine "components" part either in RAW JSON mode or through the we user interface by choosing "local service" in the menu.

    {
      "name": "license-plates",
      "type": "vision",
      "namespace": "rdk",
      "model": "felixreichenbach:vision:ocr",
      "attributes": {
        "languages": [
          "eng"
        ],
        "parameters": {
          "tessedit_char_blacklist": "*+",
          "tessedit_pageseg_mode": "7"
        },
        "tessdata_local": "./tessdata/",
        "tessdata_remote": "https://github.com/tesseract-ocr/tessdata_fast/raw/main/"
      }
    }

You can find a table of all possible tesseract configuration attributes here.

Build the Module

From within the "src" directory run:

go build -o ../bin/ocr .

BUILD INSTRUCTIONS MAC

To be able to successfully build the module, the following libraries are required. I also uninstalled leptonica and tesseract with brew ignoring dependencies as mentioned here: otiai10/gosseract#234 (comment)

wget http://www.leptonica.org/source/leptonica-1.78.0.tar.gz or https://github.com/DanBloomberg/leptonica/releases/tag/1.84.1
tar -xzvf leptonica-1.78.0.tar.gz
cd leptonica-1.78.0
./configure
make && sudo make install
brew install automake

git clone https://github.com/tesseract-ocr/tesseract.git
cd tesseract
./autogen.sh
./configure
make
sudo make install

Build on Ubuntu

sudo apt install build-essential

sudo apt install pkg-config

# If Unable to find a valid copy of libtoolize or glibtoolize in your PATH!
sudo apt-get install libtool

sudo apt install libjpeg-dev

sudo apt-get install libleptonica-dev

# Install tesseract libraries
git clone https://github.com/tesseract-ocr/tesseract.git
cd tesseract
./autogen.sh
./configure
# make seems not required -> done as part of the next step
sudo make install

go build -o ../bin/ocr .

alternatively:
CGO_ENABLED=1 GOARCH=arm64 go build --ldflags '-extldflags "-fopenmp -L/usr/local/lib/ -Bstatic -ltesseract"' -o ../bin/tesseract-ocr .

# Check dynamically linked libs on file
ldd fileName

# Missing tesseract lib fixed with:
sudo ldconfig


# Build go binary with adding tesseract library statically
go build --ldflags '-extldflags "-fopenmp -L/usr/local/lib/ -Bstatic -ltesseract"' -o ../bin/tesseract-ocr .


Build AppImage

Sample makefile: https://github.com/jeremyrhyde/viam-rplidar/blob/main/Makefile

tesseract-ocr's People

Contributors

felixreichenbach avatar

Watchers

 avatar

tesseract-ocr's Issues

Remote OCR support

This may be a feature request or it may be a "how do I" request.

Is there a way to have a remote tesseract-ocr instance on say an Ubuntu server or Jetson Orin device, and access it remotely using a smaller Raspberry Pi or Jetson Nano?

Not sure whether this works or how it would be configured.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.