ocr-d / ocrmultieval Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 0.0 13.15 MB

Extensible evaluation of (intermediate) results of an OCR workflow

Python 96.07% Makefile 0.53% Shell 3.40%

ocr ocr-d ocr-evaluation

ocrmultieval's Introduction

ocrmultieval

Proof-of-concept for extensible evaluation of (intermediate) results of an OCR workflow

Installation

make deps install

Architecture

All evaluation functionality is provided by backends.

Every backend inherits from EvalBackend and must implement a compare_files method, that accepts paths to and media types of the Ground Truth and detection results, does the actual evaluation and returns an EvalReport.

An EvalReport is a map of metrics to their resp. value and can be serialized as JSON or CSV for further processing/analysis.

The glue code for running the backends is in ocrmultieval.runner.py.

Usage

CLI

The ocrmultieval compare command line tool allows evaluating individual pages of GT and detection with any of the available backends.

Usage: ocrmultieval compare [OPTIONS] {dinglehopper|ocrevalUAtion|PrimaTextEva
                            l|CorAsvAnnEval|CorAsvAnnCompare|OcrdSegmentEvalua
                            te|IsriOcreval} GT_FILE OCR_FILE

Options:
  --gt-mediatype TEXT
  --ocr-mediatype TEXT
  --format [csv|json|yaml|xml]
  -g, --pageId TEXT             pageId to uniquely identify pages in a work
  --help                        Show this message and exit.

OCR-D processor

The ocrd-ocrmultieval command line tool implments the OCR-D processor API and can be used to process complete workspaces.

Usage: ocrd-ocrmultieval [OPTIONS]

  Evaluate

  > Eval processor

Options:
  -I, --input-file-grp USE        File group(s) used as input
  -O, --output-file-grp USE       File group(s) used as output
  -g, --page-id ID                Physical page ID(s) to process
  --overwrite                     Remove existing output pages/images
                                  (with --page-id, remove only those)
  -p, --parameter JSON-PATH       Parameters, either verbatim JSON string
                                  or JSON file path
  -P, --param-override KEY VAL    Override a single JSON object key-value pair,
                                  taking precedence over --parameter
  -m, --mets URL-PATH             URL or file path of METS to process
  -w, --working-dir PATH          Working directory of local workspace
  -l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
                                  Log level
  -C, --show-resource RESNAME     Dump the content of processor resource RESNAME
  -L, --list-resources            List names of processor resources
  -J, --dump-json                 Dump tool description as JSON and exit
  -h, --help                      This help message
  -V, --version                   Show version

Parameters:
   "backend" [string - "PrimaTextEval"]
    Backend to use
    Possible values: ["PrimaTextEval", "ocrevalUAtion", "dinglehopper",
    "OcrdSegmentEvaluate", "IsriOcreval", "CorAsvAnnCompare"]
   "format" [string - "csv"]
    Output format
    Possible values: ["csv", "json", "yaml", "xml"]
   "config" [object]
    Configuration to override default

Default Wiring:
  ['GT,OCR1'] -> ['GT_VS_OCR1']

ocrmultieval's People

Contributors

Stargazers

Watchers

ocrmultieval's Issues

Add calamari-eval backend

CorAsvAnnEval: wrap CLI instead of API

https://github.com/kba/ocrmultieval/blob/5de79f3021b48f83f9cb798a484fd472d21ed94b/ocrmultieval/backends/CorAsvAnnEval.py#L18-L20

I can see it would be hard to repeat everything in ocrd_cor_asv_ann.scripts.compare. So why not instead call the CLI and have it write its report where we want it?

Alternatively, keep the API as it is and emulate Click's context as in:

from ocrd_cor_asv_ann.scripts.compare import cli as standalone_cli

...

def compare_files(self, gt_mediatype, gt_file, ocr_mediatype, ocr_file, pageId):
    outfile = StringIO()
    kwargs = {'output_file': outfile, 
              'gt_file': gt_file,
              'ocr_files': [ocr_file],
              # add more of your non-default choices here, e.g. normalization, gt_level, confusion, histogram
              # if gt_mediatype is plaintext path list, you need to set file_lists=True
    }
    standalone_cli(**kwargs)
    report = outfile.getvalue()
    # dive into this file pair's aggregates
    metrics = report[gt_file + ',' + ocr_file]
    # maybe delete the individual line metrics
    del metrics['lines']
    return self.make_report(gt_file, ocr_file, pageId, **metrics)

OcrdSegmentEvaluate: use different metric

https://github.com/kba/ocrmultieval/blob/5de79f3021b48f83f9cb798a484fd472d21ed94b/ocrmultieval/backends/OcrdSegmentEvaluate.py#L27-L28

This retrieves only the mAP score, which is the least useful/adequate, and has only been added for comparison with similar benchmarks. The better keys would be precision | recall | pixel_precision | pixel_recall | pixel_iou | oversegmentation | undersegmentation under either by-category → category or by-image → pageid → category.

deployment: separate dependencies for backends?

I'm afraid the heavy toll of packages like ocrd_segment and ocrd_cor_asv_ann (both of which require TF) might not only be annoying here, but quickly lead to clashes between backends.

I understand it would be harder to integrate backends via CLI, but that would make it possible to call them from sub-venvs or Docker, and opt out of them (at least during ocrmultieval setup).

OcrdSegmentEvaluate: ensure binarized image fits page

https://github.com/kba/ocrmultieval/blob/5de79f3021b48f83f9cb798a484fd472d21ed94b/ocrmultieval/backends/OcrdSegmentEvaluate.py#L21-L23

This does not cover the case where the binary image is itself cropped or deskewed, i.e. does not represent the full PAGE. It will run into an assertion failure if not satisfied. You better watch the @comments for cropped or deskewed. Either you find some image without them (i.e. binarized before cropping), or do not pass the binary image (which will effectively run without only-fg, i.e. on the full segment masks). Also, try to fetch an image without clipped (as these obviously distort the evaluation).

In ocrd_segment.evaluate.EvaluateSegmentation (the OCR-D wrapper) we add relative coordinates in this case (i.e. whatever is consistent with the binary image).

ocr-d / ocrmultieval Goto Github PK

ocrmultieval's Introduction

ocrmultieval

Installation

Architecture

Usage

CLI

OCR-D processor

ocrmultieval's People

Contributors

Stargazers

Watchers

ocrmultieval's Issues

Add calamari-eval backend

CorAsvAnnEval: wrap CLI instead of API

OcrdSegmentEvaluate: use different metric

deployment: separate dependencies for backends?

OcrdSegmentEvaluate: ensure binarized image fits page

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent