Please help! Running <div class="snippet-clipboard-content notranslate position-re

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The current model can only distinguish these categories, but the pixel classifie

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Running tests delivers no segmentation,about ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation

Comments (13)

wrznr commented on September 27, 2024 1

@crater2150 Many thanks for your efforts.

Any progress on the OCR-D vs. non-OCR-D invocation issue?

from ocrd-pixelclassifier-segmentation.

wrznr commented on September 27, 2024

Any ideas what's going wrong in my case? Does make test-cli work for anyone?

from ocrd-pixelclassifier-segmentation.

crater2150 commented on September 27, 2024

It seems like the pixel classifier is giving different results when running in the segmentation tool, which is strange.
I'm looking into it.

from ocrd-pixelclassifier-segmentation.

wrznr commented on September 27, 2024

@crater2150 Many thanks. I would really like to test your tool. How would I run it outside the segmentation tool?

from ocrd-pixelclassifier-segmentation.

crater2150 commented on September 27, 2024

It looks like the standalone pixel classifier applies some preprocessing to the image during loading the file from disk, which is circumvented by loading the image via the workspace. I'm currently separating the processing from the loading and testing if that fixes the issue.

@wrznr To run the pixel classifier by itself, you can use

ocr4all-pixel-classifier predict --load $PATH_TO_MODEL \
  --binary $PATH_TO_IMAGE --images $PATH_TO_IMAGE \
  --color_map color_map.json --char_height $XHEIGHT --output out

with color_map.json containing:

{"(255, 255, 255)": [0, "none"], "(255, 0, 0)": [1, "text"], "(0, 255, 0)": [2, "image"]}

Which should run the pixel classifier and write three subfolders to out/ with color being the CNN output (red = text, green = image, white = neither). overlay contains the output combined with the input image for visualization. inverted is also a combination of output and input, where all background pixels are black and foreground pixels are the color of their classification.

from ocrd-pixelclassifier-segmentation.

wrznr commented on September 27, 2024

@crater2150 Thank you! I will test this asap. But two questions pop up:

Does this mean the pixel classifier is only able to distinguish text from image and non-text (@cneud)?
Did the test ever ran successfully in your environment? If yes, what was the configuration? If not, why do you provide at all?

from ocrd-pixelclassifier-segmentation.

crater2150 commented on September 27, 2024

The current model can only distinguish these categories, but the pixel classifier can in theory be trained with more classes (you can set the number of classes during training and provide a json file like the one above defining which color represents which class in the mask). As the results with the current training data were worse (text often not being detected), we currently only have this model.
I tried the tool locally with files read from disk, and wrote the test to check if the ocrd-interface could read files from the workspace and pass them on, but did not check the results, so I missed the difference in file reading. Sorry.

from ocrd-pixelclassifier-segmentation.

wrznr commented on September 27, 2024

@crater2150 Many thanks for the information! It would be very helpful if you could document the creation of your current model on a step-by-step basis for example as a Gist. We could then extend the model more easily as soon as more and more reliable training data become available.

from ocrd-pixelclassifier-segmentation.

crater2150 commented on September 27, 2024

Ok, I will create a gist with the model creation tomorrow.

I fixed some bugs in the segmentation that caused no output to be produced with the ocrd-wrapper. But for one of the two images, no output segments are produced even with the fixes, while the other one works. I have a suspicion about the reason (based on the CNN output, only a single segment should be created, which maybe leads to problems in the XYCut implementation), but will have to check.

from ocrd-pixelclassifier-segmentation.

crater2150 commented on September 27, 2024

@wrznr I added examples for dataset preparation and training to the pixel classifier repositorty, after adding and polishing the tool for mask generation.

from ocrd-pixelclassifier-segmentation.

crater2150 commented on September 27, 2024

@wrznr as I mentioned before, the test should now output something, it just fails in the case of only a single region being found on the page.
As I'm currently sick, I can't say for sure if I can fix that issue before the workshop, sorry :(

from ocrd-pixelclassifier-segmentation.

crater2150 commented on September 27, 2024

It seems that the model file shipped in this repo was broken. I replaced it with another one (trained on 9137 pages from DTA-2). This does produce a segmentation for me with make test-cli and a fresh environment. Can you retry it?

from ocrd-pixelclassifier-segmentation.

VolkerHartmann commented on September 27, 2024

Now segmentation produce results. 👍
Attention: Will not work in combination with the dewarp of DFKI.

from ocrd-pixelclassifier-segmentation.

Running tests delivers no segmentation about ocrd-pixelclassifier-segmentation HOT 13 CLOSED

Comments (13)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent