Skip to content
This repository has been archived by the owner on Oct 3, 2022. It is now read-only.

error msg "No image suitable for OCR" is too vague #21

Open
ghost opened this issue Mar 15, 2017 · 1 comment
Open

error msg "No image suitable for OCR" is too vague #21

ghost opened this issue Mar 15, 2017 · 1 comment

Comments

@ghost
Copy link

ghost commented Mar 15, 2017

Every document I receive from a particular source is deemed "unsuitable" by ocrodjvu and results in a session that looks like this:

$ ocrodjvu --debug --engine=tesseract -l eng --in-place document.djvu
Processing 'document.djvu':

The same error results if cuneiform is the engine, so apparently the error is not coming from the engine. Is ocrdjvu enforcing a certain image property, such as DPI? I see no image requirements in the manpage, so certainly It would be useful if the error message would list the requirements, and ideally indicate the unmet ones.

@jwilk
Copy link
Member

jwilk commented Apr 6, 2017

Thanks for the bug report.

Yes, the warning comes from ocrodjvu itself. I agree that the message is rather obscure.

By default, ocrodjvu passes only page's mask to the OCR engine. (See the --render option in the manpage.)
The warning is emitted if there was no mask at all for this page.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

1 participant