Has anyone here tried the new Tesseract OCR engine that google has recently released? I gave it a whirl last night. IT compiles more or less fine on Centos 4.4 if you don't mind lots of warnings. Took a scanned image I already had that contained a column of newspaper text, GIMP'ed it to cut everything but the text, increased contrast to get rid of grayish background, saved as uncompressed tiff. Fired up tesseract and it is STILL running, around ten hours later, consuming 90% of the CPU. This doesn't seem right... Clues? -- ---- Fred Smith -- fredex at fcshome.stoneham.ma.us ----------------------------- The Lord detests the way of the wicked but he loves those who pursue righteousness. ----------------------------- Proverbs 15:9 (niv) ----------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available URL: <http://lists.centos.org/pipermail/centos/attachments/20061001/6d7c0a17/attachment-0003.sig>
On Sun, 1 Oct 2006, fredex wrote:> Has anyone here tried the new Tesseract OCR engine that google has recently > released? > > I gave it a whirl last night. IT compiles more or less fine on Centos 4.4 > if you don't mind lots of warnings. > > Took a scanned image I already had that contained a column of newspaper > text, GIMP'ed it to cut everything but the text, increased contrast to > get rid of grayish background, saved as uncompressed tiff. > > Fired up tesseract and it is STILL running, around ten hours later, > consuming 90% of the CPU. This doesn't seem right... > > Clues?Wrong community. Don't know what the value for this post is, chances are this is not related to CentOS 4.4 and you have better response on a Tesseract or OCR related forum. Thanks for playing, better luck next time :) -- dag wieers, dag at wieers.com, http://dag.wieers.com/ -- [all I want is a warm bed and a kind word and unlimited power]