(old)Puppy Linux Discussion Forum

Posted: **Tue 19 Jan 2010, 08:57**

Pretty accurate OCR program for scanning document to text. I use it together with ImageMagick to break some silly visual CAPTCHA, and it works!

You must install two packages to get it work, main program and language file. I only packaged English language file, but you can get the others from http://code.google.com/p/tesseract-ocr/downloads/list.

Usage:
# tesseract input.tif output
The output will be written to the text file in the same directory.

More info: PuppyForum: tesseract-ocr

Download Mirror:
tesseract-2.04-i486.pet [600kB]
tesseract-2.00.eng.pet [1MB]

MD5sum:
7b8c127764e7c18f41726b2ca4faedc9 - tesseract-2.00.eng.pet
86db54dc487d8da7aaa378be11f23da2 - tesseract-2.04-i486.pet

Posted: **Thu 19 Nov 2015, 18:22**

Unfortunately these links are dead. I am keen to try this older version if anyone has it (sometimes older stuff works better in specific circumstances). cheers!

Posted: **Sun 31 Jan 2016, 04:56**

use Puppyocr, it does the job.

Posted: **Sun 31 Jan 2016, 12:52**

Here is an OCR project that is still maintained.

Posted: **Wed 03 Feb 2016, 11:33**

what i would like on this forum is to have feed back how applications work once installed. Tesseract is one of those i did'nt succed to use, but with Puppyocr i really convert old documents, very old document to text. Its not easy, much time must be spent words wrongly recognized, but it's possible..
If they are some users having success with tesseract, please point here.
Once installed.. auriza you are welcome
"Unpaper is a tool for straightening pages and removing black edges, including in the middle, where you have photocopied an open book! "
Sometimes you will wonder if you wont be faster by typing directly the text by hand. but there is something magick in OCR, not efficient, but pleasant,
What would be nice, it's voice recognition, you read, the computer writes (in french).

or in latin, for very very old papers.

Posted: **Wed 03 Feb 2016, 18:24**

What engine does Puppyocr use? Anything using tesseract or cuneiform should give you pretty good results if you feed it a good scan.

Posted: **Fri 05 Feb 2016, 03:54**

disciple wrote:What engine does Puppyocr use?

PuppyOCR appears to be Tesseract 2.04 with a GUI front-end. For some alternatives, read here.