Tesseract 2.04

Word processors, spreadsheets, presentations, translation, etc.
Post Reply
Message
Author
User avatar
auriza
Posts: 46
Joined: Tue 06 Jan 2009, 02:32
Location: Surakarta, Java
Contact:

Tesseract 2.04

#1 Post by auriza »

Pretty accurate OCR program for scanning document to text. I use it together with ImageMagick to break some silly visual CAPTCHA, and it works!

You must install two packages to get it work, main program and language file. I only packaged English language file, but you can get the others from http://code.google.com/p/tesseract-ocr/downloads/list.

Usage:
# tesseract input.tif output
The output will be written to the text file in the same directory.

More info: PuppyForum: tesseract-ocr

Download Mirror:
tesseract-2.04-i486.pet [600kB]
tesseract-2.00.eng.pet [1MB]

MD5sum:
7b8c127764e7c18f41726b2ca4faedc9 - tesseract-2.00.eng.pet
86db54dc487d8da7aaa378be11f23da2 - tesseract-2.04-i486.pet

User avatar
greengeek
Posts: 5789
Joined: Tue 20 Jul 2010, 09:34
Location: Republic of Novo Zelande

#2 Post by greengeek »

Unfortunately these links are dead. I am keen to try this older version if anyone has it (sometimes older stuff works better in specific circumstances). cheers!

Pelo

use Puppyocr, it does the job.

#3 Post by Pelo »

use Puppyocr, it does the job.

User avatar
rcrsn51
Posts: 13096
Joined: Tue 05 Sep 2006, 13:50
Location: Stratford, Ontario

#4 Post by rcrsn51 »

Here is an OCR project that is still maintained.

Pelo

feed back how applications work once installed.

#5 Post by Pelo »

what i would like on this forum is to have feed back how applications work once installed. Tesseract is one of those i did'nt succed to use, but with Puppyocr i really convert old documents, very old document to text. Its not easy, much time must be spent words wrongly recognized, but it's possible..
If they are some users having success with tesseract, please point here.
Once installed.. auriza you are welcome
"Unpaper is a tool for straightening pages and removing black edges, including in the middle, where you have photocopied an open book! "
Sometimes you will wonder if you wont be faster by typing directly the text by hand. but there is something magick in OCR, not efficient, but pleasant,
What would be nice, it's voice recognition, you read, the computer writes (in french). :) or in latin, for very very old papers.

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#6 Post by disciple »

What engine does Puppyocr use? Anything using tesseract or cuneiform should give you pretty good results if you feed it a good scan.
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

User avatar
rcrsn51
Posts: 13096
Joined: Tue 05 Sep 2006, 13:50
Location: Stratford, Ontario

#7 Post by rcrsn51 »

disciple wrote:What engine does Puppyocr use?
PuppyOCR appears to be Tesseract 2.04 with a GUI front-end. For some alternatives, read here.

Post Reply