Tesseract 2.04
Posted: Tue 19 Jan 2010, 08:57
Pretty accurate OCR program for scanning document to text. I use it together with ImageMagick to break some silly visual CAPTCHA, and it works!
You must install two packages to get it work, main program and language file. I only packaged English language file, but you can get the others from http://code.google.com/p/tesseract-ocr/downloads/list.
Usage:
# tesseract input.tif output
The output will be written to the text file in the same directory.
More info: PuppyForum: tesseract-ocr
Download Mirror:
tesseract-2.04-i486.pet [600kB]
tesseract-2.00.eng.pet [1MB]
MD5sum:
7b8c127764e7c18f41726b2ca4faedc9 - tesseract-2.00.eng.pet
86db54dc487d8da7aaa378be11f23da2 - tesseract-2.04-i486.pet
You must install two packages to get it work, main program and language file. I only packaged English language file, but you can get the others from http://code.google.com/p/tesseract-ocr/downloads/list.
Usage:
# tesseract input.tif output
The output will be written to the text file in the same directory.
More info: PuppyForum: tesseract-ocr
Download Mirror:
tesseract-2.04-i486.pet [600kB]
tesseract-2.00.eng.pet [1MB]
MD5sum:
7b8c127764e7c18f41726b2ca4faedc9 - tesseract-2.00.eng.pet
86db54dc487d8da7aaa378be11f23da2 - tesseract-2.04-i486.pet