| Author |
Message |
Dingo

Joined: 11 Dec 2007 Posts: 1397 Location: somewhere at the end of rainbow...
|
Posted: Fri 19 Aug 2011, 11:46 Post subject:
tesseract 3.0 Subject description: OCR engine for puppy 3.01, 4.3.1, 5.2.5 |
|
tesseract 3.0
http://www.dokupuppylinux.info/programs:ocr?&#tesseract-30
for Puppy 3.01 direct download
for Puppy 4.3.1 direct download
for Puppy 5.2.5 direct download
dependencies:
liblept
leptonica-1.68-i486.pet
- additional language data
| Quote: | total 213 MB once unpacked
bul.traineddata cat.traineddata ces.traineddata chi_sim.traineddata chi_tra.traineddata dan-frak.traineddata dan.traineddata deu-frak.traineddata deu.traineddata ell.traineddata eng.traineddata fin.traineddata fra.traineddata heb.traineddata hrv.traineddata hun.traineddata ind.traineddata ita.traineddata jpn.traineddata kor.traineddata lav.traineddata lit.traineddata nld.traineddata nor.traineddata pol.traineddata por.traineddata ron.traineddata rus.traineddata slk-frak.traineddata slv.traineddata spa.traineddata srp.traineddata swe-frak.traineddata swe.traineddata tgl.traineddata tur.traineddata ukr.traineddata vie.traineddata |
- as sfs v. 4 (for puppy 4.3.x-5.2.x series) - tesseract3-langdata_431.sfs
- as tar compressed with lzma (.xz) - tesseract3-langdata.tar.xz and type:
(then move the language files needed in /usr/share/tessdata)
puppy 5.2.5 users can easily extract content of this archive with xarchiver
puppy 4.3.1 users need xz utils
changelog
| Quote: | 2010-09-21 - V3.00
* Preparations for thread safety:
* Changed TessBaseAPI methods to be non-static
* Created a class hierarchy for the directories to hold instance data,
and began moving code into the classes.
* Moved thresholding code to a separate class.
* Added major new page layout analysis module.
* Added HOCR output (issues 221, 263: thanks to amkryukov).
* Added Leptonica as main image I/O and handling. Currently optional,
but in future releases linking with Leptonica will be mandatory.
* Ambiguity table rewritten to allow definite replacements in place
of fix_quotes.
* Added TessdataManager to combine data files into a single file.
* Some dead code deleted.
* VC++6 no longer supported. It can't cope with the use of templates.
* Many more languages added.
* Doxygenation of most of the function header comments.
* Added man pages.
* Added bash completion script (issue 247: thanks to neskiem)
* Fix integer overview in thresholding (issue 366: thanks to Cyanide.Drake)
* Add Danish Fraktur support (issues 300, 360: thanks to
dsl602230@vip.cybercity.dk)
* Fix file pointer leak (issue 359, thanks to yukihiro.nakadaira)
* Fix an error using user-words (Issue 345: thanks to max.markin)
* Fix a memory leak in tablefind.cpp (Issue 342, thanks to zdravco)
* Fix a segfault due to double fclose (Issue 320, thanks to souther)
* Fix an automake error (Issue 318, thanks to ichanjz)
* Fix a Win32 crash on fileFormatIsTiff() (Issues 304, 316, 317, 330, 347,
349, 352: thanks to nguyenq87, max.markin, zdenop)
* Fixed a number of errors in newer (stricter) versions of VC++ (Issues
301, among others) |
_________________ replace .co.cc with .info to get access to stuff I posted in forum
dropbox 2GB free
OpenOffice for Puppy Linux
Last edited by Dingo on Sat 17 Nov 2012, 17:39; edited 7 times in total
|
|
Back to top
|
|
 |
seaside
Joined: 11 Apr 2007 Posts: 835
|
Posted: Fri 19 Aug 2011, 16:35 Post subject:
|
|
Dingo,
Thanks for this. I'm missing " liblept.so.2" in pup431 and in pup425.
Regards,
s
|
|
Back to top
|
|
 |
Dingo

Joined: 11 Dec 2007 Posts: 1397 Location: somewhere at the end of rainbow...
|
Posted: Fri 19 Aug 2011, 16:58 Post subject:
|
|
Sorry, I forgotten to add dependencies
here you can download
liblept
leptonica-1.68-i486.pet
I compiled tesseract against leptonica libs in order to add support for all available image formats
_________________ replace .co.cc with .info to get access to stuff I posted in forum
dropbox 2GB free
OpenOffice for Puppy Linux
|
|
Back to top
|
|
 |
seaside
Joined: 11 Apr 2007 Posts: 835
|
Posted: Fri 19 Aug 2011, 17:31 Post subject:
|
|
Dingo,
Thanks. It's working now.
s.
|
|
Back to top
|
|
 |
Dingo

Joined: 11 Dec 2007 Posts: 1397 Location: somewhere at the end of rainbow...
|
Posted: Sun 21 Aug 2011, 10:11 Post subject:
|
|
added tesseract builds for Puppy 3.01 and Lucid 5.2.5
_________________ replace .co.cc with .info to get access to stuff I posted in forum
dropbox 2GB free
OpenOffice for Puppy Linux
|
|
Back to top
|
|
 |
Laie
Joined: 20 Jan 2008 Posts: 268 Location: Germany
|
Posted: Tue 30 Aug 2011, 16:49 Post subject:
thanks |
|
Wow, that's what I've been looking for for a long time! Thanks
|
|
Back to top
|
|
 |
bones01
Joined: 11 Aug 2008 Posts: 363 Location: Melbourne, Aus
|
Posted: Wed 09 May 2012, 00:02 Post subject:
|
|
I'm having some trouble getting this to work, and I'm not sure where I've gone wrong.
I've d/l the Lucid version, the lib dependency, and the language pack. I've extracted the english language, so I think I've done everything, but I'm still lost.
I don't have a menu entry for Tesseract anywhere either.
I'm using Lucid 528.004 (frugal) with fluxbox.
Any suggestions would be appreciated.
Bones.
_________________ Dell Latitude D630 running Puppy 5.2.8 frugal, Macpup 525 frugal (if I can get it working again)
Precise Puppy 5.4 live DVD
|
|
Back to top
|
|
 |
rcrsn51

Joined: 05 Sep 2006 Posts: 7753 Location: Stratford, Ontario
|
Posted: Wed 09 May 2012, 00:20 Post subject:
|
|
Read here about using Peasyscan with Tesseract.
If you open a terminal and type "tesseract", what happens?
|
|
Back to top
|
|
 |
bones01
Joined: 11 Aug 2008 Posts: 363 Location: Melbourne, Aus
|
Posted: Wed 09 May 2012, 21:00 Post subject:
|
|
| rcrsn51 wrote: | Read here about using Peasyscan with Tesseract.
If you open a terminal and type "tesseract", what happens? |
Using LXTerminal, I get this result:
sh-4.1# tesseract
Usage:tesseract imagename outputbase [-l lang] [configfile [[+|-]varfile]...]
sh-4.1#
I'll have a read of peasyscan later, but I should point out that I don't have a scanner attached my puppy computer. The files I have are from another scanner.
Bones
_________________ Dell Latitude D630 running Puppy 5.2.8 frugal, Macpup 525 frugal (if I can get it working again)
Precise Puppy 5.4 live DVD
|
|
Back to top
|
|
 |
rcrsn51

Joined: 05 Sep 2006 Posts: 7753 Location: Stratford, Ontario
|
Posted: Wed 09 May 2012, 21:44 Post subject:
|
|
| Quote: | | The files I have are from another scanner. |
In that case, you must run tesseract from the command line. If the source file is named "scan.tif", you will use the command
| Code: | | tesseract scan.tif scan |
This will produce the file "scan.txt"
Since you have tesseract 3.0, I believe that the input must be in TIFF format. If your files are something else, you will need to run them through a converter like mtpaint.
|
|
Back to top
|
|
 |
Dingo

Joined: 11 Dec 2007 Posts: 1397 Location: somewhere at the end of rainbow...
|
Posted: Fri 11 May 2012, 09:30 Post subject:
|
|
| rcrsn51 wrote: | | Since you have tesseract 3.0, I believe that the input must be in TIFF format. If your files are something else, you will need to run them through a converter like mtpaint. |
As far I remember (I compiled tesseract some times ago) leptonica lib gives to tesseract ability to load most common images format
_________________ replace .co.cc with .info to get access to stuff I posted in forum
dropbox 2GB free
OpenOffice for Puppy Linux
|
|
Back to top
|
|
 |
|