| Author |
Message |
ndujoe1
Joined: 04 Dec 2005 Posts: 616
|
Posted: Tue 13 Oct 2009, 21:40 Post subject:
tesseract image requirements |
|
Since tesseract will only operate with uncompressed TIFF files you need just a few extra steps to achieve compatiblity with xsane.
goto : click Preferences --> Setup --> Filetype
for the TIFF options
Set compression rate to 1
in the next three TIFF dialong boxes select no compression.
clock OK
click Preferences again and select SAVE settings.
When scanning a file for OCR in the XSANE menu I select type :TIFF
color : gray
enter 300 for scan resoultion
And save the filename with extention .tif not .tiff.
Then when finished you invoke tesseract from the command line with
tesseract filename.tif outputname
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Tue 12 Jan 2010, 08:42 Post subject:
|
|
Come on people, why did no one report before now that the package was broken?
Or did it work in older versions of Puppy? Maybe petget was different...
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
|
|
Back to top
|
|
 |
ndujoe1
Joined: 04 Dec 2005 Posts: 616
|
Posted: Tue 12 Jan 2010, 11:04 Post subject:
tesseract |
|
It is not broken I forgot to post that you need to move the tesseract location from /local/tessearct to /usr/local/tesseract. Then you will be able reference it from the command line. It works well on my machine.
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Wed 13 Jan 2010, 04:05 Post subject:
|
|
Yes, I know the build isn't broken, and neither are your instructions... but my package is.
I obviously packaged it wrong... unless my package somehow got replaced by a different, broken one.
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
|
|
Back to top
|
|
 |
zygo
Joined: 08 Apr 2006 Posts: 206 Location: UK
|
Posted: Wed 13 Jan 2010, 11:55 Post subject:
|
|
I'm using Puppy 431. I read only the first post in this thread and got it working -- to a fashion -- the command simply returned the dots per pixcel and size of the image file. A 1 byte file was made containing a new line character. No error on the command line. Not even in /log/var/messages . Check for dependencies form the menu lists none.
Now I see ndujoe1 says it needs xsane. Which xsane pet from the official Puppy 4 repo should I use and does that need sane?
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Fri 09 Jul 2010, 09:59 Post subject:
|
|
There is a py/gtk gui for tesseract at http://groups.google.com/group/ocropus/files/ that is worth looking at. Just find guitesseract.py on that page.
There are a couple of other guis I'm still looking at.
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
|
|
Back to top
|
|
 |
abushcrafter

Joined: 30 Oct 2009 Posts: 1447 Location: England
|
Posted: Fri 09 Jul 2010, 14:38 Post subject:
|
|
That GUI looks promising. Thanks.
_________________ adobe flash is rubbish!
My Quote:"Humans are stupid, though some are clever but stupid." http://www.dependent.de/media/audio/mp3/System_Syn_Heres_to_You.zip http://www.systemsyn.com/
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Sat 10 Jul 2010, 22:02 Post subject:
OCRfeeder - gui for OCR |
|
There's another py/gtk gui at http://ftp.gnome.org/pub/GNOME/sources/ocrfeeder/0.6/
This one is a bit more capable (e.g. page layout analysis) and looks more like it will be maintained.
You need to comment out one line of code which requires Gnome support, just to display the about page !
It also uses unpaper, which I posted above, and requires libgoocanvas and pygoocanvas and the python imaging library.
It exports to ODF or html, but unfortunately this isn't working for me; I think my python imaging library may be faulty. If it does work for anyone, please let us know which PIL and which python you're using.
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
Last edited by disciple on Sat 10 Jul 2010, 22:21; edited 1 time in total
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Sat 10 Jul 2010, 22:20 Post subject:
|
|
I couldn't find a goocanvas that worked for me, so here's the one I built, and a repackaged py-goocanvas stolen I think from debian.
| Description |
|

Download |
| Filename |
python-pygoocanvas_0.10.0-1_i386.pet |
| Filesize |
40.21 KB |
| Downloaded |
465 Time(s) |
| Description |
|

Download |
| Filename |
goocanvas_DEV-0.15-i486.pet |
| Filesize |
515.15 KB |
| Downloaded |
485 Time(s) |
| Description |
|

Download |
| Filename |
goocanvas-0.15-i486.pet |
| Filesize |
90.9 KB |
| Downloaded |
439 Time(s) |
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Sat 10 Jul 2010, 22:36 Post subject:
|
|
The other gui for tesseract is at http://sourceforge.net/projects/ocrgui/
It is in C/GTK (yay - no python ) but I suspect is not as capable.
My current puppy doesn't have a new enough GTK to try it, although I think the latest puppies do. You'll also need to install hunspell (or hack it to use enchant instead ) and it says imagemagick convert.
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
|
|
Back to top
|
|
 |
abushcrafter

Joined: 30 Oct 2009 Posts: 1447 Location: England
|
Posted: Wed 16 Feb 2011, 15:02 Post subject:
|
|
There is a new version of tesseract out.
Tesseract-GUI
Juan Ramon Castan has improved on the work of Filip Domenic "guitesseract.py". I did not manage to ocr a image with it because the language drop down box had no options.
While on Source Forge I also found another tesseract GUI: http://sourceforge.net/projects/gimagereader/
_________________ adobe flash is rubbish!
My Quote:"Humans are stupid, though some are clever but stupid." http://www.dependent.de/media/audio/mp3/System_Syn_Heres_to_You.zip http://www.systemsyn.com/
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Thu 17 Feb 2011, 05:24 Post subject:
|
|
Another one! Thanks.
Is it really Python/Gnome, or just PyGtk?
If you haven't been following the ocropus thread, you might like to check out cuneiform, which I mentioned there... along with a variety of guis.
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
|
|
Back to top
|
|
 |
abushcrafter

Joined: 30 Oct 2009 Posts: 1447 Location: England
|
Posted: Fri 18 Feb 2011, 09:22 Post subject:
|
|
| disciple wrote: |
Another one! Thanks.
Is it really Python/Gnome, or just PyGtk? | I have not tried to yet because I could not face getting and compile any more python bindings and I have a lack of time. It's dependencies are:
- python
- pygtk
- pycairo
- gnome-python2-gtkspell
- python-enchant
- python-imaging
- pypoppler
- tesseract (along with it's dictionaries)
- python-imaging-sane (optional)
So I guess its PyGtk not Gnome.
| disciple wrote: | | If you haven't been following the ocropus thread, you might like to check out cuneiform, which I mentioned there... along with a variety of guis. | No I haven't. Thanks for the pointer.
_________________ adobe flash is rubbish!
My Quote:"Humans are stupid, though some are clever but stupid." http://www.dependent.de/media/audio/mp3/System_Syn_Heres_to_You.zip http://www.systemsyn.com/
|
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6179 Location: Auckland, New Zealand
|
Posted: Fri 18 Feb 2011, 19:38 Post subject:
|
|
| abushcrafter wrote: | | I have not tried to yet because I could not face getting and compile any more python bindings and I have a lack of time. |
I know the feeling
Thanks for the list of dependencies - I couldn't find it for some reason.
_________________ DEATH TO SPREADSHEETS
- - -
Classic Puppy quotes
- - -
Beware the demented serfers!
|
|
Back to top
|
|
 |
boxR

Joined: 13 Aug 2011 Posts: 149 Location: France
|
Posted: Tue 01 Jan 2013, 19:51 Post subject:
|
|
And now what is your favorite OCR +GUI? What do you use?
Happy New Year
|
|
Back to top
|
|
 |
|