Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Sat 25 Nov 2017, 07:34
All times are UTC - 4
 Forum index » Advanced Topics » Additional Software (PETs, n' stuff) » Documents
PeasyScan Image Scanner Program
Post new topic   Reply to topic View previous topic :: View next topic
Page 8 of 8 [114 Posts]   Goto page: Previous 1, 2, 3, ..., 6, 7, 8
Author Message
Argolance


Joined: 06 Jan 2008
Posts: 3070
Location: PORT-BRILLET (Mayenne - France)

PostPosted: Mon 06 Nov 2017, 06:11    Post subject:  

Bonjour,
Smile
rcrsn51 wrote:
Peasyscan generates some large, temporary PNM image files in /root. They are deleted when the program terminates. Maybe they should be placed elsewhere.
Yes indeed! If the program ends not properly for any reason, the image will not be deleted and may cause problem with a small nearly full pupsave: so Just for my own curiosity, why /root/scan (by default)?

Cordialement.

_________________

Back to top
View user's profile Send private message Visit poster's website 
rcrsn51


Joined: 05 Sep 2006
Posts: 11729
Location: Stratford, Ontario

PostPosted: Mon 06 Nov 2017, 08:20    Post subject:  

That was the original program in 2010. Now the large pnm file is stored in /tmp. Only the final png file is stored in /root after you save it.
Back to top
View user's profile Send private message 
Argolance


Joined: 06 Jan 2008
Posts: 3070
Location: PORT-BRILLET (Mayenne - France)

PostPosted: Mon 06 Nov 2017, 12:43    Post subject:  

rcrsn51 wrote:
That was the original program in 2010. Now the large pnm file is stored in /tmp. Only the final png file is stored in /root after you save it.
OK! Thanks.
_________________

Back to top
View user's profile Send private message Visit poster's website 
Argolance


Joined: 06 Jan 2008
Posts: 3070
Location: PORT-BRILLET (Mayenne - France)

PostPosted: Fri 10 Nov 2017, 06:32    Post subject:  

Bonjour,
- Scanning image for OCR, I noticed that PeasyScan is searching for the "tessdata" directory inside /usr/share/ while it is (usually?) inside /usr/share/tesseract-ocr/. So the conversion into text is not done.
Code:
ls: cannot access /usr/share/tessdata/*.traineddata: No such file or directory
pnmtotiff: computing colormap...
pnmtotiff: Too many colors - proceeding to write a 24-bit RGB file.
pnmtotiff: If you want an 8-bit palette file, try doing a 'pnmquant 256'.
Error opening data file /usr/share/tesseract-ocr/tessdata/.traineddata.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language '.traineddata'
Tesseract couldn't load any languages!

If I make a symbolic link /usr/share/tessdata to /usr/share/tesseract-ocr/./tessdata it runs well.
I consequently had a look to the PeasyScan script and changed the line:
Code:
LANGUAGE=$(basename $(ls -1 /usr/share/tessdata/*.traineddata | head -n1) .traineddata)

to:
Code:
LANGUAGE=$(basename $(ls -1 /usr/share/tesseract-ocr/tessdata/*.traineddata | head -n1) .traineddata)

And now all is OK!

- Scanning image for PDF, the generated pdf file has no .pdf extention unless a pdf extension is added to the name of the scanned image itself in the field.

Small suggestion: would it be possible to display the text file using the defaulttextviewer at the end of the OCR process as well as the pdf file using the defaultpdtviewer. I think it is what user is expecting for, instead of the image which is not really welcome in this case? Rolling Eyes

Thinking this could be useful.

Cordialement.

_________________

Back to top
View user's profile Send private message Visit poster's website 
rcrsn51


Joined: 05 Sep 2006
Posts: 11729
Location: Stratford, Ontario

PostPosted: Fri 10 Nov 2017, 07:56    Post subject:  

Argolance wrote:
- Scanning image for OCR, I noticed that PeasyScan is searching for the "tessdata" directory inside /usr/share/ while it is (usually?) inside /usr/share/tesseract-ocr/. So the conversion into text is not done.

Where did you get your "tessdata" package? On page 1, I have given the instruction:
Quote:
3. Copy the file xxx.traineddata to /usr/share/tessdata

Quote:
Small suggestion: would it be possible to display the text file using the defaulttextviewer at the end of the OCR process as well as the pdf file using the defaultpdtviewer. I think it is what user is expecting for, instead of the image which is not really welcome in this case?

Try this: Between lines 114 and 115, insert
Code:
 defaulttexteditor "$SAVEFILENAME"

I think that seeing the image is still useful. Its quality determines how well the OCR works.
Back to top
View user's profile Send private message 
Argolance


Joined: 06 Jan 2008
Posts: 3070
Location: PORT-BRILLET (Mayenne - France)

PostPosted: Tue 14 Nov 2017, 08:13    Post subject:  

Bonjour,
rcrsn51 wrote:
Where did you get your "tessdata" package? On page 1, I have given the instruction:
Quote:
3. Copy the file xxx.traineddata to /usr/share/tessdata

I simply installed Tesseract and all its needed dependancies from the PPM...
Quote:
Try this: Between lines 114 and 115, insert
Code:
 defaulttexteditor "$SAVEFILENAME"

It is what I did for my own use, as well as for PDF file: with some minor changes, it works quite well.

Now, as simple user:
    Something is a bit confusing:
    - "Select the image format"? PDF and TXT are not "images" as such?
    - "Name the scanned image as"? There are not only "scanned" images that are named scan, but the *.png, *.jpg, *.pdf and *.txt files, so "scan" is only the base name of them?

In the basic recipe for using Peasyscan, you first mention:
1. Select the image format.
This must be this way while automating scans because all is done at a time, but I noticed that when scanning only a single document it is possible to select the image format (the "Output file type") after scanning, just before saving, so, from the same scanned image it is possible to get the full range of images, pdf or text files (provided that the script is adapted for that: I did it for my own use too).
This can be useful.

Cordialement.

_________________

Back to top
View user's profile Send private message Visit poster's website 
rcrsn51


Joined: 05 Sep 2006
Posts: 11729
Location: Stratford, Ontario

PostPosted: Tue 14 Nov 2017, 09:09    Post subject:  

When I added OCR to PeasyScan in 2010, I included my own Tesseract 3.00 package. It's still on page 1. Its default location is /usr/share/tessdata, so PeasyScan is written to look there.

If you get a Tesseract package elsewhere, I can't predict where the language files will be located. So you need to provide the link into /usr/share/tessdata.

I agree that the phrases are confusing. The original versions of PeasyScan only saved to graphics files, so they made sense then.

How about "Select the output format" and "Name the output file as"?
Back to top
View user's profile Send private message 
Argolance


Joined: 06 Jan 2008
Posts: 3070
Location: PORT-BRILLET (Mayenne - France)

PostPosted: Tue 14 Nov 2017, 12:29    Post subject:  

rcrsn51 wrote:
When I added OCR to PeasyScan in 2010, I included my own Tesseract 3.00 package. It's still on page 1. Its default location is /usr/share/tessdata, so PeasyScan is written to look there.

When I installed Tesseract, as it is the case for all the PPM users, Puppy didn't ask me where to copy the tessdata folder and put it inside the /usr/share/tesseract-ocr directory, which seems to be the default/usual one. In any case, I think it may be appropriate your script to take this into account and search for this directory too?
Quote:
If you get a Tesseract package elsewhere, I can't predict where the language files will be located. So you need to provide the link into /usr/share/tessdata
.
It is not 'elsewhere' but very problably 'where' most of the (ToOpPy) users may usually find the package and install it from. Wink
Quote:
I agree that the phrases are confusing. The original versions of PeasyScan only saved to graphics files, so they made sense then.

Your script, as always, is very interesting, as simple and efficient as possible... but looks a bit tough for my taste! Embarassed Smile
So I took the liberty and had fun to get it smoother dressed! The only thing I hope is not to have impaired its functions!
Quote:
How about "Select the output format" and "Name the output file as"?
See the pictures below, this is the choices I made...

Cordialement.
171116_190816_376x42_easyshot.png
 Description   
 Filesize   2.74 KB
 Viewed   18 Time(s)

171116_190816_376x42_easyshot.png

171116_190837_598x299_easyshot.png
 Description   
 Filesize   31.47 KB
 Viewed   18 Time(s)

171116_190837_598x299_easyshot.png


_________________


Last edited by Argolance on Thu 16 Nov 2017, 14:10; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website 
rcrsn51


Joined: 05 Sep 2006
Posts: 11729
Location: Stratford, Ontario

PostPosted: Tue 14 Nov 2017, 16:30    Post subject:  

I have updated my version to look for the language files in both places.
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 8 of 8 [114 Posts]   Goto page: Previous 1, 2, 3, ..., 6, 7, 8
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Advanced Topics » Additional Software (PETs, n' stuff) » Documents
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.0523s ][ Queries: 13 (0.0087s) ][ GZIP on ]