Looking for Bash script to text search wildcard PDF's

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Post Reply
Message
Author
User avatar
Semme
Posts: 8399
Joined: Sun 07 Aug 2011, 20:07
Location: World_Hub

Looking for Bash script to text search wildcard PDF's

#1 Post by Semme »

:D Greetings!

I'd like to have this modified to:
  • 1) Open a shell.

    2) Ask for a path

    3) Ask for a pattern.

    4) Results to stdout.
I'm not after a Gtk window as I'd prefer to keep the mouse quiet.

If easy enough and you have the time, I'd love a walk-through.

Thanks.

User avatar
SFR
Posts: 1800
Joined: Wed 26 Oct 2011, 21:52

#2 Post by SFR »

Hey Semme

You mean something (simple) like this?

Code: Select all

#!/bin/bash
tail -n +5 "$0" > /tmp/pdfgrepcli				# Copy itself, except first 4 lines, to /tmp/pdfgrepcli
exec xterm -hold -e bash /tmp/pdfgrepcli		# open terminal and execute /tmp/pdfgrepcli
# -----------------------------------------------------------------------------
[ ! `which pdfgrep` ] && echo "Install 'pdfgrep' first, exiting..." && exit 1

read -p "Path: " PDFPATH
read -p "Pattern: " PATTERN

find "$PDFPATH" -type f -iname "*.pdf" -exec pdfgrep "$PATTERN" {} +
Regading no.1: if I understood correctly - you want to open terminal window when the script is clicked, right?
If not, just comment out or delete 2nd & 3rd line (tail... & exec...).

BTW: From the link you have posted there - Puppy (at least Slacko) has also 'pdftotext' and in combination with 'grep --color=always' it looks quite nice; check it out. :wink:

Code: Select all

pdftotext /usr/share/examples/ps-pdf/Acrobat.pdf - | grep --color=always "document"
Greetings!
[color=red][size=75][O]bdurate [R]ules [D]estroy [E]nthusiastic [R]ebels => [C]reative [H]umans [A]lways [O]pen [S]ource[/size][/color]
[b][color=green]Omnia mea mecum porto.[/color][/b]

User avatar
Semme
Posts: 8399
Joined: Sun 07 Aug 2011, 20:07
Location: World_Hub

#3 Post by Semme »

AWESOME! Works a treat SFR. 8)

Is tail -n +5 "$0" > /tmp/pdfgrepcli setup as a buffer to store input while waiting for further instruction?

OK.. xterm >> retain window, have bash execute (code I have yet to understand):

Code: Select all

[ ! `which pdfgrep` ] && echo "Install 'pdfgrep' first, exiting..." && exit 1 

read -p "Path: " PDFPATH 
read -p "Pattern: " PATTERN 

find "$PDFPATH" -type f -iname "*.pdf" -exec pdfgrep "$PATTERN" {} +
The rest is gonna take me time to study and digest.

A little Bash know-how coupled with regx, right?

User avatar
SFR
Posts: 1800
Joined: Wed 26 Oct 2011, 21:52

#4 Post by SFR »

Hey Semme

Glad it works. :)
Is tail -n +5 "$0" > /tmp/pdfgrepcli setup as a buffer to store input while waiting for further instruction?
This is one of the first tricks I have learned. 8)
It's kinda "self-extracting" routine - basically this line copies the script itself (its path is in $0), except first 4 lines, to tempfile and then, in order to open terminal window, redirects execution (let's say - jumps) to this file using 'exec xterm...'

Next line is to ensure that 'pdfgrep' is really available (which pdfgrep), then 'read' to get user's input and finally the line you have delivered, however I never use 'find ... -exec'

Code: Select all

-exec pdfgrep "$PATTERN" {} +
so honestly I don't fully understand that syntax. :roll:

Greetings!
[color=red][size=75][O]bdurate [R]ules [D]estroy [E]nthusiastic [R]ebels => [C]reative [H]umans [A]lways [O]pen [S]ource[/size][/color]
[b][color=green]Omnia mea mecum porto.[/color][/b]

Post Reply