Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Fri 24 Oct 2014, 16:33
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
[bash] variable persistence in piping
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
Page 1 of 2 Posts_count   Goto page: 1, 2 Next
Author Message
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Wed 28 Jul 2010, 09:39    Post_subject:  [bash] variable persistence in piping  

I'm learning a bit of shell scripting and tring to setup a simple Gmail checker script.

This is what I got reading here and there:

Code:

#!/bin/bash

opentag=\<title\>
closetag=\</title\>

#get Gmail rss atom
rss=$(curl -su USER:PASS https://mail.google.com/mail/feed/atom)

#find <title> lines
lines=$(echo "$rss" | grep "$opentag")

#line number
lnum=0

#iterate each line
echo "$lines" | while read -r line
do
    #strip open tag
    line=${line#*$opentag}
    #strip close tag
    line=${line%$closetag*}
    #jump first occourence since it's whole XML title
    if (( lnum )); then
        echo "$lnum $line"
    fi
    (( lnum ++ ))
done

#show numbero of messagges
echo "$lnum messages"



the problem is the lnum variable I declare does not exists in the piped while loop neither gets updated outside so, at the end I get a "0 messagges"

The only solution I found is to save variables on files and avoid piping like this:

Code:

#!/bin/bash

opentag=\<title\>
closetag=\</title\>

#get Gmail rss atom on temp file
wget curl -su USER:PASS https://mail.google.com/mail/feed/atom > /tmp/gmailrss

#find <title> lines
grep "$opentag" /tmp/gmailrss > /tmp/gmaillines

#line number
lnum=0

#iterate each line
while read -r line
do
    #strip open tag
    line=${line#*$opentag}
    #strip close tag
    line=${line%$closetag*}
    #jump first occourence since it's whole XML title
    if (( lnum )); then
        echo "$lnum $line"
    fi
    (( lnum ++ ))
done < /tmp/gmaillines

echo "$lnum messages"


I'm used to program in python and using two files instead of two vars is... well... I don't even start listing cons.

I hope there's a better way than piping to pass read and grep the content of my variables so a new subshell is not started and I can use my lnum counter.

I know this is a bash script forum related topic but I know there are a lot of good shell script writers here so why not? Rolling Eyes
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Wed 28 Jul 2010, 09:59    Post_subject:  

This can be a workaround but I could need to know more from the loop I still run...

Code:

#!/bin/bash

opentag=\<title\>
closetag=\</title\>

#get Gmail rss atom
rss=$(curl -su USER:PASS https://mail.google.com/mail/feed/atom)

#find <title> lines
lines=$(echo "$rss" | grep "$opentag")

#iterate each line
echo "$lines" | while read -r line
do
    #strip open tag
    line=${line#*$opentag}
    #strip close tag
    line=${line%$closetag*}
    #jump first occourence since it's whole XML title
    if (( lnum )); then echo "$lnum $line"
    else lnum=0
    fi
    (( lnum ++ ))
done

#show number of messagges
lnum=$(echo "$lines" | wc -l)
echo "Total $lnum messages"

Back to top
View user's profile Send_private_message 
ken geometrics

Joined: 23 Jan 2009
Posts: 76
Location: California

PostPosted: Wed 28 Jul 2010, 10:43    Post_subject: Re: [bash] variable persistence in piping  

[quote="neurino"]I'm learning a bit of shell scripting and tring to setup a simple Gmail checker script.

This is what I got reading here and there:

Code:

echo "$lines" | while read -r line


This line causes bash to make a new shell and feed the output of the echo into its stdin. Try a "here-document" instead.

Code:

while real -l line ; do
echo "do stuff"
done <<XYZZY
$lines
XYZZY


Shouldn't cause a new shell. If you don't cause the new shell, you don't lose your variables.
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Wed 28 Jul 2010, 10:53    Post_subject:  

pretty genial Cool

thank you
Back to top
View user's profile Send_private_message 
technosaurus


Joined: 18 May 2008
Posts: 4353

PostPosted: Wed 28 Jul 2010, 20:38    Post_subject:  

you can set up each section of code as a function and use return to return parameters of your choosing or you can use export (probably not secure to do export with user and password though)
_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Thu 29 Jul 2010, 03:47    Post_subject:  

You're right but for what I needed in this case (just keeping variables insied and outside a loop) the use of functions would be overkill.
Just think if I had to change the value of 10 existing vars inside the loop: I should pass them to function and get them returned back...

A good solution for other cases tho... Idea thanks for answering
Back to top
View user's profile Send_private_message 
potong

Joined: 06 Mar 2009
Posts: 88

PostPosted: Thu 29 Jul 2010, 09:39    Post_subject:  

I have found that bash scripts can be a lot simpler when you can arrange the data the way you want to end up using it.

Bash is the "glue" for the countless other command line tools.

For instance: why stop at grep, having retrieved information from the web?

Other little languages can do far more, think sed/awk/perl etc

If you are going to invoke a 2nd process, make it count!

Here's an alternative:
Code:
#!/bin/bash

# set some variables
authority="USER:PASSWORD@"
url="https://${authority}mail.google.com/mail/feed/atom"
db_file=/tmp/debug.txt
tag="title"

# set IFS to newline for slurping lines into arrays
IFS="
"
# save result of gmail messages into an array
# save results of curl into a debug file (drop when goes to production)
# use sed to drop first tag (line 3) and strip remaining ones
line=($(curl -s $url |tee $db_file |sed -nr '3d;s|<('$tag'>)(.*)</\1|\2|p'))

# show number of messages
echo "${#line[@]} messages"
# show messages
for (( i=0;i<${#line[@]};i++)){ printf "[%02d] ${line[i]}\n" $((i+1)); }


Also see here for more bash best practices.

HTH

Potong
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Thu 29 Jul 2010, 10:38    Post_subject:  

Thank you, I'll have to get a closer look to sed and awk.

Since I'm trying to learn a bit of shell scripting I'd prefer to avoid other 'proper' languages like PERL and similar: this would be an easy task for me in python and I could use bash just to call py script but no learning bash this way ^^

Anyway the fact of using bash only as a glue is THE right way... and thanks for the very useful wiki you linked.

P.S.: your script needs a fix: outputs a new mail for each word in the subject:

Code:

# set IFS to newline for slurping lines into arrays
IFS=$'\n'


found googling around for IFS
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Thu 29 Jul 2010, 11:08    Post_subject:  

Another thing I hate of shell scripting and his tools is space management: readability is awfull... morover for someone coming from python!

I had to reindent - space up your code to understand how it works.
Given that I had to reset otherwise it doesn't work... Confused

edit: no spaces but new lines work... Rolling Eyes sorry...

The way you suggested is way way simpler... here are names and links extracted the same way

Code:


#!/bin/bash

# set some variables
authority="USER:PASS@"
url="https://${authority}mail.google.com/mail/feed/atom"
db_file=/tmp/debug.txt

titletag="title"
nametag="name"
linktag="link"

# set IFS to newline for slurping lines into arrays
IFS=$'\n'

#get feed
feed=$(curl -s $url)
 
# save subjects into an array
titles=($(echo "$feed" | \
    sed -nr '
        #cut 3rd line
        3d
        #get only 2nd group ( \2 ) in regexp
        s|<('$titletag'>)(.*)</\1|\2|p
        ') \
    )

# save names into another array
names=($(echo "$feed" | \
    sed -nr '
        #get only 2nd group ( \2 ) in regexp
        s|<('$nametag'>)(.*)</\1|\2|p
        ') \
    )

# save links to conversations
links=($(echo "$feed" | \
    sed -nr '
        s|<'$linktag'.*href="(.*)".*/>|\1|p
        ') \
    )

# show number of messages
echo "${#titles[@]} messages"
# show messages
for (( i=0; i<${#titles[@]}; i++)){
    printf "[%02d] ${names[i]} | ${titles[i]}\n" $(( i + 1 ));
    printf "URL: ${links[i]}\n";
}


Edited_times_total
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Thu 29 Jul 2010, 11:34    Post_subject:  

Just a question: sed works on a line basis?

If so there's no chance to expand the regexp to match only <title> tags enclosed in <entry> tags:

Code:

sed -nr 's|<entry>.*<('$titletag'>)(.*)</\1.*</entry>|\2|p'



P.S: Please forum admins switch code blocks to monospaced fonts! It's always on redability... come on it's not Linuxish!
Doesn't your terminal use that kind of font? I guess it's for a precise reason... Rolling Eyes


sorry for the big font
Back to top
View user's profile Send_private_message 
potong

Joined: 06 Mar 2009
Posts: 88

PostPosted: Fri 30 Jul 2010, 02:51    Post_subject:  

neurino:

If you look at the xml file provided by google via the curl command you will see that each <entry>/ </entry> tags are on separate lines. So....
Code:
tag1=entry tag2=title
line=($(curl -s $url |tee $db_file |sed -nr '/<'$tag1'>/,/<\/'$tag1'>/s|<('$tag2'>)(.*)</\1|\2|p'))

should fit the bill

However if you have a not so nicely formatted xml file try:
Code:
tag1=entry tag2=title
line=($(curl -s $url |xmllint --format - |tee $db_file |sed -nr '/<'$tag1'>/,/<\/'$tag1'>/s|<('$tag2'>)(.*)</\1|\2|p'))

A good sed tutorial can be found here.

HTH

Potong

p.s. a tip to grab code from the browser is to select it then switch to a terminal and type:
Code:
xclip -o>filename && chmod +x filename && ./filename

of course make sure the code is not malicious first!
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Fri 30 Jul 2010, 11:05    Post_subject:  

potong wrote:
neurino:

If you look at the xml file provided by google via the curl command you will see that each <entry>/ </entry> tags are on separate lines. So....
Code:
tag1=entry tag2=title
line=($(curl -s $url |tee $db_file |sed -nr '/<'$tag1'>/,/<\/'$tag1'>/s|<('$tag2'>)(.*)</\1|\2|p'))

should fit the bill



It's going handy since the "name" tag is not used only for mail author but also for "contributors" so I need a way to parse xml someway...
Back to top
View user's profile Send_private_message 
ljfr

Joined: 23 Apr 2009
Posts: 176

PostPosted: Fri 30 Jul 2010, 14:00    Post_subject: xmllint shell  

Hi,

To do basic parsing you could have a look at xmllint shell:
Code:
echo "cat //contributors/name" | xmllint $my_xml_file_path --shell

...but results would need some formating,
or you could use a small xml parser build using libxml2 (find an example attached),

regards,
xmltool.c.tar.bz2
Description 
bz2

 Download 
Filename  xmltool.c.tar.bz2 
Filesize  2.87 KB 
Downloaded  276 Time(s) 
Back to top
View user's profile Send_private_message 
technosaurus


Joined: 18 May 2008
Posts: 4353

PostPosted: Fri 30 Jul 2010, 18:26    Post_subject:  

you may also be able to use sed. See the script here:
http://www.dotkam.com/2007/04/04/sed-to-parse-and-modify-xml-element-nodes/

Edit
for multiple parameters within the same tag you'd need another function that does something like

Code:
my_untested_function(){
OUT=""
for x in $@; do x=`echo $x |grep $PARAM |sed "s/=*/=$NEWVALUE/g"` && OUT=$OUT $x; done
return $OUT
}


where $@ (input) is the entire contents of the tag, PARAM is the field before the "=" (name, etc...) and $NEWVALUE is the new value that you want to assign to the field

_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send_private_message 
neurino


Joined: 15 Oct 2009
Posts: 360

PostPosted: Sat 31 Jul 2010, 08:43    Post_subject:  

Thank you all, reading sed tutorial link above I found how to filter tags according to the precedent line (like only <name> tags after <author> ones) using N parameter, also matching empty subjects / author names that cause wrong array splits.

edit, here is the code, if someone wants to test it:

Code:

#!/bin/bash

# set some variables
authority="USER:PASSWD@"
url="https://${authority}mail.google.com/mail/feed/atom"

# set IFS to newline for slurping lines into arrays
IFS=$'\n'

#get feed
feed=$(curl -s $url | tee $db_file)
 
#<author>=><name>
authors=($(echo "$feed" | \
    sed -nr '\|<author>| {
            N
            \|<author>.*<name>.*</name>| {
            #in case of NOT empty tag
            s|<author>.*<name>(.+)</name>|\1| p
            #in case of empty tag
            s|<author>.*<name></name>|--| p
        }
    }'
))

#<entry>=><title>
subjects=($(echo "$feed" | \
    sed -nr '\|<entry>| {
            N
            \|<entry>.*<title>.*</title>| {
            #in case of NOT empty tag
            s|<entry>.*<title>(.+)</title>|\1| p
            #in case of empty tag
            s|<entry>.*<title></title>|no subject| p
        }
    }'
))

#<summary>=><link>
links=($(echo "$feed" | \
    sed -nr '\|<summary>| {
            N
            \|<summary>.*<link.*/>| {
            #([^"]*) instead of .* is sed way for non greedy matches
            s|<summary>.*<link.*href="([^"]*)".*/>|\1| p
        }
    }'
))

# show number of messages
echo "${#authors[@]} messages"
# show messages
for (( i=0; i<${#authors[@]}; i++)){
    printf "[%02d] ${authors[i]} | ${subjects[i]}\n" $(( i + 1 ));
    printf "URL: ${links[i]}\n\n";
}
Back to top
View user's profile Send_private_message 
Display_posts:   Sort by:   
Page 1 of 2 Posts_count   Goto page: 1, 2 Next
Post_new_topic   Reply_to_topic View_previous_topic :: View_next_topic
 Forum index » Off-Topic Area » Programming
Jump to:  

Rules_post_cannot
Rules_reply_cannot
Rules_edit_cannot
Rules_delete_cannot
Rules_vote_cannot
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1235s ][ Queries: 13 (0.0196s) ][ GZIP on ]