Split Index and File Name from string?

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Message
Author
User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

Split Index and File Name from string?

#1 Post by RSH »

Hi.

I have list files containing strings like: 59170-"LazY-Fred-English-Locals.tar.gz" which is the last entry in one list file.

This command COUNT=$((`tail -1 $INDEXFILE | cut -d "-" -f1`)) gives me the 59170 as a result in $COUNT - coming from the last enry of the list file.

Let's say i have a single string (no file) 59170-"LazY-Fred-English-Locals.tar.gz" in $OUTSTR

What do i have to change in COUNT=$((`tail -1 $INDEXFILE | cut -d "-" -f1`)) to get the Index and File Name (without the double quotes) as a result in $INDEX and $FILENAME?

INDEX= ? ? ?
FILENAME= ? ? ?

Thanks

RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

Bruce B

#2 Post by Bruce B »

It could be easier if all your 'indexes' were 5 digits like in the example above. Are they? If so, then something like this:

Code: Select all

STRING="59170-LazY-Fred-English-Locals.tar.gz"

FILENAME=`echo $STRING | sed 's/^.....//'`

COUNT=`echo $STRING | cut -d - -f 1`

#test content of variables

echo \$STRING = $STRING  
echo \$FILENAME = $FILENAME
echo \$COUNT = $COUNT  

testing outputs

$STRING = 59170-LazY-Fred-English-Locals.tar.gz
$FILENAME = -LazY-Fred-English-Locals.tar.gz
$COUNT = 59170
Last edited by Bruce B on Sat 08 Sep 2012, 06:09, edited 1 time in total.

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#3 Post by RSH »

Bruce B wrote:It could be easier if all your 'indexes' were 5 digits like in the example above. Are they?

~
Unfortunately not.

Is there any bash function that gives me the position of - ? I could do the rest using string functions that i know and have already used. Just how to find the - ! Could do it in a loop, thought there would be any easier way to do that.
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

Bruce B

#4 Post by Bruce B »

I think a loop would be helpful

Please post about 10 or more actual filenames

~

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#5 Post by RSH »

Bruce B wrote:I think a loop would be helpful

Please post about 10 or more actual filenames

~
Hi Bruce B.

Thanks for your code example. Please take the files below. Would be nice, to have a smarter solution.

How I did solve this:

Code: Select all

INDEXSTR="108-"grubfd01.zip""
#INDEXSTR="1168-"FreeSans.zip""
#INDEXSTR="58992-"LazY-FReD-1.0.2.sfs.gz""
#INDEXSTR="58993-"LazY-FReD-1.0.2.pet""
#INDEXSTR="59170-"LazY-Fred-English-Locals.tar.gz""
nlen="`echo ${#INDEXSTR}`"
echo $nlen

doloop="true"
start=0
while $doloop; do
	s0=${INDEXSTR:$start:1}
	echo $s0
	if [ "$s0" = "-" ]; then
		doloop="false"
	fi
	((start++))
	echo $start
done

s1=${INDEXSTR:0:$start-1}
s2=${INDEXSTR:$start-1:$nlen-$start+1}
echo $s2
nlen="`echo ${#s2}`"
echo $nlen
s3=${s2:1:$nlen-1}
echo $s1
echo $s3
I wonder, if it could be done to make this shown code a function in a single script that would be called from another script and would return $s1 and $s3 ---> or similar $COUNT and $FILENAME ?
RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

Bruce B

#6 Post by Bruce B »

what the files have in common is the - after the numbers.

to get just the numbers you could use cut -d - -f 1

Bruce B

#7 Post by Bruce B »

I don't understand what you are trying to accomplish, if I knew it would be most helpful.

BTW if you want to remove quotes in the file name in the variable

I have a better idea, don't use quotes or any special characters or spaces. I don't and it makes writing scripts to handle files much easier.

~

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#8 Post by RSH »

I am working on a program that can download the attached files from murga forum, selectable from a list in a gui. Therefor i read out the database - using attach&id , which is in each download link of each attached file.

After building the index file i do get the listed names as a combination of the index and the file name, formatted as shown: 108-"grubfd01.zip"

To download the file i do need the number of its index. But the file is been downloaded as: viewtopic.php?mode=attach&id=108 which gives me absolutely no information on what file type it is. Therefor i do need the file name. After downloading as viewtopic.php?mode=attach&id=108 i move the file and give it a new name ---> the file name.

This is the Project of it all.
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#9 Post by RSH »

And this is the Application :D
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

User avatar
SFR
Posts: 1800
Joined: Wed 26 Oct 2011, 21:52

#10 Post by SFR »

RSH wrote:
Bruce B wrote:I think a loop would be helpful

Please post about 10 or more actual filenames

~
Hi Bruce B.

Thanks for your code example. Please take the files below. Would be nice, to have a smarter solution.

How I did solve this:

Code: Select all

INDEXSTR="108-"grubfd01.zip""
#INDEXSTR="1168-"FreeSans.zip""
#INDEXSTR="58992-"LazY-FReD-1.0.2.sfs.gz""
#INDEXSTR="58993-"LazY-FReD-1.0.2.pet""
#INDEXSTR="59170-"LazY-Fred-English-Locals.tar.gz""
nlen="`echo ${#INDEXSTR}`"
echo $nlen

doloop="true"
start=0
while $doloop; do
	s0=${INDEXSTR:$start:1}
	echo $s0
	if [ "$s0" = "-" ]; then
		doloop="false"
	fi
	((start++))
	echo $start
done

s1=${INDEXSTR:0:$start-1}
s2=${INDEXSTR:$start-1:$nlen-$start+1}
echo $s2
nlen="`echo ${#s2}`"
echo $nlen
s3=${s2:1:$nlen-1}
echo $s1
echo $s3
Maybe something like this:

Code: Select all

#!/bin/bash

INDEXSTR="108-"grubfd01.zip"" 
#INDEXSTR="1168-"FreeSans.zip"" 
#INDEXSTR="58992-"LazY-FReD-1.0.2.sfs.gz"" 
#INDEXSTR="58993-"LazY-FReD-1.0.2.pet"" 
#INDEXSTR="59170-"LazY-Fred-English-Locals.tar.gz"" 

COUNT=`echo $INDEXSTR | cut -d '-' -f1`
FILENAME=`echo $INDEXSTR | cut -d '-' -f2-`

echo $INDEXSTR
echo $COUNT
echo $FILENAME
RSH wrote:I wonder, if it could be done to make this shown code a function in a single script that would be called from another script and would return $s1 and $s3 ---> or similar $COUNT and $FILENAME ?
As far as I know there's no a simple way to communicate between scripts.
You can use a temporary file (script1 writes to -> script2 reads from) or you can try this:
http://www.murga-linux.com/puppy/viewtopic.php?t=75778
I never tried it, however, so don't know how exactly use it.

Greetings!
[color=red][size=75][O]bdurate [R]ules [D]estroy [E]nthusiastic [R]ebels => [C]reative [H]umans [A]lways [O]pen [S]ource[/size][/color]
[b][color=green]Omnia mea mecum porto.[/color][/b]

User avatar
rcrsn51
Posts: 13096
Joined: Tue 05 Sep 2006, 13:50
Location: Stratford, Ontario

#11 Post by rcrsn51 »

RSH wrote:I wonder, if it could be done to make this shown code a function in a single script that would be called from another script and would return $s1 and $s3 ---> or similar $COUNT and $FILENAME ?
RSH
If script1 sends its data to stdout using echo statements, then script2 can retrieve it with code like

Code: Select all

OUT=$(script1)
COUNT=$(echo $OUT | cut -d " " -f 1)
FILENAME=$(echo $OUT | cut -d " " -f 2)

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#12 Post by amigo »

All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:

Code: Select all

INDEX=${OUTSTR%%-*}
FILENAME=${OUTSTR#*-}

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#13 Post by RSH »

amigo wrote:All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:

Code: Select all

INDEX=${OUTSTR%%-*}
FILENAME=${OUTSTR#*-}
Thanks amigo.

Looks like this could be what i'm looking for. Will test it later.

RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

User avatar
rcrsn51
Posts: 13096
Joined: Tue 05 Sep 2006, 13:50
Location: Stratford, Ontario

#14 Post by rcrsn51 »

[Edit] My mistake.

User avatar
stu91
Posts: 145
Joined: Mon 06 Aug 2012, 15:11
Location: England. Dpup. Dell Inspiron 1501

#15 Post by stu91 »

amigo wrote:All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:

Code: Select all

INDEX=${OUTSTR%%-*}
FILENAME=${OUTSTR#*-}
Hi Amigo,
Could you give a breakdown on what the different characters represent in you code - or any links etc that might expand on such code further.

Thanks in advance.

amigo
Posts: 2629
Joined: Mon 02 Apr 2007, 06:52

#16 Post by amigo »

Offhand I don't even remember what this bash-centric feature is called! :oops: Maybe it's 'variable substitution'.

But this one means "everything(*) to the left of the first '-'":
INDEX=${OUTSTR%%-*}

And this one means "everything to the right of the first '-'":
FILENAME=${OUTSTR#*-}

You can play with it in a terminal to understand it better:

Code: Select all

TEST=59170-LazY-Fred-English-Locals.tar.gz
echo ${TEST%%-*}
echo ${TEST#*-}
Then vary that like this:

Code: Select all

echo ${TEST%-*}
echo ${TEST##*-}
Using one or two '#' characters parses the string from left to right. Using one or two '%' chars parses from right to left. Using doublke characters means to use the longest match, using a single char returns the shortest match.

Anyway, a little trick like that can make your code run hundreds of times faster than some multi-command pipeline. For a single instance you'd not notice the difference, but if that line is being used inside a loop which runs many times, then the difference can be dramatic.

User avatar
Karl Godt
Posts: 4199
Joined: Sun 20 Jun 2010, 13:52
Location: Kiel,Germany

#17 Post by Karl Godt »

There are many possibilities using cut, awk, sed, grep . Since cut is already known in combination with echo ,

here the others :

Code: Select all

STRING="1234-\"attached.file.pet\""
awk :

Code: Select all

echo "$STRING" |awk -F '"' '{print "\""$2"\""}'
sed :

Code: Select all

echo "$STRING" |sed 's:.*-::'
grep :

Code: Select all

echo "$STRING" |grep -o '".*"'
*

On a large database file :

Code: Select all

awk -F '"' '{print "\""$2"\""}' /database_file.db >/newfile.filenames-only.awk.list

Code: Select all

sed 's:.*-::' /database_file.db >/newfile.filenames-only.sed.list

Code: Select all

grep -o '".*"'/database_file.db >/newfile.filenames-only.grep.list
might be faster than a loop .

http://www.grymoire.com/Unix/Sed.html is the reference i download whenever i have/had sed questions .
http://www.grymoire.com/Unix/Awk.html is not as good as the sed tutorial as my impression is,
still have to checkout
http://www.grymoire.com/Unix/Grep.html .

Google for " $COMMAND tutorial " brings up a lot of stuff .

*

Other chars for shell variable substitution are '/' & '//' and since bash-4 '^' ',' & '^^' ',,' :

Code: Select all

STRING="ABCDCEFG-123.456.tar"
echo "${STRING//C/cccCccc}"
echo "${STRING/C/cccCccc}"

Code: Select all

echo "${STRING,C}"
Learn here :
http://www.gnu.org/software/bash/manual ... -Expansion

Bruce B

#18 Post by Bruce B »

You can mostly automate this by downloading the page as html only. Then run the script against the page(s). And have that script download the attachments with both numbers and filenames.

Is this even what you are thinking about?

~

seaside
Posts: 934
Joined: Thu 12 Apr 2007, 00:19

#19 Post by seaside »

amigo wrote:Offhand I don't even remember what this bash-centric feature is called! :oops: Maybe it's 'variable substitution'.

................................

Anyway, a little trick like that can make your code run hundreds of times faster than some multi-command pipeline. For a single instance you'd not notice the difference, but if that line is being used inside a loop which runs many times, then the difference can be dramatic.
amigo,

Yes, I spent time avoiding using bash string manipulations because when I read the explanations, I thought I understood how they worked, and then later when I went to use them, I'd draw a blank and have to extensively play with the expressions in a terminal to get it right, until this change in thought process......

Think of what you don't want in the string (what you want to cut out)

# (first instance) = left side (beginning) of string ##=last instance encountered
% (first instance) = right side (end) of string %%=last instance encountered

str="don't do what I do, do what I say"
eliminate 'don't do' from the left (return "what I do, do what I say")
${str#*do }
starting from the left (#) eliminate *(all chars) up to the first "do " encountered -notice the space after do, if it's not there, it will only eliminate the first "do" in "don't" , leaving "n't do what I do....."

eliminate "do what I say" from the right
${str%do*}
starting from the right (%) eliminate all chars (*) up to the first encountered "do"

In addition to speed and efficiency, another advantage over "cut" is that you can use more than one char as a delimiter.

Regards,
s

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#20 Post by technosaurus »

stu91 wrote:
amigo wrote:All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:

Code: Select all

INDEX=${OUTSTR%%-*}
FILENAME=${OUTSTR#*-}
Hi Amigo,
Could you give a breakdown on what the different characters represent in you code - or any links etc that might expand on such code further.

Thanks in advance.
see "substring manipulation" in the advanced bash scripting guide.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

Post Reply