Split Index and File Name from string?
Split Index and File Name from string?
Hi.
I have list files containing strings like: 59170-"LazY-Fred-English-Locals.tar.gz" which is the last entry in one list file.
This command COUNT=$((`tail -1 $INDEXFILE | cut -d "-" -f1`)) gives me the 59170 as a result in $COUNT - coming from the last enry of the list file.
Let's say i have a single string (no file) 59170-"LazY-Fred-English-Locals.tar.gz" in $OUTSTR
What do i have to change in COUNT=$((`tail -1 $INDEXFILE | cut -d "-" -f1`)) to get the Index and File Name (without the double quotes) as a result in $INDEX and $FILENAME?
INDEX= ? ? ?
FILENAME= ? ? ?
Thanks
RSH
I have list files containing strings like: 59170-"LazY-Fred-English-Locals.tar.gz" which is the last entry in one list file.
This command COUNT=$((`tail -1 $INDEXFILE | cut -d "-" -f1`)) gives me the 59170 as a result in $COUNT - coming from the last enry of the list file.
Let's say i have a single string (no file) 59170-"LazY-Fred-English-Locals.tar.gz" in $OUTSTR
What do i have to change in COUNT=$((`tail -1 $INDEXFILE | cut -d "-" -f1`)) to get the Index and File Name (without the double quotes) as a result in $INDEX and $FILENAME?
INDEX= ? ? ?
FILENAME= ? ? ?
Thanks
RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
It could be easier if all your 'indexes' were 5 digits like in the example above. Are they? If so, then something like this:
testing outputs
$STRING = 59170-LazY-Fred-English-Locals.tar.gz
$FILENAME = -LazY-Fred-English-Locals.tar.gz
$COUNT = 59170
Code: Select all
STRING="59170-LazY-Fred-English-Locals.tar.gz"
FILENAME=`echo $STRING | sed 's/^.....//'`
COUNT=`echo $STRING | cut -d - -f 1`
#test content of variables
echo \$STRING = $STRING
echo \$FILENAME = $FILENAME
echo \$COUNT = $COUNT
testing outputs
$STRING = 59170-LazY-Fred-English-Locals.tar.gz
$FILENAME = -LazY-Fred-English-Locals.tar.gz
$COUNT = 59170
Last edited by Bruce B on Sat 08 Sep 2012, 06:09, edited 1 time in total.
Unfortunately not.Bruce B wrote:It could be easier if all your 'indexes' were 5 digits like in the example above. Are they?
~
Is there any bash function that gives me the position of - ? I could do the rest using string functions that i know and have already used. Just how to find the - ! Could do it in a loop, thought there would be any easier way to do that.
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
Hi Bruce B.Bruce B wrote:I think a loop would be helpful
Please post about 10 or more actual filenames
~
Thanks for your code example. Please take the files below. Would be nice, to have a smarter solution.
How I did solve this:
Code: Select all
INDEXSTR="108-"grubfd01.zip""
#INDEXSTR="1168-"FreeSans.zip""
#INDEXSTR="58992-"LazY-FReD-1.0.2.sfs.gz""
#INDEXSTR="58993-"LazY-FReD-1.0.2.pet""
#INDEXSTR="59170-"LazY-Fred-English-Locals.tar.gz""
nlen="`echo ${#INDEXSTR}`"
echo $nlen
doloop="true"
start=0
while $doloop; do
s0=${INDEXSTR:$start:1}
echo $s0
if [ "$s0" = "-" ]; then
doloop="false"
fi
((start++))
echo $start
done
s1=${INDEXSTR:0:$start-1}
s2=${INDEXSTR:$start-1:$nlen-$start+1}
echo $s2
nlen="`echo ${#s2}`"
echo $nlen
s3=${s2:1:$nlen-1}
echo $s1
echo $s3
RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
I am working on a program that can download the attached files from murga forum, selectable from a list in a gui. Therefor i read out the database - using attach&id , which is in each download link of each attached file.
After building the index file i do get the listed names as a combination of the index and the file name, formatted as shown: 108-"grubfd01.zip"
To download the file i do need the number of its index. But the file is been downloaded as: viewtopic.php?mode=attach&id=108 which gives me absolutely no information on what file type it is. Therefor i do need the file name. After downloading as viewtopic.php?mode=attach&id=108 i move the file and give it a new name ---> the file name.
This is the Project of it all.
After building the index file i do get the listed names as a combination of the index and the file name, formatted as shown: 108-"grubfd01.zip"
To download the file i do need the number of its index. But the file is been downloaded as: viewtopic.php?mode=attach&id=108 which gives me absolutely no information on what file type it is. Therefor i do need the file name. After downloading as viewtopic.php?mode=attach&id=108 i move the file and give it a new name ---> the file name.
This is the Project of it all.
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
And this is the Application
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
Maybe something like this:RSH wrote:Hi Bruce B.Bruce B wrote:I think a loop would be helpful
Please post about 10 or more actual filenames
~
Thanks for your code example. Please take the files below. Would be nice, to have a smarter solution.
How I did solve this:Code: Select all
INDEXSTR="108-"grubfd01.zip"" #INDEXSTR="1168-"FreeSans.zip"" #INDEXSTR="58992-"LazY-FReD-1.0.2.sfs.gz"" #INDEXSTR="58993-"LazY-FReD-1.0.2.pet"" #INDEXSTR="59170-"LazY-Fred-English-Locals.tar.gz"" nlen="`echo ${#INDEXSTR}`" echo $nlen doloop="true" start=0 while $doloop; do s0=${INDEXSTR:$start:1} echo $s0 if [ "$s0" = "-" ]; then doloop="false" fi ((start++)) echo $start done s1=${INDEXSTR:0:$start-1} s2=${INDEXSTR:$start-1:$nlen-$start+1} echo $s2 nlen="`echo ${#s2}`" echo $nlen s3=${s2:1:$nlen-1} echo $s1 echo $s3
Code: Select all
#!/bin/bash
INDEXSTR="108-"grubfd01.zip""
#INDEXSTR="1168-"FreeSans.zip""
#INDEXSTR="58992-"LazY-FReD-1.0.2.sfs.gz""
#INDEXSTR="58993-"LazY-FReD-1.0.2.pet""
#INDEXSTR="59170-"LazY-Fred-English-Locals.tar.gz""
COUNT=`echo $INDEXSTR | cut -d '-' -f1`
FILENAME=`echo $INDEXSTR | cut -d '-' -f2-`
echo $INDEXSTR
echo $COUNT
echo $FILENAME
As far as I know there's no a simple way to communicate between scripts.RSH wrote:I wonder, if it could be done to make this shown code a function in a single script that would be called from another script and would return $s1 and $s3 ---> or similar $COUNT and $FILENAME ?
You can use a temporary file (script1 writes to -> script2 reads from) or you can try this:
http://www.murga-linux.com/puppy/viewtopic.php?t=75778
I never tried it, however, so don't know how exactly use it.
Greetings!
[color=red][size=75][O]bdurate [R]ules [D]estroy [E]nthusiastic [R]ebels => [C]reative [H]umans [A]lways [O]pen [S]ource[/size][/color]
[b][color=green]Omnia mea mecum porto.[/color][/b]
[b][color=green]Omnia mea mecum porto.[/color][/b]
If script1 sends its data to stdout using echo statements, then script2 can retrieve it with code likeRSH wrote:I wonder, if it could be done to make this shown code a function in a single script that would be called from another script and would return $s1 and $s3 ---> or similar $COUNT and $FILENAME ?
RSH
Code: Select all
OUT=$(script1)
COUNT=$(echo $OUT | cut -d " " -f 1)
FILENAME=$(echo $OUT | cut -d " " -f 2)
All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:
Code: Select all
INDEX=${OUTSTR%%-*}
FILENAME=${OUTSTR#*-}
Thanks amigo.amigo wrote:All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:Code: Select all
INDEX=${OUTSTR%%-*} FILENAME=${OUTSTR#*-}
Looks like this could be what i'm looking for. Will test it later.
RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]
Hi Amigo,amigo wrote:All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:Code: Select all
INDEX=${OUTSTR%%-*} FILENAME=${OUTSTR#*-}
Could you give a breakdown on what the different characters represent in you code - or any links etc that might expand on such code further.
Thanks in advance.
Offhand I don't even remember what this bash-centric feature is called! Maybe it's 'variable substitution'.
But this one means "everything(*) to the left of the first '-'":
INDEX=${OUTSTR%%-*}
And this one means "everything to the right of the first '-'":
FILENAME=${OUTSTR#*-}
You can play with it in a terminal to understand it better:
Then vary that like this:
Using one or two '#' characters parses the string from left to right. Using one or two '%' chars parses from right to left. Using doublke characters means to use the longest match, using a single char returns the shortest match.
Anyway, a little trick like that can make your code run hundreds of times faster than some multi-command pipeline. For a single instance you'd not notice the difference, but if that line is being used inside a loop which runs many times, then the difference can be dramatic.
But this one means "everything(*) to the left of the first '-'":
INDEX=${OUTSTR%%-*}
And this one means "everything to the right of the first '-'":
FILENAME=${OUTSTR#*-}
You can play with it in a terminal to understand it better:
Code: Select all
TEST=59170-LazY-Fred-English-Locals.tar.gz
echo ${TEST%%-*}
echo ${TEST#*-}
Code: Select all
echo ${TEST%-*}
echo ${TEST##*-}
Anyway, a little trick like that can make your code run hundreds of times faster than some multi-command pipeline. For a single instance you'd not notice the difference, but if that line is being used inside a loop which runs many times, then the difference can be dramatic.
There are many possibilities using cut, awk, sed, grep . Since cut is already known in combination with echo ,
here the others :
awk :
sed :
grep :
*
On a large database file :
might be faster than a loop .
http://www.grymoire.com/Unix/Sed.html is the reference i download whenever i have/had sed questions .
http://www.grymoire.com/Unix/Awk.html is not as good as the sed tutorial as my impression is,
still have to checkout
http://www.grymoire.com/Unix/Grep.html .
Google for " $COMMAND tutorial " brings up a lot of stuff .
*
Other chars for shell variable substitution are '/' & '//' and since bash-4 '^' ',' & '^^' ',,' :
Learn here :
http://www.gnu.org/software/bash/manual ... -Expansion
here the others :
Code: Select all
STRING="1234-\"attached.file.pet\""
Code: Select all
echo "$STRING" |awk -F '"' '{print "\""$2"\""}'
Code: Select all
echo "$STRING" |sed 's:.*-::'
Code: Select all
echo "$STRING" |grep -o '".*"'
On a large database file :
Code: Select all
awk -F '"' '{print "\""$2"\""}' /database_file.db >/newfile.filenames-only.awk.list
Code: Select all
sed 's:.*-::' /database_file.db >/newfile.filenames-only.sed.list
Code: Select all
grep -o '".*"'/database_file.db >/newfile.filenames-only.grep.list
http://www.grymoire.com/Unix/Sed.html is the reference i download whenever i have/had sed questions .
http://www.grymoire.com/Unix/Awk.html is not as good as the sed tutorial as my impression is,
still have to checkout
http://www.grymoire.com/Unix/Grep.html .
Google for " $COMMAND tutorial " brings up a lot of stuff .
*
Other chars for shell variable substitution are '/' & '//' and since bash-4 '^' ',' & '^^' ',,' :
Code: Select all
STRING="ABCDCEFG-123.456.tar"
echo "${STRING//C/cccCccc}"
echo "${STRING/C/cccCccc}"
Code: Select all
echo "${STRING,C}"
http://www.gnu.org/software/bash/manual ... -Expansion
amigo,amigo wrote:Offhand I don't even remember what this bash-centric feature is called! Maybe it's 'variable substitution'.
................................
Anyway, a little trick like that can make your code run hundreds of times faster than some multi-command pipeline. For a single instance you'd not notice the difference, but if that line is being used inside a loop which runs many times, then the difference can be dramatic.
Yes, I spent time avoiding using bash string manipulations because when I read the explanations, I thought I understood how they worked, and then later when I went to use them, I'd draw a blank and have to extensively play with the expressions in a terminal to get it right, until this change in thought process......
Think of what you don't want in the string (what you want to cut out)
# (first instance) = left side (beginning) of string ##=last instance encountered
% (first instance) = right side (end) of string %%=last instance encountered
str="don't do what I do, do what I say"
eliminate 'don't do' from the left (return "what I do, do what I say")
${str#*do }
starting from the left (#) eliminate *(all chars) up to the first "do " encountered -notice the space after do, if it's not there, it will only eliminate the first "do" in "don't" , leaving "n't do what I do....."
eliminate "do what I say" from the right
${str%do*}
starting from the right (%) eliminate all chars (*) up to the first encountered "do"
In addition to speed and efficiency, another advantage over "cut" is that you can use more than one char as a delimiter.
Regards,
s
- technosaurus
- Posts: 4853
- Joined: Mon 19 May 2008, 01:24
- Location: Blue Springs, MO
- Contact:
see "substring manipulation" in the advanced bash scripting guide.stu91 wrote:Hi Amigo,amigo wrote:All these loops and extraneous use of echo, sed and cut... Bash makes this very simple:Code: Select all
INDEX=${OUTSTR%%-*} FILENAME=${OUTSTR#*-}
Could you give a breakdown on what the different characters represent in you code - or any links etc that might expand on such code further.
Thanks in advance.
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].