Bash: Count numbers in list
My submission:
With TIMES=3650 and SONGS=2000 as above (meaning, 3650 timestamps per song, with 2000 songs), result: .
Using Ziggy's original parameter of 25 timestamps per song and 2000 songs (TIMES=25, SONGS=2000), result
Using 25 timestamps for 20,000 songs (TIMES=25, SONGS=20000), result:
Using 3650 timestamps for 20,000 songs (TIMES=3650, SONGS=20000), result:
Of course these are on my machine and is not directly comparable with anyone else's number. Ziggy need to run this on his own machine to test
cheers!
Code: Select all
#!/bin/ash
LANG=C
FIND=3421 # timestamp to find (random)
TIMES=3650 # 3650 timestamps (tracks played everyday for 10 years) - number from 1 to $TIMES
SONGS=2000 # 2000 songs
# generate fake timestamp data
tstamps=$(seq -s, 1 $TIMES)
line="/path/to/song.mp3 | $tstamps"
#echo $line
# generate fake playlist (repeat timestamp data $SONGS times)
for a in $(seq 1 $SONGS); do
echo $line
done |
# do actual work
awk -F, -v TS=$FIND '{
split($1, a, "|"); $1=a[2] # fix first entry
max=NF; min=1;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS >= $i) min = i
else max = i
}
print a[1], min
}'
Code: Select all
# time ./fav.sh > /dev/null
real 0m1.641s
user 0m2.916s
sys 0m0.073s
Using Ziggy's original parameter of 25 timestamps per song and 2000 songs (TIMES=25, SONGS=2000), result
Code: Select all
# time ./fav.sh > /dev/null
real 0m0.068s
user 0m0.080s
sys 0m0.010s
Code: Select all
# time ./fav.sh > /dev/null
real 0m0.479s
user 0m0.730s
sys 0m0.057s
Code: Select all
# time ./fav.sh > /dev/null
real 0m16.855s
user 0m30.308s
sys 0m0.520s
cheers!
Fatdog64 forum links: [url=http://murga-linux.com/puppy/viewtopic.php?t=117546]Latest version[/url] | [url=https://cutt.ly/ke8sn5H]Contributed packages[/url] | [url=https://cutt.ly/se8scrb]ISO builder[/url]
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
Count numbers in list
changezigbert wrote:....and that means????technosaurus wrote:Oops need to prepend the -lt test with a [ $1 ] && ... For the case where it has not been played in the time period
Code: Select all
while ([ $1 -lt $timestamp ]) do
Code: Select all
while ( [ $1 ] && [ $1 -lt $timestamp ]) do
# ./file_times_played_since 1345648694
The number of times played /root/file1.mp3 since 1345648694 is 0
The number of times played /root/file2.mp3 since 1345648694 is 1
#
- technosaurus
- Posts: 4853
- Joined: Mon 19 May 2008, 01:24
- Location: Blue Springs, MO
- Contact:
Re: Count numbers in list
@jamesbond - over 100 lines grep becomes faster than using builtins, and it doesn't slow it down significantly for less than 100 lines, so its probably worth calling grep in this case since we can't assume a user has/plays only a few songs.
thanks, I was/am posting from my phone.L18L wrote:changezigbert wrote:....and that means????technosaurus wrote:Oops need to prepend the -lt test with a [ $1 ] && ... For the case where it has not been played in the time periodtoCode: Select all
while ([ $1 -lt $timestamp ]) do
tested successfully (no more: ash: 1345623045: unknown operand ):Code: Select all
while ( [ $1 ] && [ $1 -lt $timestamp ]) do
# ./file_times_played_since 1345648694
The number of times played /root/file1.mp3 since 1345648694 is 0
The number of times played /root/file2.mp3 since 1345648694 is 1
#
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].
-
- Posts: 229
- Joined: Wed 25 Aug 2010, 15:38
- Location: ISM Dhanbad, Jharkhand, India
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
Count numbers in list
I have been testing jamesbond´s submission, too.
change
print a[1], min
to
print a[1], NF-min
to get TIMES from FIND to NOW as required.
Speed: wow, it is TIME to learn awk now.
change
print a[1], min
to
print a[1], NF-min
to get TIMES from FIND to NOW as required.
Speed: wow, it is TIME to learn awk now.
- technosaurus
- Posts: 4853
- Joined: Mon 19 May 2008, 01:24
- Location: Blue Springs, MO
- Contact:
@L18L
the awk would be even faster if it only computed stuff for the match:not sure how well awk works on non-ascii filenames though
the awk would be even faster if it only computed stuff for the match:
Code: Select all
FILEPATH=${FILEPATH//\//\\\/} #escape the slashes for awk ... if you care about paths
#FILEPATH=${FILEPATH##*/} #remove the path for awk ... if you only care about names
awk -F, -v TS=$FIND '/'$FILEPATH'/{...
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
Count numbers in list
Yes, and slower if we care about accuracy.technosaurus wrote:the awk would be even faster if it only computed stuff for the match
The binary search algorithm is not accurate if case of some identic values.
But on a single user system there are no doubles
Test results so far with the following file
########################################
########################################
########################################
########################################
All worked with non-english filename /root/file2000øæå.mp3
Thank you all
Sigmund
Code: Select all
/root/file1.mp3| 1345641102,1345644563,1345647584,1345647585,1345647586,1345647587,1345647588,1345647589,1345647590,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300
/root/file2.mp3| 1345623045,1345643786,1345648695,1345648696,1345648697,1345648698,1345648699,1345648700,1345648701,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300
[...]
/root/file1999.mp3| 1345641102,1345644563,1345647584,1345647585,1345647586,1345647587,1345647588,1345647589,1345647590,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300
/root/file2000øæå.mp3| 1345623045,1345643786,1345648695,1345648696,1345648697,1345648698,1345648699,1345648700,1345648701,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300
Code: Select all
#!/bin/bash
#usage: played_since_foreach time
time="$1";
ifs_bak="$IFS"
IFS=""
while read -d "|" record; do
IFS=","
#Get all timestamps in array
read -a times
#Binary search
i=$(( ${#times[*]}/2 ))
d=$(( $i/2 ))
test "$d" -eq "0" && d=1 #precaution
ulim=$(( ${#times[*]}-1 ))
count=-1
while test "$i" -ge "0" -a "$i" -lt "$ulim"; do #check whether index is in range
if test "$time" -lt "${times[$i]}"; then
i=$(( $i-$d ))
elif test "$time" -gt "${times[$(( $i+1 ))]}"; then
i=$(( $i+$d ))
else
#found, tell no. of records greater than this
count="$(( $ulim-$i ))"
break
fi
d=$(( ($d/2) ))
test "$d" -eq "0" && d=1 #precaution
done
#In case of 'out of range'
if test "$count" -lt "0"; then
if test "$ulim" -lt "0"; then
count=0
else
if test "$time" -lt "$times"; then
count="${#times[*]}"
elif test "$time" -gt "${times[$ulim]}"; then
count=0
else
count="bug" #This should not be reached
fi
fi
fi
#Show result for this record
echo "$record: $count"
done < database.txt
IFS="$ifs_bak"
Code: Select all
real 0m1.129s
user 0m1.080s
sys 0m0.047s
Code: Select all
#!/bin/ash
timestamp=$1
ifs_bak=$IFS
IFS=","
while read LOGENTRY; do
filename=${LOGENTRY%%|*}
#LOGENTRY=`grep $filename database.txt`
#grep is faster than while read case on large files
STAMPS=${LOGENTRY#*|}
#you could use bash arrays instead - set allows "arrays" in sh, ash, etc...
#set ${STAMPS//,/ }
set ${STAMPS}
#for speed change IFS to "," vs. the //,/ } but don't forget to reset it
while ( [ $1 ] && [ $1 -lt $timestamp ]) do
shift #a hacky way to remove all entries < timestamp
done
# $# is the number of args passed or in this case set with "set ..."
echo The number of times played $filename since $timestamp is $#
done < database.txt
IFS=$ifs_bak
Code: Select all
real 0m1.434s
user 0m0.190s
sys 0m0.633s
Code: Select all
#!/bin/ash
LANG=C
FIND=$1
cat /root/database.txt |
awk -F, -v TS=$FIND '{
split($1, a, "|"); $1=a[2] # fix first entry
max=NF; min=1;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS >= $i) min = i
else max = i
}
print a[1], NF-min
}'
Code: Select all
real 0m0.026s
user 0m0.020s
sys 0m0.003s
All worked with non-english filename /root/file2000øæå.mp3
Thank you all
Sigmund
Ok, next stage
I don't understand a bit of the awk-code, so I need some help to integrate it fully into pMusic code.
It now works ok (only ok) to find the last played songs for a given period:
My NEW index file looks right now like this:
To get it work I made a workaround and used awk separator |: instead of simply |
... I couldn't get the awk code to read column 3 instead of column 2 (as in the example in the main post.)
Some #comments would be great
The function in pMusic:
Also, what we want is to define a alternative max value to find timestamps in a specific period. - Like february 2012. Let's say that is $4.
Anyone???
Sigmund
I don't understand a bit of the awk-code, so I need some help to integrate it fully into pMusic code.
It now works ok (only ok) to find the last played songs for a given period:
My NEW index file looks right now like this:
Code: Select all
/mnt/sdb1/musikk/mp3/Judas priest - Monsters of rock.mp3| Judas priest - Monsters of rock.mp3|:,1345762237
/mnt/sdb1/musikk/mp3/Iron maiden - Seventh son of a seventh son.mp3| Iron maiden - Seventh son of a seventh son.mp3|:,1345762833
/mnt/sdb1/musikk/mp3/Savatage - Dead winter dead.mp3| Savatage - Dead winter dead.mp3|:,1345763094
/mnt/sdb1/musikk/mp3/Twisted sister - Hot love.mp3| Twisted sister - Hot love.mp3|:,1345763569
/mnt/sdb1/musikk/mp3/Twisted sister - I believe in you.mp3| Twisted sister - I believe in you.mp3|:,1345763892
/mnt/sdb1/musikk/mp3/Twisted sister - I'm so hot for you.mp3| Twisted sister - I'm so hot for you.mp3|:,1345764140
/mnt/sdb1/musikk/mp3/Twisted sister - I wanna rock.mp3| Twisted sister - I wanna rock.mp3|:,1345764326
/mnt/sdb1/musikk/mp3/Twisted sister - King of the fools.mp3| Twisted sister - King of the fools.mp3|:,1345764715
/mnt/sdb1/musikk/mp3/Twisted sister - Leader of the pack.mp3| Twisted sister - Leader of the pack.mp3|:,1345764940
/mnt/sdb1/musikk/mp3/Twisted sister - Lookin out for no. 1.mp3| Twisted sister - Lookin out for no. 1.mp3|:,1345765130
/mnt/sdb1/musikk/mp3/Twisted sister - One bad habit.mp3| Twisted sister - One bad habit.mp3|:,1345765538
/mnt/sdb1/musikk/mp3/Twisted sister - Run for your life.mp3| Twisted sister - Run for your life.mp3|:,1345766019
/mnt/sdb1/musikk/mp3/Twisted sister - The beast.mp3| Twisted sister - The beast.mp3|:,1345760757,1345766230
/mnt/sdb1/musikk/mp3/Twisted sister - The price.mp3| Twisted sister - The price.mp3|:,1345766462
/mnt/sdb1/musikk/mp3/Twisted sister - Under the blade.mp3| Twisted sister - Under the blade.mp3|:,1345766745
/mnt/sdb1/musikk/mp3/Twisted sister - Wake up (the sleeping giant).mp3| Twisted sister - Wake up (the sleeping giant).mp3|:,1345767008
/mnt/sdb1/musikk/mp3/Twisted sister - We're not gonna take it.mp3| Twisted sister - We're not gonna take it.mp3|:,1345767230
/mnt/sdb1/musikk/mp3/Twisted sister - You can't stop rock'n'roll.mp3| Twisted sister - You can't stop rock'n'roll.mp3|:,1345767514
/mnt/sdb1/musikk/mp3/Twisted sister - You're not alone (Suzette's song).mp3| Twisted sister - You're not alone (Suzette's song).mp3|:,1345767762
/mnt/sdb1/musikk/mp3/Twisted sister - You want what we got.mp3| Twisted sister - You want what we got.mp3|:,1345767984
/mnt/sdb1/musikk/mp3/Saxon - Wheels of steel.mp3| Saxon - Wheels of steel.mp3|:,1345789853
/mnt/sdb1/musikk/mp3/Judas priest - Parental guidance (LP).mp3| Judas priest - Parental guidance (LP).mp3|:,1345790062
/mnt/sdb1/musikk/mp3/Judas priest - Turbo lover (LP).mp3| Judas priest - Turbo lover (LP).mp3|:,1345790660
/mnt/sdb1/musikk/mp3/Alice Cooper - Poison.mp3| Alice Cooper - Poison.mp3|:,1345790933
/mnt/sdb1/musikk/mp3/Faith no more - Easy.mp3| Faith no more - Easy.mp3|:,1345791129
/mnt/sdb1/musikk/mp3/Bangles - Manic monday.mp3| Bangles - Manic monday.mp3|:,1345791315
/mnt/sdb1/musikk/mp3/Secret garden - Nocturne.mp3| Secret garden - Nocturne.mp3|:,1345791511
/mnt/sdb1/musikk/mp3/Dimmu borgir - Vredesbyrd.mp3| Dimmu borgir - Vredesbyrd.mp3|:,1345760201,1345760237,1345794966
/mnt/sdb1/musikk/mp3/Black sabbath - Heaven and hell.mp3| Black sabbath - Heaven and hell.mp3|:,1345760180,1345761905,1345796352
/mnt/sdb1/musikk/mp3/Twisted sister - Love is for suckers.mp3| Twisted sister - Love is for suckers.mp3|:,1345765338,1345797921
/mnt/sdb1/musikk/mp3/Dimmu borgir - Burn in hell.mp3| Dimmu borgir - Burn in hell.mp3|:,1345760219,1345760545,1345795930,1345798229
/mnt/sdb1/musikk/mp3/Twisted sister - Destroyer.mp3| Twisted sister - Destroyer.mp3|:,1345763352,1346008426
/mnt/sdb1/musikk/flac/Candlemass - At the gallows end.flac| Candlemass - At the gallows end.flac|:,1345761248,1345796972,1346008710
/mnt/sdb1/musikk/mp3/Saint deamon - My heart.mp3| Saint deamon - My heart.mp3|:,1345761483,1345797207,1346008944
/mnt/sdb1/musikk/mp3/Ræva Rockers - Depp (LP).mp3| Ræva Rockers - Depp (LP).mp3|:,1345795310,1345797401,1346009139
... I couldn't get the awk code to read column 3 instead of column 2 (as in the example in the main post.)
Some #comments would be great
The function in pMusic:
Code: Select all
-index_rating_buildlist)
TIMESTAMP="$2"
cat "$3" |
awk -F, -v TS=$TIMESTAMP '{
split($1, a, "|:"); $1=a[2] # fix first entry
max=NF; min=1;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS >= $i) min = i
else max = i
}
print a[1], NF-min
}'
;;
Anyone???
Sigmund
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
Awk: Count numbers in list
@zigbert - faster than
read LINE; do ..... done < database.txt
is your
cat /root/database.txt | awk -F, -v TS=$FIND '...'
But even faster (10%) is:
awk -F, -v TS=$FIND ' ...' database.txt
read LINE; do ..... done < database.txt
is your
cat /root/database.txt | awk -F, -v TS=$FIND '...'
But even faster (10%) is:
awk -F, -v TS=$FIND ' ...' database.txt
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
Awk: Count numbers in list
Why 3 columns?zigbert wrote:...My NEW index file looks right now like this:To get it work I made a workaround and used awk separator |: instead of simply |Code: Select all
/mnt/sdb1/musikk/mp3/Judas priest - Monsters of rock.mp3| Judas priest - Monsters of rock.mp3|:,1345762237
... I couldn't get the awk code to read column 3 instead of column 2 (as in the example in the main post.)
Some #comments would be great
I have successfully tried to print file and basename from 2 column database:
changing just
print a[1], NF-min
to
num=split (a[1], b , "/")
print a[1], b[num], NF-min
Learning awk:
split(s, a [, r]) Splits the string s into the array a on the
regular expression r, and returns the number of
fields. If r is omitted, FS is used instead.
The array a is cleared first. Splitting behaves
identically to field splitting.
Go back to 2 column database
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
Awk: Count numbers in list
Where is $1 ?zigbert wrote:Also, what we want is to define a alternative max value to find timestamps in a specific period. - Like february 2012. Let's say that is $4.
Here is a solution using $1 and $2 timestamp from and -to
#!/bin/ash
#additional output filename only=basename
#TIMESTAMP from to
#
LANG=C
#
FIND_FROM=$1 #timestamp
FIND_TO=$2
awk -F, -v TS1=$FIND_FROM -v TS2=$FIND_TO '{
split($1, a, "|"); $1=a[2] # fix first entry
max=NF; min=0;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS1 >= $i) min = i
else max = i
}
from=NF-min
max=NF; min=0;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS2 >= $i) min = i
else max = i
}
to=NF-min
num=split (a[1], b , "/")
print a[1], b[num], from-to
}' database.txt
Re: Awk: Count numbers in list
DoneL18L wrote:But even faster (10%) is:
awk -F, -v TS=$FIND ' ...' database.txt
Re: Awk: Count numbers in list
Else, I find this excellent!!!!!L18L wrote:Where is $1 ? $1 is for pMusic internal redirectionzigbert wrote:Also, what we want is to define a alternative max value to find timestamps in a specific period. - Like february 2012. Let's say that is $4.
Here is a solution using $1 and $2 timestamp from and -to
#!/bin/ash
#additional output filename only=basename
#TIMESTAMP from to
#
LANG=C
#
FIND_FROM=$1 #timestamp
FIND_TO=$2
awk -F, -v TS1=$FIND_FROM -v TS2=$FIND_TO '{
split($1, a, "|"); $1=a[2] # fix first entry
max=NF; min=0;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS1 >= $i) min = i
else max = i
}
from=NF-min
max=NF; min=0;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS2 >= $i) min = i
else max = i
}
to=NF-min
variable 'to' is never 0, even if zero hits
num=split (a[1], b , "/")
print a[1], b[num], from-to
}' database.txt
- L18L
- Posts: 3479
- Joined: Sat 19 Jun 2010, 18:56
- Location: www.eussenheim.de/
Re: Awk: Count numbers in list
And our thanks go to: jamesbondzigbert wrote:Else, I find this excellent!!!!!