Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Sun 20 Apr 2014, 19:08
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
Bash: Count numbers in list
Post new topic   Reply to topic View previous topic :: View next topic
Page 2 of 3 [33 Posts]   Goto page: Previous 1, 2, 3 Next
Author Message
zigbert


Joined: 29 Mar 2006
Posts: 5562
Location: Valåmoen, Norway

PostPosted: Fri 24 Aug 2012, 10:47    Post subject:  

This sounds interesting.
In am on the run this weekend, so I can't compare executing-time until Monday.


Sigmund

_________________
Stardust resources
Back to top
View user's profile Send private message Visit poster's website 
zigbert


Joined: 29 Mar 2006
Posts: 5562
Location: Valåmoen, Norway

PostPosted: Fri 24 Aug 2012, 10:50    Post subject:  

technosaurus wrote:
Oops need to prepend the -lt test with a [ $1 ] && ... For the case where it has not been played in the time period
....and that means???? Embarassed
_________________
Stardust resources
Back to top
View user's profile Send private message Visit poster's website 
jamesbond

Joined: 26 Feb 2007
Posts: 1876
Location: The Blue Marble

PostPosted: Fri 24 Aug 2012, 12:27    Post subject:  

My submission:

Code:
#!/bin/ash
LANG=C

FIND=3421  # timestamp to find (random)
TIMES=3650 # 3650 timestamps (tracks played everyday for 10 years) - number from 1 to $TIMES
SONGS=2000 # 2000 songs

# generate fake timestamp data
tstamps=$(seq -s, 1 $TIMES)
line="/path/to/song.mp3 | $tstamps"
#echo $line

# generate fake playlist (repeat timestamp data $SONGS times)
for a in $(seq 1 $SONGS); do
   echo $line
done |

# do actual work
awk -F, -v TS=$FIND '{
      split($1, a, "|"); $1=a[2] # fix first entry
      max=NF; min=1;
      
      while (max-min > 1) {
         i=int( (max+min)/2 )
         if (TS >= $i) min = i
         else max = i
      }
      print a[1], min
}'

With TIMES=3650 and SONGS=2000 as above (meaning, 3650 timestamps per song, with 2000 songs), result:
Code:
# time ./fav.sh  > /dev/null

real   0m1.641s
user   0m2.916s
sys   0m0.073s
.

Using Ziggy's original parameter of 25 timestamps per song and 2000 songs (TIMES=25, SONGS=2000), result
Code:
# time ./fav.sh  > /dev/null

real   0m0.068s
user   0m0.080s
sys   0m0.010s


Using 25 timestamps for 20,000 songs (TIMES=25, SONGS=20000), result:
Code:
# time ./fav.sh  > /dev/null

real   0m0.479s
user   0m0.730s
sys   0m0.057s


Using 3650 timestamps for 20,000 songs (TIMES=3650, SONGS=20000), result:
Code:
# time ./fav.sh  > /dev/null

real   0m16.855s
user   0m30.308s
sys   0m0.520s


Of course these are on my machine and is not directly comparable with anyone else's number. Ziggy need to run this on his own machine to test Smile

cheers!

_________________
Fatdog64, Slacko and Puppeee user. Puppy user since 2.13
Back to top
View user's profile Send private message 
L18L

Joined: 19 Jun 2010
Posts: 2476
Location: Burghaslach, Germany somewhere also known as "Hosla"

PostPosted: Fri 24 Aug 2012, 13:05    Post subject: Count numbers in list
Subject description: technosaurus´ "something like" implemented by L18L
 

zigbert wrote:
technosaurus wrote:
Oops need to prepend the -lt test with a [ $1 ] && ... For the case where it has not been played in the time period
....and that means???? Embarassed


change
Code:
while ([ $1 -lt $timestamp ]) do

to
Code:
while ( [ $1 ] && [ $1 -lt $timestamp ]) do


tested successfully (no more: ash: 1345623045: unknown operand ):
# ./file_times_played_since 1345648694
The number of times played /root/file1.mp3 since 1345648694 is 0
The number of times played /root/file2.mp3 since 1345648694 is 1
#
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4134

PostPosted: Fri 24 Aug 2012, 13:16    Post subject: Re: Count numbers in list
Subject description: technosaurus´ "something like" implemented by L18L
 

@jamesbond - over 100 lines grep becomes faster than using builtins, and it doesn't slow it down significantly for less than 100 lines, so its probably worth calling grep in this case since we can't assume a user has/plays only a few songs.

L18L wrote:
zigbert wrote:
technosaurus wrote:
Oops need to prepend the -lt test with a [ $1 ] && ... For the case where it has not been played in the time period
....and that means???? Embarassed


change
Code:
while ([ $1 -lt $timestamp ]) do

to
Code:
while ( [ $1 ] && [ $1 -lt $timestamp ]) do


tested successfully (no more: ash: 1345623045: unknown operand ):
# ./file_times_played_since 1345648694
The number of times played /root/file1.mp3 since 1345648694 is 0
The number of times played /root/file2.mp3 since 1345648694 is 1
#
thanks, I was/am posting from my phone.
_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
akash_rawal

Joined: 25 Aug 2010
Posts: 232
Location: ISM Dhanbad, Jharkhand, India

PostPosted: Fri 24 Aug 2012, 13:55    Post subject:  

jamesbond wrote:

My submission:
...

I tested your solution, it takes only one second to process 100 songs with 100000 timestamps. Faster by orders of magnitude Smile
Back to top
View user's profile Send private message 
L18L

Joined: 19 Jun 2010
Posts: 2476
Location: Burghaslach, Germany somewhere also known as "Hosla"

PostPosted: Sat 25 Aug 2012, 06:45    Post subject: Count numbers in list
Subject description: awk
 

I have been testing jamesbond´s submission, too.

change
print a[1], min
to
print a[1], NF-min
to get TIMES from FIND to NOW as required.

Speed: wow, it is TIME to learn awk now. Cool
Back to top
View user's profile Send private message 
technosaurus


Joined: 18 May 2008
Posts: 4134

PostPosted: Sat 25 Aug 2012, 08:32    Post subject:  

@L18L
the awk would be even faster if it only computed stuff for the match:
Code:
FILEPATH=${FILEPATH//\//\\\/} #escape the slashes for awk ... if you care about paths
#FILEPATH=${FILEPATH##*/} #remove the path for awk ... if you only care about names
awk -F, -v TS=$FIND '/'$FILEPATH'/{...
not sure how well awk works on non-ascii filenames though
_________________
Web Programming - Pet Packaging 100 & 101
Back to top
View user's profile Send private message 
L18L

Joined: 19 Jun 2010
Posts: 2476
Location: Burghaslach, Germany somewhere also known as "Hosla"

PostPosted: Sun 26 Aug 2012, 03:56    Post subject: Count numbers in list  

technosaurus wrote:
the awk would be even faster if it only computed stuff for the match
Yes, and slower if we care about accuracy.

The binary search algorithm is not accurate if case of some identic values.
But on a single user system there are no doubles Laughing
Back to top
View user's profile Send private message 
zigbert


Joined: 29 Mar 2006
Posts: 5562
Location: Valåmoen, Norway

PostPosted: Sun 26 Aug 2012, 14:56    Post subject:  

Test results so far with the following file
Code:
/root/file1.mp3| 1345641102,1345644563,1345647584,1345647585,1345647586,1345647587,1345647588,1345647589,1345647590,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300
/root/file2.mp3| 1345623045,1345643786,1345648695,1345648696,1345648697,1345648698,1345648699,1345648700,1345648701,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300
[...]
/root/file1999.mp3| 1345641102,1345644563,1345647584,1345647585,1345647586,1345647587,1345647588,1345647589,1345647590,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300
/root/file2000øæå.mp3| 1345623045,1345643786,1345648695,1345648696,1345648697,1345648698,1345648699,1345648700,1345648701,1345740286,1345740287,1345740288,1345740290,1345740291,1345740291,1345740292,1345740293,1345740293,1345740294,1345740295,1345740296,1345740297,1345740298,1345740299,1345740300


########################################

Code:
#!/bin/bash

#usage: played_since_foreach time

time="$1";

ifs_bak="$IFS"
IFS=""

while read -d "|" record; do
   IFS=","
   #Get all timestamps in array
   read -a times
   #Binary search
   i=$(( ${#times[*]}/2 ))
   d=$(( $i/2 ))
   test "$d" -eq "0" && d=1 #precaution
   ulim=$(( ${#times[*]}-1 ))
   count=-1
   while test "$i" -ge "0" -a "$i" -lt "$ulim"; do #check whether index is in range
      if test "$time" -lt "${times[$i]}"; then
         i=$(( $i-$d ))
      elif test "$time" -gt "${times[$(( $i+1 ))]}"; then
         i=$(( $i+$d ))
      else
         #found, tell no. of records greater than this
         count="$(( $ulim-$i ))"
         break
      fi
      d=$(( ($d/2) ))
      test "$d" -eq "0" && d=1 #precaution
   done
   #In case of 'out of range'
   if test "$count" -lt "0"; then
      if test "$ulim" -lt "0"; then
         count=0
      else
         if test "$time" -lt "$times"; then
            count="${#times[*]}"
         elif test "$time" -gt "${times[$ulim]}"; then
            count=0
         else
            count="bug" #This should not be reached
         fi
      fi
   fi
   #Show result for this record
   echo "$record: $count"
done < database.txt

IFS="$ifs_bak"
Code:
real   0m1.129s
user   0m1.080s
sys   0m0.047s


########################################

Code:
#!/bin/ash
timestamp=$1

ifs_bak=$IFS
IFS=","

while read LOGENTRY; do

filename=${LOGENTRY%%|*}
#LOGENTRY=`grep $filename database.txt`
#grep is faster than while read case on large files

STAMPS=${LOGENTRY#*|}
#you could use bash arrays instead - set allows "arrays" in sh, ash, etc...
#set ${STAMPS//,/ }
set ${STAMPS}
#for speed change IFS to "," vs. the //,/ } but don't forget to reset it
while ( [ $1 ] && [ $1 -lt $timestamp ]) do
shift #a hacky way to remove all entries < timestamp
done
# $# is the number of args passed or in this case set with "set ..."
echo The number of times played $filename since $timestamp is $#

done < database.txt

IFS=$ifs_bak
Code:
real   0m1.434s
user   0m0.190s
sys   0m0.633s


########################################

Code:
#!/bin/ash
LANG=C
FIND=$1

cat /root/database.txt |
awk -F, -v TS=$FIND '{
      split($1, a, "|"); $1=a[2] # fix first entry
      max=NF; min=1;
     
      while (max-min > 1) {
         i=int( (max+min)/2 )
         if (TS >= $i) min = i
         else max = i
      }
      print a[1], NF-min
}'
Code:
real   0m0.026s
user   0m0.020s
sys   0m0.003s


########################################

All worked with non-english filename /root/file2000øæå.mp3


Thank you all
Sigmund

_________________
Stardust resources
Back to top
View user's profile Send private message Visit poster's website 
zigbert


Joined: 29 Mar 2006
Posts: 5562
Location: Valåmoen, Norway

PostPosted: Sun 26 Aug 2012, 15:30    Post subject:  

Ok, next stage
I don't understand a bit of the awk-code, so I need some help to integrate it fully into pMusic code.

It now works ok (only ok) to find the last played songs for a given period:
My NEW index file looks right now like this:
Code:
  /mnt/sdb1/musikk/mp3/Judas priest - Monsters of rock.mp3|    Judas priest - Monsters of rock.mp3|:,1345762237
  /mnt/sdb1/musikk/mp3/Iron maiden - Seventh son of a seventh son.mp3|    Iron maiden - Seventh son of a seventh son.mp3|:,1345762833
  /mnt/sdb1/musikk/mp3/Savatage - Dead winter dead.mp3|    Savatage - Dead winter dead.mp3|:,1345763094
  /mnt/sdb1/musikk/mp3/Twisted sister - Hot love.mp3|    Twisted sister - Hot love.mp3|:,1345763569
  /mnt/sdb1/musikk/mp3/Twisted sister - I believe in you.mp3|    Twisted sister - I believe in you.mp3|:,1345763892
  /mnt/sdb1/musikk/mp3/Twisted sister - I'm so hot for you.mp3|    Twisted sister - I'm so hot for you.mp3|:,1345764140
  /mnt/sdb1/musikk/mp3/Twisted sister - I wanna rock.mp3|    Twisted sister - I wanna rock.mp3|:,1345764326
  /mnt/sdb1/musikk/mp3/Twisted sister - King of the fools.mp3|    Twisted sister - King of the fools.mp3|:,1345764715
  /mnt/sdb1/musikk/mp3/Twisted sister - Leader of the pack.mp3|    Twisted sister - Leader of the pack.mp3|:,1345764940
  /mnt/sdb1/musikk/mp3/Twisted sister - Lookin out for no. 1.mp3|    Twisted sister - Lookin out for no. 1.mp3|:,1345765130
  /mnt/sdb1/musikk/mp3/Twisted sister - One bad habit.mp3|    Twisted sister - One bad habit.mp3|:,1345765538
  /mnt/sdb1/musikk/mp3/Twisted sister - Run for your life.mp3|    Twisted sister - Run for your life.mp3|:,1345766019
  /mnt/sdb1/musikk/mp3/Twisted sister - The beast.mp3|    Twisted sister - The beast.mp3|:,1345760757,1345766230
  /mnt/sdb1/musikk/mp3/Twisted sister - The price.mp3|    Twisted sister - The price.mp3|:,1345766462
  /mnt/sdb1/musikk/mp3/Twisted sister - Under the blade.mp3|    Twisted sister - Under the blade.mp3|:,1345766745
  /mnt/sdb1/musikk/mp3/Twisted sister - Wake up (the sleeping giant).mp3|    Twisted sister - Wake up (the sleeping giant).mp3|:,1345767008
  /mnt/sdb1/musikk/mp3/Twisted sister - We're not gonna take it.mp3|    Twisted sister - We're not gonna take it.mp3|:,1345767230
  /mnt/sdb1/musikk/mp3/Twisted sister - You can't stop rock'n'roll.mp3|    Twisted sister - You can't stop rock'n'roll.mp3|:,1345767514
  /mnt/sdb1/musikk/mp3/Twisted sister - You're not alone (Suzette's song).mp3|    Twisted sister - You're not alone (Suzette's song).mp3|:,1345767762
  /mnt/sdb1/musikk/mp3/Twisted sister - You want what we got.mp3|    Twisted sister - You want what we got.mp3|:,1345767984
  /mnt/sdb1/musikk/mp3/Saxon - Wheels of steel.mp3|    Saxon - Wheels of steel.mp3|:,1345789853
  /mnt/sdb1/musikk/mp3/Judas priest - Parental guidance (LP).mp3|    Judas priest - Parental guidance (LP).mp3|:,1345790062
  /mnt/sdb1/musikk/mp3/Judas priest - Turbo lover (LP).mp3|    Judas priest - Turbo lover (LP).mp3|:,1345790660
  /mnt/sdb1/musikk/mp3/Alice Cooper - Poison.mp3|    Alice Cooper - Poison.mp3|:,1345790933
  /mnt/sdb1/musikk/mp3/Faith no more - Easy.mp3|    Faith no more - Easy.mp3|:,1345791129
  /mnt/sdb1/musikk/mp3/Bangles - Manic monday.mp3|    Bangles - Manic monday.mp3|:,1345791315
  /mnt/sdb1/musikk/mp3/Secret garden - Nocturne.mp3|    Secret garden - Nocturne.mp3|:,1345791511
  /mnt/sdb1/musikk/mp3/Dimmu borgir - Vredesbyrd.mp3|    Dimmu borgir - Vredesbyrd.mp3|:,1345760201,1345760237,1345794966
  /mnt/sdb1/musikk/mp3/Black sabbath - Heaven and hell.mp3|    Black sabbath - Heaven and hell.mp3|:,1345760180,1345761905,1345796352
  /mnt/sdb1/musikk/mp3/Twisted sister - Love is for suckers.mp3|    Twisted sister - Love is for suckers.mp3|:,1345765338,1345797921
  /mnt/sdb1/musikk/mp3/Dimmu borgir - Burn in hell.mp3|    Dimmu borgir - Burn in hell.mp3|:,1345760219,1345760545,1345795930,1345798229
  /mnt/sdb1/musikk/mp3/Twisted sister - Destroyer.mp3|    Twisted sister - Destroyer.mp3|:,1345763352,1346008426
  /mnt/sdb1/musikk/flac/Candlemass - At the gallows end.flac|    Candlemass - At the gallows end.flac|:,1345761248,1345796972,1346008710
  /mnt/sdb1/musikk/mp3/Saint deamon - My heart.mp3|    Saint deamon - My heart.mp3|:,1345761483,1345797207,1346008944
  /mnt/sdb1/musikk/mp3/Ræva Rockers - Depp (LP).mp3|    Ræva Rockers - Depp (LP).mp3|:,1345795310,1345797401,1346009139


To get it work I made a workaround and used awk separator |: instead of simply |
... I couldn't get the awk code to read column 3 instead of column 2 (as in the example in the main post.)
Some #comments would be great Very Happy

The function in pMusic:
Code:
-index_rating_buildlist)
   TIMESTAMP="$2"
   cat "$3" |
   awk -F, -v TS=$TIMESTAMP '{
        split($1, a, "|:"); $1=a[2] # fix first entry
        max=NF; min=1;
       
        while (max-min > 1) {
          i=int( (max+min)/2 )
          if (TS >= $i) min = i
          else max = i
        }
        print a[1], NF-min
   }'
   ;;


Also, what we want is to define a alternative max value to find timestamps in a specific period. - Like february 2012. Let's say that is $4.


Anyone???
Sigmund

_________________
Stardust resources
Back to top
View user's profile Send private message Visit poster's website 
L18L

Joined: 19 Jun 2010
Posts: 2476
Location: Burghaslach, Germany somewhere also known as "Hosla"

PostPosted: Mon 27 Aug 2012, 06:42    Post subject: Awk: Count numbers in list
Subject description: cat /root/database.txt | awk
 

@zigbert - faster than
read LINE; do ..... done < database.txt
is your
cat /root/database.txt | awk -F, -v TS=$FIND '...'

But even faster (10%) is:
awk -F, -v TS=$FIND ' ...' database.txt
Back to top
View user's profile Send private message 
L18L

Joined: 19 Jun 2010
Posts: 2476
Location: Burghaslach, Germany somewhere also known as "Hosla"

PostPosted: Mon 27 Aug 2012, 08:05    Post subject: Awk: Count numbers in list
Subject description: emulating function basename
 

zigbert wrote:
...My NEW index file looks right now like this:
Code:
  /mnt/sdb1/musikk/mp3/Judas priest - Monsters of rock.mp3|    Judas priest - Monsters of rock.mp3|:,1345762237

To get it work I made a workaround and used awk separator |: instead of simply |
... I couldn't get the awk code to read column 3 instead of column 2 (as in the example in the main post.)
Some #comments would be great Very Happy

Why 3 columns?
I have successfully tried to print file and basename from 2 column database:
changing just
print a[1], NF-min
to
num=split (a[1], b , "/")
print a[1], b[num], NF-min

Learning awk:
split(s, a [, r])
Splits the string s into the array a on the
regular expression r, and returns the number of
fields. If r is omitted, FS is used instead.
The array a is cleared first. Splitting behaves
identically to field splitting.

Go back to 2 column database Question
Back to top
View user's profile Send private message 
L18L

Joined: 19 Jun 2010
Posts: 2476
Location: Burghaslach, Germany somewhere also known as "Hosla"

PostPosted: Mon 27 Aug 2012, 08:52    Post subject: Awk: Count numbers in list  

zigbert wrote:
Also, what we want is to define a alternative max value to find timestamps in a specific period. - Like february 2012. Let's say that is $4.

Where is $1 ?

Here is a solution using $1 and $2 timestamp from and -to

Quote:
#!/bin/ash
#additional output filename only=basename
#TIMESTAMP from to
#
LANG=C
#
FIND_FROM=$1 #timestamp
FIND_TO=$2

awk -F, -v TS1=$FIND_FROM -v TS2=$FIND_TO '{
split($1, a, "|"); $1=a[2] # fix first entry

max=NF; min=0;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS1 >= $i) min = i
else max = i
}
from=NF-min

max=NF; min=0;
while (max-min > 1) {
i=int( (max+min)/2 )
if (TS2 >= $i) min = i
else max = i
}
to=NF-min


num=split (a[1], b , "/")
print a[1], b[num], from-to

}' database.txt
Back to top
View user's profile Send private message 
zigbert


Joined: 29 Mar 2006
Posts: 5562
Location: Valåmoen, Norway

PostPosted: Mon 27 Aug 2012, 16:30    Post subject: Re: Awk: Count numbers in list
Subject description: cat /root/database.txt | awk
 

L18L wrote:
But even faster (10%) is:
awk -F, -v TS=$FIND ' ...' database.txt
Done Very Happy
_________________
Stardust resources
Back to top
View user's profile Send private message Visit poster's website 
Display posts from previous:   Sort by:   
Page 2 of 3 [33 Posts]   Goto page: Previous 1, 2, 3 Next
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Off-Topic Area » Programming
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1379s ][ Queries: 12 (0.0253s) ][ GZIP on ]