Page 1 of 1
Getting line numbers and jump in to
Posted: Mon 10 Sep 2012, 08:17
by RSH
Hi.
Again my List:
Code: Select all
100-"osx-brushed-mplayer-theme.pup"
101-"wmxmms-xpm.pup"
102-"internet-time.pup"
103-"gtkfind.pup"
104-"alsa-screen.jpg"
105-"text.jpg"
...
...
...
59217-"firewallstate-2.2.pet"
59222-"ctwm-3.8a-plus.tar.gz"
59223-"image-2.jpg"
59224-"Newest%20release.png"
59225-"Htop.png"
59226-"Chromium%20libs%20.png"
59227-"broadcom_wl_delta-k3.3.8-mage2-p64gsw-i586.pet"
In LazY MAID i do use the indexes to search for the files on the web. I want to use the already downloaded and on every run updated full list (might be faster?) to search for the files (using the indexes).
So if i would use
grep to check if index 59217 is in the file, it would be true. But how could i do this returning the line number as a result and then to jump into the text file to this line and read out the file from this jump-in-point?
RSH
Posted: Mon 10 Sep 2012, 12:22
by Flash
RSH, I read through your post several times and still don't understand what you're doing, so I'm afraid I can't help with your subject line. Are you continuing a discussion from another thread? If so, at least put in a link to the other thread.
And you do know that you can edit your posts, including their subject lines?
Posted: Mon 10 Sep 2012, 12:56
by SFR
Hmm, I too am not sure if I understood correctly...
Assuming that index file looks like:
59217-"firewallstate-2.2.pet"
59222-"ctwm-3.8a-plus.tar.gz"
59223-"image-2.jpg"
59224-"Newest%20release.png"
59225-"Htop.png"
59226-"Chromium%20libs%20.png"
59227-"broadcom_wl_delta-k3.3.8-mage2-p64gsw-i586.pet"
Code: Select all
grep -n '59223-"' index_file_name.txt
will return:
3:59223-"image-2.jpg"
which is
line_number:line_content
To "jump-in" that line number and read contents I usually use simple:
Code: Select all
head -3 index_file_name.txt | tail -1
Is that what you need?
Greetings!
Posted: Tue 11 Sep 2012, 00:28
by seaside
SFR,
If you just want the corresponding filename belonging to the index line number, you might do this
Code: Select all
item=$(grep '59223' index_file_name)
filename="${item#*-}"
$filename will then equal Htop.png
Cheers,
s
Posted: Tue 11 Sep 2012, 10:24
by Karl Godt
cat -n and grep -n are the only possibilities to get the line number in files i know of too .
PATTERN="59224"
Code: Select all
grep -Hn "^${PATTERN}\-\".*\"" /lazyMAID.main.db
would print something like this into console :
Code: Select all
test.db:4:59224-"Newest%20release.png"
and cat -n :
Code: Select all
cat -n test.db |grep "${PATTERN}\-\""
prints like
with tabs . Somewhat more uncomfortable .
Now sed can be used in these two forms for example :
Code: Select all
line_number=`grep -Hn "^${PATTERN}\-\".*\"" /lazyMAID.main.db |cut -f2 -d':'`
[ "$line_number" ] || { echo "Failed to get a line number for '${PATTTERN}-'"; return || exit; }
Code: Select all
sed -n "$line_number p" /lazyMAID.main.db
or
Code: Select all
sed "$line_number p" /lazyMAID.main.db |uniq -d
Posted: Tue 11 Sep 2012, 22:46
by RSH
Hi to all.
Thanks for the replies. I have to look deeper into that later.
I had a go on what SFR did post, but couldn't get it finally to work. I did get the line number but could not read the file continually, starting at the line number.
Maybe Karl's Code will do it, but that's a lot of confusing code to me - still.
So i have to learn a bit more.
RSH
Posted: Thu 13 Sep 2012, 20:00
by Keef
Code: Select all
PATTERN=59100
cat -n index_list.txt |grep "${PATTERN}\-\""
line_number=`grep -n "${PATTERN}" index.txt | cut -f1 -d':'`
tail +$line_number index_list.txt
If I get it right and you want a list starting from a particular line number, will the above do? It is just a mash-up of Karl and seaside's suggestions.
If I've completely got the wrong idea, I promise to stick to lurking...
Posted: Thu 13 Sep 2012, 23:23
by RSH
Keef wrote:Code: Select all
PATTERN=59100
cat -n index_list.txt |grep "${PATTERN}\-""
line_number=`grep -n "${PATTERN}" index.txt | cut -f1 -d':'`
tail +$line_number index_list.txt
If I get it right and you want a list starting from a particular line number, will the above do? It is just a mash-up of Karl and seaside's suggestions.
If I've completely got the wrong idea, I promise to stick to lurking...
Yes, Keef. This does the job exactly. Thanks.
Just another issue ---> it reads the file until its end. I want to stop reading the file if returned PATTERN is equal or even bigger than a defined end-PATTERN:
Let's say to read from 59100 to 59225.
How to check if it is bigger than the end-PATTERN.
In PASCAL i would write:
Code: Select all
if ret-PATTERN >= end-PATTERN then
Posted: Thu 13 Sep 2012, 23:45
by Karl Godt
Code: Select all
sed -n "$line_nr_begin,$line_nr_end p" index_list.txt
let's say line_nr_begin=100 and line_nr_end=200 should print lines 100 til 200 of the index.txt file .
But i think that you are still trying to loop .
Output control:
-m, --max-count=NUM stop after NUM matches
from
Posted: Fri 14 Sep 2012, 00:46
by RSH
Karl Godt wrote:Code: Select all
sed -n "$line_nr_begin,$line_nr_end p" index_list.txt
let's say line_nr_begin=100 and line_nr_end=200 should print lines 100 til 200 of the index.txt file .
But i think that you are still trying to loop .
Output control:
-m, --max-count=NUM stop after NUM matches
from
Thank you Karl. Yes, i did try to use a loop - but failed. Just hacked a little (trial and error) and then, suddenly: solved!
(i thought so)
Used script:
Code: Select all
#!/bin/sh
PATTERN=59203
ENDPATTERN=59225
listfile="murga-linux_attachments_index_full.lst"
cat -n $listfile |grep "${PATTERN}\-""
line_number=`grep -n "${PATTERN}" $listfile | cut -f1 -d':'`
cat -n $listfile |grep "${ENDPATTERN}\-""
end_line_number=`grep -n "${ENDPATTERN}" $listfile | cut -f1 -d':'`
CNTLINES=$(($end_line_number-$line_number))
CNTLINES=$(($CNTLINES+1))
echo $CNTLINES
tail +$line_number $listfile | head -$CNTLINES > test.lst
exit
Output ---> test.lst:
Code: Select all
59203-"broadcom_wl_delta-k2.6.32.28.pet"
59205-"dmesg.gz"
59206-"gtk-png-icons-0.0.1.tar.gz"
59207-"Power-off.png"
59209-"Frisbee%2Bxpupsay-beta2-0912.pet"
59210-"xfe133scrn.jpg"
59213-"Frisbee%2Bxpupsay-beta2-0912.pet"
59215-"firewallstate-2.0.c.gz"
59217-"firewallstate-2.2.pet"
59222-"ctwm-3.8a-plus.tar.gz"
59223-"image-2.jpg"
59224-"Newest%20release.png"
59225-"Htop.png"
But now there is another problem. If i use index PATTERN=203 and ENDPATTERN=225 it finds 203, 1203, 2203, 3203 ... ... ... for the PATTERN and 225, 1225, 2225, 3225 ... ... ... for the ENDPATTERN and returns several line numbers, which makes the above used script useless.
I need to find exactly and only 203 if this is the PATTERN and returning only the line_number of PATTERN (to start reading) and 225 if this is the ENDPATTERN and only the end_line_number of ENDPATTERN (to end reading).
Used list attached ---> remove .gz
Posted: Fri 14 Sep 2012, 01:33
by RSH
This seems to work properly (as long as the index is existing). I just had to refine format of the indexes (adding 00 in front).
Indexes now listed as 00215, 001215, 002215, 003215 etc.
Code: Select all
#!/bin/sh
PATTERN="003215-"
ENDPATTERN="003406-"
listfile="murga-linux_attachments_index_full.lst"
cat -n $listfile |grep "${PATTERN}\-\""
line_number=`grep -n "${PATTERN}" $listfile | cut -f1 -d':'`
cat -n $listfile |grep "${ENDPATTERN}\-\""
end_line_number=`grep -n "${ENDPATTERN}" $listfile | cut -f1 -d':'`
CNTLINES=$(($end_line_number-$line_number))
CNTLINES=$(($CNTLINES+1))
echo $CNTLINES
tail +$line_number $listfile | head -$CNTLINES > test.lst
exit
To make sure 0032150 is not found for 003215 i add the - at the end of PATTERN and ENDPATTERN.
Output:
Code: Select all
003215-"Lynx-2.8.6-rel4.pup"
003216-"sc.png"
003218-"CEwallpaper.pup"
003219-"dscf1485.jpg"
003220-"langpack-es-213-fix-05feb07.pup"
003221-"langpack-es-213-fix-05feb07.pup"
003222-"Mp3Wrap-0.5.pup"
003224-"rc.network-2.14-3.pup"
003226-"CE-IcewmThemes.pup"
003228-"usr.tar.gz"
003229-"expose24.png"
003230-"xbootmount.gz"
003231-"isomaster10_1x.pup"
003232-"error.jpg"
003233-"xorgwizard.gz"
003234-"gnupuppy.jpg"
003236-"gpkgtool2.jpeg"
003237-"gpkgtool1.jpeg"
003241-"batmon2.png"
003242-"batmon1.png"
003243-"batmon0.0.7.pup"
003244-"error.png"
003245-"gdmap_large_file_patch.pup"
003246-"gdmap-patched2.pup"
003247-"probedisk-probepart.tar.gz"
003248-"wkpup2-02-iso.zip"
003249-"pmount.tar.gz"
003251-"xorgwizard_temporary_files.tar.gz"
003252-"sndconfig.pup"
003253-"firefox-2.0.0.1-locale_es-AR.tar.bz2"
003254-"firefox-2.0.0.1-locale_es-ES.tar.bz2"
003255-"desktop-214-JP.png"
003256-"gnocl-0.9.91.pup"
003258-"config%20log.tar.gz"
003259-"skype-tv-320.jpg"
003260-"error-firefox2.png"
003261-"xmix21.pup"
003262-"xsetnumlock.pup"
003263-"part.exe.gz"
003264-"WebGen01-en.tar.gz"
003266-"Bureau%20TTL%202-14.jpg"
003267-"Bureau%20TTL%202-14.jpg"
003268-"Image.gif"
003269-"Blinky-0.8-patched-trayicon.tar.gz"
003271-"smm1.0.pup"
003272-"main.jpg"
003273-"net-setup-2.15-1.pet"
003276-"SP214.pup"
003277-"dvdvob.c.tar.gz"
003278-"dvdauthor-patched-0.6.14.pet"
003279-"vobcopy-1.1.0.pet"
003281-"fonts_symlink.pup"
003284-"Default_filemanager-0.2.pet"
003285-"volume.jpg"
003286-"xmksfx_TEST.gz"
003287-"batmon-0.1.0.tar.gz"
003288-"internet-mail.png"
003289-"3dfm-0.7-bin.tar.gz"
003293-"gltt-2.5.2.tar.gz"
003294-"openglut-0.6.3.tar.gz"
003295-"openglut-0.6.3.tar.gz"
003296-"libglut.so.3.7.1.tar.gz"
003298-"Jwm_tray_button_is_pressed-2nd_from_left.png"
003299-"batcycle7.gif"
003300-"freememapplet-trayicon.tar.gz"
003301-"screen.png"
003302-"remotedesktopclient-2.15-2.pup"
003303-"refresh.ppm.gz"
003304-"snapshot8.png"
003308-"preferences.gz"
003309-"tkmines-2.15-1.pup"
003310-"tkmines-2.15-1.pet"
003311-"tkConvert-2.15-1.pup"
003313-"mself.zip"
003314-"missing_files.rar"
003315-"mksfs.tar.gz"
003316-"mksfs.png"
003317-"sfsManager.png"
003319-"opera48.png"
003321-"martianfull-20061203-i486.pet"
003322-"martianfull-20061203-i486.pet"
003323-"systrayapplet-0.0.1.tar.gz"
003324-"tvtime-1.0.2-i486-1kjz.pet"
003325-"gqradio-1.9.2-i486.pet"
003326-"teen1.jpg"
003327-"teen2.jpg"
003328-"teen3.jpg"
003329-"teen4.jpg"
003331-"ttoutoulinux222.jpg"
003332-"devx2hd.sh.gz"
003335-"pConvert-2.15-1.pet"
003338-"teen5.jpg"
003339-"mini-volume-0.7.pet"
003340-"pvolume-mixer-0.3.pet"
003341-"prename-0.7.pet"
003342-"prename.jpg"
003343-"mixsc.png"
003344-"bootmanager.png"
003345-"icewinconfig.pup"
003348-"xcalcscreen_.zip"
003349-"xcalc_wrapper.pup"
003351-"mhwaveedit-1.4.13.pet"
003353-"Snd-8.8.pet"
003355-"bootManager_ALT.png"
003356-"xonclock-0.0.8.7.pet"
003357-"tkdvd-4.0.5.pet"
003358-"gtkdialog3.tar.gz"
003359-"sweep-0.9.2.pet"
003360-"snack-2.2.pet"
003361-"wavesurfer-1.8.5.pet"
003362-"wavesurfer-1.8.5-stereo_patch.pet"
003363-"amixer.tar.gz"
003366-"amixer.tar.gz"
003367-"LanPuppyAdmin.png"
003368-"gtk-tab-tool_cli.tar.gz"
003372-"screen.jpeg.jpg"
003375-"IPTSTATE.jpg"
003376-"3Loadmeter.jpg"
003377-"3Loadmeter.jpg"
003380-"terminal.png"
003381-"215ce-missing-libs.pup"
003385-"amixer.tar.gz"
003386-"blinky.jpeg"
003387-"envelope_printer_1.0.2.tgz.tar"
003388-"blpup.jpg"
003389-"exit.png"
003390-"amixer.tar.gz"
003391-"amixer.tar.gz"
003392-"amixer.tar.gz"
003395-"amixer.tar.gz"
003402-"pvideoconv-0.1.pup"
003403-"pwine-0.2.pup"
003404-"H2O-gtk2-theme.tar.gz"
003406-"customcd.tar.gz"
Puuhhhh...
Posted: Fri 14 Sep 2012, 02:56
by seaside
RSH wrote:This seems to work properly (as long as the index is existing). I just had to refine format of the indexes (adding 00 in front). .
RSH,
Formatting numbers is handy using "fprint".
Creates six digits and pads with zeros in front.
Cheers,
s
Posted: Fri 14 Sep 2012, 03:30
by Karl Godt
grep -w $PATTERN is handy too .
-w option uses WORD ie
Code: Select all
A="
321-\"FILE.ext\"
3210-\"File2.ext\"
321321-\"File3.ext\""
echo "$A" |grep -w 321
[ grep -w 321 testfile.txt ]
should only grep the first of the three above patterns .
Also usefull are the special chars ' ^ ' for the beginning of a line and ' $ ' for the end
ie
Code: Select all
grep -w "^${PATTERN}\-" testfile.txt
.
*
Barry does not use grep -w (much) , he formats mostly before like
*
And if you want to grep the extensions ie .pet it would be something like
Code: Select all
grep '\.pet"$' testfile.txt >here_all_pets.db
Posted: Fri 14 Sep 2012, 06:59
by technosaurus
if you are concerned about speed, Awk can basically replace:
grep ... /string/
Print line # ... FNR
cut ... By changing FS or using split
sed/tr ... sub and gsub
bash
wc
And many more
To jump to a matching line it would resemble:
(i say resemble because I am speculating from my phone ....untested)
awk 'BEGIN{FS="-";}/STRING/{print FNR $2}'
Though it would probably be unnecessary to have the numbers at all, the the FS would not need to change to - either
Awk combines bash, cut, grep, sed and others into 1 tool with pretty print capabilities and floating point math.
P.s. "-" is not the best separator, think how many files have it, compared to tabs its just an extra complication, easy enough to cope with if necessary though
Posted: Fri 14 Sep 2012, 12:16
by Karl Godt
Technosaurus, thanks for mentioning how to use the FNR special variable .
Will have to test it .
Does FNR means the same as '-f2-' for cut ( all variables ($3,$4,$5,...) ) ? . Because of not knowing how to use an 'all variables until the end' syntaxt i have not used awk much .
Posted: Fri 14 Sep 2012, 12:58
by technosaurus
FNR is basically the line number. The trick to printing all fields after $1 is to set $1 to "" and print $0 (or you can loop from 2 to NF). This technique is similar in bash/sh to using while read LINE; do set $LINE; shift; echo $@....