Getting line numbers and jump in to

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Post Reply
Message
Author
User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

Getting line numbers and jump in to

#1 Post by RSH »

Hi.

Again my List:

Code: Select all

100-"osx-brushed-mplayer-theme.pup"
101-"wmxmms-xpm.pup"
102-"internet-time.pup"
103-"gtkfind.pup"
104-"alsa-screen.jpg"
105-"text.jpg"
...
...
...
59217-"firewallstate-2.2.pet"
59222-"ctwm-3.8a-plus.tar.gz"
59223-"image-2.jpg"
59224-"Newest%20release.png"
59225-"Htop.png"
59226-"Chromium%20libs%20.png"
59227-"broadcom_wl_delta-k3.3.8-mage2-p64gsw-i586.pet"
In LazY MAID i do use the indexes to search for the files on the web. I want to use the already downloaded and on every run updated full list (might be faster?) to search for the files (using the indexes).

So if i would use grep to check if index 59217 is in the file, it would be true. But how could i do this returning the line number as a result and then to jump into the text file to this line and read out the file from this jump-in-point?

RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

User avatar
Flash
Official Dog Handler
Posts: 13071
Joined: Wed 04 May 2005, 16:04
Location: Arizona USA

#2 Post by Flash »

RSH, I read through your post several times and still don't understand what you're doing, so I'm afraid I can't help with your subject line. Are you continuing a discussion from another thread? If so, at least put in a link to the other thread.

And you do know that you can edit your posts, including their subject lines?

User avatar
SFR
Posts: 1800
Joined: Wed 26 Oct 2011, 21:52

#3 Post by SFR »

Hmm, I too am not sure if I understood correctly...
Assuming that index file looks like:
59217-"firewallstate-2.2.pet"
59222-"ctwm-3.8a-plus.tar.gz"
59223-"image-2.jpg"
59224-"Newest%20release.png"
59225-"Htop.png"
59226-"Chromium%20libs%20.png"
59227-"broadcom_wl_delta-k3.3.8-mage2-p64gsw-i586.pet"

Code: Select all

grep -n '59223-"' index_file_name.txt
will return:
3:59223-"image-2.jpg"
which is line_number:line_content

To "jump-in" that line number and read contents I usually use simple:

Code: Select all

head -3 index_file_name.txt | tail -1
Is that what you need?

Greetings!
[color=red][size=75][O]bdurate [R]ules [D]estroy [E]nthusiastic [R]ebels => [C]reative [H]umans [A]lways [O]pen [S]ource[/size][/color]
[b][color=green]Omnia mea mecum porto.[/color][/b]

seaside
Posts: 934
Joined: Thu 12 Apr 2007, 00:19

#4 Post by seaside »

SFR,

If you just want the corresponding filename belonging to the index line number, you might do this

Code: Select all

item=$(grep  '59223' index_file_name)
filename="${item#*-}"
$filename will then equal Htop.png

Cheers,
s

User avatar
Karl Godt
Posts: 4199
Joined: Sun 20 Jun 2010, 13:52
Location: Kiel,Germany

#5 Post by Karl Godt »

cat -n and grep -n are the only possibilities to get the line number in files i know of too .

PATTERN="59224"

Code: Select all

grep -Hn "^${PATTERN}\-\".*\"" /lazyMAID.main.db
would print something like this into console :

Code: Select all

test.db:4:59224-"Newest%20release.png"
and cat -n :

Code: Select all

cat -n test.db |grep "${PATTERN}\-\""
prints like

Code: Select all

     4	59224-"Newest%20release.png"
with tabs . Somewhat more uncomfortable .

Now sed can be used in these two forms for example :

Code: Select all

line_number=`grep -Hn "^${PATTERN}\-\".*\"" /lazyMAID.main.db |cut -f2 -d':'`
[ "$line_number" ] || { echo "Failed to get a line number for '${PATTTERN}-'"; return || exit; }

Code: Select all

sed -n "$line_number p" /lazyMAID.main.db
or

Code: Select all

sed "$line_number p" /lazyMAID.main.db |uniq -d

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#6 Post by RSH »

Hi to all.

Thanks for the replies. I have to look deeper into that later.

I had a go on what SFR did post, but couldn't get it finally to work. I did get the line number but could not read the file continually, starting at the line number.

Maybe Karl's Code will do it, but that's a lot of confusing code to me - still. :lol:

So i have to learn a bit more.

RSH
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

User avatar
Keef
Posts: 987
Joined: Thu 20 Dec 2007, 22:12
Location: Staffordshire

#7 Post by Keef »

Code: Select all

PATTERN=59100
cat -n index_list.txt |grep "${PATTERN}\-\""
line_number=`grep -n "${PATTERN}" index.txt | cut -f1 -d':'`
tail +$line_number index_list.txt
If I get it right and you want a list starting from a particular line number, will the above do? It is just a mash-up of Karl and seaside's suggestions.

If I've completely got the wrong idea, I promise to stick to lurking...

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#8 Post by RSH »

Keef wrote:

Code: Select all

PATTERN=59100
cat -n index_list.txt |grep "${PATTERN}\-""
line_number=`grep -n "${PATTERN}" index.txt | cut -f1 -d':'`
tail +$line_number index_list.txt
If I get it right and you want a list starting from a particular line number, will the above do? It is just a mash-up of Karl and seaside's suggestions.

If I've completely got the wrong idea, I promise to stick to lurking...
Yes, Keef. This does the job exactly. Thanks.

Just another issue ---> it reads the file until its end. I want to stop reading the file if returned PATTERN is equal or even bigger than a defined end-PATTERN:

Let's say to read from 59100 to 59225.

How to check if it is bigger than the end-PATTERN.

In PASCAL i would write:

Code: Select all

if ret-PATTERN >= end-PATTERN then 
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

User avatar
Karl Godt
Posts: 4199
Joined: Sun 20 Jun 2010, 13:52
Location: Kiel,Germany

#9 Post by Karl Godt »

Code: Select all

sed -n "$line_nr_begin,$line_nr_end p" index_list.txt
let's say line_nr_begin=100 and line_nr_end=200 should print lines 100 til 200 of the index.txt file .

But i think that you are still trying to loop .

Output control:
-m, --max-count=NUM stop after NUM matches
from

Code: Select all

grep --help

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#10 Post by RSH »

Karl Godt wrote:

Code: Select all

sed -n "$line_nr_begin,$line_nr_end p" index_list.txt
let's say line_nr_begin=100 and line_nr_end=200 should print lines 100 til 200 of the index.txt file .

But i think that you are still trying to loop .
Output control:
-m, --max-count=NUM stop after NUM matches
from

Code: Select all

grep --help
Thank you Karl. Yes, i did try to use a loop - but failed. Just hacked a little (trial and error) and then, suddenly: solved! :D (i thought so) :cry:

Used script:

Code: Select all

#!/bin/sh

PATTERN=59203
ENDPATTERN=59225
listfile="murga-linux_attachments_index_full.lst"

cat -n $listfile |grep "${PATTERN}\-""
line_number=`grep -n "${PATTERN}" $listfile | cut -f1 -d':'`

cat -n $listfile |grep "${ENDPATTERN}\-""
end_line_number=`grep -n "${ENDPATTERN}" $listfile | cut -f1 -d':'`

CNTLINES=$(($end_line_number-$line_number))
CNTLINES=$(($CNTLINES+1))

echo $CNTLINES
tail +$line_number $listfile | head -$CNTLINES > test.lst

exit
Output ---> test.lst:

Code: Select all

59203-"broadcom_wl_delta-k2.6.32.28.pet"
59205-"dmesg.gz"
59206-"gtk-png-icons-0.0.1.tar.gz"
59207-"Power-off.png"
59209-"Frisbee%2Bxpupsay-beta2-0912.pet"
59210-"xfe133scrn.jpg"
59213-"Frisbee%2Bxpupsay-beta2-0912.pet"
59215-"firewallstate-2.0.c.gz"
59217-"firewallstate-2.2.pet"
59222-"ctwm-3.8a-plus.tar.gz"
59223-"image-2.jpg"
59224-"Newest%20release.png"
59225-"Htop.png"
But now there is another problem. If i use index PATTERN=203 and ENDPATTERN=225 it finds 203, 1203, 2203, 3203 ... ... ... for the PATTERN and 225, 1225, 2225, 3225 ... ... ... for the ENDPATTERN and returns several line numbers, which makes the above used script useless. :roll:

I need to find exactly and only 203 if this is the PATTERN and returning only the line_number of PATTERN (to start reading) and 225 if this is the ENDPATTERN and only the end_line_number of ENDPATTERN (to end reading).

Used list attached ---> remove .gz
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

User avatar
RSH
Posts: 2397
Joined: Mon 05 Sep 2011, 14:21
Location: Germany

#11 Post by RSH »

This seems to work properly (as long as the index is existing). I just had to refine format of the indexes (adding 00 in front). 8)

Indexes now listed as 00215, 001215, 002215, 003215 etc.

Code: Select all

#!/bin/sh

PATTERN="003215-"
ENDPATTERN="003406-"
listfile="murga-linux_attachments_index_full.lst"

cat -n $listfile |grep "${PATTERN}\-\""
line_number=`grep -n "${PATTERN}" $listfile | cut -f1 -d':'`

cat -n $listfile |grep "${ENDPATTERN}\-\""
end_line_number=`grep -n "${ENDPATTERN}" $listfile | cut -f1 -d':'`

CNTLINES=$(($end_line_number-$line_number))
CNTLINES=$(($CNTLINES+1))

echo $CNTLINES
tail +$line_number $listfile | head -$CNTLINES > test.lst

exit
To make sure 0032150 is not found for 003215 i add the - at the end of PATTERN and ENDPATTERN.

Output:

Code: Select all

003215-"Lynx-2.8.6-rel4.pup"
003216-"sc.png"
003218-"CEwallpaper.pup"
003219-"dscf1485.jpg"
003220-"langpack-es-213-fix-05feb07.pup"
003221-"langpack-es-213-fix-05feb07.pup"
003222-"Mp3Wrap-0.5.pup"
003224-"rc.network-2.14-3.pup"
003226-"CE-IcewmThemes.pup"
003228-"usr.tar.gz"
003229-"expose24.png"
003230-"xbootmount.gz"
003231-"isomaster10_1x.pup"
003232-"error.jpg"
003233-"xorgwizard.gz"
003234-"gnupuppy.jpg"
003236-"gpkgtool2.jpeg"
003237-"gpkgtool1.jpeg"
003241-"batmon2.png"
003242-"batmon1.png"
003243-"batmon0.0.7.pup"
003244-"error.png"
003245-"gdmap_large_file_patch.pup"
003246-"gdmap-patched2.pup"
003247-"probedisk-probepart.tar.gz"
003248-"wkpup2-02-iso.zip"
003249-"pmount.tar.gz"
003251-"xorgwizard_temporary_files.tar.gz"
003252-"sndconfig.pup"
003253-"firefox-2.0.0.1-locale_es-AR.tar.bz2"
003254-"firefox-2.0.0.1-locale_es-ES.tar.bz2"
003255-"desktop-214-JP.png"
003256-"gnocl-0.9.91.pup"
003258-"config%20log.tar.gz"
003259-"skype-tv-320.jpg"
003260-"error-firefox2.png"
003261-"xmix21.pup"
003262-"xsetnumlock.pup"
003263-"part.exe.gz"
003264-"WebGen01-en.tar.gz"
003266-"Bureau%20TTL%202-14.jpg"
003267-"Bureau%20TTL%202-14.jpg"
003268-"Image.gif"
003269-"Blinky-0.8-patched-trayicon.tar.gz"
003271-"smm1.0.pup"
003272-"main.jpg"
003273-"net-setup-2.15-1.pet"
003276-"SP214.pup"
003277-"dvdvob.c.tar.gz"
003278-"dvdauthor-patched-0.6.14.pet"
003279-"vobcopy-1.1.0.pet"
003281-"fonts_symlink.pup"
003284-"Default_filemanager-0.2.pet"
003285-"volume.jpg"
003286-"xmksfx_TEST.gz"
003287-"batmon-0.1.0.tar.gz"
003288-"internet-mail.png"
003289-"3dfm-0.7-bin.tar.gz"
003293-"gltt-2.5.2.tar.gz"
003294-"openglut-0.6.3.tar.gz"
003295-"openglut-0.6.3.tar.gz"
003296-"libglut.so.3.7.1.tar.gz"
003298-"Jwm_tray_button_is_pressed-2nd_from_left.png"
003299-"batcycle7.gif"
003300-"freememapplet-trayicon.tar.gz"
003301-"screen.png"
003302-"remotedesktopclient-2.15-2.pup"
003303-"refresh.ppm.gz"
003304-"snapshot8.png"
003308-"preferences.gz"
003309-"tkmines-2.15-1.pup"
003310-"tkmines-2.15-1.pet"
003311-"tkConvert-2.15-1.pup"
003313-"mself.zip"
003314-"missing_files.rar"
003315-"mksfs.tar.gz"
003316-"mksfs.png"
003317-"sfsManager.png"
003319-"opera48.png"
003321-"martianfull-20061203-i486.pet"
003322-"martianfull-20061203-i486.pet"
003323-"systrayapplet-0.0.1.tar.gz"
003324-"tvtime-1.0.2-i486-1kjz.pet"
003325-"gqradio-1.9.2-i486.pet"
003326-"teen1.jpg"
003327-"teen2.jpg"
003328-"teen3.jpg"
003329-"teen4.jpg"
003331-"ttoutoulinux222.jpg"
003332-"devx2hd.sh.gz"
003335-"pConvert-2.15-1.pet"
003338-"teen5.jpg"
003339-"mini-volume-0.7.pet"
003340-"pvolume-mixer-0.3.pet"
003341-"prename-0.7.pet"
003342-"prename.jpg"
003343-"mixsc.png"
003344-"bootmanager.png"
003345-"icewinconfig.pup"
003348-"xcalcscreen_.zip"
003349-"xcalc_wrapper.pup"
003351-"mhwaveedit-1.4.13.pet"
003353-"Snd-8.8.pet"
003355-"bootManager_ALT.png"
003356-"xonclock-0.0.8.7.pet"
003357-"tkdvd-4.0.5.pet"
003358-"gtkdialog3.tar.gz"
003359-"sweep-0.9.2.pet"
003360-"snack-2.2.pet"
003361-"wavesurfer-1.8.5.pet"
003362-"wavesurfer-1.8.5-stereo_patch.pet"
003363-"amixer.tar.gz"
003366-"amixer.tar.gz"
003367-"LanPuppyAdmin.png"
003368-"gtk-tab-tool_cli.tar.gz"
003372-"screen.jpeg.jpg"
003375-"IPTSTATE.jpg"
003376-"3Loadmeter.jpg"
003377-"3Loadmeter.jpg"
003380-"terminal.png"
003381-"215ce-missing-libs.pup"
003385-"amixer.tar.gz"
003386-"blinky.jpeg"
003387-"envelope_printer_1.0.2.tgz.tar"
003388-"blpup.jpg"
003389-"exit.png"
003390-"amixer.tar.gz"
003391-"amixer.tar.gz"
003392-"amixer.tar.gz"
003395-"amixer.tar.gz"
003402-"pvideoconv-0.1.pup"
003403-"pwine-0.2.pup"
003404-"H2O-gtk2-theme.tar.gz"
003406-"customcd.tar.gz"
Puuhhhh...
[b][url=http://lazy-puppy.weebly.com]LazY Puppy[/url][/b]
[b][url=http://rshs-dna.weebly.com]RSH's DNA[/url][/b]
[url=http://murga-linux.com/puppy/viewtopic.php?t=91422][b]SARA B.[/b][/url]

seaside
Posts: 934
Joined: Thu 12 Apr 2007, 00:19

#12 Post by seaside »

RSH wrote:This seems to work properly (as long as the index is existing). I just had to refine format of the indexes (adding 00 in front). .
RSH,

Formatting numbers is handy using "fprint".

Code: Select all

# printf "%06d" 3215
003215
Creates six digits and pads with zeros in front.

Cheers,
s

User avatar
Karl Godt
Posts: 4199
Joined: Sun 20 Jun 2010, 13:52
Location: Kiel,Germany

#13 Post by Karl Godt »

grep -w $PATTERN is handy too .

-w option uses WORD ie

Code: Select all

A="
321-\"FILE.ext\"
3210-\"File2.ext\"
321321-\"File3.ext\""
echo "$A" |grep -w 321
[ grep -w 321 testfile.txt ]

should only grep the first of the three above patterns .

Also usefull are the special chars ' ^ ' for the beginning of a line and ' $ ' for the end

ie

Code: Select all

grep -w "^${PATTERN}\-" testfile.txt
.

*

Barry does not use grep -w (much) , he formats mostly before like

Code: Select all

PATTERN='^'"${PATTERN}"'-'
*

And if you want to grep the extensions ie .pet it would be something like

Code: Select all

grep '\.pet"$' testfile.txt >here_all_pets.db

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#14 Post by technosaurus »

if you are concerned about speed, Awk can basically replace:
grep ... /string/
Print line # ... FNR
cut ... By changing FS or using split
sed/tr ... sub and gsub
bash
wc
And many more

To jump to a matching line it would resemble:
(i say resemble because I am speculating from my phone ....untested)
awk 'BEGIN{FS="-";}/STRING/{print FNR $2}'
Though it would probably be unnecessary to have the numbers at all, the the FS would not need to change to - either

Awk combines bash, cut, grep, sed and others into 1 tool with pretty print capabilities and floating point math.

P.s. "-" is not the best separator, think how many files have it, compared to tabs its just an extra complication, easy enough to cope with if necessary though
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

User avatar
Karl Godt
Posts: 4199
Joined: Sun 20 Jun 2010, 13:52
Location: Kiel,Germany

#15 Post by Karl Godt »

Technosaurus, thanks for mentioning how to use the FNR special variable .
Will have to test it .
Does FNR means the same as '-f2-' for cut ( all variables ($3,$4,$5,...) ) ? . Because of not knowing how to use an 'all variables until the end' syntaxt i have not used awk much .

User avatar
technosaurus
Posts: 4853
Joined: Mon 19 May 2008, 01:24
Location: Blue Springs, MO
Contact:

#16 Post by technosaurus »

FNR is basically the line number. The trick to printing all fields after $1 is to set $1 to "" and print $0 (or you can loop from 2 to NF). This technique is similar in bash/sh to using while read LINE; do set $LINE; shift; echo $@....
Check out my [url=https://github.com/technosaurus]github repositories[/url]. I may eventually get around to updating my [url=http://bashismal.blogspot.com]blogspot[/url].

Post Reply