another coding tutorial-- how to write pupraytally

For discussions about programming, programming questions/advice, and projects that don't really have anything to do with Puppy.
Post Reply
Message
Author
User avatar
nosystemdthanks
Posts: 703
Joined: Thu 03 May 2018, 16:13
Contact:

another coding tutorial-- how to write pupraytally

#1 Post by nosystemdthanks »

the right language for the job is the one that helps you accomplish your goals. to a puppy fan, python might look like a very bloated framework.

i understand. for a python fan, you can download developer suites that are 50 times the size of python itself-- larger than your entire puppy cd. i dont use those suites.

i invested about a decade of practice and application with bash. i still use it daily, i have 5 term windows open and 3 tabbed windows with 3 tabs each. only a few are running alex.

but when i want to do something complex, which is possible with bash, i dont enjoy it quite as much. not usually:

wget -O- https://ptpb.pw/UFuV > bashlogo ; md5sum bashlogo | grep 24b6dc7bb0a52f1c92af36c3710cbec2
Image

i didnt write fig as a bash replacement, but an alternative to basic. trying to teach basic and python got me farther than javascript, but for mathphobes ("this looks too much like math.")

a wanted something that allowed more words, and looked more like language (of course math is a language too, but its a very different language.) all coding is math when it gets to the cpu.

when i write bash, i feel like every little change could break the whole thing. this is true of most languages, but some break unexpectedly more easily than others. for example, in python and javascript, some but fewer things like this matter:

Code: Select all

p="hello";
p= "hello";
p ="hello"
we can debate about stuff like that all day, how javascript requires a final ; (a perfect example of what im complaining about) but scripting languages usually dont need that. c and c++ do.

bash requires p="no whitespace on either side of the equals" and i understand why-- its not arbitrary, but things like that can be tedious.

some languages are more finicky than others. of course bash is very powerful and useful and in some ways, flexible. which is why we use it every day.

but i get tired of the mountain of little details, things like this:

Code: Select all

[[ "" == "" ]]
and then the debates about whether you should use [ ] or [[ ]] which take 45 minutes only to find out that there are some comparisons where full bash doesnt work the same/reliably with certain types of data, which is why i decided years ago that i prefer [[ ]] and if i use [ ] then other things start breaking.

thats not how i prefer to write most of my code.

any example we could spend a long time talking about, and ive done as much of that as i want to. besides-- i got into bash years later.

python (like basic) does let you call bash code, and fig lets you call bash code or python code. you can also call some inline python code from bash but trust me, it gets real tedious real fast.

ive written a lot of python and its very nice, but i wanted something even easier to explain.

so here we go.

normally in fig, when i want to create arrays it is very straightforward:

Code: Select all

people arr
places arr
things arr
but suppose we want 1000 item string arrays? alright:

Code: Select all

people "" arr times 1000
places "" arr times 1000
things "" arr times 1000
we can build them gradually too:

Code: Select all

people arr mid 0 1
places arr mid 0 1
things arr mid 0 1

people plus "bob" plus "joe"
things plus "honda civic"
people plus "cindy"
theres a special version of fig with dictionary support. i try to keep the feature set of fig very stable. there are < 100 commands.

but we can use python dictionaries in fig:

Code: Select all

tal arr
python
    tal = {}
    fig
pupraytally uses only 3 user defined functions:

Code: Select all

function getsize p
    fig 

function gettype p
    fig

function taldo p t
python
    fig
    fig
the first two are written purely in fig, the third is a wrapper for handling the dictionary.

getsize and gettype each take one parameter and return a value. taldo takes two parameters and accesses a dictionary that we happen to make global.

for the most part, local function scope means its a lot less work to build larger programs.

bash functions dont have scope. bash scripts do:

Code: Select all

$ cat > prog1
p="hello" ; echo $p ; ./prog2 
$ cat > prog2
echo $p ; p="there" ; echo $p 
$ cat >> prog1
echo $p
$ chmod +x prog1 prog2
$ ./prog1
hello

there
hello
so lets look at the main loop in pupraytal, and what it does is:

1. take input from stdin
2. ignore lines unless they contain a tab, and either: /mnt/pupray/fs/ or /mnt/pupray/sfs/

you can do this with regexes, but if you want to keep it basic you can do this in bash:

Code: Select all

| grep "\t" | egrep "\/mnt\/pupray\/sfs\/|\/mnt\/pupray\/fs\/"
we do this:

Code: Select all

files arrstdin
forin p files
    tab 9 chr
    ctab instr p tab
    csf  instr p "/mnt/pupray/sfs/"
    cf   instr p "/mnt/pupray/fs/"  plus csf
    iftrue ctab
        iftrue cf
not a shell language, eh? in python, this is:

Code: Select all

from sys import stdin
files = stdin.readlines()
for p in files:
    if "\t" in p:
        if "/mnt/pupray/sfs/" in p or "/mnt/pupray/fs/" in p:
and you could create a function around this if you prefer:

Code: Select all

function validline t
python
    if "\t" in t:
        if "/mnt/pupray/sfs/" in t or "/mnt/pupray/fs/" in t:
            return t
    return ""
    fig
    fig
now this:

Code: Select all

files arrstdin
forin p files
    tab 9 chr
    ctab instr p tab
    csf  instr p "/mnt/pupray/sfs/"
    cf   instr p "/mnt/pupray/fs/"  plus csf
    iftrue ctab
        iftrue cf
becomes simply:

Code: Select all

files arrstdin
forin p files
    each validline p
    iftrue each
note that indentation is not required in fig code, it is only required for inline python.

continuing with our program:

Code: Select all

files arrstdin
forin p files

    each validline p
    iftrue each

        size getsize p
        filetype gettype p   
        now taldo filetype size
        fig

    next
so what does it do?

1. go through each line: forin p files
2. filter lines we dont want: iftrue each
3. get the filesize: size = getsize p
the equals is optional.
4. get the file extension: filetype = gettype p
5. add the two to the dictionary: now = taldo filetype size
taldo doesnt return a value, so now is 0.

and thats the program.

as it happens, this program is designed to process output from a routine i wrote in python called fsortplus. you can find that routine in pupray, which i put in a fig function here:

Code: Select all

function fsortplus sortlist
python
    from hashlib import sha256
    from datetime import datetime
    b = []
    outlist = []
    for p in sortlist:
        if len(p):
            try: fs = int(os.path.getsize(p))
            except: fs = -1
            try: s256 = sha256(open(p).read()).hexdigest()
            except: s256 = "-" * 64
            try: filetime = str(datetime.fromtimestamp(os.path.getmtime(p)))[0:19]
            except: filetime = -1
            try: b += [(fs, s256, filetime, p)]
            except: b += [(0, "problem", "with", "fsortplus")]
    b.sort()
    for p in b:
        tab = chr(9)
        try: outlist += [str(p[0]) + " " + p[1] + " " + p[2] + tab + p[3]] 
        except: outlist += ["-1" + chr(32) + "-" * 64 + chr(32) + "?" + chr(32) + "?" + chr(32) + p[3]]
    return outlist
    fig
    fig
how do we use that?

Code: Select all

filelist = "find /root/Downloads/ -type f" ; arrshell ; fsortplus filelist
the = and ; and ; are optional.

so that takes our simple find output (-type f can be important, the routine doesnt like /dev either) and turns it into this:

Code: Select all

1204 b192dfc6c0099bc5377abcd2e81332a875e42cee144812911f2032522ccf977a 2018-11-30 09:29:41	/root/Downloads/pupraytally.fig.gz
7184 919d050bd22a1abacff7caea2eb1227681f3967938736df681c8514c7a904910 2018-06-19 09:32:15	/root/Downloads/mkfigos28.fig.LICENSE
30561 6d7f5be260afedf4750e8297c77dcbc18d6732a36c0fb7708c6d0235e2f5b179 2018-10-31 07:00:08	/root/Downloads/pi.png
56686 7bd9820e4bb496aa357f94d54001449eaad80c64b0cbd83ee079d327907f2445 2018-07-19 18:32:35	/root/Downloads/mcorepup04.fig.gz
63284 d4ceb8b7beff7dfe4540bf8d14005a8e77d66c8c6ac563131bac03d8e78c2fec 2018-06-19 09:32:15	/root/Downloads/mkfigos29.fig
111296 7b44b61192960f2e833c76176ac19110aeb20daf1df91bdb514d0a12a3fca85d 2018-10-27 09:14:49	/root/Downloads/ckyf.png
183765 b6d37f93af94799f7240505271e019a667e957ce6b5b9ce024d84f01bf0b0c80 2018-09-21 03:18:37	/root/Downloads/fsf2.html
pupraytally is specific, it filters output from pupray, so that wont give you any information from here. if we change these lines in validline:

Code: Select all

    if "\t" in p:
        if "/mnt/pupray/sfs/" in p or "/mnt/pupray/fs/" in p:
to this:

Code: Select all

    if "\t" in p:
        if "/" in p:
now we can take our fsortplus output and our more general purpose version of tally will give this:

Code: Select all

56686 .gz
63284 .fig
111296 .png
183765 .html
so of those files, .html files are collectively the largest.

pupraytally is a bloat tallying routine.

my favourite bash features are:

$() and <() and |

i use bash to connect fig programs together, i use fig to run bash code.

but i try to use the code that will let me spend less time guessing whats wrong and more time getting things right the first try. pure bash? not the tool for me.

but if youre comfortable with it, if it is convenient enough for your uses, then bash is the right tool for you.
Attachments
bashlogo.sh.gz
ptpb.pw seems to be down at the moment
(4.41 KiB) Downloaded 81 times
[color=green]The freedom to NOT run the software, to be free to avoid vendor lock-in through appropriate modularization/encapsulation and minimized dependencies; meaning any free software can be replaced with a user’s preferred alternatives.[/color]

Post Reply