speeding up scripts

Message

wiak · #46 Post by **wiak** » Thu 07 Sep 2017, 20:20

Yes, sed can produce some amazingly efficient fast results. So much code with multiple while read loops and pipes via cut and grep which a single sed command can do (via multiple commands to the single called sed instance). There are also many instances of multiple sed calls being then re-piped into sed again and sometimes again, when a single sed call (with multiple sed command instructions can do).

Part of the problem is that many just know simple basics of sed (mainly just sed 's\\\;'. It can do a lot more than that and all on the one line... but a bit of a learning curve understanding all the things it can do and how to do them altogether in one sed call instance.

wiak

sc0ttman · #47 Post by **sc0ttman** » Thu 07 Sep 2017, 21:30

i only know what you guys have shared, but this seems legit:

David Butcher: Speeding Up Your UNIX Shell Scripts
http://www.los-gatos.ca.us/davidbu/faster_sh.html

And i am guilty of piping sed to sed, among a thousand other shell-related sins..

wiak · #48 Post by **wiak** » Fri 08 Sep 2017, 03:54

sc0ttman wrote: And i am guilty of piping sed to sed, among a thousand other shell-related sins..

Yes, me too, but doesn't matter anyway as long as not in some long loop type situation - sometimes/often it's more important the code is easy to read. Nice link about shell code speed up by the way.

I'm pretty sure jlst is constantly tuning woof-CE code to gradually speed it up, but there is a lot of code to work though - plenty of speed up possible I'm sure.

wiak

sc0ttman · #49 Post by **sc0ttman** » Fri 08 Sep 2017, 11:52

Yeah, that link it quite good.. Some stuff I never even considered..

I like the idea of replacing

Code: Select all

[ "$var" = 'foo' -a "$var2" = 'bar' ] && echo blah

with

Code: Select all

case "$var1$var2" in
  foobar) echo blah ;;
esac

Although readability is not great (imho)..

sc0ttman · #50 Post by **sc0ttman** » Fri 08 Sep 2017, 11:54

Is this

Code: Select all

if "$var" = 'foo'; then

faster or (different at all) than this:

Code: Select all

if [ "$var" = 'foo']; then

?

MochiMoppel · #51 Post by **MochiMoppel** » Fri 08 Sep 2017, 12:23

sc0ttman wrote:Is this
Code: Select all
if "$var" = 'foo'; then 
faster or (different at all) than this:
Code: Select all
if [ "$var" = 'foo']; then 
?

Both are incorrect. You mean which of the error messages you will receive is faster?

sc0ttman · #52 Post by **sc0ttman** » Fri 08 Sep 2017, 13:03

MochiMoppel wrote:
sc0ttman wrote:Is this
Code: Select all
if "$var" = 'foo'; then 
faster or (different at all) than this:
Code: Select all
if [ "$var" = 'foo']; then 
?
Both are incorrect. You mean which of the error messages you will receive is faster?

lol yeah fine, add the fixes mentally

I think u know what I mean..

MochiMoppel · #53 Post by **MochiMoppel** » Fri 08 Sep 2017, 14:50

if test "$var" = 'foo' ?
if [[ "$var" = 'foo' ]] ?
Wait ... where is my crystal ball?

musher0 · #54 Post by **musher0** » Fri 08 Sep 2017, 16:45

Hi scotman.

In the 1st issue of the Puppy Linux Newsletter, January of this year, I have
explained a number of tricks I have used to very rapidly (+/- 1 second)
create a +/- 16Kb wmx menu.

It boils down to :
-- use ash instead of bash if appropriate (ash does not have all of bash's
string manipulation capacities; bash may be faster than ash in such cases.)

-- use internal bash or ash commands as much as possible

-- take advantage of bash's fantastic string manipulation capacities: they
are generally faster than using awk for the same result

-- sort lists and items that need sorting before processing them.
Remember that somewhere in a computer, the alphabetical and numerical
orders are still built-in. It is not just an historical thing. For example, if you
don't respect alphabetical-order processing when you can, expect to lose
precious time.

-- use the case...esac conditions structure whenever possible

-- use LC_ALL=C before any important non-linguistic processing and
know where and when to release it with LC_ALL="" -- at the proper place
in the processing. Otherwise, you'll get junk results.

(By "non-linguistic processing", I mean any processing not based on
human language.)

With LC_ALL=C, the LANG variable remains untouched. You're only
suspending it for the time being.

LC_ALL=C suspends utf-8 and makes the utf-8 bites available for general
processing, not just for language. So this multiplies the speed of your script
by a factor of 2 to 4. "Your script gets the whole boulevard to itself," in a
manner of speaking.

This will make a bash script approach C processing speed. But you have to
know when to release it, especially if you want to integrate human
languages other than bare-bones English. (No special characters
allowed.)

-- time the loops for speed. A < while read line;do > may be faster or slower
than a < for i bla ble bli;do > loop, depending on the material.

-- avoid writing to disk as much as possible. Prefer string manipulation. If
you need to write to disk, write in the same directory as the script. You'll
save only a millisecond, but they add up when using a loop.

-- use the most efficient logic for the problem at hand. You may gain speed
by changing the order of the "processing steps". This you learn by what I
call "living with the problem" for a little -- or a long! -- while.

Finally, please note that the above are "working notes" derived from my
personal experience. I know they work, but I'm just a "ground-hog"

:
some "eagle" with general perspective will have to provide the theory of it.

IHTH.

technosaurus · #55 Post by **technosaurus** » Fri 15 Sep 2017, 04:39

MochiMoppel wrote:Even when reading smaller files sed tends to be faster.

Not if you are running sed from inside a script, using a non-bash shell with LANG=C
If memory serves me, you have made several localization contributions, so I am guessing you bothered to set $LANG and probably have /bin/sh as a link to bash since it minimizes problems with all the bashisms that riddle Puppy scripts.

I did a bunch of testing to figure out when to use read loops vs. sed, grep and awk and IIRC on average it came out to just short of 100 lines on average (depending on the shell)

FWIW, Those timing values are quite suspect because of all the echoes, considering I can process all the desktop files in /usr/share/applications/ to generate my jwmrc file in about the same amount of time... but then again, I build a large string and print it once to a file instead of doing it one echo at a time to stdout ... I guess that's another tip ... printing to console/terminal is slow, so batch them up if possible

Edit - for clarification
ex: instead of having echo "$line" inside a while loop, use

Code: Select all

OUTPUT="$OUTPUT
$line" #note: some shells have a faster string concatenation operator

and then after the loop is done just echo "$OUTPUT"

This is because every write to stdout/tty/etc... takes a (variably) long time, so only output when necessary and do as much of it in one go as possible.
I could explain why this is but it gets a bit off topic (filesystems, kernel syscalls and the C interface)

(old)Puppy Linux Discussion Forum

(old)Puppy Linux Discussion Forum

speeding up scripts