Puppy Linux Discussion Forum Forum Index Puppy Linux Discussion Forum
Puppy HOME page : puppylinux.com
"THE" alternative forum : puppylinux.info
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

The time now is Tue 12 Nov 2019, 06:23
All times are UTC - 4
 Forum index » Off-Topic Area » Programming
Creating an ASCII table
Post new topic   Reply to topic View previous topic :: View next topic
Page 1 of 2 [23 Posts]   Goto page: 1, 2 Next
Author Message
MochiMoppel


Joined: 26 Jan 2011
Posts: 1925
Location: Japan

PostPosted: Sun 29 Sep 2019, 23:05    Post subject:  Creating an ASCII table  

My next update of MMview will add more viewing options for binary files. Dealing with binaries involves identification of control characters and hex values and I often felt the need to consult ASCII maps to find out, what a specific hex value represents.

It's easy to find ASCII tables online but it seems that Puppy doesn't include an offline version, so I wrote one myself.

I started with a table generated with awk before I found that in older Puppies the file /usr/share/cups/charmaps/windows-1252.txt and in newer Puppies the file /usr/lib/aspell/cp1252.cset include descriptions for each character (though not the characters themselves). They also contain Unicode codepoints for characters in extended ASCII range 0x80-0x9F of codepage 1252. These codepoints do not correspond with the hex values of this range and therefore can't easily be generated otherwise.

I intend to integrate following script into MMview but before I do I would like to know

1) if it contains unexpected bugs
2) if it could possibly be simplified
3) if either one of the 2 files I mentioned is indeed present in all Puppies, which would eliminate the need for the fall-back option /tmp/asciimap.tmp

Thanks.

Code:
#!/bin/bash
LOCATION_1=/usr/share/cups/charmaps/windows-1252.txt
LOCATION_2=/usr/lib/aspell/cp1252.cset
LOCATION_3=/usr/lib64/aspell/cp1252.cset
if   [[ -s $LOCATION_1 ]];then CHARMAP=$LOCATION_1
elif [[ -s $LOCATION_2 ]];then CHARMAP=$LOCATION_2
elif [[ -s $LOCATION_3 ]];then CHARMAP=$LOCATION_3
else #may never be needed as one of above should always be present
    CHARMAP=/tmp/asciimap.tmp
    awk '
    BEGIN {
    for (i=0  ;i<=127;i++) printf "%02X\t%04X\t%s\n",i,i,"#"
    for (i=128;i<=159;i++) printf "%02X\t%04X_<control_character>\t%s\n",i,i,"#"
    for (i=160;i<=255;i++) printf "%02X\t%04X\t%s\n",i,i,"#"
    }' > $CHARMAP
fi
LC_ALL=C awk '
BEGIN {
hline="---------------------------------------------------------"
print "=========== ASCII Table ==========\n\nDEC\tHEX\tCHR\tCODEPNT NAME\n"hline
}
/^[0-9A-F]/{                        # process only lines starting with hex characters
text=substr($0, match($0,/#/)+1)    # extract from position of first encountered #, plus 1 to excl. #
sub(/^ /,"",text)                   # remove leading space from text (cp1252.cset)
$1=substr($1, match($1,/x/)+1)      # remove 0x from 0xnn (windows-1252.txt) or keep nn (cp1252.cset)
dec=strtonum("0x" $1)               # hex to dec
cpnt="U+"substr($2, match($2,/x/)+1)# remove 0x from 0xnnnn (windows-1252.txt) or keep nnnn (cp1252.cset)
if (cpnt ~ /#/) cpnt="U+00"$1       # if cpnt rendered as U+#UNDEFINED (windows-1252.txt)
if (dec<32||dec==127) char=""; else char=sprintf("%c",dec)
if (dec==7) text="BELL (esc \\a)"
if (dec==8) text="BACKSPACE (esc \\b)"
if (dec==9) {text="HORIZONTAL TABULATION (esc \\t)";char="TAB"}
if (dec==10)    {text="LINE FEED (esc \\n)";char="LF" }
if (dec==11)    text="VERTICAL TABULATION (esc \\v)"
if (dec==12)    text="FORM FEED (esc \\f)"
if (dec==13)    {text="CARRIAGE RETURN (esc \\r)";char="CR"}
if ($1==20) print hline"\nPrintable ASCII\n"hline
if ($1==80) print hline"\nExtended ASCII (example: codepage 1252)\n"hline
printf "%03d\t%s\t%s\t%s\t%s\n" ,dec,$1,char,cpnt,text
}' $CHARMAP | iconv -c -f CP1252 -t UTF-8 | gxmessage -file -


EDIT: Tentatively added LOCATION_3 for Fatdog users.

.
Screenshot.png
 Description   Screenshot shows output using /usr/share/cups/charmaps/windows-1252.txt
Ouput using /usr/lib/aspell/cp1252.cset would show different descriptions
 Filesize   44.15 KB
 Viewed   421 Time(s)

Screenshot.png


Last edited by MochiMoppel on Fri 04 Oct 2019, 23:06; edited 3 times in total
Back to top
View user's profile Send private message 
williams2

Joined: 14 Dec 2018
Posts: 190

PostPosted: Sun 29 Sep 2019, 23:50    Post subject:  

Quote:
it seems that Puppy doesn't include an offline version

BionicPup64 has /usr/local/bin/ascii.sh
It displays using gtk_text_info

The bash script is modified from a script here:
http://tldp.org/LDP/abs/html/asciitable.html
There is an awk script on that web page, too.
Back to top
View user's profile Send private message 
tallboy


Joined: 21 Sep 2010
Posts: 1538
Location: Drøbak, Norway

PostPosted: Mon 30 Sep 2019, 00:09    Post subject:  

Hi MochiMoppel.
In both my Dpup Stretch-7.5 RC4, and Tahr64-6.0.6-uefi, there is no /usr/share/cups/charmaps/windows-1252.txt. There is no directory /usr/share/cups/charmaps/ at all, but there is one named /charset/.
The windows-1252.txt file is not found by pFind in any system file.

The file /usr/lib/aspell/cp1252.cset is there, in both Puppys.

Dpup Stretch-7.5 RC4 has /usr/local/bin/ascii.sh

_________________
True freedom is a live Puppy on a multisession CD/DVD.
Back to top
View user's profile Send private message 
step

Joined: 04 May 2012
Posts: 1220

PostPosted: Mon 30 Sep 2019, 18:18    Post subject:  

Hi MochiMoppel,

Fatdog64 has /usr/lib64/aspell/cp1252.cset but no /usr/share/cups/charmaps. There's /usr/share/cups/charsets but the files in there define font mappings. Perhaps the getunimap command might be of help:
> getunimap - dump the unicode map for the current console to stdout

_________________
Fatdog64-802|+Packages|Kodi|Findnrun|+forum|gtkmenuplus
Back to top
View user's profile Send private message 
some1

Joined: 17 Jan 2013
Posts: 104

PostPosted: Mon 30 Sep 2019, 20:51    Post subject:  

On a decent distro Smile

/usr/share/i18n/charmaps

-Contains even an edition of all unicodes
..
The compression used may vary between distros.
Back to top
View user's profile Send private message 
MochiMoppel


Joined: 26 Jan 2011
Posts: 1925
Location: Japan

PostPosted: Mon 30 Sep 2019, 22:56    Post subject:  

@williams2, @tallboy: Thanks for pointing me to ascii.sh. Not exactly what I was looking for but good to know that an ASCII table exists in some Puppies

williams2 wrote:
The bash script is modified from a script here:
http://tldp.org/LDP/abs/html/asciitable.html
There is an awk script on that web page, too.
Makes me wonder why the awk script hasn't been used for the modification. Would be about 20 times faster.

@tallboy: I expect that only one of the 2 files would be present but not both. I edited my post to make it clearer (?).

@step: So the answer to my question 3) is no? Would the script work for you when replacing /usr/lib/aspell/cp1252.cset with /usr/lib64/aspell/cp1252.cset ?
On my system there is no getunimap command.

@some1: /usr/share/i18n/charmaps doesn't seem to contain charmap 1252, which is strange since 1252 is said to be the most widely used charmap. Though ISO-8859-1 comes close it is not the same.
As you already mentioned differing compression methods could be another show stopper. Would be interesting to know if indeed differences exist between distros.
Back to top
View user's profile Send private message 
some1

Joined: 17 Jan 2013
Posts: 104

PostPosted: Tue 01 Oct 2019, 00:14    Post subject:  

https://en.wikipedia.org/wiki/Windows-1252

Yes I know about the differences.

The € -might be handy - but who needs the flyspecs? Smile

One may note - that 2% seem to like the ISO,6% the Windows codepage--
the rest can do without.

Anyway . nice awk and idea.
Back to top
View user's profile Send private message 
MochiMoppel


Joined: 26 Jan 2011
Posts: 1925
Location: Japan

PostPosted: Tue 01 Oct 2019, 06:54    Post subject:  

some1 wrote:
One may note - that 2% seem to like the ISO,6% the Windows codepage--
the rest can do without.
According to your link it's even much less. UTF-8 rulez!
Nevertheless when searching for ways to display old MS Word or WordPerfect documents I found that the "body texts" of such documents often contain strange non-ASCII characters, represented in hexdump as periods. That's where your flyspecks come into play. All sorts of curly, left, right and who knows what quotation marks were in use and cp1252 helps to identify them.
Back to top
View user's profile Send private message 
rufwoof


Joined: 24 Feb 2014
Posts: 3610

PostPosted: Tue 01 Oct 2019, 20:39    Post subject:  

some1 wrote:
The € -might be handy

On my UK laptop keyboard the € is AltGr 4 (alt to the right of the spacebar). Used to near never use AltGr other than for the € .. and even then very rarely. Nowadays however I use AltGr SPACE regularly as both that and the regular Alt SPACE launches my program launcher (xlunch). WIN SPACE launches skippy-xd (live) window selector. jwmrc snippet ...
Code:
    <Key mask="C" key="Down">exec:amixer -c 1 set Master 2%- </Key>
    <Key mask="C" key="Up">exec:amixer set -c 1 Master 2%+ </Key>
    <Key mask="C" key="0">exec:amixer -c 1 sset Master,0 toggle </Key>
    <Key mask="4" key="space">exec:skippy-xd</Key>
    <Key mask="A" key="space">exec:/usr/local/bin/xlunch-show.sh</Key>
    <Key mask="5" keycode="65">exec:/usr/local/bin/xlunch-show.sh</Key> # AltGr space

Those key combinations for window selecting and program launching alongside the touchpad fits well for me. More often use that with the arrow keys for program selecting than I do use the touchpad. Also blends well IMO with the laptop's rightmost vertical strip of keys for jumping to the top of a web page (HOME), bottom (END) and Page Up/Down (or arrow up/down).

_________________
( ͡° ͜ʖ ͡°) :wq
Fatdog multi-session usb

echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh
Back to top
View user's profile Send private message 
MochiMoppel


Joined: 26 Jan 2011
Posts: 1925
Location: Japan

PostPosted: Thu 03 Oct 2019, 23:05    Post subject:  

No response from step yet, so maybe a fatdog user can answer my question to him.
I don't know fatdog, but in another 64bit Puppy (bionicpup64-8.0) the directory /usr/lib64/aspell is symlinked to /usr/lib/aspell, so in bionicpup64-8.0 there shouldn't be any problems. Same as in fatdog?
Back to top
View user's profile Send private message 
rufwoof


Joined: 24 Feb 2014
Posts: 3610

PostPosted: Fri 04 Oct 2019, 09:07    Post subject:  

Fatdog ...

# pwd
/usr/lib64
# ls -l aspell
lrwxrwxrwx 1 root root 11 Jul 31 19:19 aspell -> aspell-0.60

... and ...

/usr/lib has no aspell entry at all

_________________
( ͡° ͜ʖ ͡°) :wq
Fatdog multi-session usb

echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh
Back to top
View user's profile Send private message 
MochiMoppel


Joined: 26 Jan 2011
Posts: 1925
Location: Japan

PostPosted: Fri 04 Oct 2019, 22:39    Post subject:  

Thanks. I added LOCATION_3 to above script and *assume* that this will work for Fatdog users. I can' t test it.
If there are no other locations in the Puppy universe I can remove the fall-back solution.
Back to top
View user's profile Send private message 
rufwoof


Joined: 24 Feb 2014
Posts: 3610

PostPosted: Sat 05 Oct 2019, 08:46    Post subject:  

In my (wiak's build scripts) voidlinux, aspell is in /sbin

charmaps are in /usr/share/i18n/charmaps

_________________
( ͡° ͜ʖ ͡°) :wq
Fatdog multi-session usb

echo url|sed -e 's/^/(c/' -e 's/$/ hashbang.sh)/'|sh
Back to top
View user's profile Send private message 
step

Joined: 04 May 2012
Posts: 1220

PostPosted: Sat 05 Oct 2019, 18:10    Post subject:  

Hi MochiMoppel, I tested the script with LOCATION_3 in Fatdog64 and it works. Thanks.
_________________
Fatdog64-802|+Packages|Kodi|Findnrun|+forum|gtkmenuplus
Back to top
View user's profile Send private message 
MochiMoppel


Joined: 26 Jan 2011
Posts: 1925
Location: Japan

PostPosted: Mon 07 Oct 2019, 09:39    Post subject:  

@rufwoof: "aspell is in /sbin"? Shocked Are you sure you mean the directory aspell and not the executable file with the same name?

@step: Thanks for testing.

some1 wrote:
On a decent distro Smile
/usr/share/i18n/charmaps

Seems that at least the location is consistent in all Puppies. Let's give it a try. I adapted the script and I'm surprised that despite the decompression involved it's pretty fast. As expected not all charmaps are supported by iconv (at least not in my iconv version), I tested all and listed those that work/don't work with the script here:
Code:
File                works
---------------------------
ANSI_X3.4-1968.gz   yes
CP737.gz            no
CP775.gz            no
IBM437.gz           no
IBM850.gz           yes
IBM852.gz           no
IBM855.gz           no
IBM857.gz           no
IBM860.gz           no
IBM861.gz           no
IBM862.gz           no
IBM863.gz           no
IBM865.gz           no
IBM866.gz           no
IBM866NAV.gz        no
IBM869.gz           no
ISO-8859-1.gz       yes
ISO-8859-2.gz       yes
ISO-8859-2.gz       yes
ISO-8859-10.gz      yes
ISO-8859-11.gz      yes
ISO-8859-13.gz      yes
ISO-8859-14.gz      yes
ISO-8859-15.gz      yes
ISO-8859-16.gz      yes
UTF-8.gz            n.a.

My favorite is IBM850 because it's the only one with printable characters in hex range 80~9F.
ISO-8859-15 and ISO-8859-16 include the EURO sign.

I hope that this script works in all Puppies :
Code:
#!/bin/bash
CODEPAGE=IBM850
FONTSIZE=10

gzip -cd /usr/share/i18n/charmaps/$CODEPAGE |  LC_ALL=C gawk -v cp=$CODEPAGE '
BEGIN {
hline="---------------------------------------------------------"
print "======== CODEPAGE "cp" =======\n\nDEC\tHEX\tCHR\tCODEPNT NAME\n"hline
}
/^<U.*> /{
utf="U+"substr($1,3,4)
hex=substr($2,3)
dec=strtonum("0x" hex)
txt=substr($0,index($0,$3))
if (dec<32||dec==127) char=""; else char=sprintf("%c",dec)
if (dec==9)  char="TAB"
if (dec==10) char="LF"
if (dec==13) char="CR"
if (dec==32)  print hline"\nPrintable ASCII\n"hline
if (dec==128) print hline"\nExtended  ASCII\n"hline
printf "%03d\t%s\t%s\t%s\t%s\n",dec,hex,char,utf,txt
}' | iconv -c -f $CODEPAGE -t UTF-8 2>&1 | gxmessage -title "CODEPAGE $CODEPAGE" -c -fn $FONTSIZE -file -
EDIT1: Changed awk to gawk and /<U.*> /{ to /^<U.*> /{
EDIT2: Added LC_ALL=C
Screenshot.png
 Description   
 Filesize   57.58 KB
 Viewed   110 Time(s)

Screenshot.png


Last edited by MochiMoppel on Tue 08 Oct 2019, 10:39; edited 2 times in total
Back to top
View user's profile Send private message 
Display posts from previous:   Sort by:   
Page 1 of 2 [23 Posts]   Goto page: 1, 2 Next
Post new topic   Reply to topic View previous topic :: View next topic
 Forum index » Off-Topic Area » Programming
Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1259s ][ Queries: 12 (0.0129s) ][ GZIP on ]