Displaying text correctly in the terminal (no X)

Using applications, configuring, problems
Post Reply
Message
Author
d2390576
Posts: 1
Joined: Tue 30 Oct 2012, 22:37

Displaying text correctly in the terminal (no X)

#1 Post by d2390576 »

I like to do a lot of ebook reading with txt files on my laptop. Usually when I boot up, I don't bother loading X if all I'm going to do is read. I notice that some text files display punctuation incorrectly in vi, mp, and using the less command. I've tried switching encoding using Geany and even Abiword to US ASCII, UTF-8, and others with no effect.

However, starting X and displaying the text files in Geany, Leafpad, or Abiword makes them display correctly. I've also tried viewing the files from the commandline in another distro (SLAX) but they are also corrupted there, which makes me think this is unique to Linux. Does anyone have any experience with this?
Attachments
punctuation_errors.png
Showing the difference between how Geany displays quotes and Vi
(153.2 KiB) Downloaded 456 times

npierce
Posts: 858
Joined: Tue 29 Dec 2009, 01:40

#2 Post by npierce »

Hi d2390576,

Welcome to the forum.

As someone who also spends a fair bit of time in the text console, I can certainly appreciate your desire to have text displayed properly.

By providing the output from vi, you have shown that the pg108.txt file doesn't use an ASCII quotation mark (unless your version of vi or your font is seriously broken, which is unlikely). The dot is what is displayed for an unsupported character.

In order to determine what is going wrong, we first need to find what code the text file is using for a quotation mark. Other common codes for quotation marks are Unicode U+201C and U+201D (which would be encoded in a UTF-8 text file as three-byte sequences: "\xe2\x80\x9c" and "\xe2\x80\x9d"), and Windows-1252 character codes '\x93' and '\x94'. But those codes are all for separate opening and closing quotation marks, yet you show that Geany only displays a neutral quotation mark, so I'm guessing it is something else. Also, if it was UTF-8, I would expect to see three dots (one for each byte).

So it's a bit of a mystery. I downloaded three different versions of The Adventure Of The Empty House from Project Gutenberg, but all had standard ASCII quotation marks. I didn't find the version that you have.

Let's have a look at your file. Please go to the directory with the text file, run this command:

Code: Select all

tail -n +227  pg108.txt | head -n 2 | hexdump -C
and post the results. It should provide output similar to this (except the values for the quotation marks will be different):

Code: Select all

00000000  22 59 6f 75 27 72 65 20  73 75 72 70 72 69 73 65  |"You're surprise|
00000010  64 20 74 6f 20 73 65 65  20 6d 65 2c 20 73 69 72  |d to see me, sir|
00000020  2c 22 20 73 61 69 64 20  68 65 2c 20 69 6e 20 61  |," said he, in a|
00000030  20 73 74 72 61 6e 67 65  2c 20 63 72 6f 61 6b 69  | strange, croaki|
00000040  6e 67 0d 0a 76 6f 69 63  65 2e 0d 0a              |ng..voice...|
0000004c
Alternately (or additionally) if you remember where you got the file, you could provide a link to it.

Also, it never hurts to mention the version of Puppy that you are using, which will help others when trying to reproduce the behavior you are seeing.

Post Reply