Author |
Message |
nubc

Joined: 23 Jan 2007 Posts: 1907 Location: USA
|
Posted: Thu 15 Aug 2013, 11:22 Post subject:
ROX-filer alphabetization bug Subject description: Christophe comes before Christoph |
|
There may be other alphabetization bugs, but today I noticed this one. That an omitted letter does not precede a letter in similar words (names). For example, the name Christoph/Christophe is spelled at least two ways. The way ROX orders the two, Christophe comes before Christoph, alphabetically. The way I learned it, this is incorrect. In a long list, this error could be significant.
Likewise, if two similar files are differentiated/distinguished by Roman numerals I and II, the file designated II will come before the file designated I.
When alphabetizing, ROX-filer ignores any space and looks to the next letter.
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Fri 16 Aug 2013, 07:35 Post subject:
|
|
I presume you are not talking about "Christophe" and "Christoph", but "Christophe something" and "Christoph something" (well, I guess the "something" is optional, the real issue is whitespace characters i.e. "Christoph ")
The problem is not with Rox. Try running ls in a terminal and files will be sorted in exactly the same way as in Rox:
Code: | # ls
Christoph
Christophe
Christoph e
Christoph.er
|
I agree - "Christoph e" should come before "Christophe". I also think "Christoph.er" should come before it. In fact, any punctuation mark should come before letters and numbers. Here is a better example of why it doesn't make sense:
Code: | test
test1
test.doc
test.txt |
The sort order is defined in your locale settings or something, and I believe you could modify it to make it sensible. But I've never got around to looking at how (if you do please let us know!). I've always wondered how the entire *nix world (well, at least the English speaking part) has gone for so long without recognising this insanity ...
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Fri 16 Aug 2013, 07:41 Post subject:
|
|
To be sensible, things should sort like this:
punctuation characters first.
whitespace characters next.
numbers next.
letters last.
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
L18L
Joined: 19 Jun 2010 Posts: 3431 Location: www.eussenheim.de/
|
Posted: Fri 16 Aug 2013, 08:23 Post subject:
|
|
disciple wrote: | ...The sort order is defined in your locale settings or something, and I believe you could modify it to make it sensible. But I've never got around to looking at how (if you do please let us know!)... |
I have added Christoph1er to the example
Code: | # ls
Christoph Christoph1er Christophe Christoph e Christoph.er
#
# LC_COLLATE=/usr/lib/locale/en_US.utf8 ls
Christoph Christoph e Christoph.er Christoph1er Christophe
# |
|
Back to top
|
|
 |
musher0

Joined: 04 Jan 2009 Posts: 12076 Location: Gatineau (Qc), Canada
|
Posted: Fri 16 Aug 2013, 11:27 Post subject:
|
|
Thanks, guys, for this instructive thread.
_________________ musher0
~~~~~~~~~~
"Logical entities must not be multiplied beyond necessity." | |
« Il ne faut pas multiplier les entités logiques sans nécessité. » (Ockham)
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Sat 17 Aug 2013, 08:00 Post subject:
|
|
So what setting is being used when you don't set LC_COLLATE?
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Sat 17 Aug 2013, 08:23 Post subject:
|
|
You might want to add more to the example to demonstrate case (in)sensitivity:
Code: | [root@archie tmp]# locale
LANG=en_NZ.UTF-8
LC_CTYPE="en_NZ.UTF-8"
LC_NUMERIC="en_NZ.UTF-8"
LC_TIME="en_NZ.UTF-8"
LC_COLLATE="en_NZ.UTF-8"
LC_MONETARY="en_NZ.UTF-8"
LC_MESSAGES="en_NZ.UTF-8"
LC_PAPER="en_NZ.UTF-8"
LC_NAME="en_NZ.UTF-8"
LC_ADDRESS="en_NZ.UTF-8"
LC_TELEPHONE="en_NZ.UTF-8"
LC_MEASUREMENT="en_NZ.UTF-8"
LC_IDENTIFICATION="en_NZ.UTF-8"
LC_ALL=
[root@archie tmp]# ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christoph.er
[root@archie tmp]# LC_COLLATE="" ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christoph.er
[root@archie tmp]# LC_COLLATE="en_NZ.utf8" ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christoph.er
[root@archie tmp]# LC_COLLATE="POSIX" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1
[root@archie tmp]# LC_COLLATE="C" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1
[root@archie tmp]# LC_COLLATE="UTF-8" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1
[root@archie tmp]# LC_COLLATE="en_NZ" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1
|
I'm assuming that since "en_NZ.UTF-8" is giving the same result as "", and different to both en_NZ and UTF-8, that it is either not valid, or it doesn't specify a collate order, so is falling back to default.
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Sat 17 Aug 2013, 08:39 Post subject:
|
|
This doesn't really make sense, because people say to set LC_COLLATE="C" to make dotfiles sort at the beginning of directories. But setting it to anything else would also do this! (but if a file starts with a space that will come before the dotfile anyway)
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
disciple
Joined: 20 May 2006 Posts: 6781 Location: Auckland, New Zealand
|
Posted: Sat 17 Aug 2013, 08:48 Post subject:
|
|
Oh, actually I see whitespace is kind of ignored by default:
Code: | [root@archie tmp]# LC_COLLATE="" ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christopher
Christoph er
Christoph.er
[root@archie tmp]# LC_COLLATE="en_NZ" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph er
Christoph.er
Christoph1er
Christophe
Christopher
christoph
christoph1
|
_________________ If you have or know of a good gtkdialog application, please post a link here
Classic Puppy quotes
ROOT FOREVER
|
Back to top
|
|
 |
L18L
Joined: 19 Jun 2010 Posts: 3431 Location: www.eussenheim.de/
|
Posted: Sat 17 Aug 2013, 09:20 Post subject:
|
|
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html
http://www.dict.cc/deutsch-englisch/Da+steh+ich+nun+ich+armer+Tor++Und+bin+so+klug+als+wie+zuvor+%5BJ+W+v+Goethe+Faust+I%5D.html
|
Back to top
|
|
 |
|