ROX-filer alphabetization bug

Please post any bugs you have found
Post Reply
Message
Author
User avatar
nubc
Posts: 2062
Joined: Tue 23 Jan 2007, 18:41
Location: USA

ROX-filer alphabetization bug

#1 Post by nubc »

There may be other alphabetization bugs, but today I noticed this one. That an omitted letter does not precede a letter in similar words (names). For example, the name Christoph/Christophe is spelled at least two ways. The way ROX orders the two, Christophe comes before Christoph, alphabetically. The way I learned it, this is incorrect. In a long list, this error could be significant.

Likewise, if two similar files are differentiated/distinguished by Roman numerals I and II, the file designated II will come before the file designated I.

When alphabetizing, ROX-filer ignores any space and looks to the next letter.

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#2 Post by disciple »

I presume you are not talking about "Christophe" and "Christoph", but "Christophe something" and "Christoph something" (well, I guess the "something" is optional, the real issue is whitespace characters i.e. "Christoph ")
The problem is not with Rox. Try running ls in a terminal and files will be sorted in exactly the same way as in Rox:

Code: Select all

# ls
Christoph
Christophe
Christoph e
Christoph.er
I agree - "Christoph e" should come before "Christophe". I also think "Christoph.er" should come before it. In fact, any punctuation mark should come before letters and numbers. Here is a better example of why it doesn't make sense:

Code: Select all

test
test1
test.doc
test.txt
The sort order is defined in your locale settings or something, and I believe you could modify it to make it sensible. But I've never got around to looking at how (if you do please let us know!). I've always wondered how the entire *nix world (well, at least the English speaking part) has gone for so long without recognising this insanity :) ...
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#3 Post by disciple »

To be sensible, things should sort like this:
punctuation characters first.
whitespace characters next.
numbers next.
letters last.
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

User avatar
L18L
Posts: 3479
Joined: Sat 19 Jun 2010, 18:56
Location: www.eussenheim.de/

#4 Post by L18L »

disciple wrote:...The sort order is defined in your locale settings or something, and I believe you could modify it to make it sensible. But I've never got around to looking at how (if you do please let us know!)...
I have added Christoph1er to the example

Code: Select all

# ls
Christoph  Christoph1er  Christophe  Christoph e  Christoph.er
# 
# LC_COLLATE=/usr/lib/locale/en_US.utf8 ls
Christoph  Christoph e  Christoph.er  Christoph1er  Christophe
# 

musher0
Posts: 14629
Joined: Mon 05 Jan 2009, 00:54
Location: Gatineau (Qc), Canada

#5 Post by musher0 »

Thanks, guys, for this instructive thread.
musher0
~~~~~~~~~~
"You want it darker? We kill the flame." (L. Cohen)

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#6 Post by disciple »

So what setting is being used when you don't set LC_COLLATE?
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#7 Post by disciple »

You might want to add more to the example to demonstrate case (in)sensitivity:

Code: Select all

[root@archie tmp]# locale
LANG=en_NZ.UTF-8
LC_CTYPE="en_NZ.UTF-8"
LC_NUMERIC="en_NZ.UTF-8"
LC_TIME="en_NZ.UTF-8"
LC_COLLATE="en_NZ.UTF-8"
LC_MONETARY="en_NZ.UTF-8"
LC_MESSAGES="en_NZ.UTF-8"
LC_PAPER="en_NZ.UTF-8"
LC_NAME="en_NZ.UTF-8"
LC_ADDRESS="en_NZ.UTF-8"
LC_TELEPHONE="en_NZ.UTF-8"
LC_MEASUREMENT="en_NZ.UTF-8"
LC_IDENTIFICATION="en_NZ.UTF-8"
LC_ALL=


[root@archie tmp]# ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christoph.er

[root@archie tmp]# LC_COLLATE="" ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christoph.er

[root@archie tmp]# LC_COLLATE="en_NZ.utf8" ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christoph.er

[root@archie tmp]# LC_COLLATE="POSIX" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1

[root@archie tmp]# LC_COLLATE="C" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1


[root@archie tmp]# LC_COLLATE="UTF-8" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1

[root@archie tmp]# LC_COLLATE="en_NZ" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph.er
Christoph1er
Christophe
christoph
christoph1
I'm assuming that since "en_NZ.UTF-8" is giving the same result as "", and different to both en_NZ and UTF-8, that it is either not valid, or it doesn't specify a collate order, so is falling back to default.
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#8 Post by disciple »

This doesn't really make sense, because people say to set LC_COLLATE="C" to make dotfiles sort at the beginning of directories. But setting it to anything else would also do this! (but if a file starts with a space that will come before the dotfile anyway)
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER

disciple
Posts: 6984
Joined: Sun 21 May 2006, 01:46
Location: Auckland, New Zealand

#9 Post by disciple »

Oh, actually I see whitespace is kind of ignored by default:

Code: Select all

[root@archie tmp]# LC_COLLATE="" ls -1
520ee2edb597d
christoph
Christoph
christoph1
Christoph1er
Christophe
Christoph e
Christopher
Christoph er
Christoph.er

[root@archie tmp]# LC_COLLATE="en_NZ" ls -1
520ee2edb597d
Christoph
Christoph e
Christoph er
Christoph.er
Christoph1er
Christophe
Christopher
christoph
christoph1
Do you know a good gtkdialog program? Please post a link here

Classic Puppy quotes

ROOT FOREVER
GTK2 FOREVER


Post Reply