.. and quite commonly, there are multiple languages per document.
The good news is that sorting is almost never relevant or done over
general documents. You sort almost only well-behaved data, and quite often
the exact order is less than important: and when it is, you have very
specific rules (which probably seldom have anything what-so-ever to do
with general unicode ;).
Well, Unicode already handles the "reading" part, just not the sorting.
I think Unicode in general (and UTF-8 in particular) is a great thing. I
do not argue against Unicode at all. It's what I use myself.
The thing I argue against is that they force normalization (and then, as a
secondary complaint, their insane choice of target format).
Linux is generally UTF-8 too, and does all of this much better. No forced
normalization, and it uses UTF-8 everywhere as the encoding model. Joliet
and RR works beautifully.
(I don't think RR is NFD, btw. It's the standard microsoft UTF-16 without
normalization, afaik. I think you can happily generate a Rock Ridge disk
that has two _different_ filenames that OS X cannot tell apart, but that
both Linux and Windows can see peoperly)
Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html