Linus Torvalds:But they are not different strings, they are canonically equivalent as far as Unicode is concerned. They're even supposed to map to the same glyph (if the font has an "ä", it should display it in both cases, if it has an "a" and a combining diaeresis, it should make up one). You cannot do a binary comparison of text to see if two strings are equivalent. Whereas you are confusing characters and code points. "ä" and "a¨" use different code points, but they encode the same character, and from the user's perspective it is the *character* that is interesting (although he might confuse it with the glyph). Actually, NTFS is a bit broken. It sees file names as a string of 16-bit words. It doesn't check that it is valid UTF-16, or even valid UCS-2, it allows almost anything. Apple made Mac OS X handle filenames properly, by seeing that file names are a string of characters, not code points, so they use a canonical form for all characters (personally, I would have preferred the pre-composed form, though). -- \\// Peter - http://www.softwolves.pp.se/ - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Jan Engelhardt | intel iommu (Re: -mm merge plans for 2.6.23) |
| Justin C. Sherrill | Re: dragonflybsd.org website link? |
| Greg Kroah-Hartman | [PATCH 002/196] Chinese: rephrase English introduction in HOWTO |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
| Patrick McHardy | [NET_SCHED 01/15]: sch_atm: fix format string warning |
