Cc: Linus Torvalds <torvalds@...>, Jakub Narebski <jnareb@...>, Johannes Schindelin <Johannes.Schindelin@...>, Mark Junker <mjscod@...>, git@vger.kernel.org <git@...>
On Jan 16, 2008, at 11:51 PM, Martin Langhoff wrote:
I can imagine. However, I've never been hit by such a situation. This =20=
doesn't mean a case-insensitive filesystem is a problem per se, it =20
means interactions between a case-insensitive and a case-sensitive =20
filesystem can be a problem. That doesn't mean either way is "correct" =20=
it just means both don't work well together.
I like ice cream, and I like steak, but I sure don't think a mixture =20
of steak and ice cream would go well together. Do you?
Both of which would be replicating the directory contents, not a =20
listing of files specified by the user. If, as a user, I were to say =20
"please replicate file FOO" and the file was really called "foo", I =20
wouldn't be in the least surprised to see the tool take me at my word =20=
and produce a file called "FOO" with the contents of "foo". But in =20
general, things like this operate on the filesystem, not on the user =20
args.
If I say "track FOO", I probably mean it. So go ahead and track "FOO", =20=
even if you end up tracking the contents of file "foo". I certainly =20
won't blame the tool for doing what I told it.
Sure I do. I find it very convenient, for example, to say "cd =20
documents/school" when I really want to go to "Documents/School". =20
Similarly, if I'm trying to reference gitweb/tests/M=C3=A4rchen, I'm =
quite =20
happy to not have to figure out what normalization the filename is =20
using and attempt to replicate that (especially as I have no idea =20
which normalization my input mechanism uses - unlike Linus, I don't =20
have a key dedicated to =C3=A4, and even if I did I wouldn't necessarily =
=20
expect it to use precomposed vs decomposed). I can't think of a single =20=
reason why I'd want to be able to have 2 different files named =20
"M=C3=A4rchen" on my disk. On the other hand, treating unicode =20
normalization as significant can pose security risks - how am I to =20
know that the file that is named "foo.txt" is really the same file =20
"foo.txt" that I last saw? Someone I know on IRC sent me this =20
image[1], which shows 6 files all apparently named "foo.txt" on a disk =20=
image. This is possible because on a case-sensitive HFS+ volume, the =20
file system doesn't ignore ignorables when comparing filenames (it =20
does on a case-insensitive HFS+ system), and so all of those filenames =20=
look identical up until you actually pipe their names through xxd and =20=
look at the byte sequence. When this sort of tomfoolery is possible, I =20=
simply cannot trust the names of any of my files anymore.
[1]: http://sailor=E6=9C=88.com/imgs/ignorable.png
Extra code? I don't think so. The only reason I'd need extra code is =20
if I were attempting to explicitly detect the "real" filename for a =20
user-supplied argument, by scanning the directory contents until I =20
found a file that was equivalent to the given argument. But there's no =20=
reason to do that. None of the code I've ever written, or any of the =20
code I've ever seen, has had to do any extra work because it was on a =20=
case-insensitive filesystem. I contribute to a packaging system for =20
the Mac called MacPorts, and I've never seen any patches on any of the =20=
4000+ ports to handle case insensitivity (granted, I haven't looked at =20=
every port, but I've looked at a significant fraction). It's a =20
complete non-issue.
The content of files is sacred. The filename is only there to provide =20=
a handle to locate the contents. I don't see any problem with =20
expanding the equivalency scope of the filename to accept multiple =20
encodings and cases. The only arguments I can see that have any =20
validity at all are the ones that sound like "we use case-sensitive =20
filesystems, and your case-insensitivity and normalization are causing =20=
problems with our tools! Conform to our world!". As I said above, this =20=
isn't a problem of case-insensitivity or normalization, it's a problem =20=
of interaction between two incompatible viewpoints. All I want to do =20
is make git play nicer in an HFS+ world, and this would be far easier =20=
if you guys were willing to admit this is a problem that should be =20
solved in the tool rather than a problem with the system.
-Kevin Ballard
--=20
Kevin Ballard
http://kevin.sb.orgkevin@sb.orghttp://www.tildesoft.com