Re: git on MacOSX and files with decomposed utf-8 file names

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Kevin Ballard <kevin@...>
Cc: Jakub Narebski <jnareb@...>, Johannes Schindelin <Johannes.Schindelin@...>, Mark Junker <mjscod@...>, git@vger.kernel.org <git@...>
Date: Thursday, January 17, 2008 - 12:08 am

On Wed, 16 Jan 2008, Kevin Ballard wrote:

I do agree. And I think starting out case-insensitive (something they must 
really hate by now) also made it less of an issue. When you're 
case-insensitive, the issues with any UTF-8 normalization are simply 
swamped by all the issues of case, so you probably don't even think about 
it very much.

The big problem with any name rewriting is that I can open file 'xyz', and 
I literally have a very hard time knowing whether that file I know I 
opened and created has anything to do with the file 'Xyz' that I see when 
I do a readdir().

Are they the same? Maybe. But it's literally hard to tell on OS X. I can 
do an fstat() on my file descriptor and on the directory entry, and if 
they get the same d_ino they *probably are the same entry, but even then 
it actually could have been a hardlink (and my 'xyz' is really *another* 
name for it entirely, and the filesystem is actually case-sensitive and 
'Xyz' was a *different* name that somebody else did!).

See? If you're creating a content tracker, these kinds of issues are not 
"idle chatter". It's really *really* important. Was that file the one I 
was told to track? Or was it a temporary file that was just hardlinked? 

This is why case-insensitivity is so hard: you have a very real "aliasing" 
on the filesystem level, where all those really *different* pathnames end 
up being the same thing.

And all the same issues show up with utf-8 rewriting, so if you normalize 
utf-8 names, you actually end up having almost all the same problems that 
a case-insensitive filesystem has. They're just much rarer in practice, so 
you just won't hit them as often - but when you do, they are equally 
painful!

(In fact, they can be a whole lot *more* painful, because now they are 
really rare, and really confusing when they happen!)

But if you come from a case-insensitive background, all the UTF-8 
rewriting really looks like such a small problem compared to all the 
horrid problems that you had with different locales and cases, so I 
suspect they didn't even realize what a big mistake they did!

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Wed Jan 16, 11:34 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Wed Jan 16, 7:03 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Wed Jan 16, 12:32 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Jakub Narebski, (Wed Jan 16, 12:46 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Wed Jan 16, 6:23 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Wed Jan 16, 8:35 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Wed Jan 16, 8:54 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Wed Jan 16, 9:08 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Linus Torvalds, (Thu Jan 17, 12:08 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Thu Jan 17, 6:08 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Linus Torvalds, (Thu Jan 17, 12:43 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Thu Jan 17, 6:09 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Robin Rosenberg, (Thu Jan 17, 9:27 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Peter Karlsson, (Mon Jan 21, 10:14 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Martin Langhoff, (Mon Jan 21, 5:06 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Martin Langhoff, (Mon Jan 21, 6:45 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Linus Torvalds, (Mon Jan 21, 10:50 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Martin Langhoff, (Mon Jan 21, 11:21 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Linus Torvalds, (Mon Jan 21, 11:17 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Martin Langhoff, (Mon Jan 21, 6:56 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Martin Langhoff, (Mon Jan 21, 5:17 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Martin Langhoff, (Mon Jan 21, 5:43 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Eric W. Biederman, (Tue Jan 22, 10:46 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Junio C Hamano, (Tue Jan 22, 10:57 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Fri Jan 18, 4:50 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Sat Jan 19, 8:11 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Sun Jan 20, 5:34 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Sat Jan 19, 6:58 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Sun Jan 20, 9:15 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Peter Karlsson, (Fri Jan 18, 11:30 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Robin Rosenberg, (Thu Jan 17, 9:05 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Robin Rosenberg, (Fri Jan 18, 5:42 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Peter Karlsson, (Fri Jan 18, 11:37 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Thu Jan 17, 2:18 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Martin Langhoff, (Thu Jan 17, 12:51 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Thu Jan 17, 6:22 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Thu Jan 17, 11:57 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Robin Rosenberg, (Thu Jan 17, 8:44 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Wed Jan 16, 8:33 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Wed Jan 16, 8:57 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Eyvind Bernhardsen, (Wed Jan 16, 6:37 pm)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Thu Jan 17, 6:28 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Thu Jan 17, 7:10 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Thu Jan 17, 9:05 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Thu Jan 17, 7:51 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Johannes Schindelin, (Thu Jan 17, 8:53 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Thu Jan 17, 9:40 am)
Re: git on MacOSX and files with decomposed utf-8 file names, Wincent Colaiuta, (Thu Jan 17, 7:46 am)