Re: Re: Moving a directory into another fails

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Shawn Pearce
Date: Monday, December 4, 2006 - 1:54 pm

Linus Torvalds <torvalds@osdl.org> wrote:

Yes!

In jgit I assumed all tree entry names were encoded in UTF8.
Then I later learned they aren't.  Foolish me.

As Linus points out its a HUGE problem that the caller of
git-write-tree gets to decide what encoding should be used for
that tree.  Especially if someone else wants to use a different
encoding for the same filename (think ISO-8859-1 vs. UTF-8)!

I'd rather just force the tree entry names to be encoded in UTF-8
always, as its compact for most western texts (which many filenames
are), and at least degrades to supporting the non western texts.

A per-project setting is essentially impossible as we have
no such concept today, and a per-repository setting (like
i18n.commitEncoding) lets two different users encode the same
filename differently, which means two different tree SHA1s with
the exact same content... not correct!
 

Commit encoding is a problem.  Clearly the "header parts"
(tree, parent) are US-ASCII but the author and committer lines
can be anything.  So can the body.  And we have no way of knowing
what encoding was used years later, we can only guess and display
it wrong.

We really should either normalize all commit messages to a single
encoding (again, UTF-8) or embed the encoding as part of the headers
somehow (e.g. look at how XML embeds the document encoding in the
start of the document).

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Re: Moving a directory into another fails, Jakub Narebski, (Mon Dec 4, 12:10 pm)
Re: Re: Moving a directory into another fails, Johannes Schindelin, (Mon Dec 4, 12:10 pm)
Re: Re: Moving a directory into another fails, Linus Torvalds, (Mon Dec 4, 1:26 pm)
Re: Re: Moving a directory into another fails, Linus Torvalds, (Mon Dec 4, 1:51 pm)
Re: Re: Moving a directory into another fails, Shawn Pearce, (Mon Dec 4, 1:54 pm)
Re: Moving a directory into another fails, Jakub Narebski, (Mon Dec 4, 1:56 pm)
Re: Re: Moving a directory into another fails, Johannes Schindelin, (Mon Dec 4, 2:05 pm)
Re: Re: Moving a directory into another fails, Linus Torvalds, (Mon Dec 4, 2:23 pm)
Re: Re: Moving a directory into another fails, Johannes Schindelin, (Tue Dec 5, 12:34 am)
Re: Moving a directory into another fails, Jakub Narebski, (Tue Dec 5, 2:36 am)
filesystem encodings and gitweb tests, was Re: Moving a di ..., Johannes Schindelin, (Tue Dec 5, 7:11 am)
Re: filesystem encodings and gitweb tests, was Re: Moving ..., Johannes Schindelin, (Tue Dec 5, 7:40 am)
Re: Re: Moving a directory into another fails, Linus Torvalds, (Tue Dec 5, 10:11 am)