Re: [PATCH RFC 5/5] cache: Use ce_norm_sha1().

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Junio C Hamano
Date: Tuesday, April 20, 2010 - 12:12 pm

Henrik Grubbström <grubba@roxen.com> writes:


I don't think you tried it yourself.  Here is what should happen with the
current code.

	# step 0 & 1. a project with LF ending
	$ git init two && cd two
        $ echo a quick brown fox >kuzu
        $ git add kuzu && git commit -m kuzu

        # step 2. you want CRLF in your work area
        $ echo -e "a quick brown fox\015" >kuzu
        $ git config core.autocrlf true
	$ git diff
        diff --git a/kuzu b/kuzu

        # step 3. oops, refresh
        $ git update-index --refresh
        kuzu: needs update

And it is a common thing people wish to do.  Admittedly, this is an one-off
event, so it is not _that_ urgent to fix.  You can for example do:

	# step 4. you haven't changed anything really, so you can add
        $ git add kuzu
        $ git diff
        $ git diff --cached ;# empty!

to force re-index.


	# step 0. a blob with CRLF
        $ git init one && cd one
        $ echo -e "a quick brown fox\015" >kuzu
        $ git add kuzu && git commit -m kuzu


	# step 1. you want CRLF in work area, LF in repository
        $ git config core.autocrlf true
        $ git diff ;# clean!
        $ git checkout -- kuzu
        $ git diff ;# clean!
        $ cat -v kuzu
        a quick brown fox^M

One glitch is that this "checkout" becomes a no-op because the file is
stat-clean.  This is something your "record in the index entry what
normalization was used when we checked it out (or when we read and
hashed)" approach should be able to fix.  It however does not need the
re-indexed object name.

	Side note: if you want to have LF in both work tree and in
        repository, then you wouldn't use core.autocrlf.  Instead you
        would do:

	# step 1 (irrelevant alternative). you want LF in both
        $ dos2unix kuzu
        $ git diff ;# clean!


I think you see exactly the same behaviour in the example sequence I gave
you (blobs with LF with working files with CRLF, core.autocrlf set) and in
your example sequence (blobs with CRLF with working files with LF,
core.autocrlf set) in this case.  What happens to my example are already
shown above.  Continuing your example, because in reality the index is not
dirty, we would need to make it stat-dirty first.

	# step 2. you try to refresh
        $ touch kuzu
        $ git update-index --refresh
        kuzu: needs update
        $ git checkout -- kuzu
        $ cat -v kuzu
        a quick brown fox^M
        $ git diff ;# shows changes!
        diff --git a/kuzu b/kuzu
        index ....

If you are trying to somehow make this last "git diff" silent, then I
think you are solving a _wrong_ problem.  By setting retroactively the
CRLF setting, you are saying that you do not want to have CRLF in the
blobs recorded in the repository, and it is a _good thing_ that there are
differences (tons of them) between what is recorded currently and what you
are going to commit to fix the earlier mistake.


I think that is not just solving a wrong problem but also is encouraging a
bad workflow at the same time, which is even worse.  If you recorded CRLF
blobs in a project that you do not want to have CRLF blob, fixing that
mistake is one logical change that people who later browse the history
would not want to see intermixed with the "lines that actually have been
edited".


As I already said, I agree that it would be beneficial to store what
normalization settings were used and comparing that with what settings are
in effect to detect the possible phamtom difference caused by the change
of the settings.  But once we know that the result of a re-normalization
is different from what is recorded in the index (or tree), then the
difference should be shown.  The actual difference would change every time
the work tree file is edited, so I don't see the benefit of contaminate
the object database with intermediate "blobs" that is not "added".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH RFC 5/5] cache: Use ce_norm_sha1()., =?UTF-8?q?Henrik=20G ..., (Fri Apr 16, 9:10 am)
Re: [PATCH RFC 5/5] cache: Use ce_norm_sha1()., Junio C Hamano, (Tue Apr 20, 12:25 am)
Re: [PATCH RFC 5/5] cache: Use ce_norm_sha1()., Henrik Grubbström, (Tue Apr 20, 8:39 am)
Re: [PATCH RFC 5/5] cache: Use ce_norm_sha1()., Junio C Hamano, (Tue Apr 20, 12:12 pm)
Re: [PATCH RFC 5/5] cache: Use ce_norm_sha1()., Henrik Grubbström, (Sun Apr 25, 4:25 am)