Re: core.autocrlf considered half-assed

Previous thread: Re: git-core: conflictstyle=diff3 doesn't actually use diff3 compatible format by Jonathan Nieder on Friday, March 5, 2010 - 3:19 pm. (23 messages)

Next thread: a few beginner git questions by Thomas Anderson on Friday, March 5, 2010 - 11:42 pm. (9 messages)
From: Johannes Schindelin
Date: Friday, March 5, 2010 - 4:23 pm

Hi,

back then, I was not a fan of the core.autocrlf support. But I have to 
admit that in the meantime, I turned into an outright un-fan of the 
feature. Not because its intent is wrong, but because its implementation 
is lousy.

Just try to "git reset --hard" or "git stash" when there are files with 
DOS line endings and when core.autocrlf is not false.

And then despair.

So again, for the record this time: core.autocrlf=true is handled 
_lousily_.

Ciao,
Dscho

--

From: Dmitry Potapov
Date: Sunday, March 7, 2010 - 2:27 am

Well, I agree there are some issues with it. In particularly, when
someone changes core.autocrlf in his/her repository, and then git
behavior is outright confusing. IMHO, the nuts of the problem is that
does not store in the index how files were checkout. Instead it uses
core.autocrlf, which specifies how the user _wants_ files to be check-
out. So, when the autocrlf option changes, things get very confusing.

I did, and I have not noticed any problem with that.

git init
git config core.autocrlf true
echo foo^ | tr ^ '\r' > foo
git add foo
git commit -m 'add foo'
echo more^ | tr ^ '\r' >> foo
echo "Before reset:"
tr '\r' ^ < foo
git reset --hard
echo "After reset:"
git diff
tr '\r' ^ < foo


Dmitry
--

From: Linus Torvalds
Date: Sunday, March 7, 2010 - 4:45 pm

I do agree. It would probably have been a good idea to mark the CRLF 
status in the index, but we didn't. And crlf isn't actually the _only_ 
thing that can cause confusion, the 'ident' and 'filter' can do the same 
thing.

One option might be to have "git config" know about crlf, so that if you 
change crlf state with 'git config' rather than manually, we could at 
least _warn_ about the effects and tell people that they may need to do a 
full new checkout (or reset the stat info in the index, or whatever). But 
I like editing config files by hand, and I don't think I'm the only one.

			Linus
--

From: Tait
Date: Monday, March 8, 2010 - 11:57 am

We already have .gitattributes for tracking information about files. Maybe
add an attribute to describe the in-repository line endings? The default
would be LF, as now, and a new attribute could change the checked-in
format to be CRLF.

Tait

--

From: Johannes Schindelin
Date: Monday, March 8, 2010 - 12:15 pm

Hi,


No.

The problem is not the description of the line endings in the repository. 
The information what line endings are used can be easily extracted from 
every blob, by a simple inspection.

The problem is that the core.autocrlf code blindly assumes that Unix line 
endings are the only thing you would ever commit. And worse, the mistake 
is repeated when updating the index. Git converts the DOS line endings 
into Unix line endings, then compares with what it has in the repository 
and says: "Ooops, it is different!" even if it just checked the files out.

And I demonstrated with the "html" example that even long-time Gitsters 
sometimes commit DOS line endings as-are, unconverted.

Ciao,
Dscho

--

From: Junio C Hamano
Date: Monday, March 8, 2010 - 1:31 pm

FWIW, not using CRLF conversion in the auto-generated 'html' repository
was a deliberate choice, as the contents there (the ones generated by
AsciiDoc with '.html' suffix) were intended to be served directly from the
web servers.  I presume AsciiDoc writes them with CRLF because html
documents are supposed to be, and it would be wrong to apply core.autocrlf
in the auto-generated repository.  And it is not correct to force
core.autocrlf on the recipient side either.  I know you wanted to have a
sample repository that people can easily access and understand what you
see as a problem, and the autogenerated 'html' is a good sample to point
at for that purpose, but "even long-time gitsters..." is stretching the
truth.

Nevertheless, I agree with you that if a similar situation happened by
mistake and your project does want to enforce core.autocrlf, it would be
nicer if there is an easy-to-use one-time clean-up procedure.  It hasn't
been my itch, and I suspect it wasn't Linus's itch either.

--

From: Johannes Schindelin
Date: Tuesday, March 9, 2010 - 2:28 am

Hi,


And here you are wrong, because the branch contains _also_ .txt files that 
are _not_ CR/LF, but need to be CR/LF on Windows. Believe me, we did think 

The problem is that the whole thing was not your itch, but your 
implementation. You never used it, so you never caught the obvious flaws 
in the design.

Sorry to be so direct, but it seems that my more subtle attempts to 
explain the situation failed.

Ciao,
Dscho

--

From: Junio C Hamano
Date: Tuesday, March 9, 2010 - 10:11 am

No, being direct is good---it shows the thought behind what you say more
clearly.

Think what "your implementation" means in the open source setting.  It is
what you were given for free, you could try to improve upon it if you so
desire, and you should be thankful for it.  It also means that you know
the people to ask for help as it is "their" implementation and they are
probably more familiar with it than others.  If you happen to be in the
position where you can see shortcomings in the implementation better than
they do, that's good; they and you can complement what each is good at,
and make progress collectively.

What it does _not_ mean is that you can _demand_ anything out of them,
though.  An attempt to shaming them into doing something amounts to the
same thing.  Having seen the way you have behaved on the msysgit list and
its tracker for a while, I thought you understood all that.
--

From: Johannes Schindelin
Date: Monday, March 8, 2010 - 4:29 am

Hi,


Unfortunately, this is not the common case. The common case is that 
somebody committed _without_ autocrlf (implicitly =false), and you clone 
from there.

Easiest example:

$ git clone -n git://repo.or.cz/git.git html-docs
$ cd html-docs/
$ git config core.autocrlf true
$ git checkout -t origin/html
$ git status

... and despair.

Hth,
Dscho
--

From: Dmitry Potapov
Date: Tuesday, March 9, 2010 - 12:24 am

As Junio explained in another mail, it was intentional to have all HTML
files with CRLF, because they are supposed to have that ending on all
platforms. What is missing, however, is .gitattributes, which would tell
to Git that we do not want to autocrlf conversion for HTML files. This
can be done by adding .gitattributes:

$ cat >> .gitattributes <<EOF
*.html -crlf
EOF

I've just noticed that user-manual.html differs from other HTML files in
that it uses LF ending. I think it is a mistake, and this file should be
converted to have CRLF, but if you want to have all HTML files except
user-manual.html to have CRLF then you can do that too:

$ cat > .gitattributes <<EOF
*.html -crlf
user-manual.html crlf
EOF

I hope Junio will add the right version of .gitattributes, so users with
autocrlf=true will not suffer.


Dmitry
--

From: Johannes Schindelin
Date: Tuesday, March 9, 2010 - 2:29 am

Hi,


That is just papering over the real culprit: Git checks something out. 
This should be clean. But then Git says it is not.

Ciao,
Dscho

--

From: Dmitry Potapov
Date: Tuesday, March 9, 2010 - 3:11 am

We have two filters: from-git-to-worktree and from-worktree-to-git. If
those conversions are inconsistent for whatever reason, it is not going
to be well even if we found a way to solve this not-being-clean-after-
checkout problem.

One possible approach is to mark all files that had CRLF conversion
during checkout and apply the opposite conversion only to them. But what
about newly added files? Wouldn't this lead that to the situation where
newly added files will have the incorrect ending?

IMHO, this not clean after checkout repo demonstrates that you have the
incorrectly crlf configuration in it. So, you probably should correct
.gitattributes (or even to disable autocrlf in it if this repository is
not intended to be used with autocrlf=true). Either way, your current
settings are incorrect. It is better to fix it than try to hide it!


Dmitry
--

Previous thread: Re: git-core: conflictstyle=diff3 doesn't actually use diff3 compatible format by Jonathan Nieder on Friday, March 5, 2010 - 3:19 pm. (23 messages)

Next thread: a few beginner git questions by Thomas Anderson on Friday, March 5, 2010 - 11:42 pm. (9 messages)