crlf with git-svn driving me nuts...

Previous thread: git-push: "error: pack-objects died with strange error" by Bob Cotton on Wednesday, April 16, 2008 - 11:55 am. (1 message)

Next thread: Re: [GIT PULL] sh updates for 2.6.25 by Junio C Hamano on Wednesday, April 16, 2008 - 1:02 pm. (2 messages)
From: Nigel Magnay
Date: Wednesday, April 16, 2008 - 12:10 pm

We've got projects with a mixed userbase of windows / *nix; I'm trying
to migrate some users onto git, whilst everyone else stays happy in
their SVN repo.

However, there's one issue that has been driving me slowly insane.
This is best illustrated thusly (on windows) :

  $ git init
  $ git config core.autocrlf false

-->Create a file with some text content on a few lines
  $ notepad file.txt

  $ git add file.txt
  $ git commit -m "initial checkin"

  $ git status
# On branch master
nothing to commit (working directory clean)
--> Yarp, what I wanted

  $ git config core.autocrlf true
  $ git status

# On branch master
nothing to commit (working directory clean)
--> Yarp, still all good

--> Simulate non-change happened by an editor opening file...
  $ touch file.txt
  $ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#
#       modified:   file.txt
#
no changes added to commit (use "git add" and/or "git commit -a")

--> Oh Noes! I wonder what it could be
  $ git diff file.txt
diff --git a/file.txt b/file.txt
index 7a2051f..31ca3a0 100644
--- a/file.txt
+++ b/file.txt
@@ -1,3 +1,3 @@
-<xml>
-       wooot
-</xml>
+<xml>
+       wooot
+</xml>

--> Huh? ...
  $ git diff -b file.txt
diff --git a/file.txt b/file.txt
index 7a2051f..31ca3a0 100644

--> Bah... don't care! get me back to the start...
  $ git reset --hard

HEAD is now at 4762c31... initial checkin

  $ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#
#       modified:   file.txt
#
no changes added to commit (use "git add" and/or "git commit -a")

--> ARGH!
  $ git config core.autocrlf false
  $ git status
# On branch master
nothing to commit (working directory clean)

  $ git config core.autocrlf true
  $ git status
# On branch master
nothing to commit (working directory clean)

--> WtF?

Why does it think in this instance that ...
From: Avery Pennarun
Date: Wednesday, April 16, 2008 - 1:03 pm

We got quite confused by this here too.  I'm pretty sure git's
autocrlf feature is buggy, as you've noticed.  Combined with that, svn
has its *own* kind of autocrlf feature (svn:eol-style property on each
file) that acts completely differently.

As an added bonus, I don't know if you've run into this yet, but
cygwin's "patch" command seems to unconditionally strip CR from
patches *before* trying to apply them at all, *even if* the target
file is CRLF, so patches just never apply to CRLF files ever.  Ha ha!

I managed to make the two systems stop stomping on each other, in our
case, by using svn:eol-style of "native" (which means when git-svn
checks out the file, it gets only LF, since it seems to always claim
to be Unix) and not using git's autocrlf at all.  However, this isn't
optimal since then Windows git users end up with LF instead of CRLF in
their files, which confuses them.

On the other hand, the conflicts and the random-newline-changing diffs
go away, as svn fixes things up at checkin time no matter how badly
they got mangled by the windows user (most commonly, they run a
program that resaves the whole file as CRLF).

Obviously a working git autocrlf feature would be better, but I
haven't looked into it closely enough to say where the problem
actually lies.

Have fun,

Avery
--

From: Dmitry Potapov
Date: Wednesday, April 16, 2008 - 1:01 pm

You added a file with the CRLF ending in the repository!

You should not change core.autocrlf during your work, or you
are going to have some funny problems. If you really need to
change it, it should be followed by "git reset --hard".

In this case, you already have a file with the wrong ending,
so file.txt will be shown as changed now, because if you commit
it again then it will be commited with <LF>, which should have

Actually, it is

@@ -1,3 +1,3 @@
-<xml>^M
-       wooot^M
-</xml>^M
+<xml>
+       wooot
+</xml>


If you do not want problems, you should use core.autocrlf=true
on Windows. Then all text files will be stored in the repository
with <LF>, but they will have <CR><LF> in your work tree.
Users on *nix should set core.autocrlf=input or false, so they
will have <LF> in their work tree.

Dmitry
--

From: Avery Pennarun
Date: Wednesday, April 16, 2008 - 1:20 pm

Alas, the subject of this thread involves git-svn, and the typical
git-svn user is someone who has no way of rewriting the existing
history in their svn repositories.  Thus, files *will* be in the
repository that have the wrong line endings, and (as you noted) git
just gets totally confused in that case.

Nigel's example showed a few situations where git *thought* the file
had changed when it hadn't, and yet is incapable of checking in the
changes.

If all I had to do was checkout (thus converting everything to LF),
and then "git commit -a" to check in all the corrected files, then
git-svn would make one giant, very rude checkin to svn, and my
problems would be largely solved.  However, this does not seem to be
possible due to the problems you noted ("you are going to have
problems now").

Have fun,

Avery
--

From: Dmitry Potapov
Date: Wednesday, April 16, 2008 - 1:39 pm

Actually, what matters in what format files are in _Git_ repository.
Maybe, there is a problem with git-svn and how it imports SVN commits

Incapable of checking in? I have not found a single example in
his mail where it was impossible. The only quirk with autocrlf
is that you need to re-checkout your work tree after changing
it. There is no other problems with it as far as I know.

Dmitry
--

From: Nigel Magnay
Date: Wednesday, April 16, 2008 - 2:56 pm

My (initial) setting of core.autocrlf to false was because that's what
it was on all the windows clients (I know the default has now changed)
and to make the later parts of the script obvious that the file in the
repo had a CRLF ending, rather than have being converted to LF. That's
the situation we have, because they've all come from SVN.

The bit I really don't understand is why git thinks a file that has
just been touched has chnaged when it hasn't, and doing a 'git reset
--hard' actually doesn't help at all (but, bizzarely, git config
core.autocrlf false & git config core.autocrlf true *does* !). The
repo copy is CRLF, the working copy is CRLF, but git thinks it's
changed...
--

From: Martin Langhoff
Date: Wednesday, April 16, 2008 - 1:56 pm

If you are making the above statements in generally about git, I
disagree. I have used msysgit a lot with unix-newlines projects, and
it works fantastic. I am careful to work with newline-smart editors
but any half-decent editor will cope. The general hint is: avoid any
content-mangling options if possible, and git will do the right thing.

OTOH, you might be referring to git-svn on Windows, which I have no
experience with :-)

cheers,



martin
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
--

From: Avery Pennarun
Date: Wednesday, April 16, 2008 - 2:02 pm

Various Windows IDEs (notably Delphi... and notepad :)) get confused
by non-CRLF files and either do random things to the file, fail to
compile, or "helpfully" change all the line endings back to CRLF.  I
agree that any program that does any such thing is braindead, but
unfortunately, some people are stuck with such programs.

Avery
--

From: Dmitry Potapov
Date: Wednesday, April 16, 2008 - 2:17 pm

I stand corrected. It should be either core.autocrlf=true is you
like DOS ending or core.autocrlf=input if you prefer unix-newlines.
In both cases, your Git repository will have only LF, which is the
Right Thing. The only argument for core.autocrlf=false was that
automatic heuristic may incorrectly detect some binary as text and
then your tile will be corrupted. So, core.safecrlf option was
introduced to warn a user if a irreversable change happens. In fact,
there are two possibilities of irreversable changes -- mixed line-ending
in text file, in this normalization is desirable, so this warning can be
ignored, or (very unlikely) that Git incorrectly detected your binary
file as text. Then you need to use attributes to tell Git that this file
is binary.

I have not used git-svn on Windows for some time now, because now I have
a mirror running on Linux, so I clone directly from it.

Dmitry
--

Previous thread: git-push: "error: pack-objects died with strange error" by Bob Cotton on Wednesday, April 16, 2008 - 11:55 am. (1 message)

Next thread: Re: [GIT PULL] sh updates for 2.6.25 by Junio C Hamano on Wednesday, April 16, 2008 - 1:02 pm. (2 messages)