Git supports renames/moves in different way. Instead of recording renames
(which has trouble on it's own, for example rename via applying patch)There are trouble with file-ids. Most common example is trouble with file
which was created in two branches (two repositories) independently, then
branches got merged. Most (all?) file-id based rename detection has trouble
with repeated merging of those branches, even if there are no true
conflicts.Read Linus post about file-id based rename detection:
Message-ID: <Pine.LNX.4.64.0610201049250.3962@g5.osdl.org>
http://permalink.gmane.org/gmane.comp.version-control.bazaar-ng.general/...Not that contents based rename detection doesn have it's own pitfals:
Message-ID: <7virha4cnm.fsf@assigned-by-dhcp.cox.net>
http://permalink.gmane.org/gmane.comp.version-control.git/31899
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git-
Having not used git I can't really say whether git is better than bzr or
not in this regard. I know in the kind of development I do the case
where a file with the same name has been added independantly in 2
different branches is a pretty rare one. Usually, when it has happened
the files should have been 2 separate files with different names anyway
- so bzr would have no problem with this.However, renaming a file is pretty common and I would rather be explicit
about it and have file name changes easily visible/searchable in my log.Just out of curiosity: How does git handle the case where one file is
renamed differently in 2 branches and then the branches are repeatably
merged? I know that bzr handles this very well and in various tests I
did there were absolutely no repeated conflicts. Would git behave as
well in this scenario?Nick
-
Not so rare in a true DSCM scenario where people submit patches via
email or a bug tracker. Say two developers apply the same patch to
their trees, and one of them tweaks it a bit. While I don't personally
do kernel development, I understand that's reasonably common in the
linux dev team.It also happens quite a bit if you cherry pick across branches patches
that create files.In such cases, I find GIT does the right thing 99% of the time,
including spotting situations where the file got added at different
patchlevels in different branches.cheers,
martin
-
Ok - I got curious and decided to install git and try this myself.
In this test I had a file hello.txt that got renamed to hello1.txt in
one branch and hello2.txt in another. Then I merged the changes between
the 2 branches.Here is how it looked after the merge in bzr:
bzr status
renamed:
hello2.txt => hello1.txt
conflicts:
Path conflict: hello2.txt / hello1.txt
pending merges:
Nicholas Allen 2006-11-28 Renamed hello to hello1and here's how it looked in git:
git status
#
# Changed but not updated:
# (use git-update-index to mark for commit)
#
# unmerged: hello.txt
# unmerged: hello1.txt
# unmerged: hello2.txt
# modified: hello2.txt
#
nothing to commitSo git is not telling me that I have a conflict due to the same file
being renamed differently in 2 branches - well at least not in a way I
can comprehend anyway! Whereas bzr made this very clear. Also, in git I
ended up with 2 files:ls
hello1.txt hello2.txtwhereas in bzr there was only one file and I just had to decide which
name it was to be given to resolve the conflict.I'm not sure how I should resolve the conflict in git but that's
probably just because I am not familiar with it yet and the message it
gave was not comprehensible or helpful to me in the slightest. In bzr it
was very easy and repeatably merging caused no trouble at all - the name
conflict had to be resolved only once.While it was good that git detected my file rename (although this was
not hard as the contents did not change at all) the process in bzr was
*much* smoother and more user friendly than it was it git. When you have
conflicts I think it's especially important that the RCS inform you of
what is really happening so you do not make mistakes. Bzr was much more
informative than git was and told me exactly why there was a conflict
and made it easy to resolve it.This situation is a pretty common one and it seems to me that git's
content based approach is not as useful in this case...
Ehh. It told you exactly what happened when you actually did the merge,
didn't it?Yeah, "git status" won't tell you _why_ it results in unmerged paths, but
the merge will have told you. You must have seen that, but decided to
just ignore it and not post it, because it didn't support the conclusion
you wanted to get, did it?There are lots of reasons why "git status" may tell you that something
isn't merged. The most common one by far being an actual data conflict,
not a name conflict. The reason for why something conflicts is always told
at merge-time.Linus
-
I didn't do this deliberately - it's just because merge spewed out a
whole load of stuff at me that I didn't understand and therefore
overlooked the conflict message in it. I wasn't expecting to see it here
anyway and was hoping for a short and informative summary that I would
understand when I did a status.Also what happens if I loose the messages because they scrolled off
screen or the power goes down, I need to reboot for some reason, or I
don't have time and want to shutdown my computer restart another day and
resolve the conflicts then? All useful conflict status is lost isn't it?
That's why I expected git status to tell me this in some understandable
manner and was not even expecting it to only be in the merge output....Nick
-
I'd suggest just re-doing the merge. Something like
git reset --hard
git merge -m "dummy message" MERGE_HEADwill do it for you (that's the new "nicer syntax" for doing a merge, in
No, it's actually there, but "git status" doesn't really explain it to
you.The go-to command tends to be "git diff", which after a merge will not
show anything that already merged correctly (because it will have been
updated in the git index _and_ updated in the working tree, so there will
be no diff from stuff that auto-merged). So any output at all after a
failed merge from "git diff" generally tells you exactly what failed.But since 99%+ of all merge conflicts are data-conflicts, I suspect the
output is mostly geared towards that.The other useful tools to be used are "git log --merge" (explained in a
separate mail) and for people like me who like the git index and grok it
fully, doing agit ls-files --unmerged --stage
is probably what I'd do (but I have to admit, that is _not_ a very
user-friendly interface - you need to not only have understood the index
file, you actually need to understand it on a very deep level)."git status" is really used to be just a stupid around "git ls-files"
(it's now largely a built-in), but it was really _so_ stupid that it
doesn't really try to explain what it does - it's more like a simplified
version of ls-files with some of the information pruned away, and other
parts in a slightly more palatable format ;)So improving "git status" might mean that some people could avoid having
to learn about the index file details ;)Linus
-
Hi,
This is actually the most meaningful argument for not hiding the index.
Usually I explain it to people as a "staging area" standing between your
working directory, and the next committed state.But I will start explaining the index with "what if your merge failed?".
Ciao,
Dscho-
The thing is, the staging area is needed for a lot more than just merges.
Every single SCM has one, because even something as _trivial_ as "commit
all files" actually needs it. People don't just always think about it, and
the git staging area is "bigger" than most others.Most other SCM's have a staging area that is just a list of filenames
(nobody thinks about it, but "commit everything" doesn't actually commit
everything at all - it just commits everything /in the list of files that
the SCM knows about/).Git's staging area is just more complete than most other SCM's. It
contains not just the list of filenames, but their permissions too (where
a lot of other SCM's *cough*CVS*cough don't do permissions at all), but
also their content, and in the case of a merge conflict, the content of
the base version and the two branches to be merged.So the index really _is_ required for pretty much all operations
(including very much "git commit -a", if only because of the filename
list), but yeah, if you start by talking about merge conflicts, maybe
people understand WHY it's also important to actually stage the _contents_
of a file too (multiple times, in fact, for a merge conflict), not just
its name.So most of the time, when you use git, you can ignore the index. It's
really important, and it's used _all_ the time, but you can still mostly
ignore it. But when handling a merge conflict, the index is really what
sets git apart, and what really helps a LOT.I've used other systems, but the git handling of merge conflicts really is
superior. Other SCM's think that the merge algorithm is interestign and
important, and that's bullshit. Merge algorithms are largely trivial and
uninteresting. The interestign and important thing is to just handle
failures well, and git does that _really_ well.Linus
-
Actually, people (at least me) dislike the index because in the most common
operations (status, diff, commit), they have to know that the command doesn't actually
display all their work but just the 'indexed' part of it.For people used to cvs, svn and other systems it would be nicer if diff -a
and commit -a (and possibly other commands) were the default.index is of course necessary during merging, ... and as a speed optimization
for applying patches when you know the working copy is clean.Mark
-
Unless you do "git update-index" (and thus are already using the index)
on any files, "git diff" shows you exactly the changes between your last
commit and the working tree. There's nothing magic, odd or confusing
about it, no matter which scm you come from.--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
Until you make the mistake of reading the git-diff man page, at which
point the novice git user runs screaming into the night...Show changes between two ents, an ent and the working tree, an
ent and the index file, or the index file and the working
tree. The combination of what is compared with what is
determined by the number of ents given to the command.* When no <ent> is given, the working tree and the index file
is compared, using git-diff-files.* When one <ent> is given, the working tree and the named tree
is compared, using git-diff-index. The option --cached can
be given to compare the index file and the named tree.* When two <ent>s are given, these two trees are compared using
git-diff-tree.Looking at the man page, it does raise one interesting question ---
So exactly what is the difference between Treebeard and Quickbeam?And how many working trees do we need before we call it an Entmoot? :-)
- Ted
-
I don't see your point, really.
Nothing forces you to change the index. None of the normal operations do
that, for example, and you really have to _explicitly_ ask git to update
the index for you.So you can really think of it as a better list of names than what CVS and
others maintain for you. It's exactly the same as the CVS "Entries" file,
except it's got capabilities that CVS will never have - tracking not just
the filename, but the merge status, the permissions, and the actual
contents of an entry.And by default, and in the absense of any failed merges, you will _never_
Why? I mean really.. Why do people mind the index? If you've not done
anything to explicitly update it, and you just write "git commit", it will
tell you exactly which files are dirty, which files are untracked, and
then say "nothing to commit".Maybe we shouldn't even say "use git-update-index to mark for commit", we
should just say "use 'git commit -a' to mark for commit", but the point
is, there really is no downside. So you forget to mention which files to
commit, what's the downside really? It tells you what is up, and you can
just mention the files explicitly, or use "-a" to say "ok, commit
everything that is dirty", and it doesn't really get any simpler than
that.And the ADVANTAGES of the index are legion. You may not appreciate them
initially, but the disadvantages people talk about really don't exist in
real life, and once you actually start doing merges with conflicts, and
fix things up one file at a time (and perhaps take a break and do
something else before you come back to the rest of the conflicts), the
index saves your sorry ass, and is a _huge_ advantage.Similarly, it _allows_ you to do things that just a list of files never
allows you to. You don't _have_ to use it to mark individual files as
being ready to be committed, but you _can_. It's nothing that you need to
know or worry about if you're not aware of the index, but it's a
capability that is ...
To start with, that message confuses a lot of new users. "What do you
mean there's nothing to commit? I just made changes. And I know you
noticed them because you just mentioned the names of the files with
the changes to me!".So at the very least, there's some missing guidance as to how to get
from the "nothing to commit" stage to actually commit the files the
user was trying to commit when they typed "git commit" in the firstYes, I submitted a patch for this. I don't think Junio picked it up
because it got him thinking about all the other situations where "git
status" doesn't give as much guidance as it shouldEven with that, the user has to go through the process of:
git commit
"hmm... why didn't that work"
read message
git commit -aThat's not a _huge_ problem, but it is a little road-bump that a lot of
people meet on their first attempt at git. In the thread on the fedora
mailing list that prompted my first "user-interface warts" and the
patch I mentioned above, the process was worse:git commit
"hmm... why didn't that work"
read message
git update-index
git commit
"crap... it still didn't work even when I did what it told me to do"Here's the original version of that report:
In none of these recent threads have I been arguing disadvantages of
the index. I'm really just trying to remove one small hurdle that
does trip up new users, (see above). I'm not trying to introduce any
large conceptual change into how git works, nor even what experiencedLet's help people do exactly that by making the behavior of "git
commit -a" be the default for "git commit".-Carl
Maybe we could do that _only_ if the index matches HEAD, and otherwise
keep current behavior?
So people who don't care about the index won't get tripped up, and when
--best regards
Ray
-
Sounds sane. Especially if we couple it with a hint for the user to use
"commit -a" when he/she wants to do blanket commits.So in essence that would mean:
If no pathspecs are given and index matches current HEAD, print out
"Nothing to commit but changes in working tree. Assuming 'git commit -a'and then act accordingly. Carl, do you think that would satisfy the
desires of your RedHat peers? Always doing '-a' by default is terribly
wrong for those of us who actually use partial commits a lot, and it
would also rob git of a lot of its power.--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
Hi,
So many people spoke for it, it's time I crash the wedding.
From a usability viewpoint, it is a horrible convention. The user has to
remember too much of the side effects to handle the commit operation.
The function of the program would no longer be dependent on the command
line arguments and your config, but _also_ on something as volatile as
the index.You would literally end up asking "did I change the index?" _everytime_
before you commit.And remember, even a simple "git add" changes the index! (Why it does is
brutally clear once you grasp the concept of the staging area.)Worse, doing a "git commit --amend" should _not_ automatically add "-a"
_even_ if the index matches the HEAD, since it is quite possible that you
had a typo in the message you want to fix up. And quite possibly other
options would not want that either.But here's an idea: tell the user that she has to tell git-commit which
files she wants committed. Yes! That's it. Just tell it the friggin'
files. And if you are a lazy bum, and want to commit _all_ modified
files, git has a nice shortcut for ya: "-a".Ciao,
Dscho-
It reminds me Microsoft Office Assistant :-) Let's make "git assistant
mode" that tries hard to guess user's desires and give them guidance.
Once they get used to git, they can disable that mode and back to
"plain git".
--
Duy
-
Hi,
See git-gui from Shawn. It should really help new users with a graphical
user interface.Ciao,
Dscho-
I hate the if clause. Suppose I prefer update-index way, I would have
to check whether HEAD matches index everytime I do a commit to make
sure it won't do the other way.
Either -a or -i is the default, not if please.By the way I do use the update-index way, but vote -a by default. I
don't mind adding " -i" after every commit commands.
--
Duy
-
No you won't.
If you don't use update-index, then index will match HEAD and you will commit
changes in the working tree. That is the way for newbiesAs soon as you do the first update-index the index will no longer match HEAD,
so commit will do the same as it does now.And if you are not sure which you have done then presumably you do what you do
now, or git commit -a or git commit -i as you need.--
Alan Chandler
http://www.chandlerfamily.org.uk
-
Plus, one assumes, the git-generated comments in the commit message will
tell you what kind of commit it has decided to do.I like this suggestion a lot. Thinking back over my git usage recently,
which has included both styles of commits (though mostly -a ones), I
think this would have done the right thing by default in every case.-Steve
-
I have been(silently) following the git commit discussion and started being
fully on the side of git commit -a being the default, but was slowly moving
over towards the git commit -i being the default camp.This post seems like a Eureka moment - chew over the problem long enough and
someone comes in from left field with an off the wall remark that suddenly
clarifies everything.--
Alan Chandler
http://www.chandlerfamily.org.uk
-
I thought of that tonight and almost suggested it myself. It would be
an attempt to satisfy both "sides" of the debate without either side
having to fight with a default they didn't like or configure it away.I did wonder if the powers that be would find it a bit too magic, (the
problem with magic things is that they can sometimes be quite
confusing when they don't do exactly what you want).But this might just work. It wouldn't be too bad to document, (we
already have several commands that change slightly if the index
doesn't match, (often by just refusing to do anything in a dirty
tree)).And, significantly this would allow for documenting the simple
sequence of:# edit file
git commitin the tutorial while also allowing what Junio wanted:
git update-index file
git commitwith the behavior of, ("I already said I wanted to do a staged commit
when I explicitly updated the index, so don't make me say anything
special again when I go to commit").Can we really get the best of both worlds here?
-Carl
I meant "good" there for anyone confused, (I'm not sure how that
slipped passed my spell-checker).-Carl
Hi,
No. It does display all your work.
However, as Linus pointed out, if there are automatically merged entries
without conflicts, it will not display them. Which is sane!And yes, you can hide some modifications by putting the modified file into
And what exactly do you think is happening when "cvs add" and "svn add"
did _not_ really add the file to the repository, but only a subsequentI think that it is one major achievement of git to make clear and sane
definitions of branches (which are really just pointers
into the revision graph), and the index (which is the staging area).Ciao,
Dscho-
Something resembling index is needed anyway: 1) for "commit all changed
files" to prepare list of files to commit, excluding ignored files,
2) to mark files as "to be added" or "to be removed" (well, git index
could be a little bit smarter here in marking "intent to add"), 3) as
a place for doing the merging. Git just doesn't hide it.I agree that git definition of branches, and git not hiding index is
it's advantage... and disadvantage to those who learned using version
control on other SCM.
--
Jakub Narebski
Poland
-
That sounds good. Better output on status would be nice ;-)
Nick
-
Side note, to clarify: in the _simple_ cases it's all actually there.
I can well imagine that in more complex cases, involving multiple
different files, you may well want to re-do the merge and let the merge
tell you why it refused to merge something.So the index, for example, contains just a "final end result" of what the
merge gave up on, and while for a simple rename conflict like your example
you could certainly see that directly from the index state (and thus we
could, for example, have a "git status" that talks about it being a
filename conflict), if you have a criss-cross rename, the index itself
doesn't really tell you _why_, and it could look superficially like a data
conflict.In such a case, you'd really have to either go back to the merge itself to
see what happened, or you'd use the "git log" thing and just work it out
from there (ie you can ask "git log" to tell you about any renames as they
happened etc).I don't think I've actually hit a complex enough merge to need this yet,
but the graphical tools should help too, ie "gitk --merge" should give you
everything that "git log --merge" gives you (ie just the commits that
aren't common, and simplified to just the ones that matter for the
unmerged filenames in the end result). I can well imagine that being
useful too.So the tools are certainly there. "git status" just isn't necessarily the
best one (or the best that it could be, for that matter)..Linus
-
I guess I hit a limitation in the output of status as opposed to a
limitation in what git can do ;-)Nick
-
Hi,
I think it is something different altogether: you learnt how to use CVS,
and you learnt how to use bzr, and you are now biased towards using the
same names for the same operations in git.I actually use git-status quite often, just before committing, to know
what I changed. But I will probable retrain my mind to use "git diff" or
even "git diff --stat", because it is more informative.As for your scenario: There really should be a "what to do when my merge
screwed up?" document.Ciao,
Dscho-
I have a few examples scenarios and some notes on
cleaning up after failed merges in my slides from
the presentation I did at OLS last summer.Feel free to look at it off of www.jdl.com!
jdl
-
It would be nice to have git-resolved (or git-resolve) wrapper around
git-update-index similar to git-add, git-mv, git-rm which would mark
file as resolved, without need for git-update-index, git-add and git-rm
even in the case of CONFLICT(rename/rename). Although I'm not sure
if it could work in all cases in the simple form of "git resolved <file>",
e.g. in the case of CONFLICT(add/add).By the way, I wonder if git can detect the case when the same (or nearly
the same) file was added in two different branches under different
filename...
--
Jakub Narebski
Poland
-
Except when you are doing a large merge, your terminal scrollback
is really short, and there's a lot of conflicts. Then you can't
see what merge said about any given file. :-(Fortunately its easy to back out of the merge and redo it with
large enough scrollback, or redirecting it to a file for later
review, but its annoying that we don't save that information off
for later review.--
Shawn.
-
Heh. Which is partly why I just do "git diff", which usually tells me what
is up, or "git log --stat --merge", which is usually even better. I've
never actually had to scroll up.[ But I'll also admit that I used to have a "xterm*savedlines=5000" in my
.Xdefaults, and it might be worth it for some people. I haven't actually
needed it with git, because the _real_ reason for it used to be applying
patch-sets, and I've made sure that the git patch-application is so
robust that I never need to go back and look for reasons for conflicts -
if something conflicts, it just _stops_ and undoes the whole patch
instead of continuing to apply the rest or leave the already-applied
part applied. ]Although I agree that we could probably also improve "git status" output,
especially as I doubt it has been tested much.People don't tend to use "git status" very much, I suspect - the most
common usage is not in "git status" itself, but simply as the commit
message template, and that one obviously cannot have any unmerged stuff at
all (since then we'd refuse to even go as far as asking for a commit
message in the first place).Figuring out that the reason for a conflict is a name clash is not
necessarily possible after the merge, though: it's really up to the merge
policy to decide to merge a file cleanly or not, and the "Why" part of why
some particular merge policy decided not the resolve a file is really
internal to the policy, and not externally visible in the tree itself.(But we can certainly see whether it was a pure content conflict or
whether it had some component of a name clash by just looking at what
stages we have for a name: so we could at least separate out the causesI personally find "git log --merge" to be a huge timesaver. But I have to
say, I don't think I've seen more than one or two name conflicts ever, and
almost all of the true issues tend to be just regular data conflicts. So
that's what I personally care about most.[ Fo...
This can't be fail safe though. I would prefer to also have the option
to be able to *explicitly* tell the RCS that a file was renamed and not
have it try to detect from the content which is bound to have corner
cases that fail. When I know I renamed a file why can't I explicitly
tell the RCS and it records the change with the *file identifier*. If I
change the content then the change is not recorded with the file
identifier but with the line/content identifier.Nick
-
You want to tell git about a rename that will never fail to be detected? No
problem.$ git mv oldname newname
$ git commitThe corner cases you speak about are when you rename and edit.
For me, I prefer that to be detected as at least the detection algorithm can
be tuned - there is no fixing it if the VCS was forced to consider it a
rename.When I started using git I was worried about the lack of a rename, but now I
realise that it's not needed - it's pointless. The VCS is snapshotting
moments in time, that's it. Then by making cleverer and cleverer
interpreters of those snapshots you have the potential to do stuff that is
far more useful than "just" rename recording.Andy
--
Dr Andrew Parkins, M Eng (Hons), AMIEE
andyparkins@gmail.com
-
Do you mean if the 2 files should be merged into 1 file? If they should
be 2 files with different names there is no problem using file
identifiers but if they should be merged into one file then I can see
that this would cause problems. You would have to delete one of the
files and copy its changes into the other which would create conflicts
when that file is modified in the other branch. This is a problem if you
*only* have file identifiers.But if you tracked both file identifiers *and* content identifiers (as I
was trying to say in my first post) this wouldn't be a problem would it?
When content is changed you use the content identifiers but when files
are changed by renaming or deleting you use file identifiers. To me at
least it doesn't seem like it's a choice of one or the other or that one
is stupid and the other isn't but that you need them both. bzr uses file
ids and git uses content ids. It would be nice if there were an RCS
that used both - then you get the best of both worlds don't you?So I don't think you want to use file identifiers to track changes to
content (as bzr would do in this case) and you don't want to use content
identifiers to track changes to files (as git does, to my understanding,
when a file is renamed).Nick
-
| Vladislav Bolkhovitin | Re: Integration of SCST in the mainstream Linux kernel |
| Eric Paris | [RFC 0/5] [TALPA] Intro to a linux interface for on access scanning |
| holzheu | Re: [RFC/PATCH] Documentation of kernel messages |
| debian developer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
| Alan Cox | Re: [BUG] New Kernel Bugs |
