Hi folks, I hope this is not a daily series of mine, being confused about Git merging, but I've run my head against a wall again and before I crush my skull, I'd prefer to reach out to you to help me regain an understanding. This is about the new mdadm for Debian packaging effort. You can clone from git://git.debian.org/git/pkg-mdadm/mdadm-new.git and see the repo at http://git.debian.org/?p=3Dpkg-mdadm/mdadm-new.git;a=3Dsummary. Do not track as this repo is subject to change. My master branch was last merged with upstream's mdadm-2.6.2 tag (commit 263a535). Since then, I've committed a couple of changes to master including three cherry-picks from upstream since mdadm-2.6.2. I tagged upstream at the point which I want to merge into master: mdadm-2.6.3+200709292116+4450e59. When I merge that tag into master, I get a merge conflict on Monitor.c: <<<<<<< HEAD:Monitor.c if (mse->devnum !=3D MAXINT && =3D=3D=3D=3D=3D=3D=3D if (mse->devnum !=3D INT_MAX && >>>>>>> upstream:Monitor.c There are five commits between mdadm-2.6.2 and mdadm-2.6.3+200709292116+4450e59 that affect Monitor.c: 01d9299 e4dc510 * 66f8bbb 98127a6 4450e59 The third commit, the one with the asterisk is the one that I cherry-picked (as 845eef9); the other two cherries I picked do not touch Monitor.c. The fifth/last commit (4450e59) is the one responsible for the change which seems to cause the conflict. It is the *only* commit since the common ancestor of *both* branches that touches the conflicting lines. The fourth commit (98127a6) inserts a single line at the top of the file, so that's nothing that would cause a conflict. To be honest, I can't explain it. But I didn't give up. I branched master2 off 845eef9b~1, cherry-picked the first two commits that touch Monitor.c, cherry-picked all the commits 845eef9b..master into master2 and merge upstream... =2E.. to get exactly the same conflict in exactly ...
Side note - run
gitk --merge
when you have a merge conflict, and it will basically show you the thing
graphically (ie history as it is relevant to the merge, and only to the
files that get conflicts).
But basically, both sides have modified the code *around* that line, and
they have modified it differently.
Do this in your partial merge tree on 'master':
git diff ...mdadm-2.6.3+200709292116+4450e59 Monitor.c
git diff mdadm-2.6.3+200709292116+4450e59... Monitor.c
which will show you the diff from the common base ancestor. And in
particular, it will show how one branch did this:
@@ -399,9 +401,8 @@ int Monitor(mddev_dev_t devlist,
struct mdstat_ent *mse;
for (mse=mdstat; mse; mse=mse->next)
if (mse->devnum != MAXINT &&
- (strcmp(mse->level, "raid1")==0 ||
- strcmp(mse->level, "raid5")==0 ||
- strcmp(mse->level, "multipath")==0)
+ (strcmp(mse->level, "raid0")!=0 &&
+ strcmp(mse->level, "linear")!=0)
) {
struct state *st = malloc(sizeof *st);
mdu_array_info_t array;
and the other one did
@@ -398,10 +402,9 @@ int Monitor(mddev_dev_t devlist,
if (scan) {
struct mdstat_ent *mse;
for (mse=mdstat; mse; mse=mse->next)
- if (mse->devnum != MAXINT &&
- (strcmp(mse->level, "raid1")==0 ||
- strcmp(mse->level, "raid5")==0 ||
- strcmp(mse->level, "multipath")==0)
+ if (mse->devnum != INT_MAX &&
+ (strcmp(mse->level, ...also sprach Linus Torvalds <torvalds@linux-foundation.org> [2007.10.10.0354= This is the part I over-estimated. I thought that Git would figure out that commits 1-3 had been merged into the target and thus apply, in sequence, only the commits from the source which had not been merged. Many thanks (again), Linus! Looking forward to your next content manager; you know, the one with artificial intelligence built in! You could call it "wit" :) --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 dies ist eine manuell generierte email. sie beinhaltet tippfehler und ist auch ohne gro=DFbuchstaben g=FCltig. =20 spamtraps: madduck.bogus@madduck.net
Well, there is also an obvious name choice when the distinguishing innovation is a well-rounded feature set, but it would cause a name collision for the equivalent of "tig". -- David Kastrup, Kriemhildstr. 15, 44793 Bochum -
Yes, *some* SCM's have tried to do that. In particular, the ones that are "patch-based" tend to think that patches are "identical" regardless of where they are, and while re-ordering of them is a special event, it's not somethign that changes the fundamental 'ID' of the patch. For example, I think the darcs "patch algebra" works that way. It's a really horrible model. Not only doesn't it scale, but it leads to various very strange linkages between patches, and it fails the most important part: it means that merges get different results just because Well, the git model is really largely the reverse: the system is supposed to be as *stupid* as humanly possible, but: - make it predictable exactly because it's stupid and doesn't do anything even half-ways smart. This is part of the "it doesn't matter *how* you got to a particular state, git will always do the same thing regardless of whether you moved an existing patch around or whether you re-did the changes as (possibly more than one) new and unrelated commits". - conflicts aren't bad - they're *good*. Trying to aggressively resolve them automatically when two branches have done slightly different things in the same area is stupid and just results in more problems. Instead, git tries to do what I don't think *anybody* else has done: make the conflicts easy to resolve, by allowing you to work with them in your normal working tree, and still giving you a lot of tools to help you see what's going on. So git doesn't try to avoid conflicts per se: the merge strategies are fundamentally pretty simple (rename detection and the whole "recursive merge" thing may not be simple code, but the concepts are pretty straightforward), and they handle all the really *obvious* cases, but at the same time, I feel strongly that anything even half-way subtle should not be left to the SCM - the SCM should show it and make it really easy for the user to then fix it up. Side ...
Actually, specifically darcs, different merges _always_ result in the same
data. It's a fundamental part of is patch algebra. No matter what order
you apply a given set of patches, even with conflicts and reordering, you
always get the same result, or no result. Conflicts are "resolved" by
inserting conflict markers in the file, ordered by the patch ID. It
doesn't matter which order you apply them in, you get the same markers.
Then there will be a merge patch which fixes the markers that someone could
apply, no matter what order the applied the previous patches.
Darcs breaks down in a few places, though.
- The no result. Sometimes, it just can't figure out how to reorder
patches. Even worse, occasionally, the implementation will fail to
terminate try to figure this out. There isn't much to do at this
point, except manually apply the patch, hence generating a new patch
ID.
- It doesn't scale well.
The strange linkages between patches could be thought of as a feature,
since it is basically constraining the order that the patches can be
applied in.
There is a darcs-git project that tries to do the darcs things on top of
git.
Dave
-
No they don't. You don't understand the problem. Yes, different merges WITH THE SAME PATCHES always result in the same data. But that's not a realistic - or even very interesting - schenario. What's much more common is that the same problem gets solved slightly differently in two different branches. For example, maybe somebody does it as two different patches - where the second one fixes a bug in the first fix. And another person does the same fix, but without the bug in the first place. See? A patch-based system gets confused by those kinds of issues (or they turn into various special cases). And that is fundamentally why you MUST NOT take history into account (where "history" is some series of individual patches). Yes, history is interesting for historical reasons, and to explain what the context was, but in many ways, history is exactly the *wrong* thing to use when it comes to merging. You should look at the end result, since people can - and do - come to the same result through different ways. Linus -
[ ehh, sorry for my previous mail, i must be doing something wrong.. ] actually it's broken, according to its author: http://www.mail-archive.com/darcs-users@darcs.net/msg03161.html though i loved darcs, but yes - it scales horribly - VMiklos
This is true. However I think there are some obvious places for improvement that does look at the file history, when the regular algorithm fails; 1. do a --cherry-pick rev-list on just the file being merged and see if all the changes on one side disappear, in which case just take the result. 2. see if the files were identical at some point, in which case use a new merge base for that file based on the changes since that revision. I actually thought #2 was already the way recursive worked! Sam. -
Actually, I think both of these are fundamentally wrong. The reason is that you talk about "the file". Anything that is based on per-file heuristics is going to mean that you use a history that is not necessarily compatible with the _other_ files in the project. I agree 100% that per-file history information is going to find more things to merge automatically. But the point I was trying to make was that "automatic merges" aren't always *good*. I realize that pretty much every single theoretical merge algorithm out there tries to make merges happen automatically as much as possible, but they all en dup having strange issues. For example, take a patch that cherry-picked into mainline from a development branch, but that partly depended on some support-feature that wasn't in mainline yet. So then there is another patch that removes part of that patch from mainline. So mainline is fine. Now, three months later, the development branch is stable, and is fully merged. What happens? Git will largely get this right. Git will look at the last *global* common base, and will just look at the contents, and do a reasonable job. Yes, there will probably be conflicts (because both the development branch and the mainline ended up touching the same parts of the files thanks to the cherry-pick, and yet mainline has some added hacks on top to disable it), but on the whole that's exactly what you want! (Alternatively, maybe the "remove the part that wasn't supported yet" ended up meaning that that particular part of the patch was excised entirely from mainline, and there was no conflict at all, and git just merged the new stuff from the development branch cleanly! So I'm not saying that it *has* to conflict, I'm just saying that it might have). In other words: git always "does the right thing". Assuming both branches are stable and working, git does a very reasonable thing. It's obviously not always the thing people may *want*, but it's guaranteed to be a ...
