Ok, over the last week or so, I've been having a lot more content
conflicts than usual, mostly because of
(a) just the fact that the way the merge window happens for the kernel
these days, rather than have incremental small merges, we often end
up having lots of big ones.
and
(b) I ended up merging a few trees that had lots of small changes all
over, notably to header files having their <config.h> include
removed, causing trivial conflicts.
Now, the good news is that I have to say that our conflict resolution
rocks. It's all been _very_ easy to do. In fact, it's been even more
pleasant than BK was, because of one big issue: you could resolve the
conflict in the tree, then _test_ it (perhaps just compile-test it), and
commit the resolved result separately. With BK, you had to resolve and
commit atomically, and you never had access to a "preliminary resolve" to
test.
However, I also notived one particular thing that I did that we make less
than perfectly easy.
One thing that is _very_ useful to do is to do when you have a conflict is
this:
git log -p HEAD MERGE_BASE..MERGE_HEAD -- conflicting-filename
because this shows all the changes (with their explanations) for that
filename since the MERGE_BASE in _both_ branches you're trying to merge.
This simple command really makes conflict resolution a hell of a lot
easier, because you can see what caused the conflict, and you get a real
feel for what both branches were doing, making it a _lot_ more likely that
you actually do the right thing.
Now, the downside is that the above is both a pain to type, and we don't
actually even save the MERGE_BASE as a head, so you actually have to
compute it yourself. It's easy enough to do:
git-merge-base HEAD MERGE_HEAD > .git/MERGE_BASE
will do it, but the fact is, we should make this even easier.
In fact, after writing the above a few times, I really think there's a
case for making a helper function that does exactly the a...Heh, that's why I kept saying I want somebody to teach rev-list I fall in the former category but as the current maintainer I feel I should leave chance to do the latter to others first. I wouldn't call it "trivial" but it is not that hard -- I think I can write it in my head (as Linus can). -
I actually don't think that expression makes any sense. $(merge-base A B)..B as an expression only makes sense if there is a single point of forking, and no contact apart from that. In that case, what you suggest makes sense, because doing git diff A...B is exactly what you want. HOWEVER. If there has been any other merges in between (but they aren't merge-bases because either branch _also_ did other things), your A...B expression is meaningless, I think. To do a diff in that case, you really need to do my "merge+diff" thing, and no amount of "A...B" expressions on a commit relationship level can be meaningful. Now, the expression A...B == B...A == A B --not $(git-merge-base --all A B) is meaningful (and the one I want for merges), but it's largely useless for anything else. It just means "the set of all commits that aren't trivially in both" (it's not strictly a valid set operation, but it approaches being an "xor" instead of a union or an intersection or a difference). But the above isn't useful for "git diff" and friends any more, it's mainly just for merging. Linus -
You mean something like the following patch on top of the 'next' branch?
It also documents the --not switch because I needed it for the example.
TODO: There are still a few undocumented options left and setup_revisions()
is fat and ugly. Any volunteers? I'd clean it up if I only could
write comprehensible documentation and wasn't that lazy..
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index ad6d14c..ffbf0c9 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -15,6 +15,7 @@ SYNOPSIS
[ \--sparse ]
[ \--no-merges ]
[ \--remove-empty ]
+ [ \--not ]
[ \--all ]
[ \--topo-order ]
[ \--parents ]
@@ -37,6 +38,14 @@ not in 'baz'".
A special notation <commit1>..<commit2> can be used as a
short-hand for {caret}<commit1> <commit2>.
+Another special notation is <commit1>...<commit2> which is useful for
+merges. The resulting set of commits is the symmetric difference
+between the two operands. The following two commands are equivalent:
+
+------------
+$ git-rev-list A B --not $(git-merge-base --all A B)
+$ git-rev-list A...B
+------------
OPTIONS
-------
@@ -93,6 +102,11 @@ OPTIONS
--remove-empty::
Stop when a given path disappears from the tree.
+--not::
+ Reverses the meaning of the '{caret}' prefix (or lack
+ thereof) for all following revision specifiers, up to
+ the next `--not`.
+
--all::
Pretend as if all the refs in `$GIT_DIR/refs/` are
listed on the command line as <commit>.
diff --git a/revision.c b/revision.c
index ae4ca82..d4224a1 100644
--- a/revision.c
+++ b/revision.c
@@ -766,6 +766,47 @@ int setup_revisions(int argc, const char
left++;
continue;
}
+ dotdot = strstr(arg, "...");
+ if (dotdot) {
+ unsigned char other_sha1[20];
+ const char *one = arg;
+ const char *two = dotdot + 3;
+ *dotdo...I suspect this has the same problem I pointed out to Kristian's
attempt to make git-branch a built-in.
Subject: Re: [PATCH] Implement git-branch and git-merge-base as built-ins.
Date: Thu, 08 Jun 2006 11:53:48 -0700
Message-ID: <7vverbsclf.fsf@assigned-by-dhcp.cox.net>
Namely, merge-base code is not set up to be called more than
once without cleaning things up.
-Eek! This is not a nice interface. Your example IDs from the your mail
to Kristian:
$ ./git-rev-list 89719209...262a6ef7 66ae0c77...ced9456a | wc
75 75 3075
$ git-rev-list 89719209 262a6ef7 \
--not $(git-merge-base --all 89719209 262a6ef7) \
--not 66ae0c77 ced9456a \
--not $(git-merge-base --all 66ae0c77 ced9456a) | wc
75 75 3075
$ ./git-rev-list 66ae0c77...ced9456a 89719209...262a6ef7 | wc
76 76 3116
$ git-rev-list 66ae0c77 ced9456a \
--not $(git-merge-base --all 66ae0c77 ced9456a) \
--not 89719209 262a6ef7 \
--not $(git-merge-base --all 89719209 262a6ef7) | wc
75 75 3075
Yep, that doesn't seem right. The additional line is 262a6ef (which is
the merge base for 66ae0c77 and ced9456a btw.). The other 4x 75 lines
match.
I wonder why the two clear_commit_marks() calls at the end of
get_merge_bases() are not sufficient, though.
RenWhy does that thing check for "parent->object.parsed"? Also, it only clears commit marks if they are contiguous, but some commit marking may not be dense (eg, the "UNINTERESTING" mark may have been set by (PARENT1 && PARENT2) triggering, but is not set in the commits that reach it. Linus -
That is true. -
I'll be offline for a few hours, but if anybody is inclined to
fix this up, I'd appreciate it to be based on top of 7c6f8aa
commit (which is the tip of js/merge-base topic branch).
Perhaps as a starter (not even compile tested)...
diff --git a/commit.c b/commit.c
index 0431027..23ac210 100644
--- a/commit.c
+++ b/commit.c
@@ -397,13 +397,13 @@ void clear_commit_marks(struct commit *c
{
struct commit_list *parents;
- parents = commit->parents;
+ if (!commit)
+ return;
commit->object.flags &= ~mark;
+ parents = commit->parents;
while (parents) {
struct commit *parent = parents->item;
- if (parent && parent->object.parsed &&
- (parent->object.flags & mark))
- clear_commit_marks(parent, mark);
+ clear_commit_marks(parent, mark);
parents = parents->next;
}
}
@@ -1012,7 +1012,8 @@ static void mark_reachable_commits(struc
}
}
-struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2)
+struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2,
+ int cleanup_needed)
{
struct commit_list *list = NULL;
struct commit_list *result = NULL;
@@ -1081,8 +1082,10 @@ struct commit_list *get_merge_bases(stru
}
/* reset flags */
- clear_commit_marks(rev1, PARENT1 | PARENT2 | UNINTERESTING);
- clear_commit_marks(rev2, PARENT1 | PARENT2 | UNINTERESTING);
+ if (cleanup_needed) {
+ clear_commit_marks(rev1, PARENT1 | PARENT2 | UNINTERESTING);
+ clear_commit_marks(rev2, PARENT1 | PARENT2 | UNINTERESTING);
+ }
return result;
}
diff --git a/commit.h b/commit.h
index 89b9dad..53bc697 100644
--- a/commit.h
+++ b/commit.h
@@ -105,6 +105,6 @@ struct commit_graft *read_graft_line(cha
int register_commit_graft(struct commit_graft *, int);
int read_graft_file(const char *graft_file);
-extern struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2);
+extern struct commit_list *get_merge_bases(struct commit *rev1, struct c...Due to popular request ;-), change get_merge_bases() to be able to
clean up after itself if needed by adding a cleanup parameter.
We don't need to save the flags and restore them afterwards anymore;
that was a leftover from before the flags were moved out of the
range used in revision.c. clear_commit_marks() sets them to zero,
which is enough.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
diff --git a/commit.c b/commit.c
index 70a4eff..94c1d0e 100644
--- a/commit.c
+++ b/commit.c
@@ -1012,7 +1012,8 @@ static void mark_reachable_commits(struc
}
}
-struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2)
+struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2,
+ int cleanup)
{
struct commit_list *list = NULL;
struct commit_list *result = NULL;
@@ -1080,20 +1081,10 @@ struct commit_list *get_merge_bases(stru
tmp = next;
}
- return result;
-}
-
-struct commit_list *get_merge_bases_clean(struct commit *rev1,
- struct commit *rev2)
-{
- unsigned int flags1 = rev1->object.flags;
- unsigned int flags2 = rev2->object.flags;
- struct commit_list *result = get_merge_bases(rev1, rev2);
-
- clear_commit_marks(rev1, PARENT1 | PARENT2 | UNINTERESTING);
- clear_commit_marks(rev2, PARENT1 | PARENT2 | UNINTERESTING);
- rev1->object.flags = flags1;
- rev2->object.flags = flags2;
+ if (cleanup) {
+ clear_commit_marks(rev1, PARENT1 | PARENT2 | UNINTERESTING);
+ clear_commit_marks(rev2, PARENT1 | PARENT2 | UNINTERESTING);
+ }
return result;
}
diff --git a/commit.h b/commit.h
index 3a26e29..779ed82 100644
--- a/commit.h
+++ b/commit.h
@@ -105,7 +105,6 @@ struct commit_graft *read_graft_line(cha
int register_commit_graft(struct commit_graft *, int);
int read_graft_file(const char *graft_file);
-extern struct commit_list *get_merge_bases(struct commit *rev1, struct commit *rev2);
-extern struct commit_l...Btw, I think the symmetric thing is still wrong.
Try this:
git rev-list ^HEAD~1 HEAD...HEAD~2
which _should_ return just HEAD ("HEAD...HEAD~2" should basically expand
into "HEAD HEAD~2 ^HEAD~2")
I haven't actually tested your patch, but I think it returns HEAD and
HEAD~1.
The reason? You clear UNINTERESTING as part of clear_commit_marks(), so
HEAD~1, which was marked thus by the user, gets cleared of its mark as
part of the get_merge_bases() invocation.
I suspect the only way to fix that is to make "get_merge_bases()" not use
UNINTERESTING at all, but instead just explicitly use something like
(obj.flags & (LEFT | RIGHT)) == (LEFT | RIGHT)
instead.
Linus
-No and yes. Patch 1 in the 3+1 series changes the flags used in commit.c to not conflict with the ones in revision.h[*]. So we have two different UNINTERESTINGs, and get_merge_bases() doesn't mess up the show/no-show markings. Ren
Gaah. So UNINTERESTING in commit.c needs something else than everywhere else? That's a bug waiting to happen. Please give it a name of its own. Linus -
Gaah. commit.c defines its own UNINTERESTING and you rely on
not including revision.h which is ... gasp ... #$@#$!!!
Could we do something like this, pretty please?
---
diff --git a/commit.c b/commit.c
index 94c1d0e..a608faf 100644
--- a/commit.c
+++ b/commit.c
@@ -851,14 +851,14 @@ void sort_in_topological_order_fn(struct
/* bits #0..7 in revision.h */
#define PARENT1 (1u<< 8)
#define PARENT2 (1u<< 9)
-#define UNINTERESTING (1u<<10)
+#define STALE (1u<<10)
static struct commit *interesting(struct commit_list *list)
{
while (list) {
struct commit *commit = list->item;
list = list->next;
- if (commit->object.flags & UNINTERESTING)
+ if (commit->object.flags & STALE)
continue;
return commit;
}
@@ -920,17 +920,17 @@ static struct commit *interesting(struct
*
* Next, we pop B and something very interesting happens. It has flags==3
* so it is also placed on the result list, and its parents are marked
- * uninteresting, retroactively, and placed back on the list:
+ * stale, retroactively, and placed back on the list:
*
* list=C(7), result=C(7) B(3)
*
* Now, list does not have any interesting commit. So we find the newest
- * commit from the result list that is not marked uninteresting. Which is
+ * commit from the result list that is not marked stale. Which is
* commit B.
*
*
* Another pathological example how this thing used to fail to mark an
- * ancestor of a merge base as UNINTERESTING before we introduced the
+ * ancestor of a merge base as STALE before we introduced the
* postprocessing phase (mark_reachable_commits).
*
* 2
@@ -960,8 +960,8 @@ static struct commit *interesting(struct
* C7 2 3 7 1 3 2 1 2
*
* At this point, unfortunately, everybody in the list is
- * uninteresting, so we fail to complete the following two
- * steps to fully marking uninteresting commits.
+ * stale, so we fail to complete the following two
+ * steps to fu...You mean crap? Yes, it is. That's why I said we should centralize the commit flags. There are several duplicate names for different commit flags in different files. I couldn't come up with a clean solution, though. I have an idea. Would it help to gather a list of all commit object flags from the different files? And then we would come up with unique names? And then move the definition of all of them to commit.h? Ren
We already did that once, for a subset of the programs. Except we put them in revision.h instead ;) And yes, we should probably do it for most of the rest. Some of them (git-show-branch) have really special flags usage, and don't want to pollute their magic flags with anybody else at all, but most of the rest (fetch.c etc) have just a few flags and could in fact mostly re-use the standard ones. Linus -
Err, I mean object flags (not only commit flags) and object.h instead of commit.h. Ren
Hi, Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Ciao, Dscho -
Don't care if objects have been parsed or not and don't stop when we
reach a commit that is already clean -- its parents could be dirty.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
commit.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/commit.c b/commit.c
index 593414d..70a4eff 100644
--- a/commit.c
+++ b/commit.c
@@ -397,13 +397,12 @@ void clear_commit_marks(struct commit *c
{
struct commit_list *parents;
+ if (!commit)
+ return;
parents = commit->parents;
commit->object.flags &= ~mark;
while (parents) {
- struct commit *parent = parents->item;
- if (parent && parent->object.parsed &&
- (parent->object.flags & mark))
- clear_commit_marks(parent, mark);
+ clear_commit_marks(parents->item, mark);
parents = parents->next;
}
}
--
1.4.1.rc2.gfc04
-There is something quite wrong with this patch. When you are dealing with complex commit ancestry, this ends up traversing the same parent over and over again. I tried to do a merge in linux-2.6 history with these two commits: v2.6.17-g29454dd v2.6.17-gd6b0c53 The former is Linus's head at this writing, and the latter is James Bottomley's scsi-misc head. git describe 29454dd d6b0c53 from the "master" branch returns immediately (the use there can assume that the mark is set and contiguous, I think) while the one with this patch takes forever. Another example is to try changing get_merge_bases() to always clean-up as Johannes had originally and try computing the merge base between the two. Just before it starts to clean-up, it has only seen 5983 objects (object.c::nr_objs) and it definitely would be faster to clean flags from all these objects than to wait for the two calls to clear_commit_marks() to complete, which seems to also take forever. -
Hi, I always had the feeling that it was wrong to traverse not-yet-parsed parents: How could a revision walk possibly come to a certain commit without at least one continuous history of now-parsed objects? Also, AFAIK the revision walk sets flags for each commit it touched, and we should not try to be smart-asses about the flags, but just unset these flags. BTW some very quick tests showed that the clear_commit_marks() thing that I sent to the list was much faster than traversing all objects (which was in my original version). Ciao, Dscho -
The main points were made by Linus already. Traversing is not needed -- not clearing not-yet-parsed is I have a crude workaround pushed out last night but will be replacing it with something less drastic. I think the final version should be what you had, perhaps minus not looking at the parsed flag for unmarking purposes. -
Hi, Traversing is actually wrong. Clearing the marks does not mean to clear them on commits we did not even mark! But clearing on commits we _have_ -- but not parsed -- is important, Isn't the right way to go about it to just clear the marks if we have a commit that has at least one of the marks set, but traverse further only if _in addition to having at least one mark set_ the commit has been parsed already? That would not be a crude workaround. BTW what cases could have a commit, being seen by the revision walk, not have the SEEN mark set? Ciao, Dscho -
If we didn't mark them, then clearing them would be a no-op, so nobody Right. The point is, some logic can choose to mark commits UNINTERESTING without even parsing that commit, and it would be a good thing. You only need to parse the commit once you decide that you need its parents (or it's tree, of course), but you may be able to mark it uninteresting before that. This is why it is _wrong_ to care about the "parsed" bit when clearing the flags. Linus -
Hi, My point being: if we try to clear commits we could not have possibly marked, because they were not yet parsed, this is wrong. Basically, I I will have another look at the users of clear_commit_marks() tomorrow. Ciao, Dscho -
While you say above is correct, object.parsed bit does not give you enough information to decide if we could not have possibly marked or not, because it is perfectly valid to mark a commit that we have not parsed. As Linus said in this thread already (I am rephrasing): - When you have a commit object, which is parsed, you can tell who its parents are; more importantly, at that point you have access to the commit objects for the parents already, but they may not have been parsed. - If you can make a decision whether the parents of a commit should be marked or not solely by looking at the commit, you are allowed to do so before parsing these parents. The commit you look at to make that decision has to be parsed for you to know who the parents are, though. Marking unparsed objects is a valid operation because even unparsed objects have flags field that is retained when they are later lazily parsed. Right now, in the inner loop of the main loop of merge_base() code, we parse each parent and insert it into the &list, but instead we could parse commit (if not parsed yet) just before taking its parents list in the outer loop. That way we would parse the parents lazily and will have commits marked but still not parsed. -
No, that's not the problem.
The problem is that if we unconditionally traverse parents - whether
parsed or not - any merge will basically result in a 2* expansion of the
working set: we'll traverse all children twice (whether they meet again or
not).
So the cost of doign unconditional traversals of parents basically
approaches 2^n, where 'n' is the number of merges.
Now, the fact that we only traverse parents without adding new ones (and
the decision on whether it is parsed or not is irrelevant - the only
relevant part being that we don't parse any _new_ ones) means that each
commit itself is very cheap to traverse, but O(2^n) ends up meaning that
even a small constant will eventually be pretty big.
The proper fix is _not_ to add the "object->parsed" check (which is silly,
wrong, and doesn't fix anything at all), but to add a check for whether
the object has been seen or not.
In the case of clearing flags, you have two choices:
- _set_ a new flag ("already cleared"). This would work - once - but is
obviously pretty bad.
This is what we do in all the other cases. We usually call the flag
SEEN or similar.
- depend on the flags being "dense", and saying that we depend on the
fact that in order for any of the flags to have been set in the first
place, at least _one_ of them needs to be set in the path leading up to
that commit.
Now, for the specific case of get_merge_bases(), the flags _are_ dense.
Individual flags may not be (eg the "UNINTERESTING" flag, whatever it's
called, will not be dense), but the question of "is _any_ of the flags we
care about set" _will_ be dense.
As such, adding a
/* Have we already cleared this? */
if (!(mask & object->flags))
return;
object->flags &= ~mask;
to the traversal function will fix the O(m+2^n) behaviour, and turn the
traversal into O(m+n) (where "n" is the number of merges, and "m" is the
total number of commits).
Linus
-Hi, I thought we already did this, and did not even check. My bad. Ciao, Dscho -
'A...B' is a shortcut for 'A B --not $(git-merge-base --all A B)'.
This XOR-like operation is called symmetric difference in set
theory.
The symbol '...' has been chosen because it's rather similar to the
existing '..' operator and the somewhat more natural caret ('^') is
already taken.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
Documentation/git-rev-list.txt | 14 ++++++++++++
revision.c | 48 +++++++++++++++++++++++++++++++++-------
2 files changed, 53 insertions(+), 9 deletions(-)
diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index ad6d14c..6c370e1 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -15,6 +15,7 @@ SYNOPSIS
[ \--sparse ]
[ \--no-merges ]
[ \--remove-empty ]
+ [ \--not ]
[ \--all ]
[ \--topo-order ]
[ \--parents ]
@@ -37,6 +38,14 @@ not in 'baz'".
A special notation <commit1>..<commit2> can be used as a
short-hand for {caret}<commit1> <commit2>.
+Another special notation is <commit1>...<commit2> which is useful for
+merges. The resulting set of commits is the symmetric difference
+between the two operands. The following two commands are equivalent:
+
+------------
+$ git-rev-list A B --not $(git-merge-base --all A B)
+$ git-rev-list A...B
+------------
OPTIONS
-------
@@ -93,6 +102,11 @@ OPTIONS
--remove-empty::
Stop when a given path disappears from the tree.
+--not::
+ Reverses the meaning of the '{caret}' prefix (or lack
+ thereof) for all following revision specifiers, up to
+ the next `--not`.
+
--all::
Pretend as if all the refs in `$GIT_DIR/refs/` are
listed on the command line as <commit>.
diff --git a/revision.c b/revision.c
index b963f2a..9eb0b6d 100644
--- a/revision.c
+++ b/revision.c
@@ -536,6 +536,18 @@ void init_revisions(struct rev_info *rev
diff_setup(&revs->diffopt);
}
+static void add...Add get_merge_bases_clean(), a wrapper for get_merge_bases() that cleans
up after doing its work and make get_merge_bases() NOT clean up.
Single-shot programs like git-merge-base can use the dirty and fast
version.
Also move the object flags used in get_merge_bases() out of the range
defined in revision.h. This fixes the "66ae0c77...ced9456a
89719209...262a6ef7" test of the ... operator which is introduced with
the next patch.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
All object flags should probably be gathered into one place to avoid
future conflicts.
commit.c | 20 ++++++++++++++++----
commit.h | 1 +
2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/commit.c b/commit.c
index 0431027..593414d 100644
--- a/commit.c
+++ b/commit.c
@@ -849,9 +849,10 @@ void sort_in_topological_order_fn(struct
/* merge-rebase stuff */
-#define PARENT1 1
-#define PARENT2 2
-#define UNINTERESTING 4
+/* bits #0..7 in revision.h */
+#define PARENT1 (1u<< 8)
+#define PARENT2 (1u<< 9)
+#define UNINTERESTING (1u<<10)
static struct commit *interesting(struct commit_list *list)
{
@@ -1080,9 +1081,20 @@ struct commit_list *get_merge_bases(stru
tmp = next;
}
- /* reset flags */
+ return result;
+}
+
+struct commit_list *get_merge_bases_clean(struct commit *rev1,
+ struct commit *rev2)
+{
+ unsigned int flags1 = rev1->object.flags;
+ unsigned int flags2 = rev2->object.flags;
+ struct commit_list *result = get_merge_bases(rev1, rev2);
+
clear_commit_marks(rev1, PARENT1 | PARENT2 | UNINTERESTING);
clear_commit_marks(rev2, PARENT1 | PARENT2 | UNINTERESTING);
+ rev1->object.flags = flags1;
+ rev2->object.flags = flags2;
return result;
}
diff --git a/commit.h b/commit.h
index 89b9dad..3a26e29 100644
--- a/commit.h
+++ b/commit.h
@@ -106,5 +106,6 @@ int register_commit_graft(struct commit_
int read_graft_file(const char *graft_file);
...Hi, Of COURSE! *hits his head on the table; table breaks; has to buy a new table* Ciao, Dscho -
I missed to notice that Johannes had added those calls there; we should remove them from get_merge_bases(). The normal case of git-merge-base calling get_merge_bases() once and exiting should NOT have to pay for the clean-up cost at all. -
Hi, I wonder, too. In all of my tests, this call was sufficient to reset the marks. I really wonder what happens here. At least one of the three flags As you hinted with your patch in another mail: no, we should not. Stuff them into an if(), yes. But the point of a library function is not to make it hard on users, but rather to make life easier. The user should _not_ have to investigate which loops to jump through to make use of this Note that this was partly the reason for my hesitation to propose this patch for inclusion already. If you make this function a library function, the main user is _not_ git-merge-base any more. Everybody can use it from now on. Ciao, Dscho P.S.: I was having to much fun tonight to investigate in detail... Sorry. -
What's the logic behind naming the operator "..."?
Seems like asking for trouble to have two visually similar operators (".." and
"...") with different meanings, and "..." seems like kind of an arbitrary
choice anyway.
A symmetric difference is basically equivalent to an xor--would a carat ("^")
work? Or could we just stick a word there instead of using some tricky
notation?
--b.
-Well, if ".." is set difference, why not "..." for symmetric set difference. The operations really _are_ related. Also, the parse syntax and logic really is the same, even if Rene's patch didn't take advantage of that. That said, it does have a real downside, and that's simply that it can take a long time to compute. Somebody should also verify that there are no interesting interaction with the fact that we end up traversing the commit lists twice (no object flag interactions etc) with the new "get_merge_bases()" Linus -
I think a...b can be computed by (in pseudocode, obviously):
mark(a, 1);
add(list, a);
mark(b, 2);
add(list, b);
while (interesting(list)) {
if (*list->marks != 3)
output(*list);
for (parent : *list->parents) {
mark(parent, *list->marks);
add(list, parent);
}
}
It's basically the original merge-bases code, from way back; the several
flaws with it for computing a good merge base don't matter if you're just
excluding the merge bases. If you look at the big examples in merge-base.c
(pre-libification), it's obvious that what we want for a...b is everything
marked 1 or 2, and the trickiness in that code is getting things correctly
marked 3 versus 7, which doesn't matter here.
-Daniel
*This .sig left intentionally blank*
-And it has basically the same bug. It is possible to have a / \ b c |\ /| d e f \|/ g and clearly "e" is the only valid merge-base of b and c. HOWEVER. It's actually possible that we traverse d, f and g before we even look at 'e' (because somebody had a bogus date, and 'e' _looks_ old). Remember: in a distributed system we have no global clock, so any graph traversal ordering we choose is by definition always arbitrary, even though we can obviously _try_ to choose one that is efficient in practice (ie the "sort the heap by date). So that's why git-merge-base has all that extra "unnecessary" complexity. You cannot output anything at all until you've guaranteed that all pending objects are uninteresting. Linus -
But that wouldn't actually affect b...c, because we don't actually care that 'e' is the correct merge-base and 'g' is not, because "b c ^e ^g" is the same as "b c ^e". Your point is correct, though; if we look at e before c, we could think that it's interesting when it isn't, so we have to wait until we've That's not all the complexity in git-merge-base, though. There's a ton more that's about why e is right and g is wrong in your example, and we don't care about *that* part in b...c. Actually, I think that it would work to have object flags "LEFT" and "RIGHT", mark b with left, mark c with right, and mark anything with both LEFT and RIGHT as UNINTERESTING as we go through the revisions. The time-ordering problem with symmetric difference isn't absent with regular difference, and, assuming that b..c works in the tricky cases, the same logic should handle symmetric difference. -Daniel *This .sig left intentionally blank* -
You're right - in ths case we don't care about a minimal base commit set at all, it's fine to have too many. I think your patch to do the LEFT/RIGHT thing in git-rev-list internally, instead of generating it as part of the command line, looks fine in theory. Except I think you need to set "revs->limited" for that case too (normally it gets set by "handle_commit()", and only if there is an UNINTERESTING commit: we'd need to add code to set it for LEFT/RIGHT commits too. Linus -
That is: (this only has the logic portion, and it's against master, so it
isn't actually a really working patch or anything; also, it doesn't handle
"--not a...b" correctly, whatever that should mean)
---
diff --git a/revision.c b/revision.c
index 6a6952c..c21d332 100644
--- a/revision.c
+++ b/revision.c
@@ -351,6 +351,9 @@ static void add_parents_to_list(struct r
return;
commit->object.flags |= ADDED;
+ if (commit->object.flags & LEFT && commit->objects.flags & RIGHT)
+ commit->object.flags |= UNINTERESTING;
+
/*
* If the commit is uninteresting, don't try to
* prune parents - we want the maximal uninteresting
@@ -781,8 +784,13 @@ int setup_revisions(int argc, const char
struct object *exclude;
struct object *include;
- exclude = get_reference(revs, this, from_sha1, flags ^ UNINTERESTING);
- include = get_reference(revs, next, sha1, flags);
+ if (symmetric) {
+ exclude = get_reference(revs, this, from_sha1, flags ^ UNINTERESTING);
+ include = get_reference(revs, next, sha1, flags);
+ } else {
+ exclude = get_reference(revs, this, from_sha1, flags | LEFT_HALF);
+ include = get_reference(revs, next, sha1, flags | RIGHT_HALF);
+ }
if (!exclude || !include)
die("Invalid revision range %s..%s", arg, next);
diff --git a/revision.h b/revision.h
index 7d85b0f..93421e6 100644
--- a/revision.h
+++ b/revision.h
@@ -9,6 +9,8 @@
#define BOUNDARY (1u<<5)
#define BOUNDARY_SHOW (1u<<6)
#define ADDED (1u<<7) /* Parents already parsed and added? */
+#define LEFT_HALF (1u<<8) /* Reachable from start of dotdotdot */
+#define RIGHT_HALF (1u<<9) /* Reachable from end of dotdotdot */
struct rev_info;
struct log_info;
--
1.2.4
-[concept patch snipped]
You mean something like the patch below? It seems to work, but in my
unscientific tests it's significant slower than the version based on
get_merge_bases() (0.17s vs 0.05s for
"git-rev-list 89719209...262a6ef7 66ae0c77...ced9456a"). Did I do
something wrong?
You had no mark_parents_left_right() in your patch. I added it because
otherwise it wouldn't remove any common commits. Was this supposed to
work some other way?
We still need an automatic test case, and a better benchmark.
diff --git a/revision.c b/revision.c
index ebee05a..8c494ee 100644
--- a/revision.c
+++ b/revision.c
@@ -339,6 +339,20 @@ static void try_to_simplify_commit(struc
commit->object.flags |= TREECHANGE;
}
+static void mark_parents_left_right(struct commit *commit)
+{
+ unsigned int flags = commit->object.flags & (RIGHT_HALF | LEFT_HALF);
+ struct commit_list *parents = commit->parents;
+
+ while (parents) {
+ struct commit *p = parents->item;
+ p->object.flags |= flags;
+ if (p->parents)
+ mark_parents_left_right(p);
+ parents = parents->next;
+ }
+}
+
static void add_parents_to_list(struct rev_info *revs, struct commit *commit, struct commit_list **list)
{
struct commit_list *parent = commit->parents;
@@ -347,6 +361,13 @@ static void add_parents_to_list(struct r
return;
commit->object.flags |= ADDED;
+ if (commit->object.flags & (RIGHT_HALF | LEFT_HALF)) {
+ if (commit->object.flags & RIGHT_HALF &&
+ commit->object.flags & LEFT_HALF)
+ commit->object.flags |= UNINTERESTING;
+ mark_parents_left_right(commit);
+ }
+
/*
* If the commit is uninteresting, don't try to
* prune parents - we want the maximal uninteresting
@@ -772,7 +793,10 @@ int setup_revisions(int argc, const char
unsigned char from_sha1[20];
const char *next = dotdot + 2;
const char *this = arg;
+ int symmetric = *next == '.';
+
*dotdot = 0;
+ next += symmetric;
...I'd been assuming that there was something that would propagate flags to parents in general in add_parents_to_list(). Of course, that doesn't make sense for arbitrary flags. It might be better to handle it there, and avoid traversing parent lists twice. I'm surprised that it isn't faster than using get_merge_bases(); I'd expect it to be faster than the call to get_merge_bases(), let alone get_merge_bases() plus the processing of output candidates. It should be doing less work that get_merge_bases() ultimately does (since get_merge_bases() has to do the boundary calculation after doing practically everything that the left and right addition to revision.c does), so there's clearly something strange going on. -Daniel *This .sig left intentionally blank* -
Yes.
However, I think that 90% of the code for the ".." and "..." case are the
same, as is largely the finding of it.
So why not just do this all inside the already existing
dotdot = strstr(arg, "..");
if (dotdot) {
unsigned char other_sha1[20];
const char *one = arg;
const char *two = arg + 2;
int symmetric = *two == '.';
*dotdot = '\0';
two += symmetric;
if (one == arg)
one = "HEAD";
if (!*two)
two = "HEAD";
...
because the only difference is really at the very end.
Did you test that it looks correct too?
Linus
-Hrm, I'm not sure this is really cleaner. The two operators consist
of all dots only coincidentally, this is not functionally inherent.
So I think it's better to keep them apart.
Let's see.. [Time passes. A patch materializes at six o'clock.]
With a little helper factored out this doesn't look as bad as I
Sort of; I checked that the two forms (with ... and $(git-merge-base))
gave the same results for 7b8cf0cf and 51d1e83f, that's all. For a
proper test script we'd need to create a repo for which git-merge-base
can report multiple results. I wasn't able to come up with the needed
commands without thinking and gave up for now. Am working on it..
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index ad6d14c..6c370e1 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -15,6 +15,7 @@ SYNOPSIS
[ \--sparse ]
[ \--no-merges ]
[ \--remove-empty ]
+ [ \--not ]
[ \--all ]
[ \--topo-order ]
[ \--parents ]
@@ -37,6 +38,14 @@ not in 'baz'".
A special notation <commit1>..<commit2> can be used as a
short-hand for {caret}<commit1> <commit2>.
+Another special notation is <commit1>...<commit2> which is useful for
+merges. The resulting set of commits is the symmetric difference
+between the two operands. The following two commands are equivalent:
+
+------------
+$ git-rev-list A B --not $(git-merge-base --all A B)
+$ git-rev-list A...B
+------------
OPTIONS
-------
@@ -93,6 +102,11 @@ OPTIONS
--remove-empty::
Stop when a given path disappears from the tree.
+--not::
+ Reverses the meaning of the '{caret}' prefix (or lack
+ thereof) for all following revision specifiers, up to
+ the next `--not`.
+
--all::
Pretend as if all the refs in `$GIT_DIR/refs/` are
listed on the command line as <commit>.
diff --git a/revision.c b/revision.c
in...Hi, Aaah! Junio, Linus, I see the light now. Ciao, Dscho -
Oh, I guess it _is_ perfectly valid. It's called a "symmetric difference" in set theory. So from a set standpoint: Git op: Set theory: git-rev-list a..b // difference: B - A git-rev-list b..a // difference: A - B git-rev-list a b // union of A B (order doesn't matter) git-rev-list a...b // symmetric difference A B (order doesn't matter) git-rev-list $(git-merge-base --all a b) // intersection of A and B I think. Linus -
| Jacob Meuser | Re: Wasting our Freedom |
| Can E. Acar | Re: Wasting our Freedom |
| Ingo Molnar | [announce] CFS-devel, performance improvements |
| Pardo | Re: pthread_create() slow for many threads; also time to revisit 64b context switc... |
git: | |
| Li Frank-B20596 | why not TortoiseGit |
| Jon Smirl | ! [rejected] master -> master (non-fast forward) |
| Johannes Schindelin | Re: [PATCH] "not uptodate" changed to "has local changes" |
| Nigel Magnay | crlf with git-svn driving me nuts... |
| Damian Gerow | Oddly high load average |
| Bertram Scharpf | First install: Grub doesn't find partitions |
| Jon Morby | IPv6 and OpenBGPD - Protocol not available |
| Brandon Lee | DELL PERC 5iR slow performance |
| Volker Armin Hemmann | build error with 2.6.27.6+reiser4+ehci-hub patch. ERROR: "mii_ethtool_gset" [drive... |
| Evgeniy Polyakov | [resend take 2 0/4] Distributed storage. |
| Rafael J. Wysocki | Re: Regression in 2.6.27-rc7: Wake On LAN with sky2 broken |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| kernel module to intercept socket creation | 31 minutes ago | Linux kernel |
| Question on swap as ramdisk partition | 2 hours ago | Linux kernel |
| sysctl - dynamic registration problem | 2 hours ago | Linux kernel |
| serial driver xmit problem | 6 hours ago | Linux kernel |
| Generic Netlink subsytem | 7 hours ago | Linux kernel |
| 'Report spam filter error' page broken | 9 hours ago | KernelTrap Suggestions and Feedback |
| Netfilter kernel module | 1 day ago | Linux kernel |
| Why Windows is better than Linux | 1 day ago | Linux general |
| How can I see my kernel messages in vt12? | 1 day ago | Linux kernel |
| Grub | 1 day ago | Linux general |
