I've been converting some old CVS repositories to git, and as it turns
out, these repositories consist of a number of main branches of the same
project that were created at several points in time (the stable release
branches), and the branches contain numerous backports (and a few
forward ports) between each other.I.e. the branches split off each other at various points in time, and
evolved independently ever since (except for the numerous backports).Now, the backports can be implemented using a mere "git cherry-pick -x",
but that creates this silly text references to the original commits.
I'd rather use something that gitk can visualise.So I tried to use the parents of the commit to reference the origin(s).
I.e. the first parent links to the linear history on a given branch, but
the second (and possibly more) parents point to the cherry-picked
back-ported commit from another branch. This graft-constructed
repository is then fed to filter-branch to make it permanent.
To view it try: git://pike-git.lysator.liu.se/pikexThis works quite well and shows the following results:
- gitk shows proper grafts.
- gitk properly shows a zero-diff between the new commit and the
commits we cherry-picked from.
- It even works perfectly when picking from multiple parents.
- gitk is confused in its display of tags preceding and following this
commit (depending on the situation it mixes up the branches).Obviously the reason it works rather well is because git can actually
distinguish between a merge and a backport because of the way the
contents of the trees change.The questions now are:
- Would there be good reason not to record the backport/forwardport
relationship in the additional parents of a commit?
- Since most of the git machinery (git diff, and gitk, most notably)
seem to work just fine when using parents for that purpose, would it
be acceptable to create a patch to cherry-pick to support an option to
record the backport/forwardport relationship in the second (or...
Even though the answer to "the previous question" is a solid no, it is not
just acceptable but it would be very useful to teach gitk that the commit
you cherry-picked from is somehow related to the resulting commit from the
cherry-picking, and teach it to give you an easy access (and even a visual
cue about their relationship) to the other commit when it is showing the
cherry-picked commit.I think the commit object name -x records in the commit message of the
cherry-picked one is noticed by gitweb to give you an easy access. You
could teach gitk a similar trick, and that would not just help cherry
picking but also reverts, and a fix-up commit that says "This fixes the
regression introduced by commit 90ff09a5".You could further draw _different_ kind of line on the upper "graph" pane,
to show that a commit is _related_ to another commit. Because cherry-pick
relation is about the resulting commit and the _single_ commit that was
cherry-picked (in other words, the parent of the cherry-picked original
does not have _any_ relation to the commit that results from the
cherry-pick), such a line should be visually very distinct from the usual
parent-child relationship, which is the gitk graph (or any other commit
ancestry graph) is about. But if it can be represented clearly, I'd
imagine that it would be interesting to see.--
Checking the on-disk format I see that it has been defined in a rather
extensible way.If we were to put the SHA1-ref somewhere in the commit message,
finding references to a certain commit through cherry-picks becomes
rather disk/CPU-intensive.Would there be any objections against extending the on-disk format to
accomodate something like the following:commit 7df437e56b5a2c5ec7140dd097b517563db4972c
tree a006f20b481d811ccb4846534ef6394be5bc78a8
parent ff1e8bfcd69e5e0ee1a3167e80ef75b611f72123
parent bbb896d8e10f736bfda8f587c0009c358c9a8599
cousin 6ffaecc7d8b2c3c188a2efa5977a6e6605d878d9
cousin a1184d85e8752658f02746982822f43f32316803
author Junio C Hamano <gitster@pobox.com> 1220153499 -0700
committer Junio C Hamano <gitster@pobox.com> 1220153499 -0700Whereas cherry-pick would (optionally) generate a cousin reference for every
commit it picks.I'm willing to do the work to fix up git-core to support the new field.
--
Sincerely,
Stephen R. van den Berg.
The Horkheimer Effect: "The odds of it being cloudy are directly proportional
to the importance of an astronomical event."
--
Sorry for wandering into a thread in the middle. But we've already
been down this road before, and decided the additional header wasn't
worth it from cherry-pick. What's changed? The fact that gitk
wants to hyperlink this? Why can't it just regex out a string of
hex digits longer than 6 and see if there is a commit that matches?--
Shawn.
--
The main difference is that git-cherry too often does not work right now
(I have a local patch so that it more-or-less works with my GNU
ChangeLogs, but it does not improve the situation if rerere kicks in).(Of course git cherry could also regex out a string of hex digits longer
than 6 and see if there is a commit that matches).Paolo
--
I'm not familiar with the old thread. Any pointers? (I tried googling,
To avoid (accidental) duplication of the old thread, I'll try and read
that first.
--
Sincerely,
Stephen R. van den Berg.
The Horkheimer Effect: "The odds of it being cloudy are directly proportional
to the importance of an astronomical event."
--
Found it, thanks. Digested it, and yes, some things changed.
I'll make a formal proposal which should take care of all the old
objections.
--
Sincerely,
Stephen R. van den Berg."Be spontaneous!"
--
What about "origin", and making it propagated through cherry-picks? In
other words, if I "cherry-pick -x" A generating B, and do the same on B
generating C, C should have A as origin. Also, "git cherry-pick -n -x"
should add the commit to a list of origins somewhere so that "git
commit" can reuse it.Furthermore, "git cherry" should use origins if available.
Paolo
--
That is debatable, and should be configurable with a switch.
It depends on the way of operation, I guess.
If one picks A -> B, and then B -> C, then usually for C you want B
to be the origin to indicate that the patch has been tested and shaved
to fit from A -> B, and further polished to fit from B -> C.
Usually backporting involves shaving the patch slightly to fit the older
branch, and in that case it is truly more honest to point back to B
instead of A from C. And besides, you can follow the chain to C -> B -> AThat is one of the places in git that needs to accomodate the new field,
luckily the impact on the rest of git-core is rather minimal, I think.
--
Sincerely,
Stephen R. van den Berg.
The Horkheimer Effect: "The odds of it being cloudy are directly proportional
to the importance of an astronomical event."
--
Unfortunately I think it is more complicated than that.
If I understand correctly (please correct me if I am wrong) you meant
'cousin' / 'origin' link to refer only to single commit, and not to
the whole history ending with given commit, as it is in the case of
'parent' link. One thing to consider is the fact that git is
_snapshot_ based, while cherry-picking is _changeset_ based. When you
cherry pick commit B to apply on top of commit A, what you do in fact
is to pick (A^..A) or in other syntax (A-A^) change, and apply it on
top of A. So cherry picked B, let's denote it by B', is in fact
C'=(A-A^). For example having only commit A is not enough to replay
cherry-picking.Second, unless such header would be for informational purposes only
(there was even proposal to add generic 'note <sth>' informational
only header, but it was shot down; see the archives), you would have
to do quite a bit of surgery to revision walking code. For example
you would have to think about how commits pointed by 'origin' header
would be protected against pruning; if you allow to prune parents of
grandparents of cherry-picked commits, you would break I think a lot
of assumptions in the code, and assumption in git design that if we
have commit, then all that it references should be available (well,
there are grafts, and there is shallow clone, but those modify
reachability...).--
Jakub Narebski
Poland
ShadeHawk on #git
--
I think it would be used mostly for informational purposes (for
hyperlinks in gitk, and in git-cherry).Paolo
--
In general, I do not think what you did is a good idea. The _only_ case
you can do what you did and keep your sanity is if you cherry-picked every
single commit that matters from one branch to the other.If something is not "parent", you shouldn't be recording it as such.
Remember, when you are making a commit on top of one or more parents, you
are making this statement:* I have considered all histories leading to these parent commits, and
based on that I decided that the tree I am recording as a child of
these parents suits the purpose of my branch better than any of them.This applies to one-parent case as well.
Imagine you have two histories, forked long time ago, and have side-port
of one commit:o---...o---B---A
/
---o---o---...o---X---A'What side-porting A from the top history to create A' in the bottom
history means is that the change between B and A in the top history, and
no other change from the top history, is applied on top of X to produce
state A' in the bottom history. What B did is not included in the bottom
history.If you recorded A' with parents A and X. Here is what you would get:
o---...o---B---A
/ \ (wrong)
---o---o---...o---X---A'But that is not what you did. The tree state A' lacks what B did, which
could be a critical security fix, and you didn't consider all history that
leads to A when you cherry-picked it to create A'.To put it another way, having the parent link from A' to A is a statement
that A' is a superset of A. Because A contains B, you are claiming A'
also contains B, which is not the case in your cherry-picked history.
--
I agree with this, though I can see the motivation (for example that
git-patch-id, and hence git-cherry, often do not work because of context
changes).This thread, however, spurred one observation and a question.
Observation: it seems to me that cherry-picking and merging are mutually
exclusive workflows. You cherry-pick from a development branch to a
stable branch, you merge or rebase in the other direction. Is this true
in general? (I can see the obvious exception: you might cherry-pick
that very important bugfix, if you're not ready to do a full merge; but
if you rebase, that commit will go away as soon as you do it).Question: how are topic branches managed in git.git? In particular, how
are "graduations to master" done? Do you cherry-pick the merge commit
that went into "next"?Paolo
--
Topics meant for master are always forked at the tip of master (or older)
and they are merged back to master when they prove Ok.Topics that are fixes are forked at maint (or older) and if they are
trivial they are merged straight to maint and gets merged up to master.
Otherwise they are merged to next, cooked for a while, then merged to
master, cooked even more, and then finally merged to maint.Of course, I am not perfect (I said I am not Linus, didn't I?) and do not
have perfect foresight, so sometimes I do a fix directly on 'master' and
realize it should also apply to 'maint' some time later. They need to be
cherry-picked.
--
Wouldn't that be the normal use case for these kind of side-port
It depends on what you define to be a parent. The git repository
doesn't care either way (that's the beauty of the format definition of
the git repository, just as the tree snapshots allow for later more
complicated diff/blame processing history, so do the parent
relationships allow for more complicated parent references which wereThat is a statement which depends on the view of the user. I concur
that up till now, that is what a user says. But maybe it is possible to
accomodate both the traditional statement and the sideport-statementWhich existing git command actually misbehaves because it makes the
above assumption?
--
Sincerely,
Stephen R. van den Berg."The future is here, it's just not widely distributed yet." -- William Gibson
--
Read what you are quoting again and notice I explicitly said "suits the
purpose of *MY* branch better". In your side-port example, if (perhaps a
critical security bugfix) B does *not* matter for the purpose of *your*
branch (perhaps because you know the product built from the branch you are
cherry-picking into will not be used in a context that would be affected
by the bug), it is perfectly fine to record the cherry pick source as a
parent of A'.One ramification of this, however, is that you will give wrong impression
that such a branch contains the bugfix B to other people. By merging A'
(not A) to their history, they think they obtained the bugfix B through
you, but in fact they are *not* getting the fix. Running diff between A
to A' will reveal that in fact with the "merge" A' you discarded the fix
in B. This makes your branch that has A' in its history useless for
people other than you. But it can still be said that the resulting
history suits the purpose of *your* branch.I said (and I maintain) it is not a good idea *in general*; building that
kind of history is just not a normal thing to do, and it will lead to
confusion unless you are careful and know what you are doing. I still do
not necessarily agree that what you did is "the normal use case for these
kind of side-port", but people who consider it the normal use case would
be careful and know exactly what they are doing. It is Ok in that kind of
context.But just do not recommend it blindly to people who do not understand the
consequences, one example of which is that you cannot get the bugfix B by
merging A' to your branch as I mentioned above.At that point, the choice becomes between merging from you (i.e. A') or
not merging from you. The other people may find that merging from you
to honor *your* choice of discarding the bugfix made by B does *not* suit
the purpose of *their* branch, in which case they just do not merge from
you, and that is perfectly fine.It is all relative --- each owner of hist...
Most importantly: merge.
If you later merge the top branch into the bottom one, the merge-base
is now A. So any such merge, that under normal circumstances would
have integrated B (which as Junio said could be a very important fix!)
would not do so in your version.But other things fail too: take the '..' and '...' way of specifying
revisions (because they consider B as "on the bottom branch" with that
extra parent relationship).=2D Thomas
=2D-=20
Thomas Rast
trast@student.ethz.ch
Yes, and this is not about "misbhave" and "assumption". It is much more
fundamental.Stephen earlier seems to have mistaken what I taught in an earlier message:
Remember, when you are making a commit on top of one or more parents, you
are making this statement:* I have considered all histories leading to these parent commits, and
based on that I decided that the tree I am recording as a child of
these parents suits the purpose of my branch better than any of them.This applies to one-parent case as well.
as a mere convention or something, but it is not. It is what the merge
semantics and merge-base computation (implemented in mathematical terms
over commit DAG) do, expressed in layman's terms.The decision to merge your history with A' means that (1) you trust what
you have done so far, (2) you trust the judgement of what the other person
who built the history that leads to A' as well, and (3) you deem that the
purpose of these two histories are compatible with each other. The last
item is why you are merging with him.v you were here
o---o---o---M
/ /
o---...O---B---A /
/ \ /
---o---o---...o---X---A'Because of the "I have considered all things behind this commit, and I
declare this tree suits my purpose better than any of them" statement when
a merge A' was made, and the fact that you trust the judgement of the
person who made that statement, and the fact that you think the purpose of
his history is compatible with yours, we look at O as the merge base when
merging with A' to create M, and resulting history that lead to M claims
that it has A _and B_, in addition to X and other developments done on the
bottom branch of his, on top of what you used to have.However, the transition from A to A' involves reverting B among other
things, and the history M includes that revert. That's the consequence of
you trusting ...
Parents mean something different than just a link. If A is a parent of
B, then that implies that at point B, we considered all of the history
leading up to B (including A), and arrived at a certain tree state.But cherry-picking means we looked at just A and used it to find a
certain tree-state. It says nothing about anything that came _before_ A.So imagine this history:
A--B--C <-- master
\
D--E--F <-- side branchNow let's say we want to cherry-pick E. If we mark the cherry-picked
commit as a parent, we get:A--B--C--E' <-- master
\ /
D----E--F <-- side branchNow let's say we want to merge the branches. What's our merge base?
Without your proposal, it is A, but now it is actually E. So doing a
three-way merge between E' and F with base E, it will look like our
master branch _removed_ the change from D which is still present in F.
And in a 3-way merge if one side removes something but the other side
leaves it untouched, then the result removes it.So the merge result is bogus, as it is missing D.
I'm including a quick script below which creates this situation (it may
need tweaking to run on your system, but hopefully you get the point).-Peff
-- >8 --
#!/bin/sh -exrm -rf repo
change() {
perl -pi -e '/^'$1'$/ and $_ .= "changed $_"' words &&
git commit -a -m $1 &&
git tag $1
}mkdir repo && cd repo && git init
cp /usr/share/dict/words . && git add words && git commit -m initial
change A
change B
change C
git checkout -b other A
change D
change E
change F
git checkout master
git cherry-pick -n E
tree=`git write-tree`
commit=`echo cherry pick | git commit-tree $tree -p HEAD^ -p E`
git update-ref HEAD $commit
--
That implication is not a technical one, but merely a convention in the
mind of the git-user. Relevant, of course, but maybe we can accomodateTrue. However...
What if the merge-base determination code is modified to behave as
if --first-parent is specified while searching for the merge-base?
In that case it *will* find A as the merge-base, even in the presence of
"sideportlinks".Does that resolve all technical issues?
--
Sincerely,
Stephen R. van den Berg."The future is here, it's just not widely distributed yet." -- William Gibson
--
I'm not sure I agree. I believe that property is part of the definition
of the commit DAG as originally conceived (but somebody like Linus could
say more). Obviously there is no formal definition, but I already
pointed out one thing that will break in that instance. I don't know ifBut then it will fail to find legitimate merge bases. So yes, you _can_
come up with a merge algorithm that handles this situation. But is it
then up to the user to say "Oh, this parent link means something else.
Use this other algorithm"? In that case, it really seems we are abusing
the "parent" link and it would be more appropriate to have some _other_
type of link.Though I think if you look through the archives, people have argued
against having any git-level link to cherry-picked commits. The history
leading up to that cherry-pick is not necessarily of interest (though I
think you are proposing that it be optional to create such a link viaI really don't know. I think you are proposing changing a core
assumption of the data structure, so I wouldn't be too surprised if
there is other code that relies on it.You can use the script I posted in my last email as a basis for a
cherry-pick that does what you want (cherry-pick -n, write-tree,
commit-tree, update-ref). You might try a few experiments with that.-Peff
--
Yes, of course. But even then, it's merely a formal definition, the
thing I'm after now is if there is any code that actually relies on that
formal definition. That would be the code to review and perhaps adapt
in order to make it support the sideport-parents without hurting the oldThat, of course, is unacceptable. It either is seemless and supports
both uses transparently, or it has to be done (if at all) using a separateI will, thanks.
--
Sincerely,
Stephen R. van den Berg."The future is here, it's just not widely distributed yet." -- William Gibson
--
How about the example I gave already? The first merge-base is E, but
that is not correct for the merge I gave. So you propose an algorithm
which will find A. But now imagine the exact some topology, but there
was no cherry-pick; instead, E' is actually a merge. Wouldn't E be the
right merge-base then?And yes, of course the _content_ of the trees in E' will be different in
those two cases. But the shape of the history graph will be the same,
and that is the only thing that goes into finding a merge base.-Peff
--
Indeed. Q.E.D.
I'll drop the idea with the parentlinks.
--
Sincerely,
Stephen R. van den Berg.
The Horkheimer Effect: "The odds of it being cloudy are directly proportional
to the importance of an astronomical event."
--
| Ian Campbell | Re: [PATCH] x86: Construct 32 bit boot time page tables in native format. |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Justin Piszcz | Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195... |
| Alan | Re: [RFC] Heads up on sys_fallocate() |
| Matthias Scheler | Re: HEADS UP: timecounters (branch simonb-timecounters) merged into -current |
| David Laight | long usernames |
| Quentin Garnier | Re: Understanding foo_open, foo_read, etc. |
| Jared D. McNeill | Breaking binary compatibility for /dev/joy |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| David Miller | [GIT]: Networking |
| Natalie Protasevich | [BUG] New Kernel Bugs |
