My memory is playing tricks on me. I seem to remember running linux in the 1980's, but the earliest kernel I can find on kernel.org is dated 1994. Maybe I'm remembering xenix...dunno. Anyway, I've been tracking Linus's kernel for many years -- long before bitkeeper or git. I know just enough to compile and run a kernel, but not enough to be a software developer. And that is where my question comes from: Could someone explain to me the shortcomings of CVS which prompted the development of bk (and then git) -- in a way that a non-developer like me can understand? Pretend that you are Albert Einstein, trying to explain your theories to a ten-year-old -- this is always a useful exercise for those of you who are drowning in micro-details! I've already done some googling on this subject, but everything I've found is drenched in those micro-details which make the subject opaque to me. Thanks for any pointers! -
-91 was the first version. It was usable (depending on your definition of
It's really not very easy to explain why CVS sucks. After all, sometimes
people who have used it for decades have a hard time understanding the
suckiness.
I've used CVS for "real work" at Transmeta, and hey, it worked well
enough. When you have groups of just a couple of tens of people max, and
very strict rules on how to do things, and you trust everybody, CVS works
fine. It starts to really show its problems whenever you need to work
remotely, but there are things you can do to make the pain less.
A lot of CVS people will tell you that it sucks because it can't do
renames, and because certain operations take forever (tagging etc). That's
only superficially true, and it is really a suckiness that comes from some
implementation issues.
SVN fixes (supposedly) those "implementation suckiness" issues. It does so
largely by doing a much better database, which allows it to do certain
things much more efficiently. Personally that part scares me, since I
think it's also a much more fragile setup and there's apparently been
people who lost their entire database to corruption (something that is
very hard to do with CVS, since the "database" is so weak), but that's a
different issue.
But the things that SVN fixes are not the things that really matter in the
end. SVN i sa better CVS, but it still has all the basic fundamental
problems. Namely the fact that it's centralized.
The problem with a centralized model is that there's one point of contact:
you can replicate the central database endlessly, but you can only really
modify it in one place. Which means that anybody who wants to modify
anything at all needs to have write access to that one repository.
Now, you can limit write access in various ways ("user xyz can only write
to these files"), but it still requires an a-priori trust network rather
than a dynamic one. So every single CVS project (and SVN does zero in this
regard)...<explanation snipped> But you explained very well, thank you! And thanks to the others who responded -- all very helpful. I'm off to read the link that Martin supplied -- looks gossipy enough to keep me awake :o) -
Hi, How about adding the whole explanation as git/Documentation/howto/tell-why-cvs-sucks.txt (maybe with some more polite name)? Also, I=B4d like to add that CVS branching/merging is no good: <tryingtoputonalbertsshoes> Sometimes a developer gets an idea, or the need, to implement a certain=20 feature to a piece of free software. Now, this idea might seem good, but=20 it might take a while to - implement it, - flesh the bugs out, and - maybe realize the idea was not all that good. All the while, the project is prospering, and you have to keep up-to-date= .=20 With CVS, you would do "cvs update" every once in a while, and clean up=20 the merge conflicts. In effect, you would track the history of the=20 upstream project. Often, however, you would like to track *your* changes, too. This is not=20 possible in CVS. You just can=B4t track two different histories in the sa= me=20 working directory. Now, if you are working on two or more different ideas, which you want to= =20 test separately *and* together, you need to merge your local branches=20 every once in a while. If it weren=B4t for "every once in a while", but=20 "once", you still could do it in CVS. If you want to merge several times=20 (keeping the separate development branches), you can=B4t. </tryingtoputonalbertsshoesfailingmiserably> Ciao, Dscho
Hey, if somebody else does it, that's fine. I'm personally _so_ biased against CVS that I'm not neutral. I really hate the thing. I'd much rather use tar-balls and patches than CVS: I think "quilt" ends up being much nicer in many ways than CVS can be. So feel free to take my explanations and write something up. I just don't want to do it, because I fear I might be unfair to CVS (well.. I'm personally 120% convinced I'm not, but still, there's a lot of people who actually _use_ it, so..). > Also, I
Actually, Linus, this provokes a question I've always wanted the
answer to. I'm well aware of the centralized/distributed stuff you are
discussing, but there is policy regarding the distributed merges I've
never been quite clear on.
When one does a feature branch, one creates a "throw-away"
repository. They work on the feature, and when they are done, they
pull/push back to the main repository. This pattern is pretty much
identical in both centralized in distributed environments, even if the
nuts-and-bolts are different.
In the CVS/Subversion world, this merge becomes a single commit
on the "main" line of development ("trunk", or whatever you call it).
The merge has no concept of the steps taken to create the change, just
the actual patch. This has the disadvantage that you have to work hard
in the branch namespace to find the actual steps taken (the working
repository for the feature), but the advantage that a quick look does
not have to wade through fits and starts as the feature takes shape.
In the distributed world, a pull of the "feature" repository
pulls in all changes - the full history of the work. This includes
aborted tries, rewritten pieces, bug fixes, etc. Here, the main
repository has the detritus of the development process, but that also
because that history will contain all your something stupids, plus your
fixes for them.
But that's not how the kernel and git appear to work. Many
developers have popularized dropping that context. They take their
working repository, diff it against your mainline repository, and then
create a new repository that is merely your mainline plus one commit,
the patch of their changes.
This violently breaks the model of "work in a new repository,
then have it pulled into the 'main' repository." It has no real support
in the git/cogito command space (that I know of). It does, however,
leave all the intermediate commits out of your tree, with only a feature
commit remaining.
Where do you stand on this? Would you rather s...Note that I'm a big proponent of people cleaning up their private work-in-progress trees before merging. In fact, I'll refuse to merge with too dirty a repository. It's ok to have some fixes for mistakes, but if you have a lot of ugly stuff, use git to first track the development, and then start a new branch that has the No, exactly because you do _not_ have to publicly humiliate yourself with showing what a nincompoop you are. People should try things out, but they should clean up their worst mistakes too. Git allows both. Linus -
Do you think anybody is that perfect? What happens in reality is something like this: - you have a master tree, and your own throwaway topic branches. - you play in your own topic branches. you make stupid mistakes and redo your changes many times. - when the tips of your topic branches are in good shape, you review the changes from the master tree as a whole, without the history. - you decompose the diff between the tips of your topic branch and the master branch into logical steps. - you branch off "sanitized" branches from the tip of the master, and apply the decomposed diffs, making one commit per logical change, until all your decomposed diffs are applied. - after making sure that the tips of the sanitized branches match the tips of their corresponding topic branch you did your work on, you throw away your true history and pretend you are perfect human. Either ask your peer to pull from your sanitized branch tips, or run git-format-patch between the master and the sanitized branch tips and send them out as patches. I do not know about the kernel tree but I would be surprised if any self-respecting developer wouldn't be doing this. The review-decomposition-reapplication cycle is *very* important for both keeping the public history clean and reviewable, and preservign your public image ;-). -
I was being slightly facetious. Of course everyone makes mistakes and corrects them. But if you _want_ the history, you have to take it. Otherwise, you are required to throw away the history completely. And that -- do you want the whole history or none of it -- I could care less about preserving my public image. I'm an idiot, I screw up all the time. I only care that the tip of my tree is respectable. I've seen arguments from folks on both sides -- the intermediate history is important, warts and all, vs throw it all out for a clean public history. It seems that you fall into the second camp. That's fine, but can we make that work model a first-class citizen? Can we get a script that pulls one branch as a single, un-historied (sic) commit into the current branch? If this is The Way, I should have to be mucking about with many steps of diff/patch (at least unless my change is large enough to require split patches). Joel -- "It is not the function of our government to keep the citizen from falling into error; it is the function of the citizen to keep the government from falling into error." - Robert H. Jackson Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 -
I think you read me wrong. Didn't I say "decompose and make them into logical stepS", emphasis on plural "S"? Single big consolidated patch is not what I am advocating for. It is impossible to review and evaluate. To be merged into a public tree, such unhistoried commit is often unacceptable. -
No, you're reading me wrong, but I wasn't clear enough either. At the end of my message, I'm noting that I'm considering smaller changes here, not huge features. Basically, I'm not talking about merging with Linus. I'm talking about merging with myself. Let's assume we're all going with the clean-up-your-history model. It is quite clear that you and Linus agree on that model, and I wasn't so much arguing against it as querying everyone's opinion on it. So, I have a git repository that is my For-Linus repository. It's got a clean history. What's my workflow? 1) Clone the repo to a Work tree. 2) Create and test fix X, with perhaps some >1 number of commits. 3) Bring that fix back to the For-Linus repository. This is a small change. It's not something that needs stepS, as you put them. But my history in the Work tree is "dirty," so I cannot just pull from Work to For-Linus. As the tools currently stand, I need to hand-diff and patch my commits. Neither git nor cogito have a command to do this first-class "the way you should do it" common operation. It is, in my experience, a pain. Not as large a pain as some things, but certainly second class to much of the workflow git/cogito provide. If it is supposed to be a regular part of my workflow, what's wrong with making it a first-class operation? Obviously, large features should and do have logical steps. I'm never going to be against that. Joel -- Life's Little Instruction Book #15 "Own a great stereo system." Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 -
I actually have a set of scripts that I use for this, which I've been too
lame to clean up properly and send in. The basic idea is:
(1) "git branch clean mainline"
(2) "git checkout clean"
(now I'm looking at the clean history, which doesn't have anything
yet)
(3) "git refine dirty"
(this says I'm trying to match the content of the head with the
dirty history)
(4) editor window pops up with the diff between the working tree and
dirty
(5) edit the patch, removing hunks which go later in the series, or which
I don't want to do at all and forgot to revert.
(6) it applies the patch; if there are rejects, it goes back to (4)
(7) normal thing for committing happens
(8) if there is any difference between the working tree and dirty, it
goes back to (4) for the next in the series
I still need to correct the flow control and make it invoke the editor
automatically and such, and provide some way out of the middle if you want
to give up or stop without reaching the end, and I have to detect the done
condition. But the general method does work, provided you're at least
somewhat comfortable editing patches (with the safety net that nobody else
will ever see the patch, so it doesn't matter too much if you screw it
up).
If somebody else wants to clean this up, I can post my version; dunno when
I'll get around to making it really right.
-Daniel
*This .sig left intentionally blank*
-For an example of how to make it a first-class operation, it might be worthwhile to look at Chris Mason's "Mercurial Queues" extention to Mercurial: http://www.selenic.com/mercurial/wiki/index.cgi/MqExtension I've used it once or twice, and hg mq is definitely very nice and convenient, and it makes commits a first-class operation. On the other hand, I've found that the combination of quilt and Mercurial/BK/git works just fine, even for my own internal development of (for example) the e2fsprogs tree. - Ted -
Dear diary, on Tue, Nov 01, 2005 at 01:25:54AM CET, I got a letter Did anyone do any current detailed comparison between hg mq and StGIT? I'm very happy with StGIT, modulo few UI gripes I'm still not getting around to fix, and the fact that I cannot version my changes to patches - this is one advantage of having quilt stuff tracked by GIT, I think, but that feels ugly. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
I don't think so, but I'll give it a rough try. I have not used stgit extensively, so please correct any mistakes below. Most of the differences center around the ways we store patches. Both tools make patches into commits during push. This allows the various history commands to see the currently applied patches. Both tools allow you to make changes to files without running some form of quilt add first. StGIT has the ability to rebase patches via three-way merge. This is still on my todo list for mq. StGIT patch storage is very different from quilt and mq. StGIT keeps git commit/tree objects around for patches that have been applied. It then stores a directory with metadata about the patch (author/description etc) and the ids of the git commit objects. In StGIT, importing new patches seems to require stg import, and exporting patches requires stg export (or a similar git command). But once the patches are stored in stgit, push/pop will be very fast. mq is closer to quilt. The patches are stored as patches, and hg qpush is very similar to importing a patch. This means metadata must be stored at the top of the patch in some form the import code can understand (it tries to be smart about this). hg qrefresh will update the patch file, so the patch is always up to date wrt to the hg repo. You can import/export patches with hg commands, or by copying patches into/from the .hg/patches directory. This also means you can take a quilt patch dir, copy it into .hg/patches and just start using mq. mq has some support for putting the patches directory under revision control (as a separate repository). Most of the other differences come from differences between hg and git. I'm not sure if stgit has some form of annotate, but it's a nice way to find out which patch changed a given loc in hg/mq. -chris -
The problem with this is allowing people to modify the patch directly (with vi). This would make it difficult to do a three-way merge without either losing the direct changes or simply failing to apply a modified patch to its old base (I thought about using patches as an optimisation but after some benchmarking found that "git-diff-tree | git-apply" is fast enough and most of the time when pushing is Chuck, I think, has a patch to automatically export the patch when pushing or refreshing. With the latest StGIT snapshot, the tool reports if the patch was modified during push and can only be exported in this case (the way it detects this is by assuming that if git-apply is successful, the patch is unmodified since no fuzzy applying is accepted; the fall back to three-way merge just reports the patch as There is git-whatchanged which also reports the StGIT patches applied onto the stack. But there is no command similar to 'quilt patches' yet. -- Catalin -
The three way merge is still possible even if someone hand edits the patch. For a three way merge, you just need to know the parent revision of the change you want to merge. parent can mean the revision in the repository that precedes this patch (mq stores this information, just not in the patch), or it can mean any revision where the patch applies cleanly. Both approaches (mq vs stgit) have advantages...you can get roughly the same functionality either way. -chris -
Yes, but what I meant is that someone may modify the patch in a way that it is no longer appliable to its parent or to any other revision in the tree. A this point, a three-way merge is no longer possible (but, well, if someone modifies the patches this way should be able to Yes, you are right. The big difference is the underlying tool (hg or git). -- Catalin -
Btw, I have to say that I was a bit uncertain about doing the rebasing by way of a three-way merge, but when I recently did a revert, I was _really_ happy with how well "git revert" did the rebasing of the revert. It wasn't even a clean merge, but leaving the conflict in the tree and allowing me to fix it up made what would otherwise have been a much more complex manual operation be 99% automated. So I'm _neither_ a StGIT not mq user, but I can definitely say that rebasing with a three-way merge instead of just trying to apply the patch (whether in reverse like in a merge, or just re-apply it straigt) is really really nice. Linus -
StGIT first tries a "git-diff-tree | git-apply" since it is faster but when this fails it falls back to a three-way merge. A 'stg status' command would show the conflicted files and they should be marked as resolved before refreshing the patch. One of the good parts of the three-way merge is that it detects when a patch you sent was fully merged upstream, the local patch becoming empty after the merge. If not, you either get a conflict or the merge leaves the patch with only the unmerged parts. -- Catalin -
That's not too far away. Chuck Lever has a patch (and there were some other discussions in the past) for tracking the history of a patch. Basically, there would be another commit object, not reachable from HEAD but only via an StGIT command, which would chain all the versions of a patch. You would be able to view them with gitk for example. My main issue was whether we should store every state resulted from a refresh or use a separate command (somebody suggested 'freeze') to mark the states that should be preserved in the history. Chuck's patch implements the first. The drawback is that a future 'stg prune' command would not be able to remove the history and some states of the patch might not be useful (there are times when I do a refresh only to pop the patch and modify a different one, without any logical meaning for the state of the patch). I'm open to other suggestions as well. Otherwise, Chuck's patch should do the job. -- Catalin -
if there is interest i can post what i have. unfortunately there's some
other stuff in front of it so i don't think it will apply directly to
catalin's stgit without some futzing. in lieu of that, here's a command
synopsis:
[cel@seattle ~]$ stg revisions -h
usage: stg revisions [options] [patch-name]
Display the change history of a patch or revert a patch to a previous
commit. By itself, the command will display all committed changes,
ordered by date, of a patch. Each committed change is listed with a
numeric label. The label can be used with the --patch or --diff options
to examine specific changes in detail. The --revert option can revert
a patch to any previous version.
options:
--commit=commit-label
show the commit details of the specified commit
--diff=commit-label show changes between the specified commit and
the next
--file=<file name> show changes made to a specific file
--patch=commit-label show the state of patch-name at the specified
commit
--revert=commit-label
revert the patch to the specified previous commit
-h, --help show this help message and exit
[cel@seattle ~]$
and some usage examples:
[cel@seattle main]$ stg revisions
Previous revisions of patch "revisions-command":
0: Sat Oct 1 21:54:43 2005 -0400
1: Sat Oct 1 21:58:45 2005 -0400
2: Sat Oct 1 22:13:27 2005 -0400
3: Sat Oct 1 22:55:28 2005 -0400
4: Sat Oct 1 23:02:22 2005 -0400
... snipped ...
86: Mon Oct 31 14:19:25 2005 -0500
87: Mon Oct 31 14:22:00 2005 -0500
88: Mon Oct 31 14:23:40 2005 -0500
89: Mon Oct 31 14:24:39 2005 -0500
90: Mon Oct 31 14:27:34 2005 -0500
[cel@seattle main]$
an entry is added to this list automatically after every operation that
does a "refresh".
the idea is to expose and manipulate the change history of a patch
without having to use cumbersome sha1 hash values.
without options, "stg revisions"...On Tue, Nov 01, 2005 at 10:20:29AM -0500, Chuck Lever wrote: I'm probably not familiar enough with stgit, but it looks to me as though you're tracking individual patch history only. In trees I work with, patches rarely stand alone. There are typically collections of patches implementing a given feature, or a change to one patch requires rebasing a number of (perhaps unrelated) others. I think the command set you describe above will lose that grouping. -chris -
That's true, but you can use a 'git tag' command to mark the whole stack as something useful and this would include the state of all the patches on the stack. This would be a whole stack history, not individual patch history. Maybe we should implement this as well (or maybe only this). Anyway, I wasn't sure that's the right implementation and that's why I didn't include Chuck's patch yet. -- Catalin -
I would suggest just putting the .git/patches directory under revision
control. If you make it a head in git and then add helper functions so
that common operations are easy to do, you won't be reimplementing the
whole SCM wheel just for patches.
For example:
stg commit-patch-tree:
does git-write-tree and git-commit-tree on .git/patches
stg checkout-patches sha1:
updates .git/patches to a given patch commit
stg diff-patches [-p] [-f]:
by default this does the same as git-diff-tree
-p, read the patch commit objects and diff the patch files
-f, read the patch commit objects and diff the source files
The command names could be better, but the idea is to make commits that
change the state of your patch tree. Later on, you'll be able to find the
one commit where you added 6 patches, or the one commit where you
adapted the whole tree to some new feature.
More importantly, you can reuse gitk and all of the other history
functionality in the SCM.
-chris
-Putting them under a separate revisions repository, i.e. having a .git/patches/,git directory? Otherwise there would be some problems Doing it the above way wouldn't be of much help with gitk. You would get files like .git/patches/master/patchX/top etc. under revision control which only contain some hash strings, not meaningful. With GIT you have the advantage of being able to specify the DAG structure. It is pretty simple to just link the commit objects corresponding to a patch into a DAG and using gitk would allow you to navigate through the history and also look at the diff itself. -- Catalin -
I think we're talking past each other a little, partially because I'm not sure exactly what features you want from revision control on the patches. But, my suggestion is to remember that once you add some sort of revision control, people are going to want all of the features they are used to with git/hg/their favorite SCM. You'll probably get better results if you patch git to your needs then if you try to reimplement things all over again. -chris -
Sorry for the delay in replying. That's unclear for me too :-). I would like to have a way of checking the changes to individual patches, just to be able to go back if some changes broke it. It's also useful to have some kind of revision control for the whole stack, but this can be achieved with tags at the moment. What I usually do is export the series when I'm happy with it and keep that directory safe. I could add revision control for the directory containing the exported series but this would be somehow That's true. I think that people who want a full revision control of the patches should rather use separate branches instead of stacked patches. It's indeed more convenient to be able to add or remove features with push/pop but providing yet another SCM layer on top of these would make the tool hard to understand (and maybe make Quilt fans run away from it). The current StGIT features are enough for my needs but I'll accept/implement new features based on others' requirements. BTW, the latest StGIT snapshot has support for a 'patches' command which shows the patches modifying a file or set of files (that's because I needed this feature recently). -- Catalin -
Dear diary, on Sat, Nov 05, 2005 at 09:23:33PM CET, I got a letter Yes, that would be nice to have as well (although in general I lean to recording the whole stack now). Model scenario: A night city, the snow slowly falling. Approaching the roofs covered in white and illuminated by the yellow street lighting, dark windows - but one dimly glowing, a computer screen inside. Close-up on a hacker: $EDITOR opened, lost deep in hack mode, fingers dancing over the keyboard. Dreamy-monumental music in the background. StGIT user, only part of the patches in stack, and the rest depends on the one currently edited, and I want to record my work on this one. I can either: (i) Just keep per-patch history only. (ii) Keep _both_ per-patch and per-stack history (since I don't want to record the stack when I have to keep some patches out of it - the history would look like randomly removing and adding tons of patches, and jumping around would be difficult because of this too). (iii) Keep per-patchlist history - do not actually record only our current stack, but all the patches StGIT knows about. The patches depending on the one currently being changed will not be in consistent state, but that's tough. Actually, this seems to be the most viable strategy. One question is whether to record if some patch is actually applied right now or not (I'd say don't record it since you again have the "bouncing problem" otherwise). Yes, now let's sequence the tags... ;-) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
It happens to me to keep some patches popped which aren't really part of the stack (i.e. splitting a big patch, I still keep it in the unapplied patches to push it later and check what was left after splitting). From this point of view, (ii) would be better but with the drawback that you need to have a valid stack with all the patches (iii) is the most comprehensive method and, as Pavel said, we should record what patches were applied or not and reproduce them exactly when retrieving a different state. Another big problem is the base of the stack, which can change. Would retrieving an old state of the stack also restore the old the base? I think it should and its up to the user to rebase it. A simple way to partially achieve (iii) is to extend the existing 'branch' command to clone the whole series into a new one, including all the patches. The problem with this approach is that there is no temporal relation between branches. How would you expect to switch between different states of the stack? As Chris Mason mentioned, once you start doing this people might ask for full SCM features (like diffs between revisions) where the objects are stack states. This would complicate StGIT quite a lot. -- Catalin -
Are you sure you are git hacker? Maybe you should have been fiction I do not know if ii or iii is better, but please *do* record what patches were applied at what moment. That is useful info. "I'd like to go back to know working configuration". If I do not know what patches were applied at what moment, going back to working config is hard to do. Pavel -- Thanks, Sharp! -
Dear diary, on Tue, Nov 01, 2005 at 10:23:55AM CET, I got a letter Perhaps you could emulate the topical branches - one patch == one head. E.g. for patch foo-bar on branch 'master', you would create head I'd prefer the snapshotting being done in refresh anyway. Perhaps you would be asked for log message when you refresh by default, but when you refresh -n or something, only a temporary commit would be created and next refresh would mutate it instead of creating another commit. Anyway, "freeze" is confusing. Perhaps "snapshot" if anything... -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
The patches need to be chained so a top patch would also refer to the previously applied patches since they are its base. Anyway, I don't The refresh -n should be the default and maybe just specifying an option when you want to add a comment to that commit. But, by mutating the temporary commit, wouldn't this mean that you lose the refresh You are right. -- Catalin -
OK. Consolidating two or more patches into one is something I
have done too, but I never felt need for tool support, so that's
probably why I misunderstood you.
After extracting a sequence of the dirty commits using
git-format-patch, I would say:
for i in 0*.txt; do git-apply --index $i; done
to bring my tree up to date, and then just say "git commit".
Typically when I do this, I have one "significant" commit among
them, usually early in the series, which is followed by smaller
"fix this, fix that, oops fix that too" commits. So I edit the
log message using the log of the significant commit, and add
some missing bits.
I guess another way to do it without even first extracting them
as patches would be:
$ git checkout -b mytopic master
$ work work work, commit commit commit.
$ git checkout master
$ git-read-tree -m -u master mytopic
$ git-commit -c <that-significant-commit-in-mytopic-branch>
If you want a tool support for this workflow, probably the last
two could be somewhat automated. But what would the user input
for that be? You need to tell what tree shape you want the
after-commit tree to be in, and where you would want the bulk of
your commit message to come from.
One possibility.
$ git-squash-pick mytopic
would be something like this:
#!/bin/sh
#
git-read-tree -m -u HEAD "$1"
git log HEAD.."$1" >.tmp-commit
git-commit -F .tmp-commit -e
Your commit-log edit buffer would start with the concatenation
of all the commit logs in that throwaway history, and hopefully
you would mostly need to delete lines and move some parts around
before committing. I do not know how useful this kind of
specialized tool would be, though...
Splitting and merging patches into logical steps is something I
dream of to be automated, but I do not know how (nor even if it
is possible) offhand. Sometimes, when you want truly logical
steps, you would end up needing intermediate...When I work, regardless of SCM, I generally have many checkpoints along the way. It might be a particular subfeature is complete (and probably deserves a split-out patch of its own when I do the "clean" merge), but it could also just be "I changed a lot today, and I'd really like to save that off." So, while somtimes it looks just like your "significant commit + fixes" model, it might also be "1/4 the work", "compile fix on other platform, "1/2 the work", "fix", "the rest Replace <that-...> with <overall-concept-of-the-change> and you have the workflow I'm talking about. You know, this is a simpler command set than I am using. I've been using Cogito, because it makes many of the 5-step git operations a single step, more like some other tools. But I know no way to tell Cogito to merge all the changes of the branch into the master without also pulling in the commit history. That's the thing here. Petr, do you have a way of doing this that I don't know about? What I mean is, for the "naive" Cogito workflow: cg-clone repo working cd working hack hack hack, commit commit commit cd mainline cg-pull working the cg-pull command merges the changes back, but it also includes the full commit history. Not what we want. Compare the "identical" workflow: cg-clone repo working cd working hack hack hack, commit commit commit cg-diff mainline working > patch cd mainline cg-apply < patch cg-commit My basic premise is that I shouldn't have to deal with diff/patch as an external step, especially since git knows more about the tree than diff/patch do. It's a useless hoop to jump through. Maybe Cogito contains something like what you describe above, a way to get all the file changes without actually pulling in the commit history. I don't care that the read-tree and the commit are separate Yes, I always do. But I'm not talking about that sort of large feature add or whatever. I'm talking about merely doing something on a small scale, ...
I'm really surprised that Calalin hasn't chimed in. If you are into rewriting/merging/splitting your patches, StGIT is your friend. Check out: http://www.procode.org/stgit/ cheers, martin -
StGIT mainly resembles Quilt workflow but there are no patches, only commit objects which are indefinitely replaceable (push/pop/refresh). What I usually do is create smaller commits for different features and just stack them together. That's usually for features which are dependent on each-other and you can control them with a finer grain than having separate branches. One can push/pop patches (commits) to bring the patch to be modified at the top. After modification, a refresh command would save it as a commit. All the patches (commits) in the stack are accessible via HEAD and are seen as GIT commits. It may happen to just have a bigger patch which needs splitting. What I usually do in this case is import the patch as an StGIT patch (i.e. GIT commit object), pop it from the stack so that it is no longer applied, split the physical patch (diff file) into smaller, logical changes and apply them one by one with StGIT. When you think al the big patch was completely applied, pushing it should result in an empty patch, otherwise you might have missed something that needs applying. With StGIT you can also pick a commit object from a different branch as a StGIT patch or you could merge two patches into one. Once you are OK with the patches in the stack, just ask the gatekeeper to pull the changes from your tree using plain GIT or mail them automatically with StGIT. -- Catalin -
But I'm not. I don't want patches in the first place. I want
cg-pull but with a flattened history.
Joel
--
"Any man who is under 30, and is not a liberal, has not heart;
and any man who is over 30, and is not a conservative, has no brains."
- Sir Winston Churchill
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
-Dear diary, on Tue, Nov 01, 2005 at 02:29:15AM CET, I got a letter StGIT does not work with patches but with commits. You can manage the logical changes with StGIT and when it's time, just merge your StGIT-tracked branch with whatever else. "Patch" here is really just a different name for logical change / commit. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
Well, that you've largely got in "git rebase". But if you wanrt to merge commits, you'd have to do that logic yourself.. Linus -
I definitely care about more than just the tip. A broken history is a _problem_. Automatic tools like "git bisect" can't help you if you have lots of commits in between that are fundamentally broken. And even ignoring that, it just makes it harder for everybody to understand what the code does when "git-whatchanged" shows total crap that was undone. History is important. It's important enough that you should keep it meaningful. And "meaningful" does not mean "show all your mistakes". Some people will say that the mistakes are as important as the fixes. I call bull on that. Mistakes are mistakes. Dead ends aren't useful, even as historical examples. At the same time, I'm not a rabid "history must be perfect" freak. Mistakes happen. Just fix then and move on. When you have guests over, I sure hope that you don't walk around in your bathrobe, with pieces of your anatomy sticking out that shouldn't stick out. Sure, it may be the "real you", but there's a difference between being honest, and just being disgusting. The same is true of SCM history. There's "honesty", and there's "digusting mess". At least when it comes to the kernel, I want the "honest" kind of history, not the "disgusting" kind. Linus -
You can do a diff that spans all the commits and apply it with a new commit msg. With cogito: cg-diff -r from:to | patch -p1 With git you can also do it directly within the repo/index with git-read-tree -m from HEAD to In practice, a new developer will often roll up commits to avoid sending a string of shameful patches and corrections on top -- I often do that ;-) . Developers with more "mana" will have published repos where Junio pulls directly from -- and they get merged with full history. Of course -- they don't have brown-paper-bag commits like I do... Sounds like a reasonable, organic/dynamic way of doing it. cheers, martin -
Martin Langhoff <martin.langhoff@gmail.com> wrote: I bet they have a scratchpad on their laptop (full of brown-paper-bag commits and backtracking) from which they push into a cleaned up repository for public consumption. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513 -
I just do a lot of branches. Incidentally though, is there any way to make a commit completely go away without resetting the files? It'd be a nice feature even if it can only do it from the top down, unlike git-revert which can do it anywhere in the middle as well. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 -
I am not sure if this is what you are looking for, but after I
make a commit and find a mistake (either in the checked-in files
or commit log message), I do this:
$ git reset --soft HEAD^
... fix the checked-in files, maybe do git-add files that
... I forgot to add when I made the last commit.
$ git commit -a -c ORIG_HEAD ;# edit commit log, too.
Soft reset leaves the working tree files intact and just rewinds
the .git/HEAD to whatever commit you specify, and as a side
effect stores the original .git/HEAD in .git/ORIG_HEAD. The
lowercase -c flag to git-commit is "bring the commit log editor,
initialized with the commit log message from that commit".
->>>>> "Martin" == Martin Langhoff <martin.langhoff@gmail.com> writes: Martin> You can do a diff that spans all the commits and apply it with a new Martin> commit msg. With cogito: Martin> cg-diff -r from:to | patch -p1 What's the easiest way then to toss all that intermediate history? I'm thinking of the rcs "-o" switch that "outdates" any deltas in that range. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! -
Start a new branch before the sequence you want to clean up. Then, move the cleaned-up history to that branch, and eventually you can just delete the old one. Linus -
Big caveat --- do that before you make that dirty tree available to outside, otherwise you would be in hot water ;-) -
>>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes: Linus> Start a new branch before the sequence you want to clean Linus> up. Then, move the cleaned-up history to that branch, and Linus> eventually you can just delete the old one. So if I toss something in git/refs, the objects pointed to by that are eventually reclaimed? Do I need to git-fsck-objects to do that? Or is there some cg command to do the whole thing? -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! -
You can do "git prune". It's pretty expensive, though, and the extra objects don't _hurt_, so there's no reason to do pruning very aggressively. I tend to prune immediately just because I run git-fsck-objects all the time, and if you don't prune, it will nag you about "dangling commit". You may also decide to just rename the old broken branch. Keeping it around for local historical reasons and never push it out. Linus -
I'm well aware of this, my question was rather one of
applicability. First, do we want it to work this way, losing the
history. Second, you'd like the process to be all encompasing if you go
this route.
((cd old-repo && cg-diff -r from) | patch -p1) && cg-commit
or any equivalent. Why should I have to muck with patch and diff, when
I can have a 'pull-as-one' operation. Sure, it's a wrapper, but if its
the intended mode of development, let's make it a first-class citizen.
Joel
--
Life's Little Instruction Book #157
"Take time to smell the roses."
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
-Dear diary, on Mon, Oct 31, 2005 at 10:30:03PM CET, I got a letter Personally, from my POV it is the intended mode of development only if you keep strictly topical branches (a single logical change and fixes of it on top of that). Otherwise, this is horrid because it loses the _precious_ history and bundles us different changes to a single commit, which is one of the thing that are wrong on CVS/SVN merging. That said, with a big warning, I would be willing to do something like cg-merge -s and cg-update -s (s as squash), with a big warning that this is suitable only for topical branches. And I think it'd be still much better to spend the work making StGIT able to track history of changes to a particular patch. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
Dear diary, on Tue, Nov 01, 2005 at 10:15:33AM CET, I got a letter FWIW, cg-merge -s and cg-update -s is supported now. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
The -s option of git merge is about choosing a strategy. How can I choose the "recursive" strategy with cg-merge? Some consistency would be good here. Josef -
Dear diary, on Tue, Nov 08, 2005 at 11:50:10AM CET, I got a letter Good point. You can't now, but you should be able to in the future. I renamed this from -s to --squash. Thanks, -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
Here we have the "precious" history vs the "throwaway" history
argument again. You are correct, this does look like CVS/Subversion
merging. But I'm quite capable of keeping my patches single-topic.
Anything that requires multiple patches in a logical separation still
Wouldn't it be cg-pull? I guess I'm not conversant enough of
I like quilt for certain work, and what I read from you and
Caitlin makes me interested in StGIT for those large changes that
require split-out patches. But for simple tasks, I just want to use the
SCM, you know?
Joel
--
"The cynics are right nine times out of ten."
- H. L. Mencken
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
-Dear diary, on Tue, Nov 01, 2005 at 05:17:30PM CET, I got a letter Well, ok, so I assume you are indeed using strictly topical branches. cg-pull just fetches stuff, no merging done. Ok, in theory you do not actually need to fetch the intermediate history in case you are going to squash (unless you are going to default the final commit message to concatenation of the intermediate ones), but arranging that would not be easy to arrange with the current git tools, I think. And neither feasible. But actually, I would like to do something like this later, support for CVS/SVN-like tracking by always having only the latest tree and no intermediate states, so that people who just want to run the latest and want to do no development are not Well, if you are already going to deform the history, StGIT (able to track patch history) is just the best tool for that. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ VI has two modes: the one in which it beeps and the one in which it doesn't. -
Ahh -- the lightbulb just lit up. Using CVS is just like being married. No wonder you hate it... -
>>>>> "wa1ter" == wa1ter <wa1ter@myrealbox.com> writes: wa1ter> Ahh -- the lightbulb just lit up. Using CVS is just like wa1ter> being married. No wonder you hate it... Now, hey hey. I've met Tove. She's very nice. I doubt that Linus would compare his marriage to her with CVS. :) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! -
He did say that CVS worked reasonably at Transmeta. He's probably just as glad she's not married to the entire Linux development community... Being married is great, but it just doesn't scale past a dozen people who trust each other and have rules on how they do things. -Daniel *This .sig left intentionally blank* -
This thread is getting a big psychedelic. People, take your meds, please, Linus -
I dunno why not. If my wife knew CVS she would probably agree with me (for a change). I've learned a great deal from reading this thread -- as I hope others have. I did learn one important thing while reading about old Al Einstein: you can get some astonishing insights by asking really dumb questions of really smart people. I've had many such astonishing insights from reading Linus's posts over the years (and I look forward to many more). I've been amazed by Linus's understanding of both machines and people -- this combo is rare indeed! Linus, have you considered a career in marriage counseling? -
The very first version of Linux came out in 1991. -hpa -
You need to understand the SCM "problem space" at least a little bit. Can't cheat on that unfortunately. The writeup at http://www.dwheeler.com/essays/scm.html is not perfect, but should give you a bit of background. It barely covers git -- we need to prod the author to update it ;) martin -
