Seems to me the concept of the "index" is a half-baked version of what I really want, which is the ability to factor a working tree's changes into its constituent parts in preparation for committing them. The index provides some very nice facilities to factor out changes in a working tree into a "staging area", but the fundamental flaw of this in my view is that this "staging area" is not instantiated as a tree, so it cannot be compiled and/or tested before committing. Consider a facility where the state you want to commit next is built up in the current working directory, and the original set of changes exists in some proto-space like the index currently inhabits, where you can query and manipulate that state, but it isn't instantiated in your working tree. Imagine a session like this: You've got a couple of conflated changes in your working tree, that you think you can break up into two orthogonal changes, each of which will compile and pass a set of tests you've got. You think. You'd like to verify the build and test before you commit each piece. git prep where "prep" means "prepare commit". Don't get hung up on command or option names I'm using as placeholders, I just made that up without much deep thought about what to call it. Now my tree appears clean (and git diff returns nothing). I can now start adding the changes I had in my working tree that I want to include in the next commit, using git add (which would know I am in the "prep" mode). I can examine those original working dir changes I am choosing from with: git diff --prep which, at this point, shows the same output that "git diff" did before I ran "git prep." Now I want to add some subset of my original changes: git add newfile.c git add -i <add a couple of hunks of the changes from file modfile.c> Now I have a working tree state that I think I want to commit. I can examine it with: git diff and I can compile and test it. Yep, it works and passes my test suite (an option I did ...
Hm, I use "stash" for that purpose, which leads to kind of the reverse of your approach. So I do sth. like this: - hack hack hack - Notice that I want to make two commits out of what I have in my working tree - git add -p -- stage what I want in the first commit - git commit -m tmp -- temporary commit - git stash -- stash away what doesn't belong in the first commit - git reset HEAD^ -- drop the temporary commit, with the changes kept in the working tree - test, fix bugs, read the diff, whatever - git commit -- this time for good - git stash apply -- get back the changes for the second commit Instead of using reset, you could also use "commit --amend" (I actually used to do that), but that needs you to do "git diff HEAD^" to see the full changes, and (IMHO) makes it a harder sometimes to review your stuff, because you now have three places where the changes for one commit might reside (HEAD, index and working tree). Björn --
'git stash save' saves local modifications to a new stash, and runs 'git reset --hard' to revert them to a clean index and work tree. When the '--keep-index' option is specified, after that 'git reset --hard' the previous contents of the index is restored and the work tree is updated to match the index. This option is useful if the user wants to commit only parts of his local modifications, but wants to test those parts before committing. Also add support for the completion of the new option, and add an example use case to the documentation. Signed-off-by: SZEDER Gábor <szeder@ira.uka.de> --- I used to do the same, so I have added a '--keep-index' option to 'git stash save' to simplify this workflow. Have a look at the use case added to the documentation to see, how you could spare the temporary commit and the 'reset HEAD^'. RFC, because I'm not quite confident with using plumbing like 'git read-tree'... and there are no tests. Documentation/git-stash.txt | 22 +++++++++++++++++++++- contrib/completion/git-completion.bash | 13 ++++++++++++- git-stash.sh | 22 ++++++++++++++++++---- 3 files changed, 51 insertions(+), 6 deletions(-) diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt index baa4f55..936864f 100644 --- a/Documentation/git-stash.txt +++ b/Documentation/git-stash.txt @@ -36,12 +36,15 @@ is also possible). OPTIONS ------- -save [<message>]:: +save [--keep-index] [<message>]:: Save your local modifications to a new 'stash', and run `git-reset --hard` to revert them. This is the default action when no subcommand is given. The <message> part is optional and gives the description along with the stashed state. ++ +If the `--keep-index` option is used, all changes already added to the +index are left intact. list [<options>]:: @@ -169,6 +172,23 @@ $ git stash apply ... continue hacking ... ---------------------------------------------------------------- ...
Please do not describe how it does, before what it does and what it is good for. Here is an example: When preparing for a partial commit (iow, committing only part of the changes you made in your work tree), you would use various forms of "git add" to prepare the index to a shape you think is appropriate for committing, and finally run "git commit". This workflow however has a flaw in that you are committing something that you could never have tested as a whole (and without any other modification) in your work tree. With the new "--keep-index" option, "git stash" takes a snapshot of your index and the work tree state, and updates the work tree to match your index, i.e. what you are about to commit. This way, you can commit with confidence, knowing that you are committing what you saw (and hopefully tested) as a whole in your work tree. After making that initial commit, "git stash pop" will bring the work tree state back without touching the index, so you can start the next cycle of "git add" to prepare the second batch. I do not know if your --keep-index implementation would actually allow the above workflow; I haven't read the patch. --
Now I have my guess at the first commit as my tree state, correct? What happens when I decide I need a couple of hunks from another file which I missed in my first guess, and is now in the stashed state? How do I get those out of the stash and into the working tree? If there is no convenient way to do that, then this method is not sufficient to cover the use case I am talking about. Thanks, Bob --
It's rather a work-around to stash away only the changes that are not in the index. See the other reply to my mail for a patch that adds an git stash pop eventually fix conflicts if you changed the working tree in the meantime go back to the "git add -p" step Björn --
But that pops the entire stash, right? Inconvenient at best. A good UI here would allow you to move pieces bidirectionally to/from the stash at will until the desired, verifiable factorization of changes has been achieved. Thanks, Bob --
I do this all the time. After I have made $N commits out of my worktree, I usually $ git rebase -i HEAD~$N and turn all 'pick's into 'edit's and 'squash's. Then I can compile and test each commit, perhaps add some fixups, in isolation. -- Hannes --
Hannes, I do not have N commits. I have a modified working tree from which I would like to create N commits. I would like to compile and test an instantiation of each of those to-be-committed states before committing them. Thanks, Bob --
Hi, I wanted to suggest using commit and commit --amend, but I realized that frankly, I don't understand quite what are you wanting to do. Through the process, are you preparing a sequence of two commits at once, or merely a single commit? With s/--prep/--cached/ and throwing git prep away completely, it's not clear to me how would what you present be different at all from just using index - could you point out what is actually different in your workflow compared to the prep workflow you propose? -- Petr "Pasky" Baudis The last good thing written in C++ was the Pachelbel Canon. -- J. Olson --
As I said in my original message, twice, was that the index state is never instantiated and therefore cannot be compiled or tested. Thanks, Bob --
Hi, Half-baked is probably too strong a word. What you are basically asking for is to have the working directory as staging area, and to be able to stash away changes that are not to be committed. Now, this is not necessarily what everybody wants, which is why many people are fine with the index. Having said that, I played with the idea of a "git stash -i", which would allow you to select the changes to stash away. (And by extension, "git stash -e" using the "git add -e" command.) Hmm? Ciao, Dscho --
On Fri, Jun 27, 2008 at 02:33:33PM +0100, Johannes Schindelin <Johannes.Sch= If we are at it, git checkout -i is also something which may be useful, like: 1) Do two unrelated changes in a file. 2) You realize one of them is unnecessary. Currently what you can do is something like: 1) You stage the first hunk using git add -p 2) git commit 3) git checkout file But this forces you to commit early, and to commit --amend later. It would be nice to be able to completely drop a hunk without first commiting. (Feel free to point out if this is something bad, I just remember from the past that darcs revert - which is like git checkout - had such an interactive mode.)
On Fri, Jun 27, 2008 at 6:33 AM, Johannes Schindelin It is too subtle. That the index state - which becomes the next committed state - is not available for building or testing before But it is something they should want, and should have, if they care about the quality of their commits. Especially in the common case of a project with development lines which have some sort of policy about build/test requirements. How do you ensure your commits obey that policy if you cannot verify it? That is why the index is not a sufficient mechanism for preparing partial commits. It's fine for quick and dirty operation when the factorization of the conflated I meant to mention that - at least in the model I described - this has some overlap with "stash" and could possibly be folded into it. In my ideal UI, changes (from all changes to hunk level) could be moved back and forth between the stash and the working tree equally easily. Would git stash -i allow that? For example, if I moved a couple of files into the stash, and then realized I needed one hunk back, could I easily retrieve just that from the stash? Thanks, Bob --
Hi, This is too narrow-minded a view for me. No longer interested, Dscho --
On Fri, Jun 27, 2008 at 10:45 AM, Johannes Schindelin Here's a patch to match the local culture: "It is incredible how stupid the idea of the index is." Clearly you should now be interested. Thanks, Bob --
I guess I'm not interested in the over-generalizations. ;-) But the ability to use e.g. some stash-like feature (as suggested above) to easily make the index-state (about to be committed) fully available for compiling/processing/testing without losing edits not yet ready for commit is an extra feature we would use here at least some of the time. I will admit it's currently not the "itch" at the top of my list. Thanks, -- Dana L. How danahow@gmail.com +1 650 804 5991 cell --
Hello, I just thought I'd throw in my $0.02 here. There's something fundamental I think I'm not getting about this argument: it seems to be based on the premise that partial commits allow untested tree states to enter the repository. However, having gotten used to the git way of things, I personally don't see the problem with allowing bad commits, as long as they are not pushed to public. That is, I use the git add -p command all the time when I realize I've just done two things that should be committed separately. Then I'll git commit everything else, and go back and test, like so: git checkout master [hack hack..] git add -p git commit git commit -a [test..] git checkout master^ [test..] git checkout master See? All tested. If I find a problem during testing, I'll probably commit it and then rebase master off my new commit, fixing any conflicts I just introduced. Frankly I hardly even use the stash; since history can be edited, I feel like commits are all I need when working with git. Anything I do doesn't matter until I push to public anyways. So it's up to me to make sure I test everything before pushing, but otherwise I'm very happy with the ability to commit half-baked ideas and then go back to make sure they are usable (and tested!) before pushing. This is what local branches are for, isn't it? Steve --
Of course developers need to be able to make bad-commits. If they can't, we quickly end up back in the "don't checkin" mentality of CVS. I want to be able to do bad-commits. Further, I recognize that even with the best of intentions bad-commits will enter the mainline. What I think is missing is a mechanism for re-summarizing a set of committs that does not rely on the fact that they've never been published. See my other email on "supercede".. Consider a big public tree that is using bisect, but discovers that there is a bad commit in the mainline. I have no idea what you would do with current git to fix this. However, if I could pick two endpoints and "supercede" them, I would have a bunch of options: - I could supercede 2-commits with 1, effectively making the bad-commit disappear in the linear history. Users who already have the history, however, would be unaffected, because the start/end endpoints are the same. - I could supercede 2-commits with two different commits, one that makes the test pass correctly, and one that trues-it-up with the next patch, that somehow made all the tests pass correctly again. (possibly by hoisting a diff out of the 2nd commit that actually fixed the first issue) While repairing an old test build problem may seem like an infrequent issue, this mechanism has a much more frequent and more important benefit... which is that people can share changes and then later rebase them -- and others who have a copy of that branch will receive automatic merge/rebase support. (assuming there are not reasons for conflict) --
You can't do that. The start might be the same but the end point would not, even if the file contents happened to be the same as they were before. This is because each commit is identified by its SHA-1 hash, and changing any ancestor anywhere in the DAG will have a trickle-down effect to all subsequent commits which stem from it, changing their SHA-1 hashes too. So you wind up with a different end point. This is by design (see below). It has to be this way because it is the only way to do distributed SCM. Imagine for a minute that one developer has a commit "feedface", fetched from another repo. If the commit or any parent (grand parent, great grandparent etc) is changed then the hash _has_ to be different. Otherwise you would have the untenable situation that two developers, two repos, could refer to a particular commit "feedface" and it wouldn't actually be the same thing. Distributed development would be unworkably flakey if you couldn't rely on the stability of each commit's SHA-1 hash, and that fact that it embodies not just that particular state in the history, but all prior states too. It's a totally basic, fundamental, central, non-negotiable concept to Git's design. And not just an idiosyncrasy either; if you do a review of other distributed SCMs out there you'll find that _all_ of them have a similar underlying precept, as it's literally the only way to do working distributed SCM. Wincent --
And why is that? It is like saying that any editor that does not allow you to compile the file without saving it first has a deep flaw. In Git, commits are not the same as in CVS. You can commit any changes and amend them any time later before you publish your changes. So, what you normally do is to commit your changes and then run test on them. The advantage of this approach is that you can continue to your normal work while test are running, besides the tests are running in clear and control environment, not in the developer working director where always Those who care about quality should have a review process, and the review process works best when all changes are slit in a small logical You can verify it, but you do that _after_ you committed changes but before you publish them. BTW, policy may include that it should be compiled and tested on a few platforms, so you cannot do that in your working directory anyway. I think the source of your confusion is that you consider git commit as cvs commit, while git commit in some sense may be closer to saving files, while a better analogue for cvs commit will be git push to a public repository. Dmitry --
I don't believe it is like that. It would be like that if you intended for your on-tree disk to have a policy to always compile (for This is enforcing a two-step process where there only need be one the vast majority of the time, to require that commit and publish be I don't understand the question. The entire point of the facility I am proposing is to facilitate creating small clean changesets. Go back and read my original proposal, or Junio's statement of more or Again, this is requiring two steps when it is otherwise not required, No. I have experience with a wide array of source control systems. CVS fits my mental model the least well. git is pretty close, but it is not there yet. The current partial commit facility is the biggest misfire, in my view. I like that git's philosophy does not include a draconian policy of not changing history. That's fine, it's practical, and it's useful, to cover common scenarios in which you'd like to quickly recover from a mistake. However, I am afraid that these facilities have been abused and turned into something that they are not well-suited for, i.e., the use of lines of development as both keepers of history and of scratch spaces where you scribble around with temporary things, all the while git having no clue which is intended. That these ideas are conflated is a mistake. That's my opinion. These activities ought to occur in separate, logically distinct spaces in which they occur, because they have different requirements and common use-cases. Bob --
Do you think commit only tested changes is a common policy among Git users? I seriously doubt that. And as I said before, git commit I don't see it is a problem. In fact, it saves time for me, because I can work while tests are running, and considering the whole cycle with different configurations may take 3 hours, I really want to do something useful while it is running... Besides, I really believe If you use CVS like workflow, you may have the policy of no commit until your patches are reviewed. In case of Git, you do commit but only push to fast-forward only branch after receiving okay (or the maintainer pull your changes to it). With Git, commit is more like saving file, except that you save not a single file but the whole You have your working directory let's say on Linux, and you have to test your changes on Windows. So what do you do? With Git, it is simple as you commit your changes and then run testing automatically on different platforms and they use exactly what you put into the Do you have your personal experience of using it, or it is just some abstract considerations? Certainly, any feature may be misused but, in general, it is handy and I have not had any problem with it. And, yes, if I split a very complex patch then I will use stash to facilitate I don't see why git should know it. The policy depends on workflow and is usually enforced by hooks. I don't see why Git should care about it deeply. It is like a word processor can be used for writing a draft of a document or the final version for publication. Sure, you can mark something as work-in-progress (use tags or comments), but it is not something about Git should care deeply inside. Dmitry --
I've always said that I am not in favor of any form of partial commits, exactly for the reason Robert states, namely that you are not committing what you had in your work tree as a whole. I said so back when the only form of partial commits were "git commit [-o] this-file". I said it again even when I introduced "add -i", that the interface goes backwards and does not fix the issues associated with partial commits. But I agree with you that calling the index half-baked is missing the point. The index is merely the lowest level of facility to stage what is to be committed, and there is no half nor full bakedness to it. The way the current Porcelain layer uses it however could be improved and Robert is allowed to call _that_ half-baked when he is in a foul mood (even then I would rather prefer people to be civil on this list). So I would welcome constructive proposals to make things better. But before going into the discussion, to be fair, I would mention that people who are used to partial commits (perhaps inherited from their CVS/SVN habit) defend the practice by saying that they will want to make commit series first (with unproven separation between commit boundaries that is inherent to the practice of making partial commits) and it is not problem for them that their commits are not tested at commit time, because they will test each step afterwards after they are done committing. They can fix things up later with "rebase -i" if they find glitches. The defense makes sense from the workflow point of view, in that batching things up tends to make people more productive. You think of the logical separation first and make commits without having to wait for each step to build and test (otherwise your train of thought would be interrupted), and then you test the final resulting sequence as a separate phase. Although I imagine I would personally not be able to work that way comfortably, I consider this a personal preference issue, and if some people are more productive to ...
What I said is that the index is a half-baked version of _what I want_, which is the ability to do partial commits from testable states. Clearly the index is a way to do partial commits, but it does not address the "untested state" issue, so it is only halfway there, Exactly. That was a good summary of the workflow I proposed in my Bingo. Time permitting, I will propose a more well thought through UI for this workflow. If you like it, we can talk about how it might be implemented. Thanks, Bob --
I do partial commits all the time. I used to use the "go back and clean up and test each" method. But now with stash, I use the workflow mentioned elsewhere in the thread: 1. hack hack hack 2. add -i; commit -m tmp 3. stash 4. reset HEAD^ which is really kind of awkward, since I didn't want to make that commit in the first place. I think of it as three ordered buckets: index, working tree, and stash. You can move changes from any bucket to an adjacent bucket, but you can only test what's in the working tree. So we have too many changes in the working tree. Right now we can put some of them in the index bucket. But that doesn't help with testing. So what I do now is move good changes to the index bucket (and then commit -- more on that in a second), and then everything else to the stash bucket, and then reset the commit. The extra commit / reset is annoying. We could do better than that with the proposed "stash --keep-index". Then I just have to move my good changes into the index and say "ok, now stash everything else." But that still involves two bucket moves. I think what would be much more natural is to simply say "I don't want these changes right now; move them into the stash bucket." And Dscho mentioned "git stash -i", which does that (and theoretically, it could even be based on the "git add -i" code). What you propose below accomplishes the same thing, but seems mentally Here we say "OK, I don't want any of these changes; stash them" and then selectively bring them back. And maybe that _is_ what you want sometimes, or maybe how certain people think. But personally, given two sets of changes, what makes sense to me is to directly say "these are the changes I _don't_ want". And at any point after ditching some changes, you can re-run your tests. So I fundamentally think that all of these contortions are because moving things to the stash bucket is not as featureful as moving them to the index bucket. And there's no reason for it, since we can ...
Are all features for moving changes to stash bi-directional in your implementation? Can we move a hunk out of stash, just as easily as we can move one in? I think this is an essential property of a good implementation of this workflow. As to which way people like to think and work, in terms of re-applying changes from a fresh state one at a time, or just pushing off changes they don't want, I think ensuring bi-directionality in the tools for moving changes back and forth will ensure that all such scenarios will be equally well supported. It does seem to me at this point that extending stash functionality is a reasonable way to approach supporting this type of workflow. Thanks, Bob --
No, they're not bi-directional. You get the full power of "add -i" when moving things into the stash, but not out. It is the reverse of the situation with the index; we have good tools for staging things, but not for unstaging them. But I agree that they _should_ be. Given the patch I posted already, probably the simplest way to do both at once would be to give "add -i" Actually, I oversimplified a little bit in my "buckets" description. Stash actually stashes two things: the current index and the current worktree. So in that sense, it is not just another bucket as I described. For the purposes of the workflow we're discussing, I think that is how we want it to function. But the implementation will be a bit trickier than it might otherwise be because of this. I just didn't want to have to introduce another, slightly different type of stash. -Peff --
You do *not* have to use stash that has different index and work tree
components and then your buckets description makes sense.
The index is to hold good changes verified to be fine to make commits, the
work tree is to test and verify outstanding changes, and the stash is to
hold the remainder (i.e. further changes that does not belong to what you
are currently looking at in your work tree).
When your workflow is to "verify and immediately commit", then the need
for buckets to your particular workflow can degenerate to a one that does
not need the index. That is essentially Robert's workflow (but it does
not mean it is the only valid one).
What I do these days is this:
* Fork a new topic from the commit I pushed out the last time to the
public ("git checkout -b jc/topic ko/master").
* Think, hack, commit, think, hack, commit, lather, rinse, repeat.
* Make sure everything is worthy for the final state. There can be (and
need to be to use the current set of tools) some uncommitted changes.
Make a stash, so that the work tree component records the final tree,
and mentally name it the "commit goal".
* I have never grew comfortable operating "edit" insn in "rebase -i", so
the workflow from this point does not use it. Instead, I detech the
HEAD to the root of the series ("git checkout ko/master^0") at this
point. Now, none of my change is in the work tree.
* Repeat the following:
* Recreate the work tree state for the next round to be built on HEAD
and make a commit, after verifying what I have in the commit is good.
Examples of the tools at my disposal are:
* "cherry-pick jc/topic~$N" to get the necessary changes from my
earlier "snapshots", which can possibly be followed by a "git
commit --amend". This "going forward" is easiest especially in the
early part of the sequence.
"format-patch --stdout jc/topic~$N..jc/topic~$M | git am" is a
slight variant of the above when I ...Hi. I'll squash in "|-e" here. Note that some time ago, I was working on "git stash apply -i", which I never came around to finish. Yesterday, I was briefly considering working on it, but I have other things to tend to now. Ciao, Dscho --
Thinking about this some more, it seems to me to lack one really important feature that "git add -i" has: you must stash all in one go. That is, I may do some of the adds as "git add <file1> <file2>" and then pick out the rest of the changes with "git add -p". And traditionally, stashing has been about dumping all changes, so everything happened at once. But I think what I would really like here is to say "now I don't want to stage for a commit; I want to stage into some bucket, so that I can clear my workspace for making the commit". And then proceed to use "add" or "add -i" in the usual way, except that they go into my bucket. And at the end, I switch back to staging for a commit, make the commit, and then start picking things out of my bucket. And that workflow is not too hard to imagine by just pointing GIT_INDEX_FILE to the bucket. But I am still thinking on this, so I'll let it percolate and then maybe try to implement something once I have a better sense of exactly what workflow I want. I just thought I would throw it out there for others to ponder. -Peff --
Robert, I'm new to git, but I understand where you are going. Why limit it only to working tree changes? For me, the stash machinery is of no help here, because I commit super-often. What I end up with is 30 commits in a topic branch, but where not every point passes 100% of tests. I want to go back through and order them properly, and decide which points are sensible as upstream committs. (especially as I read more about bisect). git has all the concepts I want except one. However, it makes the process pretty manual. Here is an idea about automating it. I'll talk about that one new concept at the bottom. I think of this as reorder/merge/split... reorder: Picture that a list of commits on this branch opens in an editor. You are free to rearrange the lines in any order you want, but you have to keep all the lines. When you are done reordering the lines, the tool creates a new topic branch and applies the changes (probably with cherrypick) to the new topic branch. If there are no conflicts, you're done. merge: Picture now that in your editor you can create groupings of those individual commits that should make up separate topic-branches. The operation can still be performed automatically, and at the end, it can compose those topic branches into a single branch just like your original. At this point, you can "isolate" any one of those topic branches and test/push that topic branch. split: Picture now that you can list the same commit in more than one of the topic-branches. This is a little more tricky, and there is no way to do it automatically.. It drops you into an editor and asks you to select the lines of the diff for the first topic. The remaining lines are put in the next topic. This can continue for multiple topics. This seems like something that could be assembled pretty easily on top of the current git mechanisms, except for one thing. If you use merge, the history will be a mess. If you use rebase, anyone else who pulled your topic branch will be in a world ...
