[
I think the proposal below is original, and more correctly captures
the essence of the "commit interface wart" than any previous
proposal I've made. This proposal is also based entirely on what is
useful for all git users, and what I perceive git's conceptual
models to be. That is, this proposal concerns what _I_, (as a fairly
experienced git user), actually want, without any bias for any
assumptions about what an imagined "new user" might want. Notably,
it does not try to satisfy naive (and likely incorrect) assumptions
about git's model.
Finally, this proposal intentionally uses ludicrously long command
names. This is because a discussion of realistically short names
triggers the two loaded issues of "muscle memory" and which concepts
get blessed as "defaults". In previous threads, those issues have
muddied the conceptual issues I'd like to focus on here. Let's talk
about the concepts first, and save discussions of naming for later
if necessary.
]
Proposal
-------
Here are the two commit commands I would like to see in git:
commit-index-content [paths...]
Commits the content of the index for the given paths, (or all
paths in the index). The index content can be manipulated with
"git add", "git rm", "git mv", and "git update-index".
commit-working-tree-content [paths...]
Commits the content of the working tree for the given paths, (or
all tracked paths). Untracked files can be committed for the first
time by specifying their names on the command-line or by using
"git add" to add them just prior to the commit. Any rename or
removal of a tracked file will be detected and committed
automatically.
Rationale summary
-----------------
These two commands capture a distinct conceptual split that is useful
for what users want to do with git. The split is necessary and
sufficient to provide access to four different useful pieces of commit
machinery. This is more functionality than in current ...BTW, the current "apply --index" doesn't allow what I imagined in the scenario above. It notices that the affected file is different in the working tree compared to the index and just refuses to do anything. Given that safety-valve in git-apply, the current behavior of "git commit paths" would allow for splitting a submitted patch into two commits. The difference is that it only works if the local modifications do not affect any of the same paths as the patch. The user is freed from worrying about this somewhat, since if it's not the case, then git-apply just complains and doesn't do anything. But what might be very interesting is a modified "git-apply --index" that would not fail in this case, but would instead do the following: 1. Apply the patch to the working tree 2. Apply the patch to the index And of course, if either fails then the entire apply operation fails, leaving no changes to working tree or to the index. With that new git-apply behavior, then the scenario I outlined above would work, and would work in spite of any changes to the same file in both the working tree and the index. It would also require the separate commands for commit-index-content vs. commit-working-tree-content as described in my original message above. -Carl
Carl Worth <cworth@cworth.org> wrote: Edit somefile with, e.g, emacs: Get backup called somefile~ Realize that somefile is nonsense, delete it(s edited version) commit-working-tree-contents: Now you have the undesirable somefile~ saved Edit somefile, utterly changing it: Get backup called somefile~ mv somefile newfile commit-working-tree-contents: somefile~ saved, newfile lost Edit somefile a bit, move it to newfile. Make sure no backups left over. commit-working-tree-contents: somefile deleted, newfile lost This is /not/ easy to get right, as it depends on what the user wants, and the random programs run in between git commands. You need to tell git somehow what files you want saved, and which ones are junk. I.e., just the first command (unfortunately). -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 2654431 Universidad Tecnica Federico Santa Maria +56 32 2654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513 -
The semantics I intended to describe for commit-working-tree-content would not add this file. That's a "new file" so would have to be mentioned either explicitly on the command-line or in a git-add OK, you've found a bug in my description above, (though not in the intended semantics). By "rename...detected automatically" I meant only that the fact that a file has "disappeared" as part of a rename need not be mentioned to git. The fact that the contents are made available as a new file name still would need to be told to git with "git add", Perhaps I was too oblique in calling this thing commit-working-tree-contents. This isn't some fabricated-from-scratch command. The intent of my message was that readers would recognize the description as matching what the current "commit -a" and "commit files..." commands do. So I really wasn't trying to invent anything really different than those. So almost any problems of unexpected behavior you can find almost surely apply to "commit -a" already. I did throw one new thing into the description, (that does not exist in current git). That's the mention that new files could be added by mentioning them explicitly on the command-line. This was intended as a way to allow a tutorial to sidestep the details of how "git add" interacts with the index. If this one feature is a bad idea, it could be dropped with no impact on the rest of the proposal nor my discussion of it. Similarly, I worded the mention of "git add" to suggest it be done "just prior to the commit". Again, I did this just to avoid having to mention anything about the need to "git add" again if the file was edited between the time of add and the time of commit. That language is already proposed for the git-add documentation, so there's no need to repeat it all here. -Carl
How do you distinguish a "new file, same contents as old file" from "old file, renamed"? What is the difference between: mv somefile newfile and cp somefine newfile rm somefile ? How should cp somefile newfile vi somefile be handled? How about cp somefile oldfile vi somefile or just mv somefile oldfile ? Or cp somefile somefile.my-own-bakup vi somefile ? The whole problem is your description based on "file renaming" and such. AFAIU git has a list of file names it is tracking, and for those names it keeps track of what the contents for each are at each commit. That the name somefile had some contents that later show up as newfile (both names tracked) is recorded just as that. You could /interpret/ this as a "rename" if somefile is then gone, but it could well be something else. Besides, you'd have to search for the old somefile contents among /all/ newfiles just to find out it was renamed. Better don't mix facts with interpretation (== guesses on what operations came in between the snapshots git takes). Note that it should never matter what strange ideas a random user gets for naming her temporary backup files, or their git configuration. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 2654431 Universidad Tecnica Federico Santa Maria +56 32 2654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513 -
There is no difference. This is git, a content tracker. OK. Strike the words "or rename" from the description, leaving just: Any removal of a tracked file will be detected and committed automatically. The rest of my analysis still stands, I believe. And I'd be glad to accept further suggestions on documenting these. The goal is simply to have a user-oriented description of the semantics that are consistent between the current "git commit -a" and "git commit files..." commands. -Carl
I think this is a very interesting proposal, although I think I
disagree with the last part:
Any [rename or] removal of a tracked file will be detected and
committed automatically.
If adds aren't going done automatically (because otherwise you have
problems with foo.c~ accidentally getting checked it), then it's
non-symmetric to expect that deletes will also happen automatically.
It's relatively rare that files are removed or renamed, and sometimes
files accidentally disappear.
So in the case where there are no pathnames given to "git
commit-working-tree-content", I would argue that it does not do any
implicit "git add" on new files NOR any implicit "git rm" on missing
files unless the user actually specifies an --implicit-add or
--implicit-delete option, respectively. If users want to make
--implicit-add and/or --implicit-delete the default, that could be a
configuration option, but I don't think it should be a default.
A second issue which you left unspecified is what should
commit-working-tree-content do if the index != HEAD. In particular,
in this case:
edit foo.c
git update-index
edit foo.c
git commit-working-tree-content foo.c
What should happen to foo.c in the index? Should it be stay the same?
Should the contents be replaced with version of foo.c that has just
been commited? The latter seems to make sense, but runs the risk of
losing the data (what was in the index). The former has the downside
that the index might have a version of foo.c which is older than what
has been just commited, which could be confusing. Or should git
commit-working-tree abort with an error message if index != HEAD?
- Ted
-
That is exactly the "'commit --only' jumps the index" issue. Updating the index with what is committed makes sense because the commit after this --only commit happens builds on top of it, and not doing so would mean the change to foo.c would be reverted. As you mentioned above, updating the index with the committed version of foo.c means information loss of what was staged earliser, and the traditional behaviour has been to "abort with an error if index != HEAD" at that path, which was a safety valve. However, In the recent discussion, everybody (Linus, Nico, and I included) seems to think this information loss is acceptable and in fact is even useful. I've sent a patch to remove the obsolete safety valve for comments today, but haven't applied it to any of my public branches yet, but most likely I will, and it will happen sooner with encouragement from the list. -
It's non-symmetric, yes, but it's what I would personally like. It's not an essential aspect of the proposal, so it could go either way as the git crowd decides. To explain my personal preference, I like the notion of all files being "untracked" until I inform the system about their existence. After that, I'd like the system to take care of them and The ability to configure --implicit-delete and --implicit-add to This case is already under debate in a separate thread. There "git commit files", (which really is commit-working-tree-content already), currently errors out in this case, but the proposal is to allow it to proceed with the commit, (thereby "losing" the intermediate staged content). -Carl
(This is offtopic)
I often faced situations like that during git.git history. One
patch to expose the bug in the existing code, and another to fix
it. And there are three ways to make that commit.
(1) one commit exposes, then another fixes.
(2) one commit fixes, then another verifies the bug is no more.
(3) one commit to include both.
In my experience, (1) is only useful during the time I am coming
up with the fix (if I am fixing it myself) or during the time I
am reviewing and committing the fix (if I am applying somebody
else's patch). Committing in that order lets me validate the
brokenness after making the first commit, and then lets me feel
good by not seeing that problem after the second commit. But
this means I deliberately record a state that is known not to
pass the test, which means it is a problem for somebody else in
the future when the history needs to be bisected to hunt for an
unrelated bug. If the "test" is just an optional test in the
test suite, then it is easy to work around (the person who is
bisecting can ignore that bug by not running that particular
test), but if it is an assert somewhere deep inside the code,
ignoring it is not very easy, especially if the person who is
bisecting is not familiar with that part of the code.
What I recommend people to do these days is either (2) or (3),
but do so _after_ verifying the fix in the reverse order. The
criteria to choose between (2) or (3) is fairly simple: if the
"test" is easily separable (e.g. changes to a test script file
that does not overlap with the "fix" patch), roll both in one
commit. Then it would not later cause problems for bisection.
Enough of offtopic.
The sequence to split a patch in place would be (I'll speak in
the present tense and pretend Nico's "git add" does not exist
yet):
git apply
git update-index <files for the first batch>
git commit
git commit -a ;# the remainder
so you do not necessarily need a new "concept". It ...Granted, something like an assert that breaks the library would not be a useful thing to have in the history. I'm certainly not in favor of something like that. I'm talking about tests that demonstrate pre-existing broken-ness in the code. In the case of cairo, our test suite is entirely optional, and each test is out-of-process, so even if a test totally crashes the suite continues and it's easy to ignore that. But it's not just the correctness test suite. We also have a performance test suite, and I encourage the same "add test case, then performance fix" pattern there. This shows an even more obvious example of why it's useful to have separate commits in the history, (since someone may want to verify the performance impact on a separate system at any point in the future, and the two commits makes it easy to get "before" and "after" for the performance fix results from the Yes, I can use this today, and I do, (as I mentioned in my mail). The only requirement is that I start with a non-dirty working tree. I can arrange that, but it would be just a bit less inconvenient if I didn't No, I don't need it. And this "commit-index-content [paths...]" was the least significant part of my proposal. As I said, originally I was just going to say this "might be useful in some cases", but then someone just happened to request this feature on the list at the same time I was considering the proposal. Anyway, it spite of this being an accidental feature of y proposal, it seems to be the only part you commented on. Even if this functionality weren't made available at all, I'd still be interested in your comments on the main thrust of my proposal. I think that consists of: 1. Unifying the two current commands that provide commit-working-tree-content semantics into a single, use-oriented description. 2. Avoiding a change of semantics triggered by merely applying pathname arguments without any command-line option or Indeed. I like that discipline very much. And ...
Hi, Note that (1) maybe would reflect history better, but (2) and (3) are way nicer to bisecting. I fell very strongly that I want (3) in the history. (Though I am guilty of many instances of (2)...) Ciao, Dscho -
By the way, the original command-line convention I used in the proposal was that the omission of an optional argument should be equivalent to supplying some default argument. Here's another convention that is also useful to examine: Adding path-name arguments limits the behavior of the command, (and does not otherwise change the semantics). I don't know that this is as universal a convention outside of git, but it's quite strong within git. The path name limiting exists in deep parts of the machinery and allows for things like: git log -- paths... # path-limited version of "git log" git diff -- paths... # path-limited version of "git diff" etc. It's interesting to look at how the various commit commands fit (or do not fit) this convention: git commit paths... git commit --only paths... This command cannot be explained in terms of the semantics of "git commit" (without command-line options). This command _can_ be explained as a path-limited version of "git commit -a". git commit --include paths... This command does something _extra_ to the given paths before executing the equivalent of "git commit". I think this is a fairly unique violation of the path-limiting convention. The proposal I made with commit-working-tree-content and commit-index-content consistently follow the path-limiting convention. I think consistency of command-line conventions like this are important for making the tool usable. And there have been notable improvements to consistency of convention in git recently, (for example, using <since>..[<until>] in git-format-patch rather than <his> <mine>). -Carl
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
