[
I think the proposal below is original, and more correctly captures
the essence of the "commit interface wart" than any previous
proposal I've made. This proposal is also based entirely on what is
useful for all git users, and what I perceive git's conceptual
models to be. That is, this proposal concerns what _I_, (as a fairly
experienced git user), actually want, without any bias for any
assumptions about what an imagined "new user" might want. Notably,
it does not try to satisfy naive (and likely incorrect) assumptions
about git's model.
Finally, this proposal intentionally uses ludicrously long command
names. This is because a discussion of realistically short names
triggers the two loaded issues of "muscle memory" and which concepts
get blessed as "defaults". In previous threads, those issues have
muddied the conceptual issues I'd like to focus on here. Let's talk
about the concepts first, and save discussions of naming for later
if necessary.
]
Proposal
-------
Here are the two commit commands I would like to see in git:
commit-index-content [paths...]
Commits the content of the index for the given paths, (or all
paths in the index). The index content can be manipulated with
"git add", "git rm", "git mv", and "git update-index".
commit-working-tree-content [paths...]
Commits the content of the working tree for the given paths, (or
all tracked paths). Untracked files can be committed for the first
time by specifying their names on the command-line or by using
"git add" to add them just prior to the commit. Any rename or
removal of a tracked file will be detected and committed
automatically.
Rationale summary
-----------------
These two commands capture a distinct conceptual split that is useful
for what users want to do with git. The split is necessary and
sufficient to provide access to four different useful pieces of commit
machinery. This is more functionality than in current git, and is
provided with more clarity.
The semantics of the two commands above are distinct enough that any
given tutorial introduction to git could outline a complete work-flow
by using only one or the other of the two commands, (or by presenting
one first and then expanding to the other).
The conceptual split here is necessary. In general, neither of the two
commands can be defined in terms of the other. This is independent of
the fact that commit-index-content is more core and provides shared
machinery for commit-working-tree-content. It is also independent of
the fact that commit-working-tree-content _can_ be defined in terms of
commit-index-content in the special case of the "all tracked paths"
form.
The two-way split here is also sufficient. It provides access to four
different, and useful, pieces of commit machinery. Of the four, only
three of these pieces currently exist in git. The new behavior is that
of "commit-index-content paths..." and is actually quite useful as
described in the detailed rationale below.
Finally, the two-way split here is simpler and more clear than the
three different commit commands currently provided by git, ("commit",
"commit paths...", and "commit -a"). The improved clarity comes from
taking advantage of the following standard command-line convention:
If optional arguments are omitted from a command, the command
is semantically equivalent to some default argument being
provided.
This convention is standard across many unix commands and is prevalent
in git itself, (such as commands like git-log defaulting to HEAD when
no revision specifier is provided). Note that this convention is not
followed by the current git-commit. The behavior of "git commit" and
"git commit paths..." involve distinct semantics. It is not the case
that "git commit" is equivalent to "git commit paths..." with some
default argument supplied. Violating this command-line convention is
unkind in general, but it also steals "space" from the command-line
for implementing the semantics of "git commit" with the application of
a <paths...> limit. This is discussed in more detail below.
So, by cleanly separating the two different useful git-commit
behaviors, and applying a standard command-line convention, we end up
with more functionality and less to teach. What's not to love? All
that would be missing is to come up with names for the two
commands. As I promised above, I'm going to avoid proposing any
binding of the concepts to realistic names here, but I will point out
that one of the "names" might very well be a command-line option
alteration of the other command.
Rationale details
-----------------
Although the conceptual split is only two commands, the actual
implementation of this functionality breaks down into four separate
internal behaviors, (based on whether doing "given paths" or "all
tracked paths"). Three of the four exist in git already, while the
fourth is new, (and also useful). Let's review each of the four along
with the names that git currently provides for them:
1. commit-index-content # all paths in the index
This functionality currently exists as "git commit" and is the
oldest and definitely the "most core" git commit command. Until
fairly recently, all other git commit commands could easily be
described as a variation of this functionality.
2. commit-index-content paths...
This functionality does not currently exist in any git commit
command, as far as I know. The behavior is to commit only a
(path-based) subset of the content that has been staged into the
index.
I was originally just going to say that this functionality "might
be useful in some cases", but coincidentally Alan Chandler
happened to request it just yesterday on the list:
I have been editing a set of files to make a commit, and after editing each
one had done a git update-index.
At this point I am just about to commit when I realise that one of the files
has changes in it that really ought to be a separate commit.
So effectively, I want to do one of three things
a) git-commit <that-file>
It's interesting to note that either of the two solutions
suggested in response to Alan might not work in general. For
example, "git reset", would not be a satisfactory solution if the
user had dirty content in any of the affected files compared to
what was staged in the index. Similarly, just removing the
safety-valve on the existing "git commit <that-file>" would commit
the wrong content if the working-tree contents of <that-file> were
dirty with respect to the index.
Now, it might still sound far-fetched to imagine wanting to commit
a subset of something staged in the index while also having dirty
content, but it occurs to me that I would actually _love_ to have
this capability. The case I would use it for is fairly common,
(and something that I think will speak to Junio who often brings
up a similar scenario).
Here's where I would like this functionality:
I receive a patch while I'm in the middle of doing other work,
(but with a clean index compared to HEAD, which is what I've
usually). The patch looks good, so I want to commit it right
away, but I do want to separate it into two or more pieces,
(commonly this is because I want to separate the "add a test
case demonstrating a bug" part from the "fix the bug"
part). So, if I could do:
git apply --index
git commit-index-content <files that add the test case>
git commit-index-content
Then this would do exactly what I want. I wouldn't even have
to think about whether my local modifications are to any of
the same paths as touched by the patch.
Today, in this scenario, what I have to do is to create a
temporary branch with a clean working tree, and then use the index
to stage the commit there. That process involves a few annoyances,
(stashing my dirty work, inventing a free name for the temporary
branch (which usually involves "git branch -D tmp"), switching back
when I'm done, and trying to remember to clean up the branch). The
new capability would let me skip _all_ of that overhead and
instead I could just delight in the beauty and power of the
index. Woo-hoo!
3. commit-working-tree-content # all tracked files
This functionality currently exists as "git commit -a" and, while
not _really_ old in git's history, its invention predates my
initial exposure to git. It has almost always been described in
terms of its implementation, ("first update the index for all
paths in the index, then commit that index").
One benefit of this description is that it forces the user to
learn about the index up front, (and gain a better understanding
of git's model). One cost is that the user is forced to learn a
two-stage implementation for a single-step process, (commit my
changes). I won't try to weigh the costs/benefits here, but
compare this to the description in (4) below.
4. commit-working-tree-content paths...
This functionality currently exists as "git commit paths..." and
is the newest variant of any git-commit command described here.
I think the evolution of what the semantics of the "git commit
paths..." command-line has been is very instructive. There was a
time when this command could be described in terms of a two-stage
manipulation of "the" index just like "commit -a" is described
today. That is:
Old: first update the index for all specified paths, then
commit the index".
But then the semantics were changed and the new description does
not involve the index at all:
New: Commit only the files specified on the command line.
The old behavior is still available with the --include option, but
nobody has ever come out in favor of that being a useful command,
(I agree it is not useful at all). Meanwhile, the new (default)
behavior as been strongly identified by Linus as extremely
useful. Junio has recently noticed that the old --index behavior
is more conceptually consistent with the classic, commit-the-index
definition of the core "git commit", but that's not sufficient
justification for promoting functionality that would never be
useful.
So the evolution of the current "commit paths..." shows utility of
functionality being a primary concern in defining the semantics of
git commands. And that's wonderful.
In my opinion, what has happened with the evolution of "commit paths"
and "commit -a" is that a new conceptual commit behavior has been
invented, (what I've termed commit-working-tree-content), but it
hasn't been recognized yet as separate from the core
commit-index-content nature of "git commit". And there's some muddling
in that simply adding a <paths..> argument to "git commit" completely
changes its semantics, (which violates the command-line convention I
described above).
-Carl