I'm going to gripe a lot in this mail, possibly verging on flaming.
Therefore I want to start by making clear that I am not here to
complain without pitching in to help fix the problems. If I can get
responsive answers to my questions, I will take responsibility for
editing them into the relevant git documentation,
Short version: "git status --porcelain" is horribly badly documented
and appears to be seriously maldesigned. Both these problems need to
be fixed before git causes a lot of unnecessary grief for people
trying to use it.
Here is the entire documentation on this feature in HEAD:
=============================================================================
In short-format, the status of each path is shown as
XY PATH1 -> PATH2
where `PATH1` is the path in the `HEAD`, and ` -> PATH2` part is
shown only when `PATH1` corresponds to a different path in the
index/worktree (i.e. renamed).
For unmerged entries, `X` shows the status of stage #2 (i.e. ours) and `Y`
shows the status of stage #3 (i.e. theirs).
For entries that do not have conflicts, `X` shows the status of the index,
and `Y` shows the status of the work tree. For untracked paths, `XY` are
`??`.
X Y Meaning
-------------------------------------------------
[MD] not updated
M [ MD] updated in index
A [ MD] added to index
D [ MD] deleted from index
R [ MD] renamed in index
C [ MD] copied in index
[MARC] index and work tree matches
[ MARC] M work tree changed since index
[ MARC] D deleted in work tree
-------------------------------------------------
D D unmerged, both deleted
A U unmerged, added by us
U D unmerged, deleted by them
U A unmerged, added by them
D U unmerged, deleted by us
A A unmerged, both added
U U unmerged, both ...These all fall within "Patches welcome" category (meaning: I agree the
Is that DD really "illustrative", or did you mean to say "only/sole"?
You should never get "DD" in non-conflicting case. I think I was fairly
careful not to make them ambiguous when I did that code, but apparently I
wasn't so careful about the documentation.
Thanks for going through this area with fine comb.
diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index 1cab91b..313dd04 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -86,7 +86,7 @@ and `Y` shows the status of the work tree. For untracked paths, `XY` are
[MD] not updated
M [ MD] updated in index
A [ MD] added to index
- D [ MD] deleted from index
+ D [ M] deleted from index
R [ MD] renamed in index
C [ MD] copied in index
[MARC] index and work tree matches
--
Note that "status --porcelain" is brand new in v1.7.0, so you may be among the first to be seriously reading the documentation. As Junio said, I think patches in this area are very welcome. My answers below are meant to help you understand. I omitted the "...and yes, this should be documented better" from the end of each, but you can It's a space. But more importantly, the path columns are actually C-quoted. E.g.: $ perl -e 'open foo, ">", "foo\n"'\ $ git add . $ git status --porcelain A "foo\n" If your parser supports it, it will almost certainly be easier to use "-z": $ git status --porcelain -z | cat -A A foo$ ^@ Do note that for the 'R'ename status, you will get _two_ NUL-terminated entries, and they will be in the order of "to\0from\0", whereas the non-NUL form is "from -> to" (and no, I doubt this is adequately They are the same as in "git diff --name-status", which in turn has kind The terms "us / ours" and "them / theirs" are frequently used in the git documentation. I'm not sure if they are ever defined rigorously. They are only meaningful in a merging context, and basically refer to the two sides of a merge. If I am on branch "master" and do "git merge foo", then "us" refers to the master branch and the the contents of index stage 2 (bear with me a moment, I'll define that in a second). "Them" refers to branch "foo" and index stage 3. Git's "index" is where it keeps uncommitted state about files it tracks (sort of like CVS/Entries, if that helps, except that git exposes the concept much more). Most of the time, you use it for building a commit incrementally. You "git add" files to the index, and then "git commit" creates a new commit from the contents of your index. But the index actually has several different slots for each file entry, which are called stages, and each has a number. "Stage #0" is the "normal" stage, which you use as described in the last paragraph. During a merge, entries with conflicts use the other ...
"git clean -n -d" may help. Just my 2¢, Jonathan --
err, "git clean -n -d -X". I am also not sure how stable the "Would remove " output format is, or how stable we want it to be. Probably not stable at all, so sorry about that. Jonathan --
That's the same information, isn't it? You do "git clean -ndX" to see _everything_ that is untracked, and "git clean -nd" to see things that are untracked but not ignored. So I think it is just as painful to use as ls-files, but as you noted, it is not really plumbing. -Peff --
No, the capital X tells clean to only list excluded files. The standard use is as a poor man’s “make maintainer-clean”, leaving unrelated files alone. I only learned about it just now. I’m glad I did (I often use the lowercase version for this because I just didn’t know about -X), but as you mentioned, it is not so applicable here because not plumbing. Jonathan --
Ah, I read it as "-x" (probably because I had never heard of "-X" either...). So yes, it would do the right thing. I still think a --show-ignored option to git-status would probably be better (in addition to being sanctioned plumbing, it means we only have to traverse the tree once The "-X" mode seems much safer to me, as you are less likely to blow away things you actually wanted to keep while cleaning the tree of crufty build products. It seems like it should have been the easier-to-type "-x", but it is far too late for such bikeshedding at this point. Thanks for the pointer. -Peff --
They do that quite well. Thank you. I've got a couple of other things on my plate, including prepping for a GPSD point release early next week, so I can't respond immediately. Expect a response and some patches Tueday, Wednesday, or Thursday of next week. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> --
Not true. If the second form was used, then you _can_ split on \0. It will tokenise the data for you, and then you consume ether two or three tokens depending on the status flags. So it would make the parsing simpler. But to make it even easier, how about adding a -Z that makes the output format "XY\0file1\0[file2]\0" (i.e. always three tokens per record, with the third token being empty if there is no second filename)? Though if future expandability was wanted you could end each record with \0\0 and then parsing would be a two stages of split on \0\0 for records and then split on \0 for entries? The is already precedence for the -z option to change the output format, so a second similar switch should be ok? Then the updated documentation could recommend --porcelain -Z for new users without affecting old ones. -- Julian --
+1 -Z could fix some of the other issues, as well, like use of space as a flag character. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> --
On Sat, Apr 10, 2010 at 11:35 PM, Julian Phillips Surely that won't work - if file2 can be empty, \0[file2]\0 reduces to \0\0 which would be confused with the \0\0 proposed as a record separator. jon. --
On Sun, 11 Apr 2010 00:56:47 +1000, Jon Seymour <jon.seymour@gmail.com> Yes. But they were alternative suggestions, so if using \0\0 as the record marker you would omit the second filename when empty as is currently done. -- Julian --
On Sun, Apr 11, 2010 at 1:50 AM, Julian Phillips Ah, apologies. I appear to have failed to parse a necessary disjunctive :-) jon. --
Add a new output format option to git-status that is a more extreme
form of the -z output that places a NUL between all parts of the
record, and always has three entries per record, even when only two
are relevant. This make the parsing of --porcelain output much
simpler for the consumer.
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
---
Something like this for the first variant (fixed three entries per record)
perhaps ... (though a proper patch would probably want some tests too)
builtin/commit.c | 6 ++++--
wt-status.c | 19 ++++++++++++++-----
2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index c5ab683..acbcefc 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1025,8 +1025,10 @@ int cmd_status(int argc, const char **argv, const char *prefix)
OPT_SET_INT(0, "porcelain", &status_format,
"show porcelain output format",
STATUS_FORMAT_PORCELAIN),
- OPT_BOOLEAN('z', "null", &null_termination,
- "terminate entries with NUL"),
+ OPT_SET_INT('z', "null", &null_termination,
+ "terminate entries with NUL", 1),
+ OPT_SET_INT('Z', "intense-null", &null_termination,
+ "use NUL for all seperators, including absent values", 2),
{ OPTION_STRING, 'u', "untracked-files", &untracked_files_arg,
"mode",
"show untracked files, optional modes: all, normal, no. (Default: all)",
diff --git a/wt-status.c b/wt-status.c
index 8ca59a2..9f23ec6 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -663,7 +663,9 @@ static void wt_shortstatus_unmerged(int null_termination, struct string_list_ite
case 7: how = "UU"; break; /* both modified */
}
color_fprintf(s->fp, color(WT_STATUS_UNMERGED, s), "%s", how);
- if (null_termination) {
+ if (null_termination == 2) {
+ fprintf(stdout, "%c%s%c%c", 0, it->string, 0, 0);
+ } else if (null_termination) {
fprintf(stdout, " %s%c", it->string, 0);
} else {
struct strbuf onebuf = STRBUF_INIT;
@@ -687,14 ...If you're open to changing this to lose the exiguous "-> " and use "-" instead of " " as a status character, that would make me happy and fix the rest of the design problems with the format. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> --
If you use "--porcelain -Z" then you don't get the "->", the format is always XY<NUL><file1><NUL><file2><NUL>, with <file2> being an empty string if only file1 is relevant. I didn't use "-" instead of " " as that seemed out of scope for a output formatting option. Though I don't personally have an objection to it, I also don't see a particularly strong need for it as with the -Z format there is no ambiguity. If you're talking about the output without -Z, then changing the format raises compatibility issues, and were talking about something more like --porcelain2 or --porcelain=new and I don't know if that would be considered acceptable. -- Julian --
Good point. OK, the combinaation of -Z and a switch to list ignored files should solve Emacs VC's problem. Having some sort of JSON dump might still not be a bad idea. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> --
This adds a --json switch to status, which enables a json output
format. This provides a standard output format that should be easily
parsed by scripts using any of the large number of readily available
json libraries.
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
---
Starter for 10 ...
builtin/commit.c | 10 ++++
wt-status.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
wt-status.h | 1 +
3 files changed, 143 insertions(+), 0 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index c5ab683..f2b5cfa 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -91,6 +91,7 @@ static enum {
STATUS_FORMAT_LONG,
STATUS_FORMAT_SHORT,
STATUS_FORMAT_PORCELAIN,
+ STATUS_FORMAT_JSON,
} status_format = STATUS_FORMAT_LONG;
static int opt_parse_m(const struct option *opt, const char *arg, int unset)
@@ -422,6 +423,9 @@ static int run_status(FILE *fp, const char *index_file, const char *prefix, int
case STATUS_FORMAT_PORCELAIN:
wt_porcelain_print(s, null_termination);
break;
+ case STATUS_FORMAT_JSON:
+ wt_json_print(s);
+ break;
case STATUS_FORMAT_LONG:
wt_status_print(s);
break;
@@ -1025,6 +1029,9 @@ int cmd_status(int argc, const char **argv, const char *prefix)
OPT_SET_INT(0, "porcelain", &status_format,
"show porcelain output format",
STATUS_FORMAT_PORCELAIN),
+ OPT_SET_INT(0, "json", &status_format,
+ "show json output format",
+ STATUS_FORMAT_JSON),
OPT_BOOLEAN('z', "null", &null_termination,
"terminate entries with NUL"),
{ OPTION_STRING, 'u', "untracked-files", &untracked_files_arg,
@@ -1068,6 +1075,9 @@ int cmd_status(int argc, const char **argv, const char *prefix)
case STATUS_FORMAT_PORCELAIN:
wt_porcelain_print(&s, null_termination);
break;
+ case STATUS_FORMAT_JSON:
+ wt_json_print(&s);
+ break;
case STATUS_FORMAT_LONG:
s.verbose = verbose;
wt_status_print(&s);
diff --git a/wt-status.c b/wt-status.c
index ...