I'm planning some work on Emacs VC mode.
I need a command I can run on a path to tell if it's ignored by git.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
If gun laws in fact worked, the sponsors of this type of legislation
should have no difficulty drawing upon long lists of examples of
criminal acts reduced by such legislation. That they cannot do so
after a century and a half of trying -- that they must sweep under the
rug the southern attempts at gun control in the 1870-1910 period, the
northeastern attempts in the 1920-1939 period, the attempts at both
Federal and State levels in 1965-1976 -- establishes the repeated,
complete and inevitable failure of gun laws to control serious crime.
-- Senator Orrin Hatch, in a 1982 Senate Report
--
What about a variant of:
git ls-files -i -o --exclude-standard
--
That will do nicely, thank you.
There could be something better. Emacs VC mode, and other similar
front ends, would be greatly aided by a command that lists all files,
each with a status code it can understand. Our canonical list
(omitting two that apply only to locking systems) is:
'up-to-date The working file is unmodified with respect to the
latest version on the current branch, and not locked.
'edited The working file has been edited by the user.
'needs-update The file has not been edited by the user, but there is
a more recent version on the current branch stored
in the master file.
'needs-merge The file has been edited by the user, and there is also
a more recent version on the current branch stored in
the master file. This state can only occur if locking
is not used for the file.
'added Scheduled to go into the repository on the next commit.
'removed Scheduled to be deleted from the repository on next commit.
'conflict The file contains conflicts as the result of a merge.
'missing The file is not present in the file system, but the VC
system still tracks it.
'ignored The file showed up in a dir-status listing with a flag
indicating the version-control system is ignoring it,
'unregistered The file is not under version control.
The -t mode of ls-files appears to be almost what is wanted, but not quite.
(Among other things, it does not list ignored files.) I request comment
on some related questions:
1. How do these statuses map to git terminology? My tentative map, in terms
of git file-list -t codes, is
up-to-date = H?
edited = C
needs-update = no equivalent
needs-merge = no equivalent
added = no equivalent
removed = K
conflict = no ...>>>>> "Eric" == Eric Raymond <esr@thyrsus.com> writes: Eric> There could be something better. Emacs VC mode, and other similar Eric> front ends, would be greatly aided by a command that lists all files, Eric> each with a status code it can understand. Our canonical list Eric> (omitting two that apply only to locking systems) is: A lot of these don't make sense for git and other DVCS. How have hg and bzr interpreted these "canonical" states? For example: Eric> 'needs-update The file has not been edited by the user, but there is Eric> a more recent version on the current branch stored Eric> in the master file. This makes sense only with a file-based VCS, not a tree-based VCS like git. Eric> 'needs-merge The file has been edited by the user, and there is also Eric> a more recent version on the current branch stored in Eric> the master file. This state can only occur if locking Eric> is not used for the file. Ditto. Eric> 'removed Scheduled to be deleted from the repository Eric> on next commit. Not useful in git. Eric> 'missing The file is not present in the file system, but the VC Eric> system still tracks it. Not available in git. (If it's not a real file, it can't be tracked. :) Eric> 'ignored The file showed up in a dir-status listing with a flag Eric> indicating the version-control system is ignoring it, Eric> 'unregistered The file is not under version control. These two would be identical in git. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc. See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion --
That asks the question the wrong way around. These state codes
are used to change how VC *itself* performs when you fire various
commands; the VCSes called by the VC back ends never have to
'interpret' them.
It is not expected that every VCS will report all of them; in
particular, as you say, some only make sense in locking systems.
When VC knows it's dealing with a merging system, it will never go
down a logic path where a locking-related state is checked for.
I deleted two of the locking-system-only states from what you saw, but
may have missed others; I don't completely understand all the states,
because at least eleven other people hacked on VC during the 15 years
I was doing other things and added several that were not in my
original design.
(There is some excuse for this. Emacs VC is probably unique in that
its ontology has to be rich enough to accomodate *every VCS there
is*. Nothing else even attempts that, AFAIK.)
But to answer your question at least in part, here is a piece of code
mapping status codes from Mercurial's hg status -A command to Emacs
state codes.
(when (eq 0 status)
(when (null (string-match ".*: No such file or directory$" out))
(let ((state (aref out 0)))
(cond
((eq state ?=) 'up-to-date)
((eq state ?A) 'added)
((eq state ?M) 'edited)
((eq state ?I) 'ignored)
((eq state ?R) 'removed)
((eq state ?!) 'missing)
((eq state ??) 'unregistered)
((eq state ?C) 'up-to-date) ;; Older mercurials use this
(t 'up-to-date)))))))
This is failing to report at least one interesting state,
which is 'conflict. But otherwise it looks pretty complete.
What I'm really looking for is a git functional equivalent of hg status -A.
The git backend presently uses diff-index and interprets the output in
a way that I fear is rather brittle.
I'm inclined to think you are right that 'need-update and ...This isn't about file vs tree, but more about centralized vs distributed. In DVCS workflows "needs-update" as a concept does not even exist when you are working on a topic branch to perfect one thing and one thing only. You do not want to update only because somebody else did some work that may be totally unrelated to what you wanted to achieve on the current branch. I presume that many people use git in centralized workflow where they use only 'master' branch and "git pull ; work ; git commit; git push" are the only things they do. In that setting, "needs-update" may make sense. The VC backend implementation has to do "git fetch" to see if the origin has advanced. Almost the same comment applies to 'needs-merge', but the VC backend not only needs to worry about "file has been edited", but also "commits that Ignored is a subset of Unregistered, no? Neither exists in the index (i.e. not tracked); ignored ones are covered by .gitignore and you need to force "git add" to start tracking them. --
There is also In Git you don't have locking, but you have three versions: in the working area (the working file), in the index, and latest version on the current branch (the HEAD version). So 'up-to-date in Git would probably mean working tree = cached = HEAD Does this include stat-dirty files, i.e. if file has been modified (mtime), but the contents is the same in working file and in HEAD Needs *update* looks like it came from centralized VCS like CVS and Subversion, where you use update-the-commit method. You can't say that HEAD version is more recent that working file... The rought equivalent would be that upstream branch for current branch (e.g. 'origin/master' can be upstream for 'master' branch) is in fast-forward state i.e. current branch is direct ancestor of This, like 'needs-update, looks like it is relevant only in Note that with Git you can have other merge conflict than simple CONFLICT(contents). With CONFLICT(rename/rename) for example the file would not contain textual conflict, so e.g. it won't have conflict Note that file might be missing only in working directory, and can be Probably 'conflict. -- Jakub Narebski Poland ShadeHawk on #git --
Not documented in my installed version, 1.6.3.3. Where can I go in the
Yes, that was what I thought. Is this what ls-files is reporting as 'H'?
(The ls-files -t codes need better documentation. If I get detailed enough
No, it does not. Thank you for asking that question, I have just
added a note about this to the VC code exactly where it will do the
Agreed. But there's no way to tell that this is the case without
doing a pull operation or otherwise querying origin, and I'm
not going to do that.
Explanation: My general rule for DVCS back ends is that the status commands
aren't allowed to do network operations, and it's OK for them not to
report a state code if that would be required. This is so VC will fully
support disconnected operation when the VCS does.
I have, however, added a note to vc-git.el explaining that this is
possible if we ever teach the mode front end to behave differently when
Following your previous logic, I think it would make sense to set this if
we could detect that the upstream of the current branch has forward commits
touching this file. Again, this would require a network operation in the
It is unclear what Emacs wants in this situation; I will try to find out.
The documentation says this:
For now the conflicts are text conflicts. In the
future this might be extended to deal with metadata
conflicts too.
That was my best guess too. Can anyone say more definitely?
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
--
http://thread.gmane.org/gmane.comp.version-control.git/126516 In short, "git ls-files -t" was written long ago, never tested, and probably mostly used by no one. It has a very strange behavior, it's not just the doc. I'd advise against using it. "git status --porcelain" is probably what you want: --porcelain Give the output in a stable, easy-to-parse format for scripts. Currently this is identical to --short output, but is guaranteed not to change in the future, making it safe for scripts. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ --
It sounds very much to me as though this feature should be scheduled Yes, this looks like what I would want, all right - if the status codes were actually *comprehensible*! We should tackle this right now, because VC is not the last front end that will need to parse the format and at least I am willing to patch your docs based on what I learn. Most of your other customers won't do that. I'm going to start a separate thread about this. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> --
Eric, I am working on a similar program (not ready for announcing yet). I have not gotten to the part that would need this, but I would be happy to start planning that stage and work with you to make sure that this feature met both of our needs, and help write the documentation if need be. (Sorry for the double everyone in To/Cc, gmail defaulted to HTML email and it was rejected from the list. I had to To/Cc you all again so that Reply All from list members would work as expected.) Daniel http://www.doomstick.com --
I'm willing to cooperate. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> --
It was added primarily for Cogito, which is presumably dead by now. --
It was *documented* in git version 1.7.0 in
7c9f703 (commit: support alternate status formats, 2009-09-05)
I am running git version 1.7.0.1.
BTW. it is only since git 1.7.0 that "git status" is no longer
"git commit --dry-run"... and has sane behaviour wrt. specifying paths.
Actually it would not require network access, but it would require extra
work, and equivalents of 'needs-update and 'needs-merge would not exist
in all cases (in all situations).
In Git you have remote-tracking branches, which are tracking where
branches in remote repository point to. Since quite some time by default
the reside in 'refs/remotes/<remote>/' namespace, while ordinary local
branches in 'refs/heads/' namespace. For example remote-tracking branch
'refs/remotes/origin/master', usually referred to in short as
'origin/master', tracks (follows) branch 'master' ('refs/heads/master')
in remote 'origin'. Those branches might be out-of-date with respect
to remote repository, and to update them you need network connection.
Local branches can be created to "track" other branches, to base work
on the other branches. In particular you need to create local branch
which "tracks", or in other words has as 'upstream' some remote-tracking
branch, as you cannot work on non-local branch (outside 'refs/heads/'
namespace).
Now, *if* you are on branch with some upstream, you can check without
need for network operation whether "git pull" would do if there were
no new changes in remote, which means what "git merge <upstream>" would
do (pull = fetch + merge).
We can check if remote-tracking branch, which is upstream of current
branch, modified current file. We can also check if remote-tracking
branch is in fast-forwardable state wrt. current branch (the equivalent
of 'needs-update state, I guess), or did remote-tracking branch diverged
from current branch (the equivalent of 'needs-merge state, I guess).
All this without need for network operation... but all this based on
current information that ...You can query the origin _as it was on the last fetch_.
If you are on branch X, the logic is as follows:
- Let R be the value of configuration key branch.X.remote,
- let M be the value of configuration key branch.X.merge,
- for all values S of configuration key remote.R.fetch,
- strip an initial +
- if S is M:N, return N
- if S is P/*:Q/* where P is a prefix of M, take M, replace this
prefix with Q and return the result
In the most common case you will have:
- X = master
- R = origin
- M = refs/heads/master
- one key S = +refs/heads/*:refs/remotes/origin/*
so the prefix "refs/heads/" is replaced with "refs/remotes/origin/" and
the result is refs/remotes/origin/master.
Paolo
--
BTW, this procedure is complex enough that we have exposed it via a plumbing interface: $ git for-each-ref --format='%(upstream)' refs/heads/master refs/remotes/origin/master which does all of the correct magic internally. -Peff --
I personally use Magit [1]. Just thought you might want to look at it. -- Ram [1] http://zagadka.vm.bytemark.co.uk/magit/ --
Eric might be a bit too personally invested vc.el at this point :) But yeah, magit is great, unlike vc-dir and vc it makes really good use of Git's index & stash features. Instead of staging individual files for commit you stage chunks, the quality and granularity of my commits has gone up since I switched to it from vc due to that. But to help with the original question: magit has an ignore feature but it doesn't check whether something is ignored, it just counts on you not ignoring already ignored stuff because it isn't displayed to you. Depending on how you're planning to implement .gitignore support you might want to go this route. --
Well, there's that, and then there's the fact that I really do use multiple VCSes. Consistent interface for all of them -> win. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> --
