Re: ghost refs

Previous thread: Re: [ANNOUNCE] Git 1.7.1.rc0 by Sverre Rabbelier on Wednesday, April 7, 2010 - 8:25 am. (1 message)

Next thread: rebasing and submodules by John Dlugosz on Wednesday, April 7, 2010 - 10:21 am. (1 message)
From: John Dlugosz
Subject: ghost refs
Date: Wednesday, April 7, 2010 - 9:38 am

A couple times I've seen people who have some reference remotes/origin/foo after foo has been removed from origin.  What is the proper way to address that, other than removing the file directly?  It appears to not go away with a "fetch" even though it was deleted from the origin.  So what is the proper way to delete something on the origin so the deletion propagates?  I normally use "git push origin :foo".

--John
(sorry about the footer)

TradeStation Group, Inc. is a publicly-traded holding company (NASDAQ GS: TRAD) of three operating subsidiaries, TradeStation Securities, Inc. (Member NYSE, FINRA, SIPC and NFA), TradeStation Technologies, Inc., a trading software and subscription company, and TradeStation Europe Limited, a United Kingdom, FSA-authorized introducing brokerage firm. None of these companies provides trading or investment advice, recommendations or endorsements of any kind. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
--

From: Avery Pennarun
Date: Wednesday, April 7, 2010 - 9:58 am

This is on purpose, based on the theory that you don't want to lose
data from your local repo just because someone (accidentally?) deletes
a branch on the remote server.  Unfortunately, this theory is a bit
flawed, since someone could just as easily overwrite the remote branch
with a totally different commit, and you'd still lose it in *that*
case.  So mostly it's just confusing.

Anyway, what you want is "git remote prune origin".

Have fun,

Avery
--

From: Jeff King
Date: Wednesday, April 7, 2010 - 2:00 pm

You do have a reflog in the case of overwrite. Delete kills off any
associated reflog (it would be cool if we had a "graveyard" reflog that

Yep. I think there is "git fetch --prune" these days, too. We could
perhaps add a config option if there isn't one already (I didn't look)
so this happens automatically.

-Peff
--

From: John Dlugosz
Date: Wednesday, April 7, 2010 - 3:00 pm

[Empty message]
From: Avery Pennarun
Date: Wednesday, April 7, 2010 - 3:03 pm

This used to be true, but I have confirmed that with the latest
version of git, remote refs have reflogs (as they should for safety).

Have fun,

Avery
--

From: John Dlugosz
Date: Wednesday, April 7, 2010 - 3:10 pm

TradeStation Group, Inc. is a publicly-traded holding company (NASDAQ GS: TRAD) of three operating subsidiaries, TradeStation Securities, Inc. (Member NYSE, FINRA, SIPC and NFA), TradeStation Technologies, Inc., a trading software and subscription company, and TradeStation Europe Limited, a United Kingdom, FSA-authorized introducing brokerage firm. None of these companies provides trading or investment advice, recommendations or endorsements of any kind. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
--

From: Avery Pennarun
Date: Wednesday, April 7, 2010 - 3:11 pm

Why not try it and find out?

I've never asked for a reflog explicitly and I seem to get them.

Avery
--

From: Jeff King
Date: Wednesday, April 7, 2010 - 9:30 pm

We create logs for remote branches when core.logallrefupdates is set
since e19b9dd (core.logallrefupdates: log remotes/ tracking branches.,
2006-12-28).

We turned on logallrefupdates by default in non-bare repositories in
0bee591 (Enable reflogs by default in any repository with a working
directory., 2006-12-14).

Both were in v1.5.0. So it used to not be the case that we created such
reflogs, but it has been for quite some time.

-Peff
--

From: John Dlugosz
Date: Thursday, April 8, 2010 - 9:07 am

Thanks, that's good to know, and reassuring since I was worried that some front-end tools were not using the flags as described in the tutorials.

Who maintains the documentation?

In git-branch,

	-l

	    Create the branch's reflog. This activates recording of all changes made to the branch ref, 	enabling use of date based sha1 expressions such as "<branchname>@{yesterday}".

the implication is that if you don't use this flag, the feature is not enabled or activated.


I do see upon reviewing the User's Manual etc. that it no longer tells you to use the -l flag.  That must have been revised since I learned it, a year or two ago.

--John







TradeStation Group, Inc. is a publicly-traded holding company (NASDAQ GS: TRAD) of three operating subsidiaries, TradeStation Securities, Inc. (Member NYSE, FINRA, SIPC and NFA), TradeStation Technologies, Inc., a trading software and subscription company, and TradeStation Europe Limited, a United Kingdom, FSA-authorized introducing brokerage firm. None of these companies provides trading or investment advice, recommendations or endorsements of any kind. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
From: Junio C Hamano
Date: Thursday, April 8, 2010 - 9:55 am

That is how you selectively enable reflog for that particular branch when
you have explicitly disabled "reflog by default" with the configuration.
--

From: Jeff King
Date: Thursday, April 8, 2010 - 12:49 pm

Maybe:

-- >8 --
Subject: [PATCH] docs: clarify "branch -l"

This option is mostly useless these days because we turn on
reflogs by default in non-bare repos.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-branch.txt |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 903a690..d78f4c7 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -72,6 +72,8 @@ OPTIONS
 	Create the branch's reflog.  This activates recording of
 	all changes made to the branch ref, enabling use of date
 	based sha1 expressions such as "<branchname>@\{yesterday}".
+	Note that in non-bare repositories, reflogs are usually
+	enabled by default by the `core.logallrefupdates` config option.
 
 -f::
 --force::
-- 
1.7.1.rc0.248.g055378.dirty

--

From: Junio C Hamano
Date: Thursday, April 8, 2010 - 1:42 pm

That certainly is an improvement, but I've been wondering if it makes
sense to also have a section in each commands the configuration variables
that affects the behaviour of the command.  core.logallrefupdates surely
is not the only variable that affects how "git branch" behaves.

We might want to have a general concensus on what we want to have in the
documentation.  As you noted, some have too sparse SYNOPSIS, while others
have full list of options.  Some mention configuration variables, while
others don't.  Some have extensive examples, while others lack any.
Once we know the general direction in which we are going, we can hand off
the actual documentation updates to the crowd ;-)

I'll list my preference off the top of my head as a firestarter.

NAME::

The name followed by what it is used for

SYNOPSIS::

I prefer to have (almost) complete set of options in SYNOPSIS, rather than
"command [<options>] <args>..." which is next to useless.  This is
especially true for commands whose one set of options is incompatible with
other set of options and arguments (e.g. there is no place for "-b" to
"checkout" that checks out paths out of the index or a tree-ish).

I also prefer not to list "purely for backward compatibility" options in
SYNOPSIS section.

DESCRIPTION::

The description section should first state what the command is used for,
iow, in which situation the user might want to use that command.

OPTIONS::

List of full options.  Some existing pages list them alphabetically, while
others list them in functional groups.  I prefer the latter which tends to
make the page more concise, and is more suited for people who got used to
the system (and remember, nobody stays to be a newbie forever, and people
who stay to be newbies forever are not our primary audience).

Detailed discussion of concepts::

Some manual pages need to have discussion of basic concepts that would not
be a good fit for the DESCRIPTION section (e.g. "Detached HEAD" section in
"checkout" ...
From: Avery Pennarun
Date: Thursday, April 8, 2010 - 3:14 pm

I agree that you bring up a good point here.  I just hope you don't

The length of the synopsis section doesn't affect me much.  Mentioning
the equivalent config variable next to a command-line option, where
one exists, would probably be nice.

It might be okay to not actually describe in each manpage how the
relevant config options work; just referring people to git-config is
probably okay.  Having them all in git-config is useful in itself.

As for examples, well, people seem to really love examples.  So if
someone sends a patch to add more examples, I'm hoping there's no

I almost agree with you, except that nowadays there are *so* many
options that it doesn't really help much to have them all listed up
there.  It might be better to list only the most common ones.

When the same command has multiple modes, I agree that it makes sense


I actually get mildly annoyed when man pages don't list the options in
alphabetical order, because I naturally start looking for them in that
order.  But I can just as easily do a search for the option, so that's
probably just me being pointless.  In contrast, my pager can't help me
sort out the options by functional group, so that's probably a more
useful way to do it.

That said, I don't think consistency here is much benefit.  It's okay
if for some pages, functional groups aren't needed so alphabetical

I think some pages have a DISCUSSION section right at the bottom,
after the description, options, and examples.  This seems like a good
way to do it.  man pages should have concise stuff so you can find the
information quickly, but there's nothing wrong with having detailed

To be honest, I've often wished that the plumbing pages would also
have such detailed examples. :)

Which reminds me, it would be really great if somehow each command's
manual would describe a) whether it's plumbing or porcelain, and b)
the alternative to look for if what you *need* is plumbing or
porcelain and the command is the wrong one.  But I don't know ...
From: Nicolas Sebrecht
Date: Thursday, April 8, 2010 - 4:04 pm

Nice. If a concensus is found for writing consistent documentation I
think it's worth to save in a file like Documentation/README or
Documentation/CONTRIBUTE.

-- 
Nicolas Sebrecht
--

From: Jeff King
Date: Saturday, April 17, 2010 - 4:51 am

I would also like to have consensus on this, too. But it seems like it
gets bikeshedded to death every time it comes up.  But hey, why not try


I much prefer to have the "major modes of operation". So yes, "command
[<options>] <args>" is useless. But

  git log [<option>] [<since>..<until>] [[--] <path>...]

is sparse but useful. You immediately get a sense of how to invoke the
command, and it is very readable. If you were to put in the dozens of
possible options, it would become hard to see what it is saying. If you
want a complete list of options (IMHO), they should be in list form.

As another example, for git-branch, I would suggest:

  git branch [<options>]
  git branch [<options>] <branchname> <start-point>
  git branch -m [<oldbranch>] <newbranch>
  git branch -d [<options>] <branchname>

From that I can quickly see that there are four major modes: listing,
creating a new branch, moving a branch, and deleting a branch. I would
also be happy if each mode was explicitly described. Some of my favorite
synopses are those of perl modules, which tend to give you a very short
and readable code snippet of how you might use the module, along with
comments showing anything non-obvious.

In the case of branch, enumerating the options in the synopsis doesn't
bother me much, because there are few enough that it remains fairly
readable. But something something like "git format-patch" or "git apply"
are getting pretty long.

I know that others disagree, though. When this came up last time, some
people said they really like having that giant clump of options. Our


Yes. Also, it should probably discuss the different modes of operation

I also prefer sorting by functionality. The only reason to prefer
alphabetical is for people finding a specific option. Presumably their
pager has a search function (whereas for grouping by functionality, I
agree with your conciseness argument, and it means you are more likely

I would really prefer most of this material to be pushed out ...
From: Junio C Hamano
Date: Saturday, April 17, 2010 - 9:32 am

I like the suggested outcome.

One way of doing this is to strip the description from pretty-format.txt
and move the description to gitpretty.txt (and anything that supports
pretty format will continue to include pretty-format.txt).

But we will need to list _all_ the options twice if we go this route;
pretty-format.txt for the heading, and the descriptions in gitpretty.txt.
Perhaps pretty-format.txt can be autogenerated from gitpretty.txt to keep


Yes, I like it.

--

From: Jakub Narebski
Date: Saturday, April 17, 2010 - 9:57 am

Well, there are some variables, like advice.*, or core.*, or alias.*, or
color.*, or browser.<tool>.path, or i18n.*, or interactive.singlekey,
or notes.*, or user.* that do not really belong to single git command
(well, perhaps they could be put in git(1) manpage), or belong to more
than one command.

-- 
Jakub Narebski
Poland
ShadeHawk on #git
--

From: Junio C Hamano
Date: Saturday, April 17, 2010 - 5:28 pm

So?

Naturally they will be listed like:

    alias.*		git(1)
    color.diff.*	git-diff(1)
    browser.*.path      git(1)
    ...

and I don't see a problem in the general structure Jeff suggested.

    
--

From: John Dlugosz
Date: Monday, April 19, 2010 - 8:33 am

[Empty message]
From: Yann Dirson
Date: Tuesday, April 20, 2010 - 12:02 am

Wouldn't it jus be sufficient to keep reflogs on branch deletion, and let reflog
entries subject be expired by gc just like for any branch, so that way we may
only need to gc the reflog itself when it becomes empty ?

-- 
Yann

--

From: Jeff King
Date: Tuesday, April 20, 2010 - 4:51 am

Almost. The complication is that a branch "foo" prevents any branch
"foo/bar" from being created. So if you leave the reflog in place, you
are blocking the creation of the reflog for a new branch.

So you need some solution to that problem. Things I thought of are:

  1. Leave the reflog in place until such a foo/bar branch is created.
     But that means branch creation unexpectedly kills off old unrelated
     reflog entries. Combingin user surprise and destruction of data is
     probably bad.

  2. Make a refs/dead hierarchy so that the reflogs don't interfere with
     new branches. This just pushes off the problem, though, for when
     you try to delete "foo/bar" and see that "refs/dead/foo" is already
     blocking its spot in the reflog graveyard.

  3. Stick everything in a big "graveyard" reflog. I think there are
     some complications here with the reflog format, though. Namely:

       - reflog entries don't actually name the ref they're on. We could
         munge the comment field to add the name of the ref as we put
         them in the graveyard ref.

       - entries just have a timestamp, and I think we assume they're in
         order. So I guess we can merge-sort the old graveyard ref with
         what we're adding to keep things in order. But it means you
         will have entries from various refs interspersed. I guess that
         is OK, though, as it's not unlike the HEAD reflog.

So (3) seems like the only viable option to me, but I would be happy to
hear alternatives.

-Peff
--

From: Zefram
Date: Tuesday, April 20, 2010 - 5:02 am

This is easily solved by tweaking the name for dead reflogs.
logs/dead_refs/foo~ doesn't clash with logs/dead_refs/foo/bar~.

You might also want to stick a sequence number into the filename, for
when you delete more than one foo/bar branch.

-zefram
--

From: Yann Dirson
Date: Tuesday, April 20, 2010 - 6:00 am

Le Tue, 20 Apr 2010 13:02:28 +0100,

That sounds cool.  A logs/dead_refs/ namespace of some sort seems to be
unavoidable, to avoid the clash between old "logs/refs/foo/bar~"
and new "logs/refs/foo".

We would also need a syntax for accessing those.  Maybe something
reminiscent of Debian "epochs" in version number.  That would
give a syntax like "foo@{1:1}" and "foo@{2:1}" to access the dead and
long-dead refs' logs, respectively looking into foo~<largest> and
foo~<largest-1>.

Going that way, we would probably want to add a "delete" entries in the
reflog when deleting a ref - but that would make "foo@{1:0}" a
non-sense, we could just reject it.


Another option than adding a sequence number would be to move back the
dead_refs/ log back to refs/ when the branch is creating again.  That
way just after resurection we have:

	foo@{0}	: now
	foo@{1} : invalid (deleted state)
	foo@{2} : the ref as it was 2 operations before

That would kinda make sense too, but then if the new "foo" is something
completely unrelated, we may rather want to refer to foo{1:1}
(which is stable until next deletion of foo) rather than foo@{2}, which
varies with current foo.  But the 1st solution could give us that too,
by considering logs/dead_refs/foo~ the logical continuation of
logs/refs/foo.

Would that make sense ?
-- 
Yann
--

From: Zefram
Date: Tuesday, April 20, 2010 - 6:14 am

Yes, that also makes sense.  Pick one model or the other.  A mixture
would *not* make sense.

-zefram
--

From: Jay Soffian
Date: Tuesday, April 20, 2010 - 6:33 am

4. Just append to the existing reflog? Given:

$ git checkout -b topic origin/master # 1
$ git add; git commit ...
$ git checkout master
$ git merge topic
$ git branch -d topic
$ git checkout -b topic origin/master # 2

Whose to say that the branch named topic from (1) and the branch named
topic from (2) are unrelated? Isn't the fact that they have the same
name is an indication that they are likely to be related. And even if
they are unrelated, what's wrong with re-using the same reflog?

Wouldn't it be obvious what happened? e.g.:

64c7587 topic@{0}: branch: Created from HEAD
abcdef3 topic@{1}: branch: deleted topic    <---- I made this one up
3568c4b topic@{2}: commit: turned the knob to 11
707d9fb topic@{3}: branch: Created from HEAD

j.
--

From: Jeff King
Date: Tuesday, April 20, 2010 - 7:24 am

I like how the user would interact with that, but what happens with:

  git checkout -b topic/subtopic

The reflog of the deleted branch is in the way.

-Peff
--

From: Yann Dirson
Date: Tuesday, April 20, 2010 - 7:42 am

Le Tue, 20 Apr 2010 10:24:44 -0400,

That would be addressed by considering logs/dead_refs/* contents
*logical* continuations of logs/refs/* (bottom-most suggestion in my
other email)

-- 
Yann
--

From: Jay Soffian
Date: Tuesday, April 20, 2010 - 7:52 am

Handle it just as gracefully as we do today. This is what happens when
you try to create a branch with a similar collision:

$ git branch foo/bar
$ git branch foo
error: there are still refs under 'refs/heads/foo'
fatal: Failed to lock ref for update: Is a directory

So the reflog analog would be:

$ git branch topic/subtopic
error: there are still logs under 'logs/refs/heads/topic'
fatal: Failed to lock log for update: Is a directory

I think it's an edge case; thus I think it's okay to fail as long as
we give a reasonable error and a way to rename it.

j.
--

From: Alex Riesen
Date: Tuesday, April 20, 2010 - 8:03 am

No it is not. Creation of the reflog is not the purpose of
git branch operation (creation of the branch itself is).
It will be just annoyance, especially if the user will
have to do a rename which could be done automatically.
--

From: Jeff King
Date: Tuesday, April 20, 2010 - 8:10 am

Yeah, but my next step would be "branch -d foo/bar"; under your proposal
that no longer works. Now I have to do "branch -m foo/bar foobar" where
"foobar" is some name that I know means "the old reflog for foo/bar".

So I think it makes more sense to come up with that naming scheme

It is an edge-case, but I'd rather just have a scheme that works nicely
in the normal case and "degrades" only in the error case. Like if
creating "foo/bar" we see that we have "foo", but that the last reflog
entry is deletion, we move "foo" to "foo-1" or something. It's ugly, but
it just doesn't come up that much.

-Peff
--

Previous thread: Re: [ANNOUNCE] Git 1.7.1.rc0 by Sverre Rabbelier on Wednesday, April 7, 2010 - 8:25 am. (1 message)

Next thread: rebasing and submodules by John Dlugosz on Wednesday, April 7, 2010 - 10:21 am. (1 message)