Re: [PATCH 2/2] git-gc: skip stashes when expiring reflogs

Previous thread: [PATCH] Enhanced auto-discovery of httpd location and call conventions. by Flavio Poletti (polettix) on Wednesday, June 11, 2008 - 6:51 pm. (5 messages)

Next thread: [StGit PATCH 00/14] Undo series by Karl on Wednesday, June 11, 2008 - 10:34 pm. (32 messages)
From: Eric Raible
Date: Wednesday, June 11, 2008 - 9:32 pm

This unfortunately goes against the recommended usage in John Wiegley's
otherwise excellent "Git From the Bottom Up".  I've contacted him separately to
make him aware of the collective wisdom of not relying on stashes for long-term
storage.

- Eric


--

From: Wincent Colaiuta
Date: Wednesday, June 11, 2008 - 10:35 pm

Yes, we shouldn't _encourage_ people to use stashes as a long-term  
storage mechanism, but neither should we allow old stashes to silently  
disappear as a result of reflog expiry, especially as part of  
automatic garbage collection. There are two reasons:

(1) Normal reflogs accumulate cruft automatically through normal use  
and if not cleaned up they'll just grow and grow and grow. On the  
other hand, for "git stash" to accumulate cruft over the long term the  
user actually has to take action and _abuse_ them. Abuse is less  
likely because it requires this conscious action, and as the output of  
"git stash list" gets bigger and more unwieldy this will serve to  
encourage people to clean out their stashes themselves, or not let the  
list grow out of control in the first place. In other words, the size  
of the stash reflog is unlikely to be a problem.

(2) Automatically expiring normal reflogs is a service to the user,  
because it's cleaning up something that is automatically generated.  
Stashes are the result of a concious user decision to create them, so  
automatically "cleaning them up" is _not_ going to help the user.

So yes, branches _are_ better and more appropriate for long term  
storage than stashes, but even so I don't think it's right for us to  
risk throwing away information that the user explicitly stashed and  
expected Git to look after for them.

Wincent



--

From: Nicolas Pitre
Date: Thursday, June 12, 2008 - 7:14 am

Fair enough.


Nicolas
--

From: Junio C Hamano
Date: Thursday, June 12, 2008 - 1:13 pm

Yes, but for a limited amount of time.


--

From: Eric Raible
Date: Thursday, June 12, 2008 - 1:35 pm

A limited amount of time?  Why is that?  Can you give a rationale which
at least addresses Wincent's points?

It's bad enough that 'git stash clear' drops all pending stashes w/out
at least echoing them (the way git stash drop does), but what is the
rationale for having git _ever_ forget information which it was specifically
requested to remember?

I know that stash is implemented in terms of reflogs, but that seems
to me an implementation detail which ought not leak out.  Especially
if that leakage ends up forgetting potentially important data.

- Eric
--

From: Junio C Hamano
Date: Thursday, June 12, 2008 - 1:51 pm

Perhaps

 http://thread.gmane.org/gmane.comp.version-control.git/84665/focus=84670

The user explicitly asks to stash it for a while, where the definition of
the "while" comes from reflog's retention period.

--

From: Eric Raible
Date: Thursday, June 12, 2008 - 2:36 pm

But that doesn't answer the basic question as to why it's ok
to trash data that the user explicitly asked git to save?

The fact that stash is implemented in terms of reflogs seems irrelevant to me.
If stash were implemented via branches and cherry picking would it still be
natural to automatically expire them?

At the very least the man page might want to mention the temporary nature
of stashes.  Better yet when 'git stash' could print that out when creating
the stash, eh?

- Eric
--

From: Johannes Schindelin
Date: Thursday, June 12, 2008 - 9:52 pm

Hi,


If the user really asked git to save the changes, she would have 
_committed_ them.

"git stash" really is only about shelving quickly and dirtily something 
you'll need (or maybe need) in a moment.

If you need something from the stash a day after stashing it, you have a 
serious problem with understanding what branches are for.

Ciao,
Dscho
--

From: Wincent Colaiuta
Date: Friday, June 13, 2008 - 1:43 am

While this may be true for codebases which move forward quickly, what  
about one which is basically finished and tends not to get touched in  
a long time. A situation arises, you stash something, the phone rings,  
and for whatever reason the stash gets forgotten and you don't revisit  
the project at all for days, weeks, months. It wouldn't be nice to  
eventually come back and discover that your in-progress work had been  
"garbage" collected for you.

Wincent



--

From: Jeff King
Date: Friday, June 13, 2008 - 2:13 am

I think this argues more for increasing the expiration period on
reflogs, if your project moves very slowly.

-Peff
--

From: Johannes Schindelin
Date: Friday, June 13, 2008 - 2:41 pm

Hi,

On Fri, 13 Jun 2008, Wincent Colaiuta wrote:

> El 13/6/2008, a las 6:52, Johannes Schindelin escribi
From: Christian Jaeger
Date: Friday, June 13, 2008 - 4:33 pm

[Empty message]
From: Wincent Colaiuta
Date: Saturday, June 14, 2008 - 1:58 am

Your arguments are _totally_ spurious.

With respect to your first point, I didn't even mention repository  
duplication and backups so I don't know why you tried to inject it  
into the discussion. I am _completely_ serious about preserving my  
data in the repos I work on, and that's why I do automated backups of  
the home directory which contains all of my local working repos every  
two hours (ie 12 times per day), plus an automated whole-disk backup  
once every 24 hours. However, my point about "git stash" is completely  
independent of my backup regimen; the backup regimen exists to protect  
me against disk and system failures, not to protect me from my SCM.

And on your second point, you are arguing that anything you can't  
remember isn't worth keeping, which just isn't a sustainable argument.  
Can you remember the thousands of commits in Git's complete history?  
Would it be okay to just throw away the changes you've forgotten about?

So I don't think I've made your point for you at all; it remains for  
you to make it for yourself. But it seems to me that you are arguing  
against the sizeable majority of participants on this thread which has  
already dragged on too long.

So, let's recap:

(1) It is reasonable to expect, and in fact it is clear that most  
users _do_ expect, that Git should indefinitely remember changes that  
they told it to remember with "git stash save"

(2) The people largely responsible for the implementation of "git  
stash" never envisaged its use for long term storage (either  
intentional abuse of "stash", or inadvertent misuse) and architected  
it in such a way that it can "forget" stashes after a period of time

So far you've only attacked the first point, and your defense of the  
second point has consisted of nothing more than an affirmation that  
the status quo should remain in place because that's the way it's  
always been. I haven't yet heard any explanation of _why_ it's such a  
big deal to adjust the behaviour of the ...
From: しらいしななこ
Date: Saturday, June 14, 2008 - 4:59 pm

I apologize for my lack of perfect foresight as the original
author of the command.  As I already said, I think expiration
period of reflogs that is configurable for each ref as suggested
earlier by Junio makes sense.

But changing the default expiration "never" for stash has its
own problem and I think we need to modify the way a stash entry
is created to solve it.

You will find in the DISCUSSION section of git-stash manual page
that a stash entry is represented as a commit that is a merge
between the commit the stash was made on (H), and a fake commit
whose tree records the contents of the index (I).  The merge
commit itself records the state of the files in the working
directory as its tree (W):

           .----W
          /    /
    -----H----I

If you do not expire stash forever, you will keep the history
behind the commit H.  This is unnecessary and is problematic
particularly if you rebase your branches frequently.

In order to apply a stash, all you need is the tree of the three
commits contained in this structure.  You do not need the
history behind commit H.

The following is a trial patch to change how a stash is recorded.
With this patch I do not think we will keep unnecessary commits
behind H in the repository even when a stash is kept forever.

diff --git a/git-stash.sh b/git-stash.sh
index 4938ade..f4c4236 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -54,6 +54,9 @@ create_stash () {
 	fi
 	msg=$(printf '%s: %s' "$branch" "$head")
 
+	# create the base commit that is parentless
+	b_commit=$(printf 'base of %s\n' "$msg" | git commit-tree "HEAD:")
+
 	# state of the index
 	i_tree=$(git write-tree) &&
 	i_commit=$(printf 'index on %s\n' "$msg" |


-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

----------------------------------------------------------------------
Get a free email address with REAL anti-spam protection.
http://www.bluebottle.com/tag/1

--

From: Mikael Magnusson
Date: Friday, June 13, 2008 - 5:05 am

Even if everyone were to eventually agree with this, I think it's a bit rude
to teach people who stored something important in a stash for a month that
this is the case by deleting their work.

Even with the documentation update, if you're a newbie, you might not know
that git in some places automatically calls git gc when you git pull, as in
the example Brandon gave.

-- 
Mikael Magnusson
--

From: Brandon Casey
Date: Thursday, June 12, 2008 - 2:27 pm

The fact that this caveat is not mentioned anywhere in the stash
documentation or anywhere in the commit log related to git-stash.sh makes
me think that this idea of 'a limited amount of time' was possibly not a
design decision but merely a side effect of stashes being implemented using the
reflog. Of course I didn't pay any attention to the discussions about stash
back when it was implemented, so I may definitely be wrong.

I'm not sure what the drawback is for persistent stashes though. This is
what I can think of:

  - enlarges repository size by retaining cruft referenced by old stashes
  - encourages bad workflows
  - behaves in a way that is not expected or preferred by the user
  - overly complicates code

The first item I think is somewhat irrelevant. There are many ways that a user
could cause repository size growth, and as Wincent suggested, the increase in
size of the list of stashes is an incentive to clean it up. And in the case of
user generated data, the definition of cruft should be left to the user.

I don't think the second item is true. I don't think any particular work flow
is being encouraged here.

The third item is the one I think is the most important. I think this is a user
interface issue. "Does git do what the user _expects_ git to do?". I offered one
example where the current behavior would produce a result that was likely not
expected by the user and possibly not desired by the user. I think a counter
example (one that would argue against the suggested change in behavior), is if
it were true that if I were to create a stash today, and then be surprised 30
days from now when I do a 'stash list' and find the stash is still there.
Something along the lines of:

   $ git stash save my work
   # wait 30 days
   $ git stash list
   stash@{0}: WIP on master: my work

   # and if my reaction were something like:
   # hmm, that's strange, what is that stash still doing there? It's been 30 days,
   # it should be gone.

btw, that _is_ the current ...
From: Junio C Hamano
Date: Thursday, June 12, 2008 - 2:46 pm

I do not deeply care either way, but perhaps

 http://thread.gmane.org/gmane.comp.version-control.git/50737/focus=50863 

and yes use of reflog was more or less conscious thing and the mechanism
is very much temporary in nature (see the use case stated in the starting

We could prune before running "git stash list", but why bother?  The fact
you can see it is like a bonus.


--

From: Brandon Casey
Date: Thursday, June 12, 2008 - 3:10 pm

Ahh, you reveal that you were not always a supporter of "stash per branch" in


Yes, I understand the use case. I am just not convinced that a persistent

hmph :)

-brandon

--

From: しらいしななこ
Date: Thursday, June 12, 2008 - 8:45 pm

After reading the thread, I reread the manual page.

I agree with some people that it would be nicer if the documentation
spelled out the temporary nature of the stashed states better.

-- 8< -- (^_^) -- >8 --

Improved git-stash documentation

The examples in the manual talk temporary operation but it is not a strong
enough hint that the stashed states are temporary in nature and designed
to be automatically pruned away when gc runs.

Signed-off-by: Nanako Shiraishi <nanako3@bluebottle.com>
---

 Documentation/git-stash.txt |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index baa4f55..65f6af7 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -14,10 +14,11 @@ SYNOPSIS
 DESCRIPTION
 -----------
 
-Use 'git-stash' when you want to record the current state of the
-working directory and the index, but want to go back to a clean
-working directory.  The command saves your local modifications away
-and reverts the working directory to match the `HEAD` commit.
+`git-stash` records the current state of the work tree and the index, and
+reverts them to the state after freshly checking out the current commit.
+You can temporarily switch to work on something different, and once you
+finished handling the emergency, you can come back to the state before you
+were interrupted.
 
 The modifications stashed away by this command can be listed with
 `git-stash list`, inspected with `git-stash show`, and restored
@@ -84,6 +85,9 @@ longer apply the changes as they were originally).
 clear::
 	Remove all the stashed states. Note that those states will then
 	be subject to pruning, and may be difficult or impossible to recover.
+	Old stashed states are also pruned automatically when
+	linkgit:git-gc[1] prunes old reflog (see linkgit:git-reflog[1])
+	entries.
 
 drop [<stash>]::
 

-- 
Nanako ...
From: Andreas Ericsson
Date: Thursday, June 12, 2008 - 9:26 pm

Why are branches better and more appropriate?
Is it because the developer who first thought of stashes didn't think they'd
be used for any halflong period of time?
Is it because there are actions you can do on a branch that you can't do on
a stash?

Who's to say what's appropriate and not? If I explicitly tell a system to
save something for me I damn well expect it to be around when I ask that
same system to load it for me too.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
--

From: Jeff King
Date: Thursday, June 12, 2008 - 10:58 pm

I think we are getting into circular reasoning here (on both sides):

Branches are better, because they don't expire. Stashes expire, because
branches are a better way to do what you want.

Stashes shouldn't expire, because the user told the stash to save
information. The user considers it a "save" because stashes hold things
forever. Stashes hold things forever because they shouldn't expire.

In other words, yes, the developer who thought of stashes didn't think
they'd be used for a long period of time. That's _why_ they were
designed as they were. The status quo argument says "this is what a
stash is, because that is how it is implemented."

So I would expect people in favor of the change to say "here is why
long-term stashes are useful." And I would expect such an argument to
address the fact that we don't simply want to recreate branches (badly).
In other words, what is the compelling use case that makes people want
to stash for months at a time?

-Peff
--

From: Andreas Ericsson
Date: Friday, June 13, 2008 - 12:16 am

Ah right. Thanks for clarifying and putting me back on a useful track.

To me, long-living stashes are useful because I can all of a sudden be
pulled away from something I'm working on and set to work on something
entirely different for up to 6 months (so far we haven't had a single
emergency project run longer than that). It doesn't happen a lot, but
it *does* happen.

When that happens, I just leave everything as-is, because that's the
most useful state for me to find it in and serves as a nice bump to jog
my memory as to precisely it was I was working on. When I get back it's
possible that someone else has committed design changes or some minor
bugfixes, so naturally I always fetch to make sure I inspect the latest
changes.

I sometimes have stashes around for a day or two if it turns out I
absolutely have to fix some bug or add something to an API before I
can finish the feature I just started working on and the minor change
turns out to be not-so-minor (or if it requires a day or two of testing
to verify).

Sure, I could probably benefit from starting a topic-branch immediately
and then rebase it later, but I also have a git-daemon running so my
co-workers can fetch the latest from me (I work with back-end stuff
usually, and sometimes they need mockups of soon-to-be-real API stuff
which we'd prefer not to get into the central repository), and I don't
want them to get the stuff I *know* is incomplete. It leads to confusion
and unnecessary work. Stashes are handy there.

This workflow works fine for me, but I'd be appalled if I all of a sudden
got back from a period of being away, did a git-fetch and had git-gc
remove my stash(es). I rarely have more than one or two.

Come to think of it, I think it has actually happened once, and I spent
two days trying to find the changes I knew I had made before I gave up
and wrote it down to the changes having been done on a testing system
and overwritten at a later time.

I think these are the options we're faced with:
1. Never ...
From: Jeff King
Date: Friday, June 13, 2008 - 12:42 am

So of course my first question is "then why didn't you use a branch?" :)

I'm not, by the way, trying to say "there is no good reason not to use a
branch." I am trying to figure out what the reasons are, because I
wonder if there is a more useful abstraction we can come up for handling
this situation.

Reading your (and others') responses, it seems like there are two
things:

  1. Stashing is about saying "save everything about where I am now with
     no hassle". IOW, it's one command, you don't have to decide what
     goes and what stays, and you can pull it back out with one command.
     And maybe there is a psychological component that you are not ready
     to "commit" such a work-in-progress (I am extrapolating here, but I
     know that when I first started with git, I was hesitant to commit
     because of my experience with other systems).

  2. Branches tend to get shared, and you don't want people to see your
     stashes, because they are messy works in progress.

To deal with '2', I wonder if it would be worth making some branches
inaccessible to pushing/pulling (either via config, or through a special
naming convention).

For '1', I guess the only solution would be some way of making a topic
branch easy to stash and restore. And that would end up looking quite a
bit like stash; I would think the interface would be "git notstash
branchname". Which we sort of have with "git stash save <message>"
already. We could say "if you didn't provide a message, then it will be
gc'ed eventually, but if you did, then it lives forever". That would
work fine for my workflow (I don't bother naming stashes, since I
generally unstash them immediately). But it seems like an accident

I am tempted by #3, which again matches my workflow. But again, it seems
like an accident waiting to happen for unsuspecting users.

So I think either #1 or #4 is reasonable. #4 probably isn't worth the
effort. If the stash reflog gets too cluttered, one can always expire or
clean it ...
From: Andreas Ericsson
Date: Friday, June 13, 2008 - 1:11 am

Because stashes are convenient, never get propagated anywhere by accident,
are easy to apply, means you won't have the hassle of creating "topic-bugs"
and later merge it into "topic" when you find something you need to fix

Right. If #1 gets dropped, I'll most likely hack up #4 though. I'd hate
for one part of git to be able to silently drop work when every other
aspect of it makes damn sure that never, ever happens.

I can imagine lots of people complaining if the merge logic suddenly
starts clobbering dirty work-tree files with an mtime 90 days in the past,
even though the user hasn't explicitly asked git to take care of those at
all.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
--

From: Jakub Narebski
Date: Friday, June 13, 2008 - 1:51 am

[...]

There is one thing that is easy to do with a stash (due to the way it
is implemented, even if it complicates it a bit), and you CANNOT do
(without much hassle) with branches, namely saving state where _index_
state matters, either partial commit (or just added files), or
conflict resolve in progress.

I'm not sure how useful such a thing can be after a month, but if
project has slow rate of development (and developer can deal with such
"ahlfway" state decently when restored), it can happen...

-- 
Jakub Narebski
Poland
ShadeHawk on #git
--

From: Sverre Rabbelier
Date: Friday, June 13, 2008 - 1:56 am

Because nobody / not everybody has perfect foresight, sometimes you
don't know in advance that what you thought was going to be a
temporary stash will turn into a long lived stash. What you are saying
is that really you should always create a branch, just in case your
temporary stash proved to be more long-lived than thought?


-- 
Cheers,

Sverre Rabbelier
--

From: Jeff King
Date: Friday, June 13, 2008 - 2:10 am

Well, two things here:

  1. I was being somewhat tounge in cheek with that comment. If you read
     the rest of the email, I was trying to figure out reasons why
     people are using "git stash" for long-term storage, to see if we
     could improve the branch interface or find a middle ground between
     temporary stashes and branches.

  2. You don't need perfect foresight. Sometime in the thirty days (but
     probably about 5 minutes later) you realize "oh, this is some
     stashed work that I'm not going to deal with for a while" and you
     promote it to a topic branch.

     But then, I have good reason to want works-in-progress to become
     topic branches: I can then push them to a location which is backed
     up, and from which I can retrieve them if I want to access them
     from a different machine. Not everybody uses the same workflow.
     If you don't see any other benefits to topic branches, then the
     promotion is just a pain.

-Peff
--

From: Miles Bader
Date: Friday, June 13, 2008 - 4:14 am

That's exactly the sort of thing that people end up forgetting to do,
even if they know they know about and understand the issue -- and of
course many (most?) people will be using stashes _without_ knowing about
this issue.  I suspect both types will be pretty annoyed when they
realize their work disappeared...

-Miles

-- 
`To alcohol!  The cause of, and solution to,
 all of life's problems' --Homer J. Simpson
--

From: Junio C Hamano
Date: Friday, June 13, 2008 - 2:47 am

I personally do not find the example that Andreas gave unconvincing, not
because I doubt it happens in practice, but because I think it shows a bad
inter-developer communication.

It is natural to have branches that are private and/or not meant to be
built on top of by others.  We all have them --- heck, I have one that is
called 'pu' (not private but it is meant to be "only look, never touch"
and advertised as such).  But if we need a strong mechanism to enforce
that "never touch" policy by not allowing fetch, there is something wrong
with the inter-developer communication.

While digging the original thread earlier today (eh, it is already
yesterday here), I was thinking about what other alternative design and
implementation would have been sensible.  Here are some thoughts.

 * We _did not have to_ make stashes into refs/stash@{$N}.  We could have
   implemented them as individual refs under "refs/stash/$N" hierarchy.
   E.g. refs/stashes/1, refs/stash/2, etc.

   As a side note, we also could have implemented per-branch stash as
   refs/stashes/master@{$N} or refs/stashes/$branch/$N (and we still can.
   Perhaps we can have "git stash save -B" option that tells the command
   to send the resulting stash to the per-branch namespace).

 * We however chose to take advantage of the auto reclamation behaviour of
   reflog, and for most practical purposes, it is a good thing.

 * We later introduced "drop" because even as a volatile and short-lived
   collection of local modifications, you can tell that some stashes are
   utter crap immediately while deciding that some are worth keeping, even
   for a short term.

   This mechanism was however meant for uncluttering the set of stashes.
   "drop" names what you want to discard right now, and by doing so,
   implicitly names what you want to keep for a bit longer (by not naming
   them).  It's a reverse operation -- to make your gems easier to find,
   you discard garbage stash entries.  It is a useful work element.

 * We ...
From: Jakub Narebski
Date: Friday, June 13, 2008 - 3:05 am

By the way, this makes stashes a bit similar to using $TMPDIR to store
files with 'tmpwatch' (or equivalent) enabled.  Is this a good analogy?


This looks nice, although I'd rather not use any magic.  I'm only
afraid that people would notice that some stash / stash entry should
have been "kept" when it is too late.

-- 
Jakub Narebski
Poland
ShadeHawk on #git
--

From: Sverre Rabbelier
Date: Friday, June 13, 2008 - 3:33 am

On Fri, Jun 13, 2008 at 11:47 AM, Junio C Hamano <gitster@pobox.com> wrote:


I'm divided on this:
 OOH: I like the idea of having a keep command to mark stashes as
valuable, making them not expire until dropped explicitly. Such a
feature would also encourage user to go through their stashes every
now and then and decide which ones are valuable, and which ones were
indeed not that valuable and may be dropped.

 OTOH: I dislike the idea of 'forcing' the users to go through their
stashes lest they lose their work. I don't see why anybody would want
to do some work, stash it, and then "for no apparent reason" (the
reason being not touching it for some time) lose it later. What if
their system borks up and gives a wrong value as current time (say, 10
years in the future), all of a sudden their stashes are gone, and they
might not even find out till it was too late. Sure, they'd lose some
stale objects too, but that I can live with, those they did not ask
git to take care of explicitly!

The per-branch stashes sounds very nice, especially if you can get a
'git stash list --all' feature, that shows all stashes, regardless of
what branch they are on. I myself would use such a per-branch feature
most of the time, it would be nice to have a config option that
defaults to that (making 'git stash' create a per-branch stash by
default that is).

-- 
Cheers,

Sverre Rabbelier
--

From: Olivier Marin
Date: Friday, June 13, 2008 - 10:31 am

it seems pretty strange to ask the user for a confirmation: are you sure

I think the same and would prefer per-branch stash by default because I
don't see a real use of a "global" one but maybe I'm wrong. Perhaps, a
config option could make everyone happy. :-)

Olivier.
--

From: Junio C Hamano
Date: Friday, June 13, 2008 - 12:21 pm

The latter argument is somewhat misguided.

To stash is like putting something in /tmp on a system that runs a cron
job to clean out cruft from there once in a while.  Another analogy is to
spitting an information out to syslog, so that it is kept until logs are
rotated.

If you want permanent storage, you do not store it in somewhere that is
designed to have automated rotation or pruning.  Instead, you would create
a file somewhere in your $HOME or use a branch.  It is natural that you do
not have perfect foresight --- so after putting something in /tmp, you may
wish that you can somehow say retroactively that some things you placed
earlier in /tmp are more valuable than others.  "keep" was an example of
how you _could_ express that wish.  In other words, you are not _forced_
to, but you are merely given an opportunity to do so.

I do not personally care too deeply about the "keep" approach.  An easier
to explain (and perhaps easier to implement, too) alternative would be to
have a per-ref configuration variable that specifies the reflog retention
period per ref, e.g. "git config reflog.refs/stash.expire never".

I however mildly suspect that the stash configured as such would end up to
be a lot worse than the current behaviour in practice.  It would make
crufts easily accumulate in the stash, making it harder to find gems, and
as a consequence of that, encouraging you to say "stash clean" or "stash
drop" more often, risking accidental removal of what you did not intend
to (for this exact reason I earlier  -- much earlier than the current
thread -- even thought about suggesting to make the reflog expiry period
much shorter than the usual ref).

But at that point it is user shooting his foot off ;-)
--

From: Wincent Colaiuta
Date: Friday, June 13, 2008 - 12:35 pm

Judging from the number of people who have chimed in on this thread  
saying "I expect Git to remember what I told it to remember" indicates  
that all too few think of the stash as being like "/tmp" or a logfile.  
There are two problems:

1) the name of the command, "stash" (and worse still "stash _save_")  
is giving people a misleading impression about what it does

2) the documentation isn't clear enough about what the command does,  
or people don't read it

I suspect that no matter how good the documentation is, a command  
called "git stash save" that remembers stuff only temporarily will  
continue to evoke reactions of puzzlement for as long as it works that  
way. Seeing as Git is usually extremely conservative about throwing  
away people's stuff, the ephemeralness of "git stash" backing store is  
likely to be quite surprising for many.

What are the options?

- improve the documentation

- rename the command (can't see that happening)

- change the behaviour to "keep" until popped/cleared

I think the latter's the best, and a patch for it was already posted  
at the beginning of this thread.

Cheers,
Wincent



--

From: Brandon Casey
Date: Friday, June 13, 2008 - 12:42 pm

I like my analogy better. :)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index baa4f55..119117e 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -17,7 +17,11 @@ DESCRIPTION
 Use 'git-stash' when you want to record the current state of the
 working directory and the index, but want to go back to a clean
 working directory.  The command saves your local modifications away
-and reverts the working directory to match the `HEAD` commit.
+and reverts the working directory to match the `HEAD` commit. You should
+think of the stash like a garbage can in which you place your temporary
+work and then bring out to the curb. If you are quick to use your stashed
+changes you can get them before garbage collection occurs (it's every
+Tuesday where I live, but git does it every 30 days).
 
 The modifications stashed away by this command can be listed with
 `git-stash list`, inspected with `git-stash show`, and restored
--

From: Olivier Marin
Date: Friday, June 13, 2008 - 12:49 pm

Are you sure about the "temporary" thing? I'm not a native speaker but all
the dictionary I tried, define stash as:

  "to put or hide away (money, valuables, etc.) in a secret or safe place,
   as for future use."


Why not encouraging the usage of "pop" then?

Olivier.
--

From: しらいしななこ
Date: Friday, June 13, 2008 - 6:16 pm

I think I am primarily responsible for this auto expiration
behavior of git-stash command.  The original use case that led
to my git-save script was only about very short-term use.  To
tell you the truth, I did not even realize myself that I can use
stash@{<<number>>} to refer to more than one states until
Johannes Schindelin pointed it out to me.  It was only about
saving the current state once, and that proves it was about very
short-term use and nothing else.

But I think git-stash may have outgrown the original motivation.

Configurable expiration period per reflog like you suggested
sounds like the most sane solution to the issue to me.  I think
that approach is especially attractive because it can kill a few
birds with the same stone.  You can configure remote tracking
branches' logs to expire much sooner than your own branches'
ones, for example.

But I think "reflog.refs/stash.expire" you mentioned above is a
bad name.  Because the default expiration period is configured
with "gc.reflogexpire", it would be better to make the variable
name start with "gc".

By the way, I sent a documentation patch to git-stash but did
not hear any response.  Was there anything wrong with it?

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

----------------------------------------------------------------------
Find out how you can get spam free email.
http://www.bluebottle.com/tag/3

--

From: Wincent Colaiuta
Date: Friday, June 13, 2008 - 5:40 am

Sounds a little bit over-engineered to me.

So, "stash" is intended for short-term storage, but by adding a "keep"  
option you're officially blessing it for long-term storage as well.  
And the interface that you propose, explicitly marking stuff as "for  
keeps" and being able to move stuff from "temp" to "keep" sounds quite  
complicated.

I honestly think that the simplest solution from both an  
implementation and a usage perspective is just to keep everything that  
is stashed until the user clears it out. If you use a push/pop model  
then your stash will never get cluttered up with garbage, and if you  
do abuse it for long-term storage you'll start to notice that the  
stash list is inconveniently large, thus hinting that perhaps you are  
abusing stash in ways that the designers never intended.

Cheers,
Wincent

--

From: Jeff King
Date: Friday, June 13, 2008 - 6:11 am

I agree. I like the expiration of stashes, but if it is a choice between
"just don't expire them" and "here is a complex set of rules and
obligations for preventing them from expiring" I think we are better off
just leaving them.

-Peff
--

From: Olivier Marin
Date: Friday, June 13, 2008 - 10:03 am

I really like your refs/stashes/$branch/$N idea because it seems easier to
list and clean with git stash list/drop/clear.
But I think stash should stay a per-branch thing by default. What about a


I don't like it at all. Why not just have "keep" by default? The users can
already use "pop", "drop" and "clear" if they want to trash their stash.

Olivier.
--

From: Jon Loeliger
Date: Friday, June 13, 2008 - 6:54 am

There are additional choices too, I think, with config-driven
variations as well.

At git-gc time, notice a reflog entry for a stash that
is about to expire and either convert it to a branch or
interactively offer to convert it or delete it.

Provide a command that converts stash entries to branches.
Maybe even take over refs/stash/ name-space or so?

All with various config options to do that quietly, interactively,
always, never, etc.

jdl
--

From: Brandon Casey
Date: Friday, June 13, 2008 - 9:54 am

Maybe the command name is the problem. We know that 'git stash' is short
for 'git stash save', so we think we are directing git to "save" something.
In the physical world when I "save" something or "stash" it away, even
temporarilly, I don't stick it in a garbage can and put it out by the curb. :)

Sorry, that's just what popped into my head :)

-brandon
--

Previous thread: [PATCH] Enhanced auto-discovery of httpd location and call conventions. by Flavio Poletti (polettix) on Wednesday, June 11, 2008 - 6:51 pm. (5 messages)

Next thread: [StGit PATCH 00/14] Undo series by Karl on Wednesday, June 11, 2008 - 10:34 pm. (32 messages)