Hi, I have a repository with a working tree on a machine A, did a clone to another machine B and commited there locally. I want my changes to get back into the first repository, so I did a "push". The new commit is in the history, I can see it with "git log", but the modifications are not in the working tree. This time, it's OK: I didn't have any uncommited modifications on A, so I just did a "git reset --hard HEAD" there. But if I had some uncommited changes, "git reset --hard HEAD" means data loss, which is precisely what I want to avoid by using a VCS. It seems a solution is to do: $ git reset --soft <commit-id-before-the-push> $ git merge <commit-id-after-the-push> But it means I have to remember <commit-id-before-the-push>. I don't understand the design choice here: git had two options to avoid this scenario: 1) update the working tree while doing the push. That's feasible with good performance since git is present on the server, but leaves the problem of possible conflicts. 2) let git remember what the local tree points to (not just the branch name, but the commit id itself, stored in a place that "git push" won't modify). Then, provide me a way to "update" to the latest revision. Fyi, bzr does this. Indeed, in bzr, a branch (let's say "repository" in the git vocabulary) with a working tree just means a working tree (AKA lightweight checkout) located in the same directory as a branch. The working tree knows which revision it corresponds to, and where to find its branch. There's a "bzr update" command to get my working tree to the head of the branch, keeping the uncommited changes. I believe this idea is very much linked to the "Lightweight Checkout" idea (listed on the SoC ideas), since, in the case of multiple working directories sharing the same .git, you don't want a commit in one tree to affect the others. So, did I miss something? Is there anything on the todo-list? Thanks, -- Matthieu -
The general answer (which you've already received) is to tell folks is
to simply don't use "git push" to remote trees; basically, if you ever
have a non-bare repository, it doesn't do what you expect, and it will
leave the novice user horribly confused. A much better answer is to
simply go back to machine A, and pull from machine B.
I was exploring though to see if there was anything we could do
better, and so I used my standard test repository of the GNU Hello,
world program, and did the following:
git clone hello r1
git clone r1 r2
cd r1
<edit hello.c's headers to be GPL v2 only>
git commit -a -m "GPL v2 only"
cd ../r2
<edit hello.c so that the message printed is "Hello, world!"
instead of "hello, world">
cd ..
OK, so this sets up the standard test setup of repositories r1 and r2.
r1 contains a committed change so that hello.c is GPLv2 only. r2
contains an uncommitted change to the actual text printed by hello.c.
The changes are nicely seprated in distaince by over 100 lines, so
there should be no problems with merges. Let's play...
Experiment #1. Let's try pushing from r1 to r2.
cd r1
git push ../r2
This pushes the change GPLv2 change from r1 to r2. However, it leaves
the working tree and the index untouched, which leads to some very
unexpected and surprising behavior:
a) If you do a "git commit" you will commit the current contents of
the index, which is usually the contents of the head of r2 before
the push.
b) If you do a "git commit -a" you will commit the modified changes to
the working directory --- based off of the state of r2 before the
push. What will therefore show up in the revision log is something
which appears to be based off of the more recent change in r1, but
which is really based off of the old history as of r2 before the push.
All of this is bad, which is why "git push" to a non-bare repository
is extremely surprising. (As an aside, what Bitkeeper would do is to
update...It's not really an option in my case. A is a fixe-IP/fixe-DNS machine, while B is my home machine, behind a NAT modem-router. So, I'd have to figure out my home IP, port-forward the ssh port from the modem to my machine, ... If I understand correctly the other answers, I have two options: * Git doesn't manage this case, and doesn't care about me loosing data if they're not commited, I'll have to do it myself with hooks. * Create a bare repository on machine A, and clone it to a non-bare repo on which I'll work. But that means duplicating the repository on the same filesystem of the same machine. Not really satisfactory either. The "light checkout" feature would make it better, but I'm still worried about what will happen to my light checkout when someone pushes to the repository. -- Matthieu -
Doesn't update hook get pre- and post- commit object name? -
Yes, and the same is true in the new post-receive hook. -- Shawn. -
In my comments, I was observing that *after* the push had succeeded, there was no way to find the commit-id-before-the-push, since neither the reflog nor ORIG_HEAD is getting updated. Is there a good reason why not? Would you accept a patch which caused the reflog and possibly ORIG_HEAD to be updated on the remote side of the push? When I was talking about a hook to enforce the BitKeeper semantics, the question is whether we have enough to enforce the following: * Only accept the push if it will result in a fast-forward merge (and if not, tell the user to do a git pull, merge locally, and then redo the git push) * Only accept the push if there are no locally modified files that would be affected when the working directory is updated to reflect the new HEAD I don't think there's any easy way to determine if these two criteria would be met besides trying to actually do the merge, and if it fails atomically back out to the original starting point, right? Or am I missing something painfully obvious? Since one of the applications where I might want to do something like this is a push a web site being maintained by git (where I don't want any the result of the interim attempted to merge to accidentally get seen by the web server), probably in order to do this right I'd have to have the hook script do a cp -rl of the repository+working tree to some scratch space, try to do the merge and update of the working tree, and if it succeeds, allow it to happen for real in the "live" tree, and if not, fail the merge. This seems awfully kludgy; is there some other way? - Ted -
The reflog does update if the log file exists during a push (err, actually during receive-pack). Or if core.logAllRefUpdates is set to true. Now this isn't the default in a bare repository, but it should be the default in a repository with a working directory. Yes, the update hook can detect this. Actually receive-pack by default rejects *all* non-fast-forward pushes, even if the client The update hook could also perform this check; test if the ref being updated is the current branch, and if so, verify the index and working directory is clean. That's a simple run of git-symbolic-ref (to get the current branch) and git-runstatus (to check the index and working directory), is it not? If git-runstatus exits to indicate the tree is clean (nothing to commit) then a simple `read-tree -m -u HEAD $new` should update the working directory and index, right? -- Shawn. -
Ah, so that's controlled by receive.denyNonFastForwards, right? Cool, I missed that. Thanks!! Documentation/config.txt doesn't say it defaults to true, but from What git-runstatus will allow me to do is to abort if there are any local modifications, regardless of whether or not they would conflict with the working tree update. The key phrase in my criteria was no locally modified files "THAT WOULD BE AFFECTED". What I could do with BitKeeper is that I could modify some file like schedule.html on my webserver, and then push a changeset from my laptop to would update sermons.html, and it would allow the push --- since it would change the file sermons.html, and not touch schedule.html. But if I modified schedule.html on my laptop and then committed it, and *then* try to push that changeset to the webserver, it would abort since in order to accept the changeset, it would have to update the working tree, and that would clash with the locally modified schedule.html file. At thta point I'd have to login to the webserver, revert the local modification and bring it back down my laptop and include it in a proper changeset. Yeah, I probably shouldn't have ever modified the file locally on the webserver, but that would sometimes happen when I was in a rush, and it was nice when it Just Worked. - Ted -
Ah, my bad, it defaults to false: static int deny_non_fast_forwards = 0; I should have known better, as I run a 1.5.x (aka 'next') server for a workgroup and I never have that set, but use instead a complex git-diff $old $new | git-apply --index ? If the patch does not apply, nothing gets updated. If it does apply, the index is also updated and stat data updated. OK, it doesn't quite handle every case, as sometimes a patch will reject but the internal 3-way merge from xdiff that is called by merge-recursive will succeed, but this does protect your working tree and doesn't require making a temporary copy. Of course another possible approach is to stuff the entire working directory into a temporary tree, and then merge. If the merge doesn't work, you can reset to the temporary tree. Unfortunately the working directory is "in flux" during that process... its not atomic. -- Shawn. -
And if you are only checking then perhaps instead of --index use --check, so that the actual updates are deferred? -
So I dug a little more deeply, and the problem is that the reflog for
master was getting updated, but not the reflog for HEAD, and that's
what "git reflog" was showing --- hence my confusion.
What are the rules for when HEAD's reflog should get updated, and is
this documented anywhere in the man pages?
- Ted
Script started on Sun 18 Mar 2007 11:11:59 PM EDT
Top-level shell (parent script)
Using ssh-agent pid 7679
<tytso@candygram> {/home/tytso/talks/dscm/git}
1% cp -r test1 test2 ; cd test2
<tytso@candygram> {/home/tytso/talks/dscm/git/test2}
2% (cd r2; git-config core.logallrefupdates)
true
<tytso@candygram> {/home/tytso/talks/dscm/git/test2}
3% cat r2/.git/refs/heads/master
f2e3cc0bb64c8c94b89ba07bfbdd1653584586f2
<tytso@candygram> {/home/tytso/talks/dscm/git/test2}
4% cat r2/.git/logs/HEAD
0000000000000000000000000000000000000000 f2e3cc0bb64c8c94b89ba07bfbdd1653584586f2 Theodore Ts'o <tytso@mit.edu> 1174266825 -0400
<tytso@candygram> {/home/tytso/talks/dscm/git/test2}
5% cat r2/.git/logs/refs/heads/master
0000000000000000000000000000000000000000 f2e3cc0bb64c8c94b89ba07bfbdd1653584586f2 Theodore Ts'o <tytso@mit.edu> 1174266825 -0400
<tytso@candygram> {/home/tytso/talks/dscm/git/test2}
6% (cd r1 ; git push ../r2)
updating 'refs/heads/master'
from f2e3cc0bb64c8c94b89ba07bfbdd1653584586f2
to 37508dc11dbe274d021124057fd2d027f6ce9d17
Generating pack...
Done counting 5 objects.
Result has 3 objects.
Deltifying 3 objects.
100% (3/3) done
Writing 3 objects.
100% (3/3) done
Total 3 (delta 2), reused 0 (delta 0)
Unpacking 3 objects
100% (3/3) done
refs/heads/master: f2e3cc0bb64c8c94b89ba07bfbdd1653584586f2 -> 37508dc11dbe274d021124057fd2d027f6ce9d17
<tytso@candygram> {/home/tytso/talks/dscm/git/test2}
7% cd r2
<tytso@candygram> {/home/tytso/talks/dscm/git/test2/r2} [master]
8% cat .git/refs/heads/master
37508dc11dbe274d021124057fd2d027f6ce9d17
<tytso@candygram> {/home/tyt...It is buried down in write_ref_sha1 (in refs.c). The rule is if the name of the ref given to us for update does not match the actual ref we are about to change, we log to both the original ref name given and the actual ref name. This handles the case of HEAD being a symref to some actual branch; we update the HEAD reflog and the actual branch reflog whenever someone updates HEAD. Which is what we are usually doing from tools like git-checkout. receive-pack isn't updating the HEAD reflog as its updating the actual branch, not HEAD. If you pushed instead to HEAD you should see the HEAD reflog entry too. Its a little ugly here as I'm not sure we should always update HEAD's reflog if HEAD points at a branch we are actually updating. Maybe we should though in receive-pack ? -- Shawn. -
What about splitting HEAD when you push to the underlying branch, and making HEAD a non-symref? Sam. -
I do not think any of the complication is needed, and I think
somebody mentioned a good example, which is a firewalled host
that can only be pushed into. In that example, even though he
knows he could fetch in reverse direction in the ideal world,
the network configuration does not let him do so, hence need for
a push.
To deal with that sanely, people who push between non bare
repositories can just forget about pushing into branch heads.
Instead, they can arrange their pushes to be a true mirror image
of their fetch that they wish could do. To illustrate:
On repo A that can only be pushed into, if you _could_ fetch
from repo B, you would:
$ git fetch B
with something like this:
[remote "B"] fetch = refs/heads/*:refs/remotes/B/*
But unfortunately because you can only push into A from B, you would
run this on B instead:
$ git push A
with:
[remote "A"] push = refs/heads/*:refs/remotes/B/*
And after you perform your push, you come to the machine with
repo A on it, and remembering that what you did was a mirror
image of "git fetch B", you would:
$ git merge remotes/B/master
and you are done.
In other words, don't think of refs/remotes/B as something that
is for the use of "git fetch". Its purpose is to track the
remote repository B's heads. You maintain that hierarchy by
issuing fetch in repository A. You can issue push in repository
B to do so as well.
I push into a live repository almost every day. My typical day
concludes like this:
gitster$ git push kernel-org-private
gitster$ ssh kernel.org
kernel.org$ git merge origin
kernel.org$ Meta/Doit -pedantic &
kernel.org$ exit
... go drink my tea ...
where
(1) gitster is my private development machine
(2) kernel.org is a machine made available to me by friendly
k.org folks
(3) Meta is a checkout of my 'todo' branch and
(4) Doit is a script to build all four public branches.
I always leave 'master' checked out on my kernel.o...-
This is indeed a corner case. And it was never considered before as great care was made at the time to be sure pushes wouldn't create any reflogs on the remote side, which is effectively done by not If the meaning of HEAD changed (although indirectly) because HEAD happens to point to the branch that just got updated then logically the HEAD reflog should be updated too. On the other hand the HEAD reflog should reflect operations performed on HEAD. Since the push updates the branch directly it is not exactly performing some operation on HEAD since HEAD could point anywhere and that wouldn't change the push at all. Meaning that for the discussion of pushing to a non-bare repository with a dirty working tree... If the branch being pushed into is not pointed to by HEAD then no consideration what so ever about the working tree should be made, and no update to the HEAD reflog made of course. Nicolas -
Right, but if the branch being pointed to is pointed to by HEAD I would argue that the reflog for HEAD should be updated, since operations that reference HEAD will see a new commit, and and it will be confusing when "git reflog" shows no hint of the change. Of couse, if the branch being pushed to isn't one which is pointed by HEAD, of course HEAD's reflog shouldn't be updated. - Ted -
I think we're saying the exact same thing. Nicolas -
If we were to do this properly, we probably would need to restructure the reflog update code for the HEAD in a major way. "git-update-ref refs/heads/foo $newvalue" when HEAD points at branch 'foo' currently does not update HEAD reflog because the current definition of HEAD reflog is (as Nico mentioned) log of changes made through HEAD symref. Instead, we would need a reverse lookup every time any ref is updated to see if that ref is pointed by any symbolics ref and update the reflogs of those symbolic refs. This is expensive to do in general, though, because there is no backpointer to list of symbolic refs that point at a non-symbolic ref. -
But practically speaking... is there that many cases where a branch is updated directly instead of the operation performed through HEAD? We identified one case which is a push to a non bare repo. If those cases are very few (and they _should_ be very few) then we might simply cheat a little and update HEAD separately in those cases. Nicolas -
Actually, there are no such "design choices". That's entirely up to the repository owners to arrange post-update hook, to allow you to do anything you want. The default is not to encourage people (who do not know what they are doing anyway) to push into non-bare repository. -
Maybe it's worth making it an error (that can be forced) if you're pushing to the head that's checked out in a non-bare repository ? It's pretty nasty behaviour for people used to darcs / bzr et al. Sam. -
We talked about that in the past on the list. No. -
| Greg Kroah-Hartman | [PATCH 019/196] DMA: Convert from class_device to device for DMA engine |
| Tejun Heo | [PATCH 4/7] FUSE: implement direct lseek support |
| Parag Warudkar | BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] |
| Greg Smith | PostgreSQL pgbench performance regression in 2.6.23+ |
git: | |
| Len Brown | fatal: unable to create '.git/index': File exists |
| Dan Farina | backup or mirror a repository |
| André Goddard Rosa | Using kdiff3 to compare two different revisions of a folder |
| Petko Manolov | git and binary files |
| Richard Stallman | Real men don't attack straw men |
| Steve B | Intel Atom and D945GCLF2 |
| Jeff Ross | U320 Drive on U160 controller? |
| Sunnz | How do I configure sendmail? |
| Eric Dumazet | [PATCH] fs: pipe/sockets/anon dentries should not have a parent |
| Denys Fedoryshchenko | thousands of classes, e1000 TX unit hang |
| Wei Yongjun | [PATCH] xfrm: Fix kernel panic when flush and dump SPD entries |
| Steffen Klassert | [RFC PATCH 4/5] crypto: allow allocation of percpu crypto transforms |
