I often get this (running git 1.5.6.rc0 presently): y:/usr/src/git26> git-checkout linux-next error: Untracked working tree file 'arch/x86/kernel/apic.c' would be overwritten by merge. which screws things up. I fix it by removing the offending file, which gets irritating because git bails out after the first such instance, so I need to rerun git-checkout once per file (there are sometimes tens of them). Should this be happening? I don't know what causes it, really. All I've been doing in that directory is running `git-checkout' against various maintainers' trees. 95% of the time this works OK but eventually git seems to get all confused and the above happens. Is there some way in which I can work around this with a single command rather than having to run git-checkout once per offending file? I suppose a good old `rm -rf *' would do it... Thanks. --
what I do when I run into this is "git reset --hard HEAD" which makes all files in the working directory match HEAD, and then I can do the other checkout. --
I think you can also do git checkout -f head to force the checkout to overwrite all files the fact that git will happily leave modified things in the working directory appears to be very helpful for some developers, but it's also a big land mine for others. is there a way to disable this? David Lang --
On Wed, 15 Oct 2008 12:14:34 -0700 (PDT) These files weren't modified. By me, at least. git might have "modified" them, but it has all the info necessary to know that the --
On Wed, 15 Oct 2008 12:14:34 -0700 (PDT) I do git-reset --hard HEAD git-reset --hard linux-next git-checkout linux-next and get error: Untracked working tree file 'Next/SHA1s' would be overwritten by merge. y --
What about simply: git-checkout -f linux-next Nicolas --
Never mind -- you apparently did that already with success. Nicolas --
Hmm. It doesn't actually do that normally. If you switch between trees, git will (or _should_) remove the old files that it knows about. If you get a lot of left-over turds, there's something wrong. It could be a git bug, of course. That said, especially considering the source of this, I wonder if it's just that Andrew ends up using all those non-git scripts on top of a git tree, and then that can result in git *not* knowing about a certain file, and then when switching between trees (with either git checkout or with git reset), the data that was created with non-git tools gets left behind and now git will be afraid to overwrite it. So yes, there are ways to force it (both "git checkout -f" and "git reset --hard" having already been mentioned), but the need for that - especially if it's common - is a bit discouraging. Especially since it's still possible that it's some particular mode of git usage that leaves those things around. Andrew - have you any clue what it is that triggers the behavior? (By the filename, I realize it's a file that doesn't exist in one tree or the other, and which doesn't get removed at some point. But have you had merge failures, for example? Is it perhaps a file that was created during a non-clean merge, and then got left behind due to the merge being aborted? It would be interesting to know what led up to this..) Linus --
I see it fairly frequently when switching between different branches of a project. I also see it when I try applying a patch to a tree, then want to get up to date with that tree (in this case it really is different) It could be that git is looking to see if the file is the same as the old tree had it before checking out the new tree. if it isn't for any reason it sounds the alert. --
So, at least for any normal switch, assuming file 'a' doesn't exist in the other branch, you really should have a few different cases: - you have a dirty file, and git should say something like error: You have local changes to 'file'; cannot switch branches. because it refuses to modify the file to match the other branch (which includes removing it) if it doesn't match the index. So this case shouldn't leave anything behind. - You have that extra file, but it's not in the index. If it's in your current HEAD, we should still notice it with something like: error: Untracked working tree file 'tree' would be removed by merge. because now it's untracked (not in the index), but the switching between branches tries to essentially "apply" the difference between your current HEAD and the new branch, and finds that the difference involves removing a file that git isn't tracking. See? HOWEVER. If you're used to doing "git checkout -f" or "git reset --hard", both of those checks are just ignored. After all, you asked for a forced switch. And at least in the second case, what I think happens is that git won't remove the file it doesn't know about, so you'll have a "turd" left around. So yes, you can certainly get these kinds of left-overs, but they really should be only happening if you "force" something. Do you do that often? Linus --
one place that I know I've run into it frequently is in an internal project that I did not properly setup .gitignore and did "git add ." and "git commit -a" to. that projects repository contains the compiled binaries and I frequently get these errors when switching trees. that sounds like the first case. I've seen discussion of a new sequencer functionality, would it allow me to define a .gitignore file and re-create the repository as if that file had existed all along? --
On Wed, 15 Oct 2008 12:31:40 -0700 (PDT) I treat my git directory as a read-only thing. I only ever modify it That's certainly a possibility - I get a lot of merge failures. A real lot. And then quite a bit of rebasing goes on, especially in linux-next. And then there's all the other stuff which Stephen does on top of the underlying trees to get something releasable happening. --
Is "git checkout -f" part of the scripting? Or "git reset --hard"? So what I could imagine is happening is: - you have a lot of automated merging - a merge goes south with a data conflict, and since it's all automated, you just want to throw it away. So you do "git reset --force" to do that. - but what "git reset --hard" means is to basically ignore all error cases, including any unmerged entries that it just basically ignores. - so it did set the tree back, but the whole point of "--hard" is that it ignores error cases, and doesn't really touch them. Now, I don't think we ever really deeply thought about what the error cases should do when they are ignored. Should the file that is in some state we don't like be removed? Or should we just ignore the error and return without removing the file? Generally git tries to avoid touching things it doesn't understand, but I do think this may explain some pain for you, and it may not be the right thing in this case. (And when I say "this case", I don't really know whether you use "git checkout -f" or "git reset --hard" or something else, so I'm not even going to say I'm sure exactly _which_ case "this case" actually us :) Of course, the cheesy way for you to fix this may be to just add a git clean -dqfx to directly after whatever point where you decide to reset and revert to an earlier stage. That just says "force remove all files I don't know about, including any I might ignore". IOW, "git reset --hard" will guarantee that all _tracked_ files are reset, but if you worry about some other crud that could have happened due to a failed merge, that additional "git clean" may be called for. Of course, it's going to read the whole directory tree and that's not really cheap, but especially if you only do this for error cases, it's probably not going to be any worse. And I'm assuming you're not compiling in that tree, so you probably don't want to save object files (you can remove ...
On Wed, 15 Oct 2008 13:08:36 -0700 (PDT)
well, this script has been hacked on so many times I'm not sure what
it does any more.
Presently the main generate-a-diff function is
doit()
{
tree=$1
upstream=$2
cd $GIT_TREE
git checkout "$upstream"
git reset --hard "$upstream"
git fetch "$tree" || exit 1
git merge --no-commit 'test merge' HEAD FETCH_HEAD > /dev/null
{
git_header "$tree"
git log --no-merges ORIG_HEAD..FETCH_HEAD
git diff --patch-with-stat ORIG_HEAD
} >$PULL/$tree.patch
{
echo DESC
echo $tree.patch
echo EDESC
git_header "$tree"
git log --no-merges ORIG_HEAD..FETCH_HEAD
} >$PULL/$tree.txt
git reset --hard "$upstream"
}
usually invoked as
doit origin v2.6.27
doit origin linux-next
etc.
the above seemed fairly busted, so I'm now using
git checkout -f "$upstream"
git reset --hard "$upstream"
git fetch "$tree" || exit 1
which seems a bit more sensible. Perhaps I should do the reset before
the checkout, dunno.
That function has been through sooooooo many revisions and each time
some scenario get fixed (more like "improved"), some other scenario
gets busted (more like "worsened"). The above sorta mostly works,
although it presently generates thirty-odd rejects against
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip.git#auto-latest,
which is way above my fix-it-manually threshold. linux-next is still
dead because it's taking Stephen over two days to fix the mess he's
Yeah, there's no easy solution here, and I suspect the real solution is
"read programmer's mind". Providing a reliable override (like -f) is a
--
On Wed, Oct 15, 2008 at 10:23 PM, Andrew Morton Hi Andrew, I was wondering whether you could share the scripts you built on top of git, you might get some useful suggestions from this list and they could be inspiration for further improvement in GIT (it just happened with this thread ;-) Thanks. Ciao, -- Paolo http://paolo.ciarrocchi.googlepages.com/ --
oh gee, you don't want to look. It should all be in http://userweb.kernel.org/~akpm/stuff/patch-scripts.tar.gz But really it's just the one script, pull-git-patches, below. That thing's been hacked around so much that I daren't breathe on it. Fortunately as long as Stephen Rothwell is producing linux-next I don't have much need for it any more. #!/bin/sh GIT_TREE=/usr/src/git26 PULL=/usr/src/pull git_header() { tree="$1" echo GIT $(cat .git/refs/heads/$tree) $(cat .git/branches/$tree) echo } # maybe use git clean -dqfx doit() { tree=$1 upstream=$2 cd $GIT_TREE git checkout -f "$upstream" git reset --hard "$upstream" git fetch "$tree" || exit 1 git merge --no-commit 'test merge' HEAD FETCH_HEAD > /dev/null { git_header "$tree" git log --no-merges ORIG_HEAD..FETCH_HEAD git diff --patch-with-stat ORIG_HEAD } >$PULL/$tree.patch { echo DESC echo $tree.patch echo EDESC git_header "$tree" git log --no-merges ORIG_HEAD..FETCH_HEAD } >$PULL/$tree.txt git reset --hard "$upstream" } do_one() { tree=$1 upstream=$2 if [ ! -e $PULL/$tree.patch ] then echo "*** doing $tree, based on $upstream" git branch -D $tree doit $tree $upstream else echo skipping $tree fi } mkdir -p $PULL if [ $1"x" = "-x" ] then exit fi cd $GIT_TREE git checkout -f master cd /usr/src if [ $# == 0 ] then trees=/usr/src/git-trees else trees="$1" fi if [ $# == 2 ] then do_one $1 $2 else while read x do if echo $x | grep '^#.*' > /dev/null then true else do_one $x fi done < $trees fi --
Actually, with your filename, I suspect the conflict would be not a real file content, but more of a "delete" conflicting with a modification to that file. IOW, I'm guessing that the thing you hit with arch/x86/kernel/apic.c was that some branch you pulled: - created that file - deleted arch/x86/kernel/apic_[32|64].c - the old file got marked as a rename source for the new apic.c and there was a data conflict when trying to apply the changes. as a result, your working tree would have that "apic.c" file in it, but with conflict markers, and marked as unmerged. When you then do "git reset --hard", it will just ignore unmerged entries, and since the original tree (and the destination tree) match, and neither of them contain apic.c either, git will totally ignore that file and not It's "--hard", not "--force". Yeah, the git reset flags are insane. As is the default action, for that matter. It's one of the earliest interfaces, and it's stupid and reflects git internal implementations rather than what we ended up learning about using git later. Oh, well. But 'git checkout -f' (which is nicer from a user interface standpoint) has the exact same logic and I think shares all the implementation. I think they both end up just calling "git read-tree --reset -u". It's quite possible that we should remove unmerged entries. Except that's not how our internal 'read_cache_unmerged()' function works. It really just ignores them, and throws them on the floor. We _could_ try to just turn them into a (since) stage-0 entry. Junio? Linus --
On Wed, 15 Oct 2008 13:23:50 -0700 (PDT) That sounds likely. I suspect things were especially bad today because I accidentally pulled four-week-old linux-next, which had over 500 rejects in it. --
I'd agree that dropping unmerged entries to stage-0 when we can would make sense. An conflicted existing path would get an stage-0 entry in the index, which is compared with the switched-to HEAD (which could be the same as the current one when "git reset --hard" is run without a rev), we notice that they are different and the index entry and the work tree path is overwritten by the version from the switched-to HEAD. For a new path that a failed merge tried to bring in, we notice that the switched-to HEAD does not have that path and happily remove it from the index and from the work tree. All will go a lot smoother than the current code. I am not sure what should happen when we can't drop the unmerged entry down to stage-0 due to D/F conflicts, though. IIRC, read-tree proper would not touch the work tree in such a case, but merge-recursive creates our and their versions with funny suffixes, which will not be known to the index and will be left in the working tree. --
When aborting a failed merge that has brought in a new path using "git
reset --hard" or "git read-tree --reset -u", we used to first forget about
the new path (via read_cache_unmerged) and then matched the working tree
to what is recorded in the index, thus ending up leaving the new path in
the work tree.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
Junio C Hamano <gitster@pobox.com> writes:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> On Wed, 15 Oct 2008, Linus Torvalds wrote:
>>>
>> It's quite possible that we should remove unmerged entries. Except that's
>> not how our internal 'read_cache_unmerged()' function works. It really
>> just ignores them, and throws them on the floor. We _could_ try to just
>> turn them into a (since) stage-0 entry.
>>
>> Junio?
>
> I am not sure what should happen when we can't drop the unmerged entry
> down to stage-0 due to D/F conflicts, though. IIRC, read-tree proper
> would not touch the work tree in such a case, but merge-recursive creates
> our and their versions with funny suffixes, which will not be known to the
> index and will be left in the working tree.
I am still unsure what we should do when we hit D/F conflicts; this one
simply replaces but it may be safer to drop ADD_CACHE_OK_TO_REPLACE from
the options to trigger an error in such a case. I dunno.
read-cache.c | 32 +++++++++++++++++++-------------
t/t1005-read-tree-reset.sh | 30 ++++++++++++++++++++++++++++++
2 files changed, 49 insertions(+), 13 deletions(-)
diff --git a/read-cache.c b/read-cache.c
index c229fd4..efbab6a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1489,25 +1489,31 @@ int write_index(const struct index_state *istate, int newfd)
int read_index_unmerged(struct index_state *istate)
{
int i;
- struct cache_entry **dst;
- struct cache_entry *last = NULL;
+ int unmerged = 0;
read_index(istate);
- dst = istate->cache;
for (i = 0; i < istate->cache_nr; i++) {
...Looks good to me. And from my tests, I think "git checkout -f" didn't have this problem at all, because it ends up using not got read-tree, but doing its own "reset_tree()" that uses unpack_trees(). I do wonder if "git reset" should perhaps be written in those terms, instead of just being a wrapper around git read-tree. But the patch looks fine. Linus --
Let's do this for 'maint' and I'll let others think about possible improvements, then ;-). --
i've met this problem in various variants in the past few months, and i
always assumed that it's "as designed" - as Git's policy is to never
lose information unless forced to do so. (which i find very nice in
general, and which saved modification from getting lost a couple of
times in the past)
the situations where i end up with a messed up working tree [using
git-c427559 right now]:
- doing a conflicted Octopus merge will leave the tree in some weird
half-merged state, with lots of untracked working tree files that not
even a hard reset will recover from. The routine thing i do to clean
up is:
git reset --hard HEAD
git checkout HEAD .
git ls-files --others | xargs rm # DANGEROUS
doing git checkout -f alone is not enough, as there might be various
dangling files left around.
- git auto-gc thinking that it needs to do another pass in the middle
of a random git operation, but i dont have 10 minutes to wait so i
decide to Ctrl-C it.
- doing the wrong "git checkout" and then Ctlr-C-ing it can leave the
working tree in limbo as well, needing fixups. If i'm stuck between
two branches that rename/remove files it might need the full fixup
sequence above.
- if a testbox has a corrupted system clock, its git repo and the
kernel build can get confused. This is to be expected i think - but
the full sequence above will recover the corrupted tree. Not much Git
can do about this i guess.
Does your fix mean that all i have to do in the future is a hard reset
back to HEAD, and that dangling files are not supposed to stay around?
Ingo
--
As long as the index *somehow* knows about these new files, they are
removed.
The situation is:
(0) you start from a HEAD that does not have path xyzzy;
(1) you attempt to merge a rev that has path xyzzy;
(2) the merge conflicts, leaving higher staged index entries for the
path.
(3) you decide not to conclude the merge by saying "reset --hard".
The old logic for "reset" was to remove paths that exist in the index at
stage #0 (i.e. cleanly merged) and not in HEAD. The patch changes the
rule to remove paths that exist in the index at any stage (i.e. including
the ones that have conflicted and not resolved yet) and not in HEAD.
--
