Re: [PATCH] fetch: refuse to fetch into the current branch in a non-bare repository

Previous thread: Re: [PATCH] "git diff <tree>{3,}": do not reverse order of arguments by Junio C Hamano on Saturday, October 11, 2008 - 4:29 am. (2 messages)

Next thread: [PATCH] Introduce core.keepHardLinks by Johannes Schindelin on Saturday, October 11, 2008 - 4:45 am. (12 messages)
From: Johannes Schindelin
Date: Saturday, October 11, 2008 - 4:38 am

Some confusing tutorials suggest that it would be a good idea to call
something like this:

	git pull origin master:master

While it might make sense to store what you want to merge, it typically
is plain wrong.  Especially so when the current branch is &quot;master&quot;.

Be at least a little bit helpful by refusing to fetch something into
the current branch.

Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
---
 builtin-fetch.c   |   20 ++++++++++++++++++++
 t/t5505-remote.sh |    2 +-
 t/t5510-fetch.sh  |    6 ++++++
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index ee93d3a..d701550 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -534,6 +534,25 @@ static void find_non_local_tags(struct transport *transport,
 	string_list_clear(&amp;new_refs, 0);
 }
 
+static void check_ref_map(struct ref *ref_map)
+{
+	int flag;
+	unsigned char sha1[20];
+	const char *HEAD;
+
+	if (is_bare_repository())
+		return;
+
+	HEAD = resolve_ref(&quot;HEAD&quot;, sha1, 1, &amp;flag);
+
+	if (!HEAD || !(flag &amp; REF_ISSYMREF))
+		return;
+
+	for (; ref_map; ref_map = ref_map-&gt;next)
+		if (ref_map-&gt;peer_ref &amp;&amp; !strcmp(HEAD, ref_map-&gt;peer_ref-&gt;name))
+			die(&quot;Refusing to fetch into current branch&quot;);
+}
+
 static int do_fetch(struct transport *transport,
 		    struct refspec *refs, int ref_count)
 {
@@ -558,6 +577,7 @@ static int do_fetch(struct transport *transport,
 	}
 
 	ref_map = get_ref_map(transport, refs, ref_count, tags, &amp;autotags);
+	check_ref_map(ref_map);
 
 	for (rm = ref_map; rm; rm = rm-&gt;next) {
 		if (rm-&gt;peer_ref)
diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index c449663..0103e1a 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -188,7 +188,7 @@ test_expect_success 'prune --dry-run' '
 test_expect_success 'add --mirror &amp;&amp; prune' '
 	(mkdir mirror &amp;&amp;
 	 cd mirror &amp;&amp;
-	 git init &amp;&amp;
+	 git init --bare &amp;&amp;
 	 git remote add --mirror -f origin ../one) &amp;&amp;
 	(cd one &amp;&amp;
 	 git branch -m side2 ...
From: Junio C Hamano
Date: Saturday, October 11, 2008 - 2:44 pm

I am somewhat confused.

This &quot;confusion&quot; has been there for very long time and (at least the
scripted version of) git-pull/git-fetch pair has supported a workaround in
the form of --update-head-ok option.

Have we broken the workaround ll.127..147 in git-pull.sh recently, or are
you trying to address something else?
--

From: Shawn O. Pearce
Date: Sunday, October 12, 2008 - 11:47 am

The description is confusing, yes.  It should be about git fetch,

I think &quot;git fetch url side:master&quot; when master is the current branch
and we have omitted --update-head-ok is broken.  Specifically Dscho's
last hunk which adds this test.  The test fails on current master.

Looking at the code in builtin-fetch.c, the only usage of
update_head_ok is for output about the current branch.  I think
it should have been used in at least one other spot, to decide if
the RHS of a refspec is valid for storage.  Dscho's patch tries to
address that.

diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 9aae496..cd8b550 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -323,4 +323,10 @@ test_expect_success 'auto tag following fetches minimum' '
        )
 '

+test_expect_success 'refuse to fetch into the current branch' '
+
+	test_must_fail git fetch . side:master
+
+'
+
 test_done


-- 
Shawn.
--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 2:28 am

Hi,


However, it was a &quot;git pull origin master:master&quot; that was the issue.  


Thanks for testing; I ran out of my Git time budget yesterday.  Will send 
an updated patch as a reply to Daniel's reply.

Ciao,
Dscho
--

From: Shawn O. Pearce
Date: Sunday, October 12, 2008 - 11:52 am

I'd rather see local variables named lowercase.  Constants should

This should only be called if update_head_ok is false.  So maybe:

	if (!update_head_ok)

Repeat this test, but with --update-head-ok and expect success,
since the check_ref_map logic is conditional on that?

-- 
Shawn.
--

From: Daniel Barkalow
Date: Sunday, October 12, 2008 - 1:37 pm

remote.h has a function for getting &quot;the current branch&quot;, which would save 
5 lines here:

	struct branch *current_branch = branch_get(NULL);
	if (!current_branch || is_bare_repository())

		!strcmp(current_branch-&gt;ref_name, ref_map-&gt;peer_ref-&gt;name)

--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 2:36 am

Some confusing tutorials suggested that it would be a good idea to fetch
into the current branch with something like this:

	git fetch origin master:master

(or even worse: the same command line with &quot;pull&quot; instead of &quot;fetch&quot;).
While it might make sense to store what you want to pull, it typically
is plain wrong when the current branch is &quot;master&quot;.

As noticed by Junio, this behavior should be triggered by _not_ passing
the --update-head-ok option, but somewhere along the lines we lost that
behavior.

NOTE: this patch does not completely resurrect the original behavior
without --update-head-ok: the check for the current branch is now _only_
performed in non-bare repositories.

Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
---

	On Sun, 12 Oct 2008, Daniel Barkalow and Shawn Pearce sent some 
	good suggestions to fix the first version of the patch.

	This is the updated version.  In particular, the test is only 
	performed without the option &quot;--update-head-ok&quot;, the function 
	branch_get(NULL) is called to get the current branch, a test was 
	added to verify that --update-head-ok works as expected, and the 
	commit message was modified in the hope that it confuses less now.

	Strangely, some more tests refused to pass this time, because they
	did not use --update-head-ok; this was fixed, too.

 builtin-fetch.c             |   15 +++++++++++++++
 t/t5405-send-pack-rewind.sh |    2 +-
 t/t5505-remote.sh           |    2 +-
 t/t5510-fetch.sh            |   12 ++++++++++++
 t/t9300-fast-import.sh      |    2 +-
 5 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index ee93d3a..57c161d 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -534,6 +534,19 @@ static void find_non_local_tags(struct transport *transport,
 	string_list_clear(&amp;new_refs, 0);
 }
 
+static void check_not_current_branch(struct ref *ref_map)
+{
+	struct branch *current_branch = branch_get(NULL);
+
+	if (is_bare_repository() || ...
From: Shawn O. Pearce
Date: Monday, October 13, 2008 - 7:09 am

Not strange, --update-head-ok was busted and the tests took advantage
of it.  :-\
 
-- 
Shawn.
--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 10:57 am

Hi,


Heh.  My &quot;this time&quot; was meant as &quot;since the first revision of the patch&quot;.  
I really tested the first revision, promise!  And those tests did not fail 
back then.  Or I am even stupider than I am prepared to admit.

Ciao,
Dscho

--

From: Junio C Hamano
Date: Monday, October 13, 2008 - 7:23 am

Do you mean, by &quot;this behavior should be triggered&quot;, &quot;we should allow
updating the current branch head only when --update-head-ok is given&quot;, in
other words, &quot;we should error out if the user tries to update the current


We need to look at these changes a bit carefully, as changes to existing
tests can be either (1) fixing those that depended on broken behaviour of
the command, or (2) trying to hide regressions introduced by the patch

I suspect all of these offending tests came after b888d61 (Make fetch a
builtin, 2007-09-10) which lacked the necessary check in do_fetch() to
cause the regression you are fixing (iow, I am suspecting that the
brokenness of the tests were hidden by the breakage you are fixing).  The
parts of the tests you fixed came from these:

    6738c81 (send-pack: segfault fix on forced push, 2007-11-08)
    4ebc914 (builtin-remote: prune remotes correctly ..., 2008-02-29)
    4942025 (t5510: test &quot;git fetch&quot; following tags minimally, 2008-09-21)
    03db452 (Support gitlinks in fast-import., 2008-07-19)
    
all of which are indeed descendants of b888d61.

With these verified, I think I should move the &quot;Strangely&quot; comment to the
commit log message proper (after stripping &quot;Strangely&quot; part -- it is not
strange anymore after we understand why).
--

From: Junio C Hamano
Date: Monday, October 13, 2008 - 10:30 am

Proposed amendment to the commit log message is:

    Fix fetch/pull when run without --update-head-ok
    
    Some confusing tutorials suggested that it would be a good idea to fetch
    into the current branch with something like this:
    
    	git fetch origin master:master
    
    (or even worse: the same command line with &quot;pull&quot; instead of &quot;fetch&quot;).
    While it might make sense to store what you want to pull, it typically is
    plain wrong when the current branch is &quot;master&quot;.  This should only be
    allowed when (an incorrect) &quot;git pull origin master:master&quot; tries to work
    around by giving --update-head-ok to underlying &quot;git fetch&quot;, and otherwise
    we should refuse it, but somewhere along the lines we lost that behavior.
    
    The check for the current branch is now _only_ performed in non-bare
    repositories, which is an improvement from the original behaviour.
    
    Some newer tests were depending on the broken behaviour of &quot;git fetch&quot;
    this patch fixes, and have been adjusted.
    
    Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
    Acked-by: Shawn O. Pearce &lt;spearce@spearce.org&gt;
    Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;

Thanks for the fix.
--

From: Johannes Schindelin
Date: Monday, October 13, 2008 - 11:12 am

Hi,

thanks for doing the due diligence that I should have done (but I ran out 
of time).


This really wants to make sure that no objects are shared or hard-linked 
between the repository in &quot;trash directory/&quot; and the one in its 
subdirectory &quot;another/&quot;.  It predates &quot;test_must_fail&quot;, too, it seems.

It never touches the working directory &quot;another/&quot;, so using 

This tests &quot;git remote add&quot;'s --mirror option.


This test is actually not fixed, but contains two test cases for the issue 

This creates an empty repository for tests to fast-import, and fetches 
into the current (not yet existing) branch.

I actually understand now why the tests started failing: the change from 
resolve_ref() to get_branch() as requested by Daniel are at fault: 
get_branch() does not check if the branch has an initial commit.

I am actually regretting making this change.  Daniel, do you agree that it 
might be better to change back to resolve_ref(), so that the initial 
complaint (IIRC Han-Wen git pull'ed into a freshly initialized repository 
with that utterly bogus &quot;git pull origin master:master&quot; line) is not 

The only test that would need fixing after reverting back to resolve_ref() 
would be the &quot;remote add --mirror&quot; thing, which I think should be fine.

Ciao,
Dscho

--

From: Daniel Barkalow
Date: Monday, October 13, 2008 - 1:05 pm

Is it, in fact, okay to fetch into the current branch if it's &quot;yet to be 
born&quot;? I feel like it shouldn't be, since you'll get exactly the same 
problem that you would if the branch already existed: the index reflects 
the previous state (in this case, it's empty), so git will show that 
you've staged removing all of the files, right? So this would make the 
check for --update-head-ok more strict than before, but I think the 
behavior change is correct.

	-Daniel
*This .sig left intentionally blank*
--

From: Johannes Schindelin
Date: Tuesday, October 14, 2008 - 2:49 am

Hi,


I think 
http://thread.gmane.org/gmane.comp.version-control.git/31351/focus=31544 
is the best link to see what Han-Wen said.  Granted, it was a 
misunderstanding on his part, but there have been quite a few people with 
the same misunderstanding.

So what they did was

	$ mkdir just-one-branch
	$ cd just-one-branch
	$ git init
	$ git remote add origin &lt;url&gt;
	$ git pull origin master:master

And this _will_ work correctly.  Except when using get_branch(NULL) 
instead of the validating resolve_ref().

When we talk about not breaking existing behavior, we have to talk about 
this behavior, too.

So, my vote is to revert back to resolve_ref(), even if it needs more 
lines.

Thoughts of others?

Ciao,
Dscho

--

From: Shawn O. Pearce
Date: Tuesday, October 14, 2008 - 8:02 am

Yes, I agree, resolve_ref() is the best thing to be using here,
even if it is more code.  get_branch() validates the commit and we
don't want that.  We really just want to know if the current branch
is going to be updated, we don't care to what/why.

-- 
Shawn.
--

From: Daniel Barkalow
Date: Tuesday, October 14, 2008 - 9:04 am

It doesn't validate the commit; it doesn't even validate the symref. The 
resolve_ref()-using code validates the symref, and I think that's an 
error; we also don't care what state we'd update the current branch from.

	-Daniel
*This .sig left intentionally blank*
--

From: Johannes Schindelin
Date: Tuesday, October 14, 2008 - 9:15 am

Hi,


Actually, get_branch() does _not_ validate.  Which is why it thinks that 
the current branch is &quot;master&quot; even if there is no commit yet.

OTOH, resolve_ref() reads the SHA-1 of the ref, which fails in the case 
that there has not been any commit yet.

Still, I think that it would be nice to allow &quot;git pull origin 
master:master&quot; in a freshly initied non-bare repository, so I like 
resolve_ref() better.

Ciao,
Dscho

--

From: Daniel Barkalow
Date: Tuesday, October 14, 2008 - 8:57 am

&quot;git pull origin master:master&quot; invokes &quot;git fetch&quot; with --update-head-ok, 
so it doesn't matter for this case what &quot;git fetch&quot; does without 
--update-head-ok.

The check would block:

$ git fetch origin master:master

but this also would fail to update the working directory and would leave 
the index as if you're removing everything.

	-Daniel
*This .sig left intentionally blank*
--

From: Johannes Schindelin
Date: Tuesday, October 14, 2008 - 9:17 am

Hi,


Does it?  You're correct.  I do not like it.

Ciao,
Dscho

--

From: Daniel Barkalow
Date: Tuesday, October 14, 2008 - 9:52 am

The reason that it runs with --update-head-ok (and the reason that 
--update-head-ok exists in the first place) is that, when you're doing a 
pull, if you fetch into the current branch, pull will identify that you've 
actually fast-forwarded the current branch and will update the working 
tree and index accordingly (which it's allowed to do because it's expected 
to perform a merge in the working tree and index).

That is, it uses --update-head-ok because &quot;git pull origin master:master&quot; 
will work correctly, regardless of whether the local master is 
yet-to-be-born or not.

	-Daniel
*This .sig left intentionally blank*
--

From: Daniel Barkalow
Date: Tuesday, October 14, 2008 - 10:02 am

In particular, the --update-head-ok is from b10ac50f, which is what added 
the check to prohibit fetching into the current branch otherwise, back in 
August 2005. There's never been anything preventing updating the current 
branch using &quot;pull&quot;.

	-Daniel
*This .sig left intentionally blank*
--

From: Junio C Hamano
Date: Tuesday, October 14, 2008 - 3:07 pm

What you wrote above is correct, and it is all that is necessary to make
&quot;pull master:master&quot; (when 'master' is the current branch) work correctly,
in normal situations.  Dscho's patch is in itself is a good thing to do,
but is not necessary when the caller gives --update-head-ok.

Nor is it sufficient.  Have you tested the current &quot;git-pull&quot; with or
without Dscho's patch applied to &quot;git-fetch&quot; when 'master' is an unborn
branch?  &quot;git-pull&quot; has this piece:

        curr_head=$(git rev-parse --verify HEAD 2&gt;/dev/null)
        if test &quot;$curr_head&quot; != &quot;$orig_head&quot;
        then
                # The fetch involved updating the current branch.

                # The working tree and the index file is still based on the
                # $orig_head commit, but we are merging into $curr_head.
                # First update the working tree to match $curr_head.

                echo &gt;&amp;2 &quot;Warning: fetch updated the current branch head.&quot;
                ...
        fi

The above part (written by yours truly) did not care what happens when the
current branch is unborn.  Back then when git-pull was originally written,
nobody was crazy enough to pull into an empty branch and expect it to
work.  The code to allow that craziness (look for &quot;initial pull&quot; in the
same script to find a 6-line block near the end) came much later, and the
commit that added the &quot;initial pull&quot; support should have adjusted the
above codeblock if it really wanted to support pulling into an unborn
branch, but it forgot to do so.  It also forgot to handle the case where
the originally unborn current branch was fetched into.

If we really care about is &quot;git pull origin master:master&quot; into an unborn
branch, I think we need the attached patch on top, and this is regardless
of Dscho's patch that tightens the check when -update-head-ok is not given.

 git-pull.sh |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git c/git-pull.sh w/git-pull.sh
index 75c3610..664fe34 100755
--- c/git-pull.sh
+++ ...
From: Daniel Barkalow
Date: Monday, October 13, 2008 - 10:08 am

Acked-by: Daniel Barkalow &lt;barkalow@iabervon.org&gt;
--

Previous thread: Re: [PATCH] "git diff <tree>{3,}": do not reverse order of arguments by Junio C Hamano on Saturday, October 11, 2008 - 4:29 am. (2 messages)

Next thread: [PATCH] Introduce core.keepHardLinks by Johannes Schindelin on Saturday, October 11, 2008 - 4:45 am. (12 messages)