Re: [PATCH] Fix "unpack-objects --strict"

Previous thread: [PATCH 3/6 (v3)] non-commit object support for rev-cache by Nick Edelen on Thursday, August 13, 2009 - 3:24 am. (1 message)

Next thread: git-blame missing output format documentation by Ori Avtalion on Thursday, August 13, 2009 - 5:18 am. (1 message)
From: Frank Lichtenheld
Date: Thursday, August 13, 2009 - 3:32 am

Hi.

I have a git repository where I include several submodules. This seemed to
work fine until the server I push to got (finally) updated from 1.5.something
to 1.6.4. Now I get an error if I try to push.

The issue is easily reproducible with a minimal repository for me:

Creating an empty repository on server:

flichtenheld@git-test:~$ git version
git version 1.6.4    <---- Directly compiled from git
flichtenheld@git-test:~$ mkdir test.git
flichtenheld@git-test:~$ cd test.git/
flichtenheld@git-test:~/test.git$ git init --bare
Initialized empty Git repository in /home/flichtenheld/test.git/

Creating repository on client:

frl@dhcp-rnd-054:~/tmp$ git version
git version 1.6.3.3   <--- From Debian Package
frl@dhcp-rnd-054:~/tmp$ mkdir test
frl@dhcp-rnd-054:~/tmp$ cd test/
frl@dhcp-rnd-054:~/tmp/test$ git init
Initialized empty Git repository in /home/frl/tmp/test/.git/

Add a random submodule:

frl@dhcp-rnd-054:~/tmp/test$ git submodule add git://repo.or.cz/git-browser.git subdir/git-browser
Initialized empty Git repository in /home/frl/tmp/test/subdir/git-browser/.git/
remote: Counting objects: 131, done.
remote: Compressing objects: 100% (67/67), done.
remote: Total 131 (delta 63), reused 131 (delta 63)
Receiving objects: 100% (131/131), 105.98 KiB, done.
Resolving deltas: 100% (63/63), done.
frl@dhcp-rnd-054:~/tmp/test$ git commit -a
[master (root-commit) 388b975] Add submodule
 2 files changed, 4 insertions(+), 0 deletions(-)
 create mode 100644 .gitmodules
 create mode 160000 subdir/git-browser

Try to push:

frl@dhcp-rnd-054:~/tmp/test$ git remote add origin ssh://gitadm/home/flichtenheld/test.git
frl@dhcp-rnd-054:~/tmp/test$ git push origin master
Counting objects: 4, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (4/4), 372 bytes, done.
Total 4 (delta 0), reused 0 (delta 0)
fatal: Error on reachable objects of 9664402120f411181d05a2f51ee06a475fb73d9b
error: unpack-objects exited with error ...
From: Frank Lichtenheld
Date: Thursday, August 13, 2009 - 4:19 am

Here is a "git config receive.fsckObjects true" missing. I have
this in my default config, and without it the error will not be
triggered.

Gruesse,
-- 
Frank Lichtenheld <frank@lichtenheld.de>
www: http://www.djpig.de/
--

From: Junio C Hamano
Date: Thursday, August 13, 2009 - 12:33 pm

When unpack-objects is run under the --strict option, objects that have
pointers to other objects are verified for the reachability at the end, by
calling check_object() on each of them, and letting check_object to walk
the reachable objects from them using fsck_walk() recursively.

The function however misunderstands the semantics of fsck_walk() function
when it makes a call to it, setting itself as the callback.  fsck_walk()
expects the callback function to return a non-zero value to signal an
error (negative value causes an immediate abort, positive value is still
an error but allows further checks on sibling objects) and return zero to
signal a success.  The function however returned 1 on some non error
cases, and to cover up this mistake, complained only when fsck_walk() did
not detect any error.

To fix this double-bug, make the function return zero on all success
cases, and also check for non-zero return from fsck_walk() for an error.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

Caused by b41860b (unpack-objects: prevent writing of inconsistent
objects, 2008-02-25), which introduced these checks and also the code to
keep unverified objects in core until check_objects() verifies their
reachability.  While I think it is a good idea to check for incomplete
pack data, I do not think it is necessary to keep them in core.  We can
simply error out to signal the caller not to update the refs.

We probably should write everything as they become unpackable (i.e. as
their delta bases becomes available) while keeping track of object names
(but not data) of structured objects that we received, and running only
one level of reachability check on them at the end.  That would certainly
reduce the memory consumption and may simplify the complexity of the code
at the same time.

But I'll leave that to other people.  Hint, hint...

 builtin-unpack-objects.c       |    8 ++++----
 t/t5531-deep-submodule-push.sh |   32 ++++++++++++++++++++++++++++++++
 2 files changed, 36 ...
From: Martin Koegler
Date: Thursday, August 13, 2009 - 11:03 pm

This is neccessary to skip already written objects (eg. blobs,
obj_list[i].obj == NULL). The return code is not important in this
case.

I'm not sure, if fsck_walk can call check_object with obj == NULL
under some (rare) conditions. If yes, the return code should be

This would defeat the whole idea of the this check: If the
precondition (fsck returns OK for a repository) is met, unpack-objects
(and index-pack) with --strict should gurantee, that this is still
true after receiving the objects (even if somebody intentionally tries
to corrupt the repository).

As we assume, that all objects (and all their ancestors) already
present in the repository are OK, we only have to check new objects
and verify, that linked objects in the repository are of the correct
type.

As soon as we start to write objects without all linked objects
already present in the repository, the repository can get inconsistant.

Lets assume, unpack-objects/receive-pack is changed according to your proposal:
* writeout all objects, as received
* checking reachability of new objects
* update refs, if everything is OK

Then a corruption HOWTO would be:

To introduce a object with one of its linked objects missing, left it
out of the pack and push it into the repository. unpack-objects will
unpack all objects and fail updating the ref (but leave all objects in
the repository). As second step, simply send a ref update request,
which should succed, as the object is present in the repository.

A variant: set the SHA1 of the missing object to the SHA1 of a object
of another type. In the second step, you can sent this object to
unpack-objects, which will unpack it, as it does not know, that the
object is already referenced in the repository.

Deleting objects in the case of an error is also not an option, as a
parallel push operation could have already used the object.

mfg Martin Köger




--

From: Junio C Hamano
Date: Thursday, August 13, 2009 - 11:32 pm

Your "ref update request" exploit does not work because your understanding
of how we decide to allow updating a ref is flawed.

We do not blindly update a ref to a commit only because we happen to have
that commit.  We require that commit to reach existing tips of refs
without break.  The logic is in quickfetch() in builtin-fetch.c.

This stronger validation is necessary to deal with any failed transfer by
http walkers, so it is nothing unusual nor new.  They walk from the latest
commits that exist on the other side, and their object transfer can be
interrupted before they transfer enough older commits to make the history
connected with what we already had.  In such a case we obviously do not
update any ref.  And when we re-run the same request, we do not say "Ah,
we have the tip in the object database, so we will update the ref to it".

That is why I said requiring connectivity is a good idea, but keeping them
in core is a misguided waste of memory.


--

From: Martin Koegler
Date: Friday, August 14, 2009 - 12:19 am

I'm talking on the server side of a push operation (receive-pack), not
the client side. The patchset should prevent invalid data from
entering the repository, thereby preventing upload-pack (during further
fetch operation) and other git programs (eg. called from gitweb) from
failing/segfaulting.

mfg Martin Kögler
--

From: Junio C Hamano
Date: Friday, August 14, 2009 - 12:31 am

upload-pack won't feed starting from a random object for a reason, so your
worry is unfounded.

I'd agree gitweb may be a problem.  Ideally it should restrict itself from
anything reachable from the refs of he repository---patches welcome.
--

From: Junio C Hamano
Date: Friday, August 14, 2009 - 12:41 am

You can fix that issue by teaching write_rest() to check what it feeds

I think that is a sensible change to signal an error regardless.  For
example, fsck_walk_tree() will make a callback to you (meaning, walk()
function pointer points at your check_object() function) like this:

	while (tree_entry(&desc, &entry)) {
		int result;

		if (S_ISGITLINK(entry.mode))
			continue;
		if (S_ISDIR(entry.mode))
			result = walk(&lookup_tree(entry.sha1)->object, OBJ_TREE, data);

so while you are checking a tree object you received, upon hitting a
subtree of that tree, it will lookup_tree() it, and if that tree is
missing, you will be called with NULL.

On top of the previous patch, a fix would look like this, I think, but
please double check.

Thanks.

 builtin-unpack-objects.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index 2522c2d..bae00ea 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -181,7 +181,7 @@ static void write_cached_object(struct object *obj)
 static int check_object(struct object *obj, int type, void *data)
 {
 	if (!obj)
-		return 0;
+		return 1;
 
 	if (obj->flags & FLAG_WRITTEN)
 		return 0;
@@ -209,8 +209,10 @@ static int check_object(struct object *obj, int type, void *data)
 static void write_rest(void)
 {
 	unsigned i;
-	for (i = 0; i < nr_objects; i++)
-		check_object(obj_list[i].obj, OBJ_ANY, 0);
+	for (i = 0; i < nr_objects; i++) {
+		if (obj_list[i].obj)
+			check_object(obj_list[i].obj, OBJ_ANY, 0);
+	}
 }
 
 static void added_object(unsigned nr, enum object_type type,


--

From: Frank Lichtenheld
Date: Friday, August 14, 2009 - 2:30 am

I've applied this patch and your small follow-up patch here and the error
indeed disappears. I can't comment on the semantical correctness of the
patch.

Thanks,
-- 
Frank Lichtenheld <frank@lichtenheld.de>
www: http://www.djpig.de/
--

Previous thread: [PATCH 3/6 (v3)] non-commit object support for rev-cache by Nick Edelen on Thursday, August 13, 2009 - 3:24 am. (1 message)

Next thread: git-blame missing output format documentation by Ori Avtalion on Thursday, August 13, 2009 - 5:18 am. (1 message)