Re: format-patch broken [Was: fetch and bundle don't work in (semi-)broken repo]

Previous thread: Re: Converting to Git using svn-fe (Was: Speeding up the initial git-svn fetch) by Stephen Bash on Tuesday, October 19, 2010 - 7:57 am. (1 message)

Next thread: For your interest by Srood Sherif on Tuesday, October 19, 2010 - 11:00 am. (1 message)
From: Uwe Kleine-König
Date: Tuesday, October 19, 2010 - 9:09 am

Hi,

I have a repo that got broken somehow (don't know the exact details,
probably because it is shared with another repo and I rewrote history).
Now I want to fetch one branch to a different repo (that happens to be
the alternative to the first one, but I think this is unrelated.):

	ukl@hostname:~/path1/linux-2.6$ git fetch ~/path2/linux-2.6 sectionmismatches
	remote: Counting objects: 118, done.
	remote: error: unable to find 40aaeb204dc04d3cf15c060133f65538b43b13b0
	remote: Compressing objects: 100% (83/83), done.
	remote: fatal: unable to read 40aaeb204dc04d3cf15c060133f65538b43b13b0
	error: git upload-pack: git-pack-objects died with error.
	fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
	remote: aborting due to possible repository corruption on the remote side.
	fatal: protocol error: bad pack header

I don't know what 40aaeb204dc04d3cf15c060133f65538b43b13b0 is, but I
think it's not necessary for the sectionmismatches branch:

	ukl@hostname:~/path2/linux-2.6$ git format-patch linus/master..sectionmismatches
	0001-wip-enable-DEBUG_SECTION_MISMATCH.patch
	0002-ARM-sa1111-move-__sa1111_probe-to-.devinit.text.patch
	0003-ARM-omap1-nokia770-mark-some-functions-__init.patch
	0004-ARM-omap-fb-move-omap_init_fb-to-.init.text.patch
	0005-ARM-omap-fb-move-omapfb_reserve_sram-to-.init.text.patch
	0006-ARM-omap-fb-move-get_fbmem_region-to-.init.text.patch
	0007-ARM-omap-move-omap_get_config-et-al.-to-.init.text.patch
	0008-wip-ARM-omap-move-omap_board_config_kernel-to-.init..patch
	0009-ARM-omap-ams-delta-move-config-to-.init.data.patch
	0010-MTD-pxa2xx-move-pxa2xx_flash_probe-to-.devinit.text.patch
	0011-VIDEO-sa1100fb-register-driver-using-platform_driver.patch
	0012-ARM-s3c64xx-don-t-put-smartq_bl_init-in-.init.text.patch
	0013-ARM-s3c64xx-don-t-put-smartq7_leds-in-.init.data.patch
	0014-ARM-s3c64xx-don-t-put-smartq5_leds-in-.init.data.patch
	0015-ARM-nomadik-move-nmk_gpio_probe-to-.devinit.text.patch

and linus/master is ...
From: Jonathan Nieder
Date: Tuesday, October 19, 2010 - 11:39 am

Hi,


Sounds like alternates or workdir allowed gc to be overzealous, indeed.

Could you:

 1. Make a copy of the corrupted repo, just in case.
 2. Explode all backs with "git unpack-objects"
 3. Identify the missing object, as explained in
    Documentation/howto/recover-corrupted-blob-object.txt?

With that information, it would be easier to examine whether and how

Cc-ing Nico, pack-objects wizard.

Thanks for reporting.
Jonathan
--

From: Uwe Kleine-König
Date: Tuesday, October 19, 2010 - 1:11 pm

Hi Jonathan,

I did:

	mv .git/objects/pack .git/objects/pack.bak
	rm .git/objects/info/alternates
	for p in .git/objects/pack.bak/*.pack ~/path1/linux-2.6/.git/objects/pack/*.pack; do
		git unpack-objects < $p

Thanks for helping

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
--

From: Nicolas Pitre
Date: Tuesday, October 19, 2010 - 1:48 pm

That's usually unnecessary.  If the pack itself is corrupted, trying to 

Ouch!  You will end up with a multi-gigabyte repository, which will be 

That's useful when you have only one corrupted object and you want to 
recreate it from raw material.  But ideally you should simply find a 
pack that contains the problematic object in another repository and copy 
it with its index 

Given that you exploded your repo into loose objects, it'll take _time_.


Nicolas
From: Jonathan Nieder
Date: Tuesday, October 19, 2010 - 2:02 pm

Hi,

[out of order for convenience]

Yep, I gave bad advice. :(  Especially because I forgot that a fsck
would be useful at all.

Better advice would be:

 1. Use "git rev-list --objects" to find out what 40aaeb204dc was.

And if that doesn't work:

 2. Run "git fsck", with packs intact.  This will take a while.  The
    result would include a list of missing objects (like 40aaeb204dc),
    and, most importantly, their type.

Following howto/recover-corrupted-blob-object.txt would be useful for
identifying a corrupt loose object, but iiuc no corrupt objects are

I assume the object is gone for good, but if you have it in another
repo that would be interesting, too.

To be clear: I think the important data has been recovered from the
broken repo already in the form of patches (right?) so the question
at hand is whether it would be possible to teach git to do better at
recovering automatically.  Which might depend on the nature of the
missing objects.

Ciao,
Jonathan
--

From: Nicolas Pitre
Date: Tuesday, October 19, 2010 - 8:06 pm

Sure.  Given that it is possible to create a patch series, that means 
that all the important objects are still available.  Therefore Git 
should be able to produce the pack for the equivalent fetch/bundle as 
well.

So the following patch should help.  I hope that Uwe still has a copy of 
the broken repo to test this patch with.

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f8eba53..691c2f1 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1299,6 +1299,15 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
 		src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz);
 		read_unlock();
 		if (!src->data)
+			if (src_entry->preferred_base) {
+				/* 
+				 * Those objects are not included in the
+				 * resulting pack.  Be resilient and ignore
+				 * them, in case the pack could be created
+				 * nevertheless.
+				 */
+				return 0;
+			}
 			die("object %s cannot be read",
 			    sha1_to_hex(src_entry->idx.sha1));
 		if (sz != src_size)


Nicolas
--

From: Uwe Kleine-König
Date: Wednesday, October 20, 2010 - 12:41 am

Hello Nico,

Doesn't help :-(  I added a warning(...) before your return 0, and I
don't see it.  Probably this means this is not the problematic code
path.

The output with your patch applied is:
	user@hostname:~/path/linux-2.6$ ~/gsrc/git/bin-wrappers/git bundle create tra linus/master..sectionmismatches
	Counting objects: 118, done.
	error: unable to find 40aaeb204dc04d3cf15c060133f65538b43b13b0
	Delta compression using up to 8 threads.
	fatal: object 3cf4fa25ab3d078a49e9488effaebf571fa128da cannot be read
	error: pack-objects died

If you want I can provide you the broken repo.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
--

From: Nicolas Pitre
Date: Wednesday, October 20, 2010 - 6:38 am

Yes, please.


Nicolas
From: Uwe Kleine-König
Date: Thursday, October 21, 2010 - 12:11 am

Hmm, I just unpacked the archive in a seperate directory, removed
.git/objects/info/alternates and then git format-patch
linus/master..sectionmismatches fails in a different way:

	fatal: Invalid revision range linus/master..sectionmismatches

I guess adding a pristine copy of Linus' tree should do the trick.

[ ... some time later ... creating a fresh clone takes quite some time ... ]

No, that's not enough, I will handpick some objects from the original.

Ah, you only need 16edb8381f2f2dabec9cc59f4a3d8c9ead899668 to make
format-patch work, but still 09b3f464a50111071f7740056b98fa8f36133347 is
missing for this tree.  This doesn't hurt format-patch as it's enough
for it to know that this entry didn't change.  So format-patch needs
less information than bundle/fetch and it's OK that the former succeeds
and the latter fails.

[...]

No, that's not the (only) problem,
40aaeb204dc04d3cf15c060133f65538b43b13b0 is needed, git format-patch is
just ignorant enough and invents something different:

	username@hostname:~/path/linux-2.6$ git rev-list linus/master..sectionmismatches
	eb84720860a90769473b42215a4cb67ee5efe7a7
	2e14a5c831032fa489384763087f4a03d88607cb
	00b18e8058e98927e2e4eae32deae7e58f47467c
	1ad328f663128b5c6e6b4af1ac2da1b443dba530
	2a0e4c23a34c78891db685b2b4851705fd36d656
	089d061c26b00a5b8dbb9e70b81d36a97e1daded
	b7ce4ec88f1bdfbe49fa7ef12df8f985d705605a
	b40acb01793933cd6baaaf826f3fef6dd734f72b
	780e3d47d067b54b17bcac3794d62825e8e60422
	ce06129cf7bbf85afe4fc127afc957d36ba4e9e4
	c2172d687578e7eb037a232802a4a8c6de1b0eea
	0c23684f39714a72f54036ca2be36e8894794b66
	cea2a0668ee1a9dc3617a810954a41c7701a08e9
	2bd6ff604ac3aa4c96636dda1ad80a289205ccba
	7591700d538d08f2e8327bb439b6cb0488e13f3e

	username@hostname:~/path/linux-2.6$ git diff-tree -r 7591700d538d08f2e8327bb439b6cb0488e13f3e
	7591700d538d08f2e8327bb439b6cb0488e13f3e
	:100644 100644 1b4afd2e6ca089de0babdacc5781426ef118da5c 40aaeb204dc04d3cf15c060133f65538b43b13b0 M	lib/Kconfig.debug

	commit ...
From: Uwe Kleine-König
Date: Thursday, October 21, 2010 - 1:12 am

Hello,

That was easy:

	git hash-object -w lib/Kconfig.debug

Now git bundle works again.

Nicolas: I forgot to say, that I needed a pristine clone of Linus' repo
as alternate to get it running.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
--

From: Nicolas Pitre
Date: Thursday, October 21, 2010 - 8:53 pm

No it is not.  In theory both format-patch and fetch/bundle should 

Or rather the low-level diff code.

diff --git a/diff.c b/diff.c
index 4732b32..b2839f9 100644
--- a/diff.c
+++ b/diff.c
@@ -2386,10 +2386,14 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only)
 	}
 	else {
 		enum object_type type;
-		if (size_only)
+		if (size_only) {
 			type = sha1_object_info(s->sha1, &s->size);
-		else {
+			if (type < 0)
+				die("unable to read %s", sha1_to_hex(s->sha1));
+		} else {
 			s->data = read_sha1_file(s->sha1, &type, &s->size);
+			if (!s->data)
+				die("unable to read %s", sha1_to_hex(s->sha1));
 			s->should_free = 1;
 		}
 	}


Nicolas
From: Uwe Kleine-König
Date: Wednesday, October 20, 2010 - 12:59 am

Hello,

Well it took 34 minutes, which is OK I guess.

I will study the output a bit now.

For the interested (all lines matching "dangling" removed):

	22:10:57 I: Started git fsck --full
	22:44:14 O: broken link from    tree 519af383e181399db929823299bbd14c04b4229a
	22:44:14 O: to    tree d58c333c44672cb933df5a353dfb63ac571964e8
	22:44:14 O: broken link from  commit e8f7f6a23979c398249a15fb71b3e52dae933fa3
	22:44:14 O: to    tree 7f22979d86cf00c8bd3487feb973353ab5a1beee
	22:44:14 O: broken link from  commit 3164f6598ae44703a89822ced9746c1876ba7fab
	22:44:14 O: to    tree 1017bb1f45b8527ee3c7cfc30288b8098bcf0915
	22:44:14 O: broken link from  commit 124dde2ea387dc9509b0a5574c6f44f7d348a65d
	22:44:14 O: to    tree e4d0ac236995847e4e1d15c6d0afb47787255703
	22:44:14 O: broken link from  commit 60deff2fffd90b217d90284295d5a910f21fe98e
	22:44:14 O: to    tree 18bb32cfd08228820f929d62e63933fe2896b424
	22:44:14 O: broken link from  commit 0b84e651b84dba73772fda15a8a66de8cc274af0
	22:44:14 O: to    tree f8939a09d73b78459381b7991423529592e66324
	22:44:14 O: broken link from  commit e0de1d3c3355f9b1e3474417f05657a1041e7c8a
	22:44:14 O: to    tree 776ad9ac45dab11f2644151a690e1035789a49b6
	22:44:14 O: broken link from  commit 76d1acb95eef413a2501a63cb7f7f4036b71ed37
	22:44:14 O: to  commit f6b6cb2336198913371e66664f28c135df01aea5
	22:44:14 O: broken link from    tree bb473ad85c260b6a1659aa2059cac23b337842e3
	22:44:14 O: to    tree e035bc14698cc3e9abfca1a174feacb25e7e262a
	22:44:14 O: broken link from    tree bb473ad85c260b6a1659aa2059cac23b337842e3
	22:44:14 O: to    tree 8908b2458c1a2c6a6db81e88d96a01aa9a89abe5
	22:44:14 O: broken link from    tree ee35b3a549f45830ed50eb1032836a71ab2b7886
	22:44:14 O: to    tree f2f33722af4b5e32ac17f914cf24cc96c6e80077
	22:44:14 O: broken link from    tree ee35b3a549f45830ed50eb1032836a71ab2b7886
	22:44:14 O: to    tree 70f0188991b8406ec6ec75a504cf50c778fc1001
	22:44:14 O: broken link from    tree 1772732da7d4751d3c0febd7b0ceee61a84702f0
	22:44:14 O: ...
Previous thread: Re: Converting to Git using svn-fe (Was: Speeding up the initial git-svn fetch) by Stephen Bash on Tuesday, October 19, 2010 - 7:57 am. (1 message)

Next thread: For your interest by Srood Sherif on Tuesday, October 19, 2010 - 11:00 am. (1 message)