Re: tree corrupted on disk quota full

Previous thread: [PATCH] Teach the revision walker to walk by reflogs with --walk-reflogs by Johannes Schindelin on Thursday, January 11, 2007 - 3:47 am. (2 messages)

Next thread: Re: MinGW port (was: Re: [ANNOUNCE] qgit4 aka qgit ported to Windows) by Johannes Sixt on Thursday, January 11, 2007 - 6:24 am. (1 message)
From: Michael S. Tsirkin
Date: Thursday, January 11, 2007 - 5:57 am

I hit a quota limit on a disk my tree was on, but did not notice.
Doing git pull on a tree seems to have corrupted it.
Now I have:

$ git-fsck-objects
error: 4d4d30be967d3284cbf59afd4fba6ab536e295f5: object not found
error: c03590b581d51d5fa43adbef9415e935d0229412: object not found
missing tree 10147d79b2418168d9433067b6439971bd4f1261
broken link from  commit 322a6c93ad86d2a151dd97a4c6b0e014a4893437
              to    tree 10147d79b2418168d9433067b6439971bd4f1261
dangling commit 322a6c93ad86d2a151dd97a4c6b0e014a4893437

The tree can not be pulled into, or from.

-- 
MST
-

From: Andreas Ericsson
Date: Thursday, January 11, 2007 - 6:40 am

Can you do a "git rev-list" on the commit pointing to this tree? If so, 
you should be able to do "git reset HEAD~1" and re-do the fetch.

Otoh, this is curious. Aren't tree- and blob-objects written to disk 
before the commit object pointing to them? If not, how can we claim to 
support atomic commits? I imagined things were written in this order

	blob -> tree -> commit

seeing as the dependency chain goes the other way.

"git repair" could easily be cooked up by finding the first complete 
commit and resetting current HEAD and all branches pointing to it to 
that first complete commit. Issuing a fresh "fetch" after that should 
automagically fix everything for you.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
-

From: Andy Whitcroft
Date: Thursday, January 11, 2007 - 6:43 am

Well we do have some dubious code which can fail in the face of short
writes when the disk is full.  Those I believe are now plugged following
my short i/o series which is in master as far as I know.

I wonder if we can cook up a test with a loop back mount on a tiny ramfs
that will fail in this way.

-apw
-

From: Michael S. Tsirkin
Date: Thursday, January 11, 2007 - 6:59 am

$ git reset --hard HEAD~1
HEAD is now at 2d41bf8... Remove svn keywords
$ git-fsck-objects
error: 4d4d30be967d3284cbf59afd4fba6ab536e295f5: object not found
error: c03590b581d51d5fa43adbef9415e935d0229412: object not found
missing tree 10147d79b2418168d9433067b6439971bd4f1261
broken link from  commit 322a6c93ad86d2a151dd97a4c6b0e014a4893437
              to    tree 10147d79b2418168d9433067b6439971bd4f1261
dangling commit 322a6c93ad86d2a151dd97a4c6b0e014a4893437

-- 
MST
-

From: Andreas Ericsson
Date: Thursday, January 11, 2007 - 7:28 am

Humm... Seems there are other problems here.

Try

$ git reset --hard 322a6c93ad86d2a151dd97a4c6b0e014a4893437~1
$ git fsck-objects HEAD

If this tells you your repo is clean, you should be able to redo 
whatever fetches you did on top of that to get a working repository 
again. If you have objects that got half-written due to end-of-disk 
errors, you'd have to do "git prune" after resetting.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
-

From: Linus Torvalds
Date: Thursday, January 11, 2007 - 11:31 am

That's nasty, but something as simple a "git reset --hard ORIG_HEAD" 
should fix it (of course, if the disk-full happened earlier, or you've 
done some other reset or something else that over-write your ORIG_HEAD, 
you'd need to find the most recent commit that wasn't broken).

The good news is that there's no way your old data got corrupted. You just 
need to _find_ it (and normally ORIG_HEAD points to it, so it's trivial to 
find).

The suggestion to use "HEAD~1" is *not* a good one, simply because the 
pull (if it was a fast-forward) will not have HEAD~1 as your previous 
head: it will depend on how many commits you pulled. If you had reflog 
enabled, a "HEAD@{1}" might have worked (or "HEAD@{2}" as you already did 
the "git reset").

That said, clearly something didn't check the error return of a write() 
call. Some of that got fixed up recently, so it might even be fixed in 
current git already.

The most likely case (for a pull) is "git-unpack-objects", I guess.

		Linus

-

From: Linus Torvalds
Date: Thursday, January 11, 2007 - 12:19 pm

I'm not convinced.

The "write_in_full()" logic is supposed to help people avoid problems, but 
it *still* returns a success for a partial write.

Example of extreme breakage:

	static int write_buffer(int fd, const void *buf, size_t len)
	{
	        ssize_t size;
	
	        size = write_in_full(fd, buf, len);
	        if (!size)
	                return error("file write: disk full");
	        if (size < 0)
	                return error("file write error (%s)", strerror(errno));
	        return 0;
	}

which is TOTAL GARBAGE, because the disk-full event might have happened in 
the middle of the write, so "write_in_full()" might have returned 4096, 
for example, even though the buffer length was much bigger.

I personally think write_in_full() is totally mis-designed. If you are 
ready to handle partial writes, you should use "xwrite()". If you're not 
ready to handle partial writes, you should either use "write_or_die()", 
_or_ you should expect a partial write to at least return an error code 
(which is how "write_buffer()" works).

But that's not how write_in_full() actually works. Write-in-full does not 
return an error for a partial write, it returns the partial size.

Which is idiotic. It makes the function pointless. Just use xwrite() for 
that.

I would suggest perhaps a patch like this..  And then you _really_ can 
just do

	if (write_in_full(fd, buf, len) < 0)
		die("Not going to work: %s", strerror(errno));

and be happy.

		Linus

---
diff --git a/write_or_die.c b/write_or_die.c
index a119e1d..f95299a 100644
--- a/write_or_die.c
+++ b/write_or_die.c
@@ -37,15 +37,14 @@ int write_in_full(int fd, const void *buf, size_t count)
 {
 	const char *p = buf;
 	ssize_t total = 0;
-	ssize_t written = 0;
 
 	while (count > 0) {
-		written = xwrite(fd, p, count);
-		if (written <= 0) {
-			if (total)
-				return total;
-			else
-				return written;
+		size_t written = xwrite(fd, p, count);
+		if (written < 0)
+			return -1;
+		if (!written) ...
From: Linus Torvalds
Date: Thursday, January 11, 2007 - 1:03 pm

otherwise it looks fine. But I obviously didn't even compile-test the 
patch I sent out, so caveat emptor.

		Linus
-

From: Junio C Hamano
Date: Thursday, January 11, 2007 - 2:02 pm

I agree 100% with the above reasoning.

-

From: Andy Whitcroft
Date: Thursday, January 11, 2007 - 2:17 pm

The call was intended to replace a common idiom:

if (xwrite(fd, buf, size) != size)
	error

write_in_full() is intended as a drop in replacement for that.  On a
short write we will return a short count and that will fail such a test.
 The call basically implements the standard write() call semantic with
maximum attempts to complete.

That said returning -1 would have the same effect.

-apw
-

From: Junio C Hamano
Date: Thursday, January 11, 2007 - 2:27 pm

You are both right ;-).  I would have these fixups on top of
Linus's.

---

diff --git a/config.c b/config.c
index 2cd0263..733fb1a 100644
--- a/config.c
+++ b/config.c
@@ -694,11 +694,9 @@ int git_config_set_multivar(const char* key, const char* value,
 
 		store.key = (char*)key;
 		if (!store_write_section(fd, key) ||
-		    !store_write_pair(fd, key, value)) {
-			ret = write_error();
-			goto out_free;
-		}
-	} else{
+		    !store_write_pair(fd, key, value))
+			goto write_err_out;
+	} else {
 		struct stat st;
 		char* contents;
 		int i, copy_begin, copy_end, new_line = 0;
@@ -777,31 +775,33 @@ int git_config_set_multivar(const char* key, const char* value,
 
 			/* write the first part of the config */
 			if (copy_end > copy_begin) {
-				write_in_full(fd, contents + copy_begin,
-				copy_end - copy_begin);
-				if (new_line)
-					write_in_full(fd, "\n", 1);
+				if (write_in_full(fd, contents + copy_begin,
+						  copy_end - copy_begin) <
+				    copy_end - copy_begin)
+					goto write_err_out;
+				if (new_line &&
+				    write_in_full(fd, "\n", 1) != 1)
+					goto write_err_out;
 			}
 			copy_begin = store.offset[i];
 		}
 
 		/* write the pair (value == NULL means unset) */
 		if (value != NULL) {
-			if (store.state == START)
-				if (!store_write_section(fd, key)) {
-					ret = write_error();
-					goto out_free;
-				}
-			if (!store_write_pair(fd, key, value)) {
-				ret = write_error();
-				goto out_free;
+			if (store.state == START) {
+				if (!store_write_section(fd, key))
+					goto write_err_out;
 			}
+			if (!store_write_pair(fd, key, value))
+				goto write_err_out;
 		}
 
 		/* write the rest of the config */
 		if (copy_begin < st.st_size)
-			write_in_full(fd, contents + copy_begin,
-				st.st_size - copy_begin);
+			if (write_in_full(fd, contents + copy_begin,
+					  st.st_size - copy_begin) <
+			    st.st_size - copy_begin)
+				goto write_err_out;
 
 		munmap(contents, st.st_size);
 ...
From: Linus Torvalds
Date: Thursday, January 11, 2007 - 3:09 pm

This fixes another problem that Andy's case showed: git-fsck-objects 
reports nonsensical results for corrupt objects.

There were actually two independent and confusing problems:

 - when we had a zero-sized file and used map_sha1_file, mmap() would 
   return EINVAL, and git-fsck-objects would report that as an insane and 
   confusing error. I don't know when this was introduced, it might have 
   been there forever.

 - when "parse_object()" returned NULL, fsck would say "object not found", 
   which can be very confusing, since obviously the object might "exist", 
   it's just unparseable because it's totally corrupt.

So this just makes "xmmap()" return NULL for a zero-sized object (which is 
a valid thing pointer, exactly the same way "malloc()" can return NULL for 
a zero-sized allocation). That fixes the first problem (but we could have 
fixed it in the caller too - I don't personally much care whichever way it 
goes, but maybe somebody should check that the NO_MMAP case does 
something sane in this case too?).

And the second problem is solved by just making the error message slightly 
clearer - the failure to parse an object may be because it's missing or 
corrupt, not necessarily because it's not "found".

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---

This would hopefully have made the error case Andy had a bit more readable 
and understandable.

diff --git a/fsck-objects.c b/fsck-objects.c
index 0d8a8eb..81f00db 100644
--- a/fsck-objects.c
+++ b/fsck-objects.c
@@ -290,7 +290,7 @@ static int fsck_sha1(unsigned char *sha1)
 {
 	struct object *obj = parse_object(sha1);
 	if (!obj)
-		return error("%s: object not found", sha1_to_hex(sha1));
+		return error("%s: object corrupt or missing", sha1_to_hex(sha1));
 	if (obj->flags & SEEN)
 		return 0;
 	obj->flags |= SEEN;
diff --git a/git-compat-util.h b/git-compat-util.h
index f8d46d5..8781e8e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -202,6 +202,8 @@ static inline void ...
From: Shawn O. Pearce
Date: Thursday, January 11, 2007 - 5:40 pm

All calls to perform any sort of mmap go through xmmap().  If NO_MMAP
was defined at compile time than the mmap() call within xmmap() is
redefined to invoke git_mmap() via a preprocessor define slightly
earlier in git-compat-util.h.

So your patch will also automatically do the right thing in the
NO_MMAP case.

BTW, I like your solution of just fixing it here in xmmap().  Good idea.

-- 
Shawn.
-

From: Linus Torvalds
Date: Thursday, January 11, 2007 - 2:59 pm

I really don't agree.

You should use "write_or_die()" for this idiom.

There really is two cases:
 - the complex and robust one: "we will keep writing even in the presense 
   of partial writes".

   This is "xwrite()". 

 - the simple case: "write everything", anything else is an error. This is 
   "write_or_die()", or "write_in_full()".

And I claim, that for the "write_in_full()" case, if you EVER get anything 
but the full length back, that's an error.

And it should be treated as such. There is NO POINT in write_in_full() 
ever returning a partial result. It's unrecoverable by design - and if it 
wasn't, you wouldn't use "write_in_full()" in the first place, you'd just 
use "xwrite()".

And returning a partial result in that case is just a recipe for disaster. 
I already pointed to one real bug due to this: write_buffer() was (and 
currently is) simply buggy. And it's buggy EXACTLY becaue 
"write_in_full()" was doing the wrong thing, and just made it easy to 
write buggy code - it was no easier to use than xwrite(), and thus there 
was no point to it. Might as well have just used xwrite() in the first 
place.

So I repeat: either you use "xwrite()" (and handle the partial case), or 
you use "write_in_full()" (and the partial case is an *ERROR*). There is 
no sane middle ground in between those cases.

			Linus
-

From: Andy Whitcroft
Date: Thursday, January 11, 2007 - 3:10 pm

Things should be safe in general with the code as it is as we are
checking the write length.  But it is clear that converting to an error
would simplify and clean up the code.  Should have done that the first
time.

If its not done in the morning, I'll fix it up.

-apw
-

From: Linus Torvalds
Date: Thursday, January 11, 2007 - 3:32 pm

NO WE ARE NOT.

I already pointed you to write_buffer(). It used to do the right thing. It 
doesn't any more. And it doesn't, exactly because you converted it away 
from a loop that did it right, to doing "write_in_full()" and NOT checking 
the return value properly.

The thing is, if you support partial writes (ie xwrite()), you need to do 
it in a loop, and then the correct thing to check for is "zero or error".

Once you don't do a loop (ie "write_in_full()" - the whole _point_ is to 
not do the loop, after all), you need to either expand that check to "zero 
or error or partial" (which just makes the code _more_ complex), or you 
need to make "write_in_full()" just return an error for the partial case.

Which is why I'm harping on this issue: either we use "xwrite()", or we 
fix "write_in_full()" to return errors on partials. Because the "middle 
ground", where write_in_full() emulates the partial case of "xwrite()" is 
actually MORE complex than "xwrite()" itself.

		Linus
-

From: Michael S. Tsirkin
Date: Thursday, January 11, 2007 - 2:11 pm

I did lots of resets and they did not seems to help.

Then, on a hunch, I just did git prune and
it cleaned the tree so I can pull/push fine.

How come?

mst@mst-lt:~/tmp/libmthca_bad$ git --version
git version 1.5.0.rc0.g244a7
mst@mst-lt:~/tmp/libmthca_bad$ git pull ~/scm/libmthca/
remote: Generating pack...
remote: Done counting 18 objects.
Result has 11 objects.
remote: Deltifying 11 objects.
remote:  100% (11/11) done
remote: Total 11 (delta 8), reused 1 (delta 1)
Unpacking 11 objects
error: failed to read delta-pack base object 4d4d30be967d3284cbf59afd4fba6ab536e295f5
fatal: unpack-objects died with error code 1
Fetch failure: /home/mst/scm/libmthca/
mst@mst-lt:~/tmp/libmthca_bad$ git prune
mst@mst-lt:~/tmp/libmthca_bad$ git pull ~/scm/libmthca/
remote: Generating pack...
remote: Done counting 18 objects.
Result has 11 objects.
Deltifying 11 objects.
 10remote: 0% (11/11) done
Total 11 (delta 8), reused 1 (delta 1)
Unpacking 11 objects
 100% (11/11) done
Updating 732a507..322a6c9
Fast forward
 Makefile.am      |   25 +++++++++++++++++--------
 configure.in     |   16 ++++++++++++++--
 debian/changelog |    4 +---
 mthca.driver     |    1 +
 src/mthca.c      |   23 ++++++++++++-----------
 5 files changed, 45 insertions(+), 24 deletions(-)
 create mode 100644 mthca.driver

So it seems the tree was more or less fine (it just has some useless objects
because of the disk full failure), the bug seems to be that pull/push fail.

I made a backup of the bad tree so that if someone wants to play with it, here it is:
http://www.openfabrics.org/~mst/libmthca_bad.tgz

-- 
MST
-

From: Linus Torvalds
Date: Thursday, January 11, 2007 - 2:50 pm

NOTE! If "nothing seems to help" means "git-fsck-objects" still complains, 
then that is very much expected.

"git reset" does NOT fix a corrupted database. Never has, never will. It 
should only fix the case where a branch head points to a corrupt or 
incomplete tree.

To actually fix the database, you need to remove the broken objects. Which 

If so, your branch heads had _already_ been fixed (by one of the resets), 
and "git prune" just got rid of the (unreachable) objects that were 

What happens is:

 - you had gotten some corrupt objects due to the disk filling up 
   (probably during the pull that thus populated the object database with
   partially written objects)

   In particular, the 4d4d30be967d3284cbf59afd4fba6ab536e295f5 object was 
   corrupt. fsck gave a confusing error message:

	error: 4d4d30be967d3284cbf59afd4fba6ab536e295f5: object not found
	error: c03590b581d51d5fa43adbef9415e935d0229412: object not found

   which is really because the _file_ for that object does exist, but the 
   file doesn't actually contain the object it expects (due to 
   corruption), so the object wasn't "found". 

 - git-fsck-objects complained about them, but it doesn't actually do 
   anything about it (which you maybe expected it to do - it doesn't 
   really act as a filesystem fsck which tries to _fix_ things, it really 
   just does the "check" part, since "fixing" things is almost always a 
   manual operation)

 - "git prune" actually removed all the unreachable objects, and since 
   none of the _reachable_ objects were broken, that makes 
   git-fsck-objects shut up too.

 - your "git pull" failed, because it was fetching objects that were 
   corrupt in your local database - and the rule is that local objects 
   ALWAYS override remote objects. That's an important security thing (a 
   "pull" is _never_ allowed to overwrite something you already have, and 
   it doesn't matter if it's corrupt or not, you're not ever going to have 

Not a bug. See ...
From: Michael S. Tsirkin
Date: Thursday, January 11, 2007 - 2:58 pm

OK, now that I know what happened, this makes sense to me.

is something expected?

-- 
MST
-

From: Linus Torvalds
Date: Thursday, January 11, 2007 - 3:24 pm

Well, it's a bug, and it's not "expected", but I think it's understood. 

What happens is that you pulled objects that were perfectly fine, and when 
you ran out of diskspace git could create the _filenames_ and inodes for 
them, but when it actually wrote the data itself, it failed.

And it failed in a very unfortunate place: when unpacking the pack-file 
(which you got when doing the "git pull"), it would call 
"write_sha1_file()" to actually write the unpacked object. That one does 
everything correctly and is actually very careful, but then it does

	if (write_buffer(fd, compressed, size) < 0)
		die("unable to write sha1 file");

which _should_ have caught the case that the buffer couldn't be written, 
and died cleanly with an error message, and it would neve rhave moved the 
temporary file that it wrote to into the real database.

In other words, it was doing everything right. A partial file would never 
have actually been allowed to be moved into the database at all, and you'd 
have gotten a nice error message, and the "git pull" would have failed 
gracefully.

However, "write_buffer()" itself was buggy. It did:

	static int write_buffer(int fd, const void *buf, size_t len)
	{
	        ssize_t size;
	
	        size = write_in_full(fd, buf, len);
	        if (!size)
	                return error("file write: disk full");
	        if (size < 0)
	                return error("file write error (%s)", strerror(errno));
	        return 0;
	}

and the problem here is that if the write was _partial_ (not totally 
empty), it would see a positive size, but not a full write, and it would 
incorrectly return zero for "everything is fine". Even though it wasn't.

So my patch to make "write_in_full()" have better semantics (and make it 
simpler too) should have fixed the fundamental problem that caused your 
failure in the first place. The other patch I just sent out should just 
make the error messages be a bit nicer, and isn't important in itself.

Btw, that ...
From: Michael S. Tsirkin
Date: Thursday, January 11, 2007 - 3:39 pm

Weird. I think the system where the tree got corrupted had git 1.4.4.4.

-- 
MST
-

From: Linus Torvalds
Date: Thursday, January 11, 2007 - 5:42 pm

Ok, you had an email with "git version" that said 1.5.0-rc0-g<something>, 
but that may have been after you already upgraded for other reasons.

It would definitely be good to reproduce this - there are multiple places 
where we could screw up in the presense of a short write. It's also 
entirely possible that the core code doesn't screw up, but if some of the 
higher-level scripts don't always check the error code, an error at 
unpacking time (or some other time) could just go unnoticed.

That said, even though I complain about the "write_in_full()" bug, I think 
that with the "return error even for partial writes" change it actually 
_does_ end up being a safer way to do things, so it's entirely possible 
that while it introduced one bug, it could have fixed another one. The 
call chain I looked at was just one particular one (it's the most 
fundamental one for writing out individual objects, so in that sense it's 
the most likely one, but it's certainly possible that something else 
happened)

		Linus
-

From: Shawn O. Pearce
Date: Thursday, January 11, 2007 - 5:51 pm

Or maybe it was 1.5.0-rc0g<something>.  git-describe has a bug (but
is patched in pu) that caused it to pick up 1.4.4.4 on master/next/pu
right now due to the 1.4.4.4 maint release being merged back into
master/next/pu after the 1.5.0rc-0 tag was created.

-- 
Shawn.
-

Previous thread: [PATCH] Teach the revision walker to walk by reflogs with --walk-reflogs by Johannes Schindelin on Thursday, January 11, 2007 - 3:47 am. (2 messages)

Next thread: Re: MinGW port (was: Re: [ANNOUNCE] qgit4 aka qgit ported to Windows) by Johannes Sixt on Thursday, January 11, 2007 - 6:24 am. (1 message)