Re: Recovering from repository corruption

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Linus Torvalds
Date: Tuesday, June 10, 2008 - 3:45 pm

On Tue, 10 Jun 2008, Denis Bueno wrote:

Yeah, it's almost certainly not the disk. Disks do go bad, but the 
behavior tends to be rather different when they do (usually you will get 
read errors with uncorrectably CRC failures, and you'd know that _very_ 
clearly).

Sure, I could imagine something like the sector remapping could be flaking 
out on you, but that sounds really unlikely. Especially since:


Oh, ok. If so, then this is much less worrisome, and is in fact almost 
"normal" HFS+ behaviour. It is a journaling filesystem, but it only 
journals metadata, so the filenames and inodes will be fine after a crash, 
but the contents will be random.

[ Yeah, yeah, I know - it sounds rather stupid, but it's a common kind of 
  stupidity. The journaling essentially protects the only thing that fsck 
  can find. Ext3 does similar things in "writeback" mode - but you should 
  use "data=ordered" which writes out the data before metadata.

  Basically, such journaling doesn't help data integrity per se, but it 
  does mean that the metadata is ok, and that in turn means that while the 
  file contents won't be dependable, at least things like free block 
  bitmaps etc hopefully are.

  That in turn hopefully means that new file allocations won't be 
  crapping out all over old ones etc due to bad resource allocations, so 
  while it doesn't mean that the data is trust-worthy, it at least means 
  that you can trust _some_ things ]

If your machine crashes often, you could trivially add a "sync" to your 
commit hook. That would make things better. And maybe we should have a 
"safe mode" that does these things more carefully. You would definitely 
want to turn it on on that machine.

Are you doing something special to make the machine crash so much? Or do 
OS X machines always crash, and Apple PR is just so good that people 
aren't aware of it?

Anyway, I'll think about sane ways to add a "safe" mode without making it 
_too_ painful. In the meantime, here's a trial patch that you should 
probably use. It does slow things down, but hopefully not too much.

(I really don't much like it - but I think this is a good change, and I 
just need to come up with a better way to do the fsync() than to be 
totally synchronous about it.)

It's going to make big "git add" calls *much* slower, so I'm not very 
happy about it (especially since we don't actually care that deeply about 
the files really being there until much later, so doing something 
asynchronous would be perfectly acceptable), but for you this is 
definitely worth-while.

			Linus

---
 sha1_file.c |   17 +++++++++++------
 1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index adcf37c..86a653b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2105,6 +2105,15 @@ int hash_sha1_file(const void *buf, unsigned long len, const char *type,
 	return 0;
 }
 
+/* Finalize a file on disk, and close it. */
+static void close_sha1_file(int fd)
+{
+	fsync_or_die(fd, "sha1 file");
+	fchmod(fd, 0444);
+	if (close(fd) != 0)
+		die("unable to write sha1 file");
+}
+
 static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
 			      void *buf, unsigned long len, time_t mtime)
 {
@@ -2170,9 +2179,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
 
 	if (write_buffer(fd, compressed, size) < 0)
 		die("unable to write sha1 file");
-	fchmod(fd, 0444);
-	if (close(fd))
-		die("unable to write sha1 file");
+	close_sha1_file(fd);
 	free(compressed);
 
 	if (mtime) {
@@ -2350,9 +2357,7 @@ int write_sha1_from_fd(const unsigned char *sha1, int fd, char *buffer,
 	} while (1);
 	inflateEnd(&stream);
 
-	fchmod(local, 0444);
-	if (close(local) != 0)
-		die("unable to write sha1 file");
+	close_sha1_file(local);
 	SHA1_Final(real_sha1, &c);
 	if (ret != Z_STREAM_END) {
 		unlink(tmpfile);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 10:26 am)
Re: Recovering from repository corruption, Jakub Narebski, (Tue Jun 10, 10:55 am)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 12:38 pm)
Re: Recovering from repository corruption, Nicolas Pitre, (Tue Jun 10, 12:40 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 12:42 pm)
Re: Recovering from repository corruption, Jakub Narebski, (Tue Jun 10, 12:59 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 1:03 pm)
Re: Recovering from repository corruption, Jakub Narebski, (Tue Jun 10, 1:14 pm)
Re: Recovering from repository corruption, Linus Torvalds, (Tue Jun 10, 1:23 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 1:28 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 1:35 pm)
Re: Recovering from repository corruption, Linus Torvalds, (Tue Jun 10, 2:09 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 2:22 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 2:27 pm)
Re: Recovering from repository corruption, Linus Torvalds, (Tue Jun 10, 2:48 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 3:09 pm)
Re: Recovering from repository corruption, Tarmigan, (Tue Jun 10, 3:25 pm)
Re: Recovering from repository corruption, Denis Bueno, (Tue Jun 10, 3:41 pm)
Re: Recovering from repository corruption, Linus Torvalds, (Tue Jun 10, 3:45 pm)
Re: Recovering from repository corruption, Junio C Hamano, (Tue Jun 10, 3:52 pm)
Re: Recovering from repository corruption, Linus Torvalds, (Tue Jun 10, 4:00 pm)
Re: Recovering from repository corruption, Nicolas Pitre, (Tue Jun 10, 5:43 pm)
Re: Recovering from repository corruption, Linus Torvalds, (Tue Jun 10, 6:39 pm)
Re: Recovering from repository corruption, Nicolas Pitre, (Tue Jun 10, 6:47 pm)
To graft or not to graft... (Re: Recovering from repositor ..., Stephen R. van den Berg, (Wed Jun 11, 4:21 pm)
Re: To graft or not to graft... (Re: Recovering from repos ..., Stephen R. van den Berg, (Thu Jun 12, 5:20 am)