Yeah, it's almost certainly not the disk. Disks do go bad, but the
behavior tends to be rather different when they do (usually you will get
read errors with uncorrectably CRC failures, and you'd know that _very_
clearly).
Sure, I could imagine something like the sector remapping could be flaking
out on you, but that sounds really unlikely. Especially since:
Oh, ok. If so, then this is much less worrisome, and is in fact almost
"normal" HFS+ behaviour. It is a journaling filesystem, but it only
journals metadata, so the filenames and inodes will be fine after a crash,
but the contents will be random.
[ Yeah, yeah, I know - it sounds rather stupid, but it's a common kind of
stupidity. The journaling essentially protects the only thing that fsck
can find. Ext3 does similar things in "writeback" mode - but you should
use "data=ordered" which writes out the data before metadata.
Basically, such journaling doesn't help data integrity per se, but it
does mean that the metadata is ok, and that in turn means that while the
file contents won't be dependable, at least things like free block
bitmaps etc hopefully are.
That in turn hopefully means that new file allocations won't be
crapping out all over old ones etc due to bad resource allocations, so
while it doesn't mean that the data is trust-worthy, it at least means
that you can trust _some_ things ]
If your machine crashes often, you could trivially add a "sync" to your
commit hook. That would make things better. And maybe we should have a
"safe mode" that does these things more carefully. You would definitely
want to turn it on on that machine.
Are you doing something special to make the machine crash so much? Or do
OS X machines always crash, and Apple PR is just so good that people
aren't aware of it?
Anyway, I'll think about sane ways to add a "safe" mode without making it
_too_ painful. In the meantime, here's a trial patch that you should
probably use. It does slow things down, but hopefully not too much.
(I really don't much like it - but I think this is a good change, and I
just need to come up with a better way to do the fsync() than to be
totally synchronous about it.)
It's going to make big "git add" calls *much* slower, so I'm not very
happy about it (especially since we don't actually care that deeply about
the files really being there until much later, so doing something
asynchronous would be perfectly acceptable), but for you this is
definitely worth-while.
Linus
---
sha1_file.c | 17 +++++++++++------
1 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/sha1_file.c b/sha1_file.c
index adcf37c..86a653b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2105,6 +2105,15 @@ int hash_sha1_file(const void *buf, unsigned long len, const char *type,
return 0;
}
+/* Finalize a file on disk, and close it. */
+static void close_sha1_file(int fd)
+{
+ fsync_or_die(fd, "sha1 file");
+ fchmod(fd, 0444);
+ if (close(fd) != 0)
+ die("unable to write sha1 file");
+}
+
static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
void *buf, unsigned long len, time_t mtime)
{
@@ -2170,9 +2179,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
if (write_buffer(fd, compressed, size) < 0)
die("unable to write sha1 file");
- fchmod(fd, 0444);
- if (close(fd))
- die("unable to write sha1 file");
+ close_sha1_file(fd);
free(compressed);
if (mtime) {
@@ -2350,9 +2357,7 @@ int write_sha1_from_fd(const unsigned char *sha1, int fd, char *buffer,
} while (1);
inflateEnd(&stream);
- fchmod(local, 0444);
- if (close(local) != 0)
- die("unable to write sha1 file");
+ close_sha1_file(local);
SHA1_Final(real_sha1, &c);
if (ret != Z_STREAM_END) {
unlink(tmpfile);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html