On Mon, 11 Oct 2010 16:28:13 +0200
Jan Kara <jack@suse.cz> wrote:
I thought we'd actually fixed this. I guess we didn't. I think what
we did do was to ensure that a subsequent fsync()/msync() would
reliably report the data loss (has anyone tested this in the past few
years??). This is something, but it's quite lame.
Yes, people do this. With a 64-bit address space they create a
gargantuan mmap of the entire database and just populate teeny bits of
it simply with CPU stores. They'd be unhappy if the kernel started
instantiating every block within the mmap()!
ouch.
Can we fix the layout problem? Are reservation windows of no use here?
When I did ext2 delayed allocation back in, err, 2001 I had
considerable trouble working out how many blocks to actually reserve
for a file block, because it also had to reserve the indirect blocks.
One file block allocation can result in reserving four disk blocks!
And iirc it was not possible with existing in-core data structures to
work out whether all four blocks needed reserving until the actual
block allocation had occurred. So I ended up reserving the worst-case
number of indirects, based upon the file offset. If the disk ran out
of "space" I'd do a forced writeback to empty all the reservations and
would then take a look to see if the disk was _really_ out of space.
Is all of this an issue with this work? If so, what approach did you
take?
Gee. I remember people having issues with forcing the SEGV at
pagefault time. It _is_ a behaviour change: the application might be
about to free up some disk space, so the msync() would have succeeded
anyway.
iirc another issue was that the standards (posix?) don't anticipate
getting a SEGV in response to ENOSPC. There might have been other
concerns - it's all foggy now.
Our general answer to this overall problem is: "run msync() and check
the result". That's a bit weaselly, but it's not a _bad_ answer.
After all, there might be an EIO as well! So a good application should
be checking for both ENOSPC and EIO. Your patches only address the
ENOSPC.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html