On Mon, 7 May 2007 05:37:54 -0600
Andreas Dilger <adilger@clusterfs.com> wrote:
I think the use of ext4_journal_extend() (as Amit has proposed) will help
here, but it is not sufficient.
Because under some circumstances, a journal_extend() failure could mean
that we fail to allocate all the required disk space. If it is infrequent
enough, that is acceptable when the caller is using fallocate() for
performance reasons.
But it is very much not acceptable if the caller is using fallocate() for
space-reservation reasons. If you used fallocate to reserve 1GB of disk
and fallocate() "succeeded" and you later get ENOSPC then you'd have a
right to get a bit upset.
So I think the ext3/4 fallocate() implementation will need to be
implemented as a loop:
while (len) {
journal_start();
len -= do_fallocate(len, ...);
journal_stop();
}
Now the interesting question is: what do we do if we get halfway through
this loop and then run out of space? We could leave the disk all filled up
and then return failure to the caller, but that's pretty poor behaviour,
IMO.
Does the proposed implementation handle quotas correctly, btw? Has that
been tested?
Final point: it's fairly disappointing that the present implementation is
ext4-only, and extent-only. I do think we should be aiming at an ext4
bitmap-based implementation and an ext3 implementation.
-