It's been a major advantage for HAMMER. The major improvement I made
to make sure throughput was retained was to create a space reservation
subsystem based around a structure called hammer_reserve. This allows
the frontend to reserve space on the media without having to modify
any meta-data or commit to the allocation.
This in turn allows the frontend to reserve space for bulk data writes
and actually perform the writes directly to the media, and then queue
just the meta-data, along with the reservation to the backend. The
backend then finalizes the reservation (doing the actual allocation)
at the same time it lays down the related meta-data.
The result is that you get nice pipelining when writing lots of bulk
data. It is possible to literally write gigabytes of bulk data
before the backend needs to flush any of the related meta-data.
Separating the two out also made testing the filesystem a lot easier.
Yah, I hit up against similar issues which I resolved with the
'flush group' abstraction, which groups together dependant operations.
If the group would become too large (e.g. you have an arbitrarily long
chain of dependancies), it can break the chain by flushing out the
inode at the break point with an adjusted nlinks count. That way
Yah. fsync() just queues the inode to the flusher and the flusher
bangs away at it until its all out, then wakes up the waiting fsync
The issue HAMMER had is that the flusher has to perform multiple media
B-Tree operations, which can block on reads. I wound up going with
None of the standards require directory stability within the cached
dirent buffer. That is, if libc caches just the base seek point for
a set of directory entries, there is no requirement that the positioning
of elements beyond the first element at that base seek point be
consistent.
The standards only require stability at explicit seek points.
In fact, the standards explicitly sa...