HAMMER is progressing very well with only 3-4 big-ticket items left
to do:
* On-the-fly Recovery (<--- current WIP)
* Balancing
* Refactoring of the spike code
* Retention policy scan
Everything else is now working and reasonably stable. Of the remaining
items only the spike coding has any real algorithmic complexity. Recovery
and balancing just require brute force and the physical record deletion
the retention policy scan needs to do is already coded and working (just
not the scan itself).
I'm really happy with the progress I'm making on HAMMER.
--
I'm going to talk about the spike and balancing code for a moment. What
is a spike? Basically, when a cluster (a 64MB block of of the disk) fills
up a 'spike' needs to be driven into that cluster's B-Tree in order to
expand it into a new cluster. The spike basically forwards a portion of
the B-Tree's key space to a new cluster.
At the moment the spike takes the B-Tree cursor at a leaf after a failed
insertion (due to running out of space) and copies that whole leaf to a
new cluster, then replaces the reference in the internal node pointing
to that leaf with a pointer to the new cluster (the 'spike').
This is extremely inefficient because the key space covered by the spike
is usually fairly minimal, so the newly spiked cluster might not fill up
before another spike is needed.
Refactoring the spike code means doing a better job selecting the amount
of key space the spike can represent.
--
The balancing code is responsible for slowly cleaning up any
inefficiencies that build up in the filesystem. As its name
implies, the idea is to 'balance' the overall B-Tree representing
the filesystem. We want to slowly move physical data records
from higher level clusters to lower level clusters, eventually
winding up with a situation where the higher level clusters contain
only spi...