Re: HAMMER filesystem update - design document

Previous thread: Re: HAMMER filesystem update - design document by Gergo Szakal on Wednesday, October 10, 2007 - 6:00 pm. (1 message)

Next thread: Re: HAMMER filesystem update - design document by Matthew Dillon on Wednesday, October 10, 2007 - 9:14 pm. (2 messages)
To: <kernel@...>
Date: Wednesday, October 10, 2007 - 6:38 pm

No, it isn't a volume manager, it's simply that the filesystem
can be made up of multiple volumes. Each cluster (say, a 256M chunk)
is integrated into the filesystem-wide B-Tree and can only be addressed
by its parent or by the parent pointers of its children. This means
that clusters can be migrated with minimal work and thus can be migrated
while the filesystem is live. We don't have the situation such as we
have in UFS where random inodes in the filesystem directly reference
random data blocks elsewhere in the filesystem.

For example, if you had a HAMMER filesystem backed by two volumes you
could add a third volume, migrate all the data from the first volume
to the new volume, and then remove the first volume (make it not part
of the filesystem any more). Similarly you could migrate the clusters
at the end of a volume elsewhere and then contract that volume, or
you could expand a volume and tell HAMMER to use the new space.

I am not going to try to implement RAID inside HAMMER when RAID can be
done with a software or hardware solution in another layer.

HAMMER will do what hardware and software storage solutions can't
easily or efficiently do, which is logical replication of the entire
filesystem. A logical replication allows the different replication
targets to retain varying amounts of filesystem history. For
example, your production filesystem might retain 30 second snapshots
for an hour and hourly for the day, while one of your replication
targets might retain hourly snapshots for a day and daily snapshots
for a month, etc.

Ultimately we will have a multi-master environment which will silently
handle whole or partial filesystem failures. In this case the type
of redundancy you need at the storage layer will depend on the number
of physical disks you need to use for each copy of the filesystem. If
your filesystem fits on one or two physical disks th...

To: <kernel@...>
Date: Wednesday, October 10, 2007 - 7:45 pm

This is the functional equivalent of a RAID1, and that is all HAMMER
provides; the point of RAIDZ (and RAID3,4,5,6,etc) is that you don't
need 2n bytes worth of disk for n bytes worth of usable storage, yet
keeping some level of resilience. There is something to be said for this
kind of scheme, namely not wasting as much disk space, but in the case
of RAID1,0,10,01, moving that to a different layer (e.g. Vinum) is good
enough.

In a clustering environment, it's not likely that you'll want anything
other than full replication, but at least on single-node storage
systems, using storage more efficiently has its uses; even though it
means longer recovery times.

Cheers,
--
Thomas E. Spanjaard
tgen@netphreax.net

Previous thread: Re: HAMMER filesystem update - design document by Gergo Szakal on Wednesday, October 10, 2007 - 6:00 pm. (1 message)

Next thread: Re: HAMMER filesystem update - design document by Matthew Dillon on Wednesday, October 10, 2007 - 9:14 pm. (2 messages)