No, it isn't a volume manager, it's simply that the filesystem
can be made up of multiple volumes. Each cluster (say, a 256M chunk)
is integrated into the filesystem-wide B-Tree and can only be addressed
by its parent or by the parent pointers of its children. This means
that clusters can be migrated with minimal work and thus can be migrated
while the filesystem is live. We don't have the situation such as we
have in UFS where random inodes in the filesystem directly reference
random data blocks elsewhere in the filesystem.
For example, if you had a HAMMER filesystem backed by two volumes you
could add a third volume, migrate all the data from the first volume
to the new volume, and then remove the first volume (make it not part
of the filesystem any more). Similarly you could migrate the clusters
at the end of a volume elsewhere and then contract that volume, or
you could expand a volume and tell HAMMER to use the new space.
I am not going to try to implement RAID inside HAMMER when RAID can be
done with a software or hardware solution in another layer.
HAMMER will do what hardware and software storage solutions can't
easily or efficiently do, which is logical replication of the entire
filesystem. A logical replication allows the different replication
targets to retain varying amounts of filesystem history. For
example, your production filesystem might retain 30 second snapshots
for an hour and hourly for the day, while one of your replication
targets might retain hourly snapshots for a day and daily snapshots
for a month, etc.
Ultimately we will have a multi-master environment which will silently
handle whole or partial filesystem failures. In this case the type
of redundancy you need at the storage layer will depend on the number
of physical disks you need to use for each copy of the filesystem. If
your filesystem fits on one or two physical disks th...This is the functional equivalent of a RAID1, and that is all HAMMER
provides; the point of RAIDZ (and RAID3,4,5,6,etc) is that you don't
need 2n bytes worth of disk for n bytes worth of usable storage, yet
keeping some level of resilience. There is something to be said for this
kind of scheme, namely not wasting as much disk space, but in the case
of RAID1,0,10,01, moving that to a different layer (e.g. Vinum) is good
enough.
In a clustering environment, it's not likely that you'll want anything
other than full replication, but at least on single-node storage
systems, using storage more efficiently has its uses; even though it
means longer recovery times.
Cheers,
--
Thomas E. Spanjaard
tgen@netphreax.net
| Alexandre Oliva | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Bart Van Assche | Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel |
| Thomas Meyer | Re: [PATCH] clockevents: Fix suspend/resume to disk hangs |
| S.Çağlar | Rescheduling interrupts |
git: | |
| Chris Ortman | [FEATURE REQUEST] git-svn format-patch |
| Sverre Rabbelier | Git vs Monotone |
| Linus Torvalds | People unaware of the importance of "git gc"? |
| Johannes Schindelin | Re: VCS comparison table |
| Alexey Dobriyan | [PATCH 01/53] xfrm: initialise xfrm_policy_gc_work statically |
| KOSAKI Motohiro | [bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin" |
| Jarek Poplawski | Re: Data corruption issue with splice() on 2.6.27.10 |
| David Miller | [GIT]: Networking |
| Nick Holland | Re: keyboard lockup, KVM, dual-boot |
| Richard Stallman | Real men don't attack straw men |
| Anders Langworthy | Re: OpenBSD/i386 won't boot on Transmeta Efficeon CPU |
| Matthew Dempsky | hoststated/relayd and Linux's tcp_tw_recycle option |
