You are missing the broader point of both papers. They (and people like
me when back at EMC) look at large numbers of machines and try to fix
what actually breaks when run in the real world and causes data loss.
The motherboards, S-ATA controllers, disk types are the same class of
parts that I have in my desktop box today.
The advantage of google, national labs, etc is that they have large
numbers of systems and can draw conclusions that are meaningful to our
broad user base.
Specifically, in using S-ATA drives (just like ours, maybe slightly more
reliable) they see up to 7% of those drives fail each year. All users
have "soft" drive failures like single remapped sectors.
These errors happen extremely commonly and are what RAID deals with well.
What does not happen commonly is that during the RAID rebuild (kicked
off only after a drive is kicked out), you push the power button or have
a second failure (power outage).
We will have more users loose data if they decide to use ext2 instead of
ext3 and use only single disk storage.
We have real numbers that show that is true. Injecting double faults
into a system that handles single faults is frankly not that interesting.
You can get better protection from these double faults if you move to
"cloud" like storage configs where each box is fault tolerant, but you
also spread your data over multiple boxes in multiple locations.
Regards,
Ric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html