Hmm. What does "not being able to handle failed writes" actually
mean? AFAICS, there are two possible answers: "all bets are off", or
"we'll tell you about the problem, and all bets are off".
Right. And a lot of database systems make the same assumption.
Oracle Berkeley DB cannot deal with partial page writes at all, and
PostgreSQL assumes that it's safe to flip a few bits in a sector
without proper WAL (it doesn't care if the changes actually hit the
disk, but the write shouldn't make the sector unreadable or put random
bytes there).
The DMA transaction should fail due to ECC errors, though.
I think the general idea is to protect valuable data with WAL. You
overwrite pages on disk only after you've made a backup copy into WAL.
After a power loss event, you replay the log and overwrite all garbage
that might be there. For the WAL, you rely on checksum and sequence
numbers. This still doesn't help against write failures where the
system continues running (because the fsync() during checkpointing
isn't guaranteed to report errors), but it should deal with the power
failure case. But this assumes that the file system protects its own
data structure in a similar way. Is this really too much to demand?
Partial failures are extremely difficult to deal with because of their
asynchronous nature. I've come to accept that, but it's still
disappointing.
--
Florian Weimer <fweimer@bfk.de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html