On Thu, 27 May 2010 00:31:44 +0900, Minchan Kim said:
I'm pretty sure that what Cesar meant was that the following could happen:
1) Write block 11983 on the disk, checksum 34FE9B72.
(... time passes.. maybe weeks)
2) Attempt to write block 11983 on disk with checksum AE9F3581. The write fails
due to a power failure or something.
(... more time passes...)
3) Read block 11983, get back data with checksum 34FE9B72. Checksum matches,
and there's no indication that the write in (2) ever failed. The program
proceeds thinking it's just read back the most recently written data, when in
fact it's just read an older version of that block. Problems can ensue if the
data just read is now out of sync with *other* blocks of data - instant data
corruption.
To be fair, we currently have the "read a stale block" problem after crashes
already. The issue is that BLK_DEV_INTEGRITY can't provide a solution here,
but most users will form a mental image that it *is* in fact giving them
that guarantee. The resulting mismatch between reality and expectations
cannot end well.