On Wed, Aug 25 2010 at 4:00am -0400,
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> wrote:
Right, there are hardware configurations that lend themselves to FLUSH
retries mattering, namely:
1) a SAS drive with 2 ports and a writeback cache
2) theoretically possible: SCSI array that is mpath capable but
advertises cache as writeback (WCE=1)
The SAS case is obviously a more concrete example of why FLUSH retries
are worthwhile in mpath.
But I understand (and agree) that we'd be better off if mpath could
differentiate between failures rather than blindly retrying on failures
like it does today (fails path and retries if additional paths
available).
I'm not seeing where anything is broken with current mpath. If a
multipathed LUN is WCE=1 then it should be fair to assume the cache is
mirrored or shared across ports. Therefore retrying the SYNCHRONIZE
CACHE is needed.
Do we still have fear that SYNCHRONIZE CACHE can silently drop data?
Seems unlikely especially given what Tejun shared from SBC.
It seems that at worst, with current mpath, we retry when it doesn't
make sense (e.g. target failure).
Mike
--