Re: [RFC][PATCH] md: avoid fullsync if a faulty member missed a dirty transition

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Mike Snitzer
Date: Tuesday, May 6, 2008 - 4:58 am

On Tue, May 6, 2008 at 2:53 AM, Neil Brown <neilb@suse.de> wrote:

Hi Neil,

I definitely could be misinterpreting something.  However, I did
determine that if the write-mostly NBD member of the raid1 becomes
degraded while writing to the raid1 it frequently has an 'events' that
is one less than the 'events_cleared' (of the local raid1 member that
the array gets reassembled with first).  The events indicate the NBD
member is clean and the local member is dirty.

I'm using internal bitmaps.  I've focused on the even->odd
(clean->dirty) transition to rationalize the safety of allowing the
NBD member to be off by one _and_ clean.  That could easily be
superficial but it seems significant.

It looks like bitmap_update_sb()'s incrementing of events_cleared (on
behalf of the local member) could be racing with the fact that the NBD
member becomes faulty (whereby making the array degraded).  This
allows the events_cleared to reflect a clean->dirty transition last
occurred before the array became degraded.  My reasoning is: If it was
a clean->dirty transition the bitmap still has the associated dirty
bit set in the local member's bitmap, so using the bitmap to resync is
valid.

thanks,
Mike
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [RFC][PATCH] md: avoid fullsync if a faulty member mis ..., Mike Snitzer, (Tue May 6, 4:58 am)