Re: Suggestion needed for fixing RAID6

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: MRK
Date: Tuesday, April 27, 2010 - 4:02 pm

On 04/27/2010 05:50 PM, Janos Haar wrote:

I think I can see a problem here:
You had 11 active devices over 12 when you received the read error.
At 11 devices over 12 your array is singly-degraded and this should be 
enough for raid6 to recompute the block from parity and perform the 
rewrite, correcting the read-error, but instead MD declared that it's 
impossible to correct the error, and dropped one more device (going to 
doubly-degraded).

I think this is an MD bug, and I think I know where it is:


--- linux-2.6.33-vanilla/drivers/md/raid5.c     2010-02-24 
19:52:17.000000000 +0100
+++ linux-2.6.33/drivers/md/raid5.c     2010-04-27 23:58:31.000000000 +0200
@@ -1526,7 +1526,7 @@ static void raid5_end_read_request(struc

                 clear_bit(R5_UPTODATE, &sh->dev[i].flags);
                 atomic_inc(&rdev->read_errors);
-               if (conf->mddev->degraded)
+               if (conf->mddev->degraded == conf->max_degraded)
                         printk_rl(KERN_WARNING
                                   "raid5:%s: read error not correctable "
                                   "(sector %llu on %s).\n",

------------------------------------------------------
(This is just compile-tested so try at your risk)

I'd like to hear what Neil thinks of this...

The problem here (apart from the erroneous error message) is that if 
execution goes inside that "if" clause, it will eventually reach the 
md_error() statement some 30 lines below there, which will have the 
effect of dropping one further device further worsening the situation 
instead of recovering it, and this is not the correct behaviour in this 
case as far as I understand.
At the current state raid6 behaves like if it was a raid5, effectively 
supporting only one failed disk.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Suggestion needed for fixing RAID6, Janos Haar, (Thu Apr 22, 3:09 am)
Re: Suggestion needed for fixing RAID6, Mikael Abrahamsson, (Thu Apr 22, 8:00 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Thu Apr 22, 8:12 am)
Re: Suggestion needed for fixing RAID6, Mikael Abrahamsson, (Thu Apr 22, 8:18 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Thu Apr 22, 9:25 am)
Re: Suggestion needed for fixing RAID6, Peter Rabbitson, (Thu Apr 22, 9:32 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Thu Apr 22, 1:48 pm)
Re: Suggestion needed for fixing RAID6, Luca Berra, (Thu Apr 22, 11:51 pm)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Fri Apr 23, 1:47 am)
Re: Suggestion needed for fixing RAID6, MRK, (Fri Apr 23, 5:34 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Sat Apr 24, 12:36 pm)
Re: Suggestion needed for fixing RAID6, MRK, (Sat Apr 24, 3:47 pm)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Sun Apr 25, 3:00 am)
Re: Suggestion needed for fixing RAID6, MRK, (Mon Apr 26, 3:24 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Mon Apr 26, 5:52 am)
Re: Suggestion needed for fixing RAID6, MRK, (Mon Apr 26, 9:53 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Mon Apr 26, 3:39 pm)
Re: Suggestion needed for fixing RAID6, Michael Evans, (Mon Apr 26, 4:06 pm)
Re: Suggestion needed for fixing RAID6, Michael Evans, (Mon Apr 26, 5:04 pm)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Tue Apr 27, 8:50 am)
Re: Suggestion needed for fixing RAID6, MRK, (Tue Apr 27, 4:02 pm)
Re: Suggestion needed for fixing RAID6, Neil Brown, (Tue Apr 27, 6:37 pm)
Re: Suggestion needed for fixing RAID6, Mikael Abrahamsson, (Tue Apr 27, 7:02 pm)
Re: Suggestion needed for fixing RAID6, Neil Brown, (Tue Apr 27, 7:12 pm)
Re: Suggestion needed for fixing RAID6, Mikael Abrahamsson, (Tue Apr 27, 7:30 pm)
Re: Suggestion needed for fixing RAID6, MRK, (Wed Apr 28, 5:57 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Wed Apr 28, 6:32 am)
Re: Suggestion needed for fixing RAID6, MRK, (Wed Apr 28, 7:19 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Wed Apr 28, 7:51 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Thu Apr 29, 12:55 am)
Re: Suggestion needed for fixing RAID6, MRK, (Thu Apr 29, 8:22 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Thu Apr 29, 2:07 pm)
Re: Suggestion needed for fixing RAID6, MRK, (Thu Apr 29, 4:00 pm)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Thu Apr 29, 11:17 pm)
Re: Suggestion needed for fixing RAID6, MRK, (Fri Apr 30, 4:54 pm)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Sat May 1, 2:37 am)
Re: Suggestion needed for fixing RAID6, MRK, (Sat May 1, 10:17 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Sat May 1, 2:44 pm)
Re: Suggestion needed for fixing RAID6, MRK, (Sun May 2, 4:05 pm)
Re: Suggestion needed for fixing RAID6, Neil Brown, (Sun May 2, 7:17 pm)
Re: Suggestion needed for fixing RAID6, Neil Brown, (Sun May 2, 7:29 pm)
Re: Suggestion needed for fixing RAID6, MRK, (Mon May 3, 3:04 am)
Re: Suggestion needed for fixing RAID6, Janos Haar, (Mon May 3, 3:20 am)
Re: Suggestion needed for fixing RAID6, MRK, (Mon May 3, 3:21 am)
Re: Suggestion needed for fixing RAID6, Neil Brown, (Mon May 3, 2:02 pm)
Re: Suggestion needed for fixing RAID6, Neil Brown, (Mon May 3, 2:04 pm)
Re: Suggestion needed for fixing RAID6 [SOLVED], Janos Haar, (Wed May 5, 8:24 am)