Re: 2.6.23.1: mdadm/raid5 hung/d-state

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Justin Piszcz <jpiszcz@...>
Cc: Carlos Carvalho <carlos@...>, Jeff Lessem <Jeff@...>, <root@...>, Dan Williams <dan.j.williams@...>, BERTRAND Joël <joel.bertrand@...>, Neil Brown <neilb@...>, <linux-kernel@...>, <linux-raid@...>, <xfs@...>
Date: Friday, November 9, 2007 - 10:09 am

On Nov 9, 2007 7:14 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:

In our case all process using md4, including md4_resync, stay in D state.
Call Trace:
  [<ffffffff803615ac>] __generic_unplug_device+0x13/0x24
  [<ffffffff803622cf>] generic_unplug_device+0x18/0x28
  [<ffffffff803f2cf7>] get_active_stripe+0x22b/0x472
...
see dmesg (sysrq t) attached.

We can reproduce this problem in two machines with the same configuration:
  - 2 x Dual-Core Opteron 2.8GHz
  - 8GB memory
  - 3ware 9000 with 10 x 750GB sata disks
  - Debian Etch x86_64
  - raid5 + xfs (/dev/md4)
in all these stock kernel's:
  - 2.6.22.11, 2.6.22.12, 2.6.23.1, 2.6.24-rc2
running:
  - for i in f{0..7}; do (dd bs=1M count=100000 if=/dev/zero of=$i &); done

If we increase /sys/block/md4/md/stripe_cache_size the device and process
back to work.
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Neil Brown, (Sun Nov 4, 5:49 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, BERTRAND Joël, (Mon Nov 5, 4:36 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Chuck Ebbert, (Wed Nov 7, 12:39 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, BERTRAND Joël, (Wed Nov 7, 12:48 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, BERTRAND Joël, (Thu Nov 8, 7:42 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Justin Piszcz, (Thu Nov 8, 8:44 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Justin Piszcz, (Sun Nov 4, 5:51 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Dan Williams, (Mon Nov 5, 2:35 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Jeff Lessem, (Tue Nov 6, 7:18 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Justin Piszcz, (Mon Nov 5, 2:35 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Dan Williams, (Mon Nov 5, 8:19 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, BERTRAND Joël, (Tue Nov 6, 6:19 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Dan Williams, (Tue Nov 6, 9:25 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, BERTRAND Joël, (Wed Nov 7, 7:20 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Jeff Lessem, (Wed Nov 7, 1:00 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Carlos Carvalho, (Thu Nov 8, 5:40 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Justin Piszcz, (Fri Nov 9, 5:14 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Fabiano Silva, (Fri Nov 9, 10:09 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Bill Davidsen, (Thu Nov 8, 1:45 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Dan Williams, (Thu Nov 8, 2:02 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Jeff Lessem, (Fri Nov 9, 4:36 pm)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Justin Piszcz, (Tue Nov 6, 7:29 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, BERTRAND Joël, (Tue Nov 6, 7:39 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, Justin Piszcz, (Tue Nov 6, 7:42 am)
Re: 2.6.23.1: mdadm/raid5 hung/d-state, BERTRAND Joël, (Tue Nov 6, 8:20 am)