Re: [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Andrew Morton <akpm@...>
Cc: Chuck Ebbert <cebbert@...>, Greg KH <gregkh@...>, Chakri n <chakriin5@...>, Peter Zijlstra <a.p.zijlstra@...>, Krzysztof Oledzki <olel@...>, linux-pm <linux-pm@...>, lkml <linux-kernel@...>, richard kennedy <richard@...>, Ingo Molnar <mingo@...>
Date: Tuesday, October 2, 2007 - 8:13 am

On Mon, Oct 01, 2007 at 07:14:57PM -0700, Andrew Morton wrote:

There're only two 'break' conditions in the loop:
1. nr_dirty + nr_unstable + nr_writeback < dirty_limit
   => *mostly* FALSE for a busy system
   => *always* FALSE in Chakri's stucked NFS case
2. nr_written >= 6MB
   for a light-load bdi:
   => *never* TRUE until there comes many new writers, contributing
      more dirty pages to sync
   => more worse, those new writers will also stuck here...
      the obvious unbalance here is:
           each writer contributes only 32KB new dirty pages, but
           want to consume (not necessarily available) 6MB

So loooong = min(global-less-busy-time, bdi-many-new-writers-arrival-time).


You are right in the reasoning. The exact consequence is:
        the light-load sdb is made as _unresponsive_ as the busy sda

Hence Chakri's case: whenever NFS is stuck, every device get stuck.


In theory, every CPU/paralle writer could contribute 8 pages of error.
Hence we get 1MB/32KB = 32 (CPUs/writers).

One more serious problem is, a busy writer could also drain all the
dirty pages and make (nr_writeback == dirty_limit+1MB). In that case,
I suspect the light-load sdb writer still have good chance to
make progress(need confirmation).


Not well tested till now. My system becomes unusable soon after
starting the NFS write(even before plugging the network). I'm seeing
large latencies in try_to_wake_up(). Hope that Ingo could help it out.


Yeah, Peter and me were both aware of the timing.
This patch is only meant for 2.6.23 and 2.6.22.10.

Fengguang

-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [PATCH] writeback: avoid possible balance_dirty_pages() ..., Fengguang Wu, (Tue Oct 2, 8:13 am)
KDB?, Daniel Phillips, (Fri Sep 28, 9:51 pm)
[PATCH] lockstat: documentation, Peter Zijlstra, (Wed Oct 3, 5:28 am)
Re: [PATCH] lockstat: documentation, Ingo Molnar, (Wed Oct 3, 5:35 am)