Thanks.
Last night's run went well: that patch does indeed seem to have fixed it.
Looking at the timings (some variance but _very_ much less than the night
before), there does appear to be some other occasional slight slowdown -
but I've no reason to suspect your patch for it, nor to suppose it's
something new: it may just be an artifact of my heavy swap thrashing.
[PATCH mm] mm per-device dirty threshold fix
Fix occasional hang when a task couldn't get out of balance_dirty_pages:
mm-per-device-dirty-threshold.patch needs to reevaluate bdi_nr_writeback
across all cpus when bdi_thresh is low, even in the case when there was
no bdi_nr_reclaimable.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
---
mm/page-writeback.c | 53 +++++++++++++++++++-----------------------
1 file changed, 24 insertions(+), 29 deletions(-)
--- 2.6.23-rc6-mm1/mm/page-writeback.c 2007-09-18 12:28:25.000000000 +0100
+++ linux/mm/page-writeback.c 2007-09-19 20:00:46.000000000 +0100
@@ -379,7 +379,7 @@ static void balance_dirty_pages(struct a
bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
- break;
+ break;
if (!bdi->dirty_exceeded)
bdi->dirty_exceeded = 1;
@@ -392,39 +392,34 @@ static void balance_dirty_pages(struct a
*/
if (bdi_nr_reclaimable) {
writeback_inodes(&wbc);
-
+ pages_written += write_chunk - wbc.nr_to_write;
get_dirty_limits(&background_thresh, &dirty_thresh,
&bdi_thresh, bdi);
+ }
- /*
- * In order to avoid the stacked BDI deadlock we need
- * to ensure we accurately count the 'dirty' pages when
- * the threshold is low.
- *
- * Otherwise it would be possible to get thresh+n pages
- * reported dirty, even though there are thresh-m pages
- * actually dirty; with m+n sitting in the percpu
- * deltas.
- */
- if (bdi_thresh < 2*bdi_stat_error(bdi)) {
- bdi_nr_reclaimable =
- bdi_stat_sum(bdi, BDI_RECLAIMABLE);
- bdi_nr_writeback =
- bdi_stat_sum(bdi, BDI_WRITEBACK);
- } else {
- bdi_nr_reclaimable =
- bdi_stat(bdi, BDI_RECLAIMABLE);
- bdi_nr_writeback =
- bdi_stat(bdi, BDI_WRITEBACK);
- }
+ /*
+ * In order to avoid the stacked BDI deadlock we need
+ * to ensure we accurately count the 'dirty' pages when
+ * the threshold is low.
+ *
+ * Otherwise it would be possible to get thresh+n pages
+ * reported dirty, even though there are thresh-m pages
+ * actually dirty; with m+n sitting in the percpu
+ * deltas.
+ */
+ if (bdi_thresh < 2*bdi_stat_error(bdi)) {
+ bdi_nr_reclaimable = bdi_stat_sum(bdi, BDI_RECLAIMABLE);
+ bdi_nr_writeback = bdi_stat_sum(bdi, BDI_WRITEBACK);
+ } else if (bdi_nr_reclaimable) {
+ bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
+ bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
+ }
- if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
- break;
+ if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
+ break;
+ if (pages_written >= write_chunk)
+ break; /* We've done our duty */
- pages_written += write_chunk - wbc.nr_to_write;
- if (pages_written >= write_chunk)
- break; /* We've done our duty */
- }
congestion_wait(WRITE, HZ/10);
}
-