On Thu, Aug 26, 2010 at 04:14:13PM +0100, Mel Gorman wrote:
Suddenly, many processes could enter into the direct reclaim path by another
reason(ex, fork bomb) regradless of congestion. backing dev congestion is
just one of them.
I think if congestion_wait returns without calling io_schedule_timeout
by your patch, too_many_isolated can schedule_timeout to wait for the system's
calm to preventing OOM killing.
How about this?
If you don't mind, I will send the patch based on this patch series
after your patch settle down or Could you add this to your patch series?
But I admit this doesn't almost affect your experiment.
From 70d6584e125c3954d74a69bfcb72de17244635d2 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan.kim@gmail.com>
Date: Fri, 27 Aug 2010 02:06:45 +0900
Subject: [PATCH] Wait regardless of congestion if too many pages are isolated
Suddenly, many processes could enter into the direct reclaim path
regradless of congestion. backing dev congestion is just one of them.
But current implementation calls congestion_wait if too many pages are isolated.
if congestion_wait returns without calling io_schedule_timeout,
too_many_isolated can schedule_timeout to wait for the system's calm
to preventing OOM killing.
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/backing-dev.c | 5 ++---
mm/compaction.c | 6 +++++-
mm/vmscan.c | 6 +++++-
3 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 6abe860..9431bca 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -756,8 +756,7 @@ EXPORT_SYMBOL(set_bdi_congested);
* @timeout: timeout in jiffies
*
* Waits for up to @timeout jiffies for a backing_dev (any backing_dev) to exit
- * write congestion. If no backing_devs are congested then just wait for the
- * next write to be completed.
+ * write congestion. If no backing_devs are congested then just returns.
*/
long congestion_wait(int sync, long timeout)
{
@@ -776,7 +775,7 @@ long congestion_wait(int sync, long timeout)
if (atomic_read(&nr_bdi_congested[sync]) == 0) {
unnecessary = true;
cond_resched();
- ret = 0;
+ ret = timeout;
} else {
prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
ret = io_schedule_timeout(timeout);
diff --git a/mm/compaction.c b/mm/compaction.c
index 94cce51..7370683 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -253,7 +253,11 @@ static unsigned long isolate_migratepages(struct zone *zone,
* delay for some time until fewer pages are isolated
*/
while (unlikely(too_many_isolated(zone))) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
+ long timeout = HZ/10;
+ if (timeout == congestion_wait(BLK_RW_ASYNC, timeout)) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(timeout);
+ }
if (fatal_signal_pending(current))
return 0;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3109ff7..f5e3e28 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1337,7 +1337,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone,
unsigned long nr_dirty;
while (unlikely(too_many_isolated(zone, file, sc))) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
+ long timeout = HZ/10;
+ if (timeout == congestion_wait(BLK_RW_ASYNC, timeout)) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(timeout);
+ }
/* We are about to die and free our memory. Return now. */
if (fatal_signal_pending(current))
--
1.7.0.5
--
Kind regards,
Minchan Kim
--