Re: [RFC PATCH 0/3] Do not wait the full timeout on congestion_wait when there is no congestion

Previous thread: [PATCH] Alpha: Fix a missing comma in sys_osf_statfs() by David Howells on Thursday, August 26, 2010 - 9:44 am. (2 messages)

Next thread: [HELP] Not getting new mails from LKML any more. by U Feng on Thursday, August 26, 2010 - 10:32 am. (2 messages)
From: Minchan Kim
Date: Thursday, August 26, 2010 - 10:20 am

Suddenly, many processes could enter into the direct reclaim path by another
reason(ex, fork bomb) regradless of congestion. backing dev congestion is 
just one of them. 

I think if congestion_wait returns without calling io_schedule_timeout 
by your patch, too_many_isolated can schedule_timeout to wait for the system's 
calm to preventing OOM killing.

How about this?

If you don't mind, I will send the patch based on this patch series 
after your patch settle down or Could you add this to your patch series?
But I admit this doesn't almost affect your experiment. 

From 70d6584e125c3954d74a69bfcb72de17244635d2 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan.kim@gmail.com>
Date: Fri, 27 Aug 2010 02:06:45 +0900
Subject: [PATCH] Wait regardless of congestion if too many pages are isolated

Suddenly, many processes could enter into the direct reclaim path
regradless of congestion. backing dev congestion is just one of them.
But current implementation calls congestion_wait if too many pages are isolated.

if congestion_wait returns without calling io_schedule_timeout,
too_many_isolated can schedule_timeout to wait for the system's calm
to preventing OOM killing.

Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
---
 mm/backing-dev.c |    5 ++---
 mm/compaction.c  |    6 +++++-
 mm/vmscan.c      |    6 +++++-
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 6abe860..9431bca 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -756,8 +756,7 @@ EXPORT_SYMBOL(set_bdi_congested);
  * @timeout: timeout in jiffies
  *
  * Waits for up to @timeout jiffies for a backing_dev (any backing_dev) to exit
- * write congestion.  If no backing_devs are congested then just wait for the
- * next write to be completed.
+ * write congestion.  If no backing_devs are congested then just returns.
  */  
 long congestion_wait(int sync, long timeout)
 {
@@ -776,7 +775,7 @@ long congestion_wait(int sync, long timeout)
     ...
From: Mel Gorman
Date: Thursday, August 26, 2010 - 10:31 am

More likely, to stop a loop in too_many_isolated() consuming CPU time it


I think the reasoning here might be a little off. How about;

If many processes enter direct reclaim or memory compaction, too many pages
can get isolated. In this situation, too_many_isolated() can call
congestion_wait() but if there is no congestion, it fails to go to sleep
and instead spins until it's quota expires.

This patch checks if congestion_wait() returned without sleeping. If it
did because there was no congestion, it unconditionally goes to sleep

We don't really need the timeout variable here but I see what you are
at. It's unfortunate to just go to sleep for HZ/10 but if it's not
congestion, we do not have any other event to wake up on at the moment.
We'd have to introduce a too_many_isolated waitqueue that is kicked if
pages are put back on the LRU.


This seems very reasonable. I'll review it more carefully tomorrow and if I
spot nothing horrible, I'll add it onto the series. I'm not sure I'm hitting
the too_many_isolated() case but I cannot think of a better alternative
without adding more waitqueues.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--

From: Minchan Kim
Date: Thursday, August 26, 2010 - 10:50 am

I think the situation applys with this series. That's because old behavior was calling
schedule regardless of I/O congested as seeing io_schedule_timeout.


I thought it firstly but first of all, let's make sure how often this situation happens 


-- 
Kind regards,
Minchan Kim
--

From: Wu Fengguang
Date: Thursday, August 26, 2010 - 6:21 pm

Minchan,

It's much cleaner to keep the unchanged congestion_wait() and add a
congestion_wait_check() for converting problematic wait sites. The
too_many_isolated() wait is merely a protective mechanism, I won't
bother to improve it at the cost of more code.

Thanks,
--

From: Minchan Kim
Date: Thursday, August 26, 2010 - 6:41 pm

Hi, Wu.


You means following as?

        while (unlikely(too_many_isolated(zone, file, sc))) {
                congestion_wait_check(BLK_RW_ASYNC, HZ/10);

                /* We are about to die and free our memory. Return now. */
                if (fatal_signal_pending(current))
                        return SWAP_CLUSTER_MAX;

-- 
Kind regards,
Minchan Kim
--

From: Wu Fengguang
Date: Thursday, August 26, 2010 - 6:50 pm

No, I mean do not change the too_many_isolated() related code at all :)
And to use congestion_wait_check() in other places that we can prove
there is a problem that can be rightly fixed by changing to

Thanks,
Fengguang
--

From: Minchan Kim
Date: Thursday, August 26, 2010 - 7:02 pm

I always suffer from understanding your comment.
Apparently, my eyes have a problem. ;(

This patch is dependent of Mel's series.
With changing congestion_wait with just return when no congestion, it
would have CPU hogging in too_many_isolated. I think it would apply in
Li's congestion_wait_check, too.
If no change is current congestion_wait, we doesn't need this patch.

Still, maybe I can't understand your comment. Sorry.
-- 
Kind regards,
Minchan Kim
--

From: Wu Fengguang
Date: Thursday, August 26, 2010 - 9:34 pm

Sorry! The confusion must come from the modified congestion_wait() by
Mel. My proposal is to _not_ modify congestion_wait(), but add another
congestion_wait_check() which won't sleep 100ms when no IO. In this
way, the following chunks become unnecessary.

--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -253,7 +253,11 @@ static unsigned long isolate_migratepages(struct zone *zone,
         * delay for some time until fewer pages are isolated
         */
        while (unlikely(too_many_isolated(zone))) {
-               congestion_wait(BLK_RW_ASYNC, HZ/10);
+               long timeout = HZ/10;
+               if (timeout == congestion_wait(BLK_RW_ASYNC, timeout)) {
+                       set_current_state(TASK_INTERRUPTIBLE);
+                       schedule_timeout(timeout);
+               }
               
                if (fatal_signal_pending(current))
                        return 0;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3109ff7..f5e3e28 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1337,7 +1337,11 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone,
        unsigned long nr_dirty;
        while (unlikely(too_many_isolated(zone, file, sc))) {
-               congestion_wait(BLK_RW_ASYNC, HZ/10);
+               long timeout = HZ/10;
+               if (timeout == congestion_wait(BLK_RW_ASYNC, timeout)) {
+                       set_current_state(TASK_INTERRUPTIBLE);
+                       schedule_timeout(timeout);
+               }
               
                /* We are about to die and free our memory. Return now. */
                if (fatal_signal_pending(current))

Thanks,
Fengguang
--

From: Mel Gorman
Date: Friday, August 27, 2010 - 2:38 am

This is what I've done. I dropped the patch again and am using
wait_iff_congested(). I left the too_many_isolated() callers as
congestion_wait().

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--

Previous thread: [PATCH] Alpha: Fix a missing comma in sys_osf_statfs() by David Howells on Thursday, August 26, 2010 - 9:44 am. (2 messages)

Next thread: [HELP] Not getting new mails from LKML any more. by U Feng on Thursday, August 26, 2010 - 10:32 am. (2 messages)