Re: [PATCH -mm 04/25] free swap space on swap-in/activation

Previous thread: [PATCH] firewire: fill_bus_reset_event needs lock protection by Stefan Richter on Friday, June 6, 2008 - 4:11 pm. (2 messages)

Next thread: [PATCH -mm 12/25] pageflag helpers for configed-out flags by Rik van Riel on Friday, June 6, 2008 - 4:28 pm. (1 message)
To: <linux-kernel@...>
Cc: Andrew Morton <akpm@...>, Lee Schermerhorn <lee.schermerhorn@...>, Kosaki Motohiro <kosaki.motohiro@...>, MinChan Kim <minchan.kim@...>
Date: Friday, June 6, 2008 - 4:28 pm

From: Rik van Riel <riel@redhat.com>

Free swap cache entries when swapping in pages if vm_swap_full()
[swap space > 1/2 used]. Uses new pagevec to reduce pressure
on locks.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: MinChan Kim <minchan.kim@gmail.com>

---
include/linux/pagevec.h | 1 +
include/linux/swap.h | 6 ++++++
mm/swap.c | 18 ++++++++++++++++++
mm/swapfile.c | 25 ++++++++++++++++++++++---
mm/vmscan.c | 7 +++++++
5 files changed, 54 insertions(+), 3 deletions(-)

Index: linux-2.6.26-rc2-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.26-rc2-mm1.orig/mm/vmscan.c 2008-05-23 14:21:33.000000000 -0400
+++ linux-2.6.26-rc2-mm1/mm/vmscan.c 2008-05-23 14:21:33.000000000 -0400
@@ -619,6 +619,9 @@ free_it:
continue;

activate_locked:
+ /* Not a candidate for swapping, so reclaim swap space. */
+ if (PageSwapCache(page) && vm_swap_full())
+ remove_exclusive_swap_page_ref(page);
SetPageActive(page);
pgactivate++;
keep_locked:
@@ -1203,6 +1206,8 @@ static void shrink_active_list(unsigned
__mod_zone_page_state(zone, NR_ACTIVE, pgmoved);
pgmoved = 0;
spin_unlock_irq(&zone->lru_lock);
+ if (vm_swap_full())
+ pagevec_swap_free(&pvec);
__pagevec_release(&pvec);
spin_lock_irq(&zone->lru_lock);
}
@@ -1212,6 +1217,8 @@ static void shrink_active_list(unsigned
__count_zone_vm_events(PGREFILL, zone, pgscanned);
__count_vm_events(PGDEACTIVATE, pgdeactivate);
spin_unlock_irq(&zone->lru_lock);
+ if (vm_swap_full())
+ pagevec_swap_free(&pvec);

pagevec_release(&pvec);
}
Index: linux-2.6.26-rc2-mm1/mm/swap.c
===================================================================
--- linux-2.6.26-rc2-mm1.orig/mm/swap.c 2008-05-23 14:21:33.000000000 -0400
+++ linux-2.6.26-rc2-mm1/...

To: Rik van Riel <riel@...>
Cc: <linux-kernel@...>, <lee.schermerhorn@...>, <kosaki.motohiro@...>, <minchan.kim@...>, Hugh Dickins <hugh@...>
Date: Friday, June 6, 2008 - 9:04 pm

On Fri, 06 Jun 2008 16:28:42 -0400

The patch puts rather a lot of pressure onto vm_swap_full(). We might
want to look into optimising that.

- Is the 50% thing optimum? Could go higher and perhaps should be
based on amount-of-memory.

- Can precalculate the fraction rather than doing it inline all the time.

- Can make total_swap_pages __read_mostly and have a think about
nr_swap_pages too.

- Can completely optimise the thing away if !CONFIG_SWAP.

What's going on here.

Normally we'll bump a page's refcount to account for its presence in a
pagevec. This code doesn't do that.

Is it safe? If so, how come?

Suitable code comments should be added which explain this unusual and
dangerous optimisation. Or fix the bug :)

--

To: Rik van Riel <riel@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <lee.schermerhorn@...>, <kosaki.motohiro@...>, Hugh Dickins <hugh@...>
Date: Sunday, June 8, 2008 - 10:14 pm

I think we can optimize more and more in case of if !CONFIG_SWAP.
If system are !CONFIG_SWAP, we can never swap out anonymous pages.

Don't we manage anonymous pages with list ?

Such system don't need to insert anonymous pages into lru list when fault occur.
It is just needless overhead.

Also, If we can't reclaim anonymous pages, don't we need anon rmap facility ?
I don't know well which subsystems used rmap.

If pageout only use anonymous rmap for pageout,
we can remove anonymous rmapping code in case of !CONFIG_SWAP

How about your opinion ?

--
Kinds regards,
MinChan Kim
--

To: MinChan Kim <minchan.kim@...>
Cc: Rik van Riel <riel@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <lee.schermerhorn@...>, Hugh Dickins <hugh@...>
Date: Monday, June 9, 2008 - 9:38 am

You are right.
but I think rmap isn't bottle neck in embedded system.
Why do you want remove that?
--

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Rik van Riel <riel@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <lee.schermerhorn@...>, Hugh Dickins <hugh@...>
Date: Monday, June 9, 2008 - 10:30 pm

On Mon, Jun 9, 2008 at 10:38 PM, KOSAKI Motohiro

I think unnecessary code execution may increase power consumption and
cache footprint.
But I am not sure pageout just only one user in rmap.
If another subsystem (ex, xen, kvm) use rmap for managing memory, we
cannot remove rmap.

--
Kinds regards,
MinChan Kim
--

To: MinChan Kim <minchan.kim@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <lee.schermerhorn@...>, <kosaki.motohiro@...>, Hugh Dickins <hugh@...>
Date: Sunday, June 8, 2008 - 10:42 pm

On Mon, 9 Jun 2008 11:14:55 +0900

You are right that it can be done. However, doing all of those
!CONFIG_SWAP optimizations are not my priority right now. Most
of them would also apply separately to the current upstream VM.

Patches are welcome, though.

--
All rights reversed.
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <lee.schermerhorn@...>, <kosaki.motohiro@...>, <minchan.kim@...>, Hugh Dickins <hugh@...>
Date: Saturday, June 7, 2008 - 3:56 pm

On Fri, 6 Jun 2008 18:04:30 -0700

I do not know if 50% is optimum. It is just what the upstream kernel
has had since 2.4.10 or so. Before that it used to be 75%. This same

I wonder if we wouldn't be off best simply placing the two on their
own cache line. After all, they are often handled together.

I believe that should be a separate patch though, since I am not
changing that situation from what is already upstream and this

With CONFIG_SWAP=n the macro PageSwapCache(page) will always be
declared false due to the declarations in page-flags.h. That
means that vm_swap_full() will be evaluated and the compiler
should leave it out.

#ifdef CONFIG_SWAP
PAGEFLAG(SwapCache, swapcache)
#else
PAGEFLAG_FALSE(SwapCache)

It is safe because the callers already hold an extra reference to
each page. I have added a full kerneldoc comment to this function
to explain that.

--
All rights reversed.
--

Previous thread: [PATCH] firewire: fill_bus_reset_event needs lock protection by Stefan Richter on Friday, June 6, 2008 - 4:11 pm. (2 messages)

Next thread: [PATCH -mm 12/25] pageflag helpers for configed-out flags by Rik van Riel on Friday, June 6, 2008 - 4:28 pm. (1 message)