[PATCH][RFC] evict streaming IO cache first

Previous thread: Re: 2.6.26-rc9-git9 doesn't boot on Macintel by Justin Mattock on Tuesday, July 15, 2008 - 3:52 pm. (1 message)

Next thread: Re: [stable] Linux 2.6.25.10 by Linus Torvalds on Tuesday, July 15, 2008 - 4:18 pm. (56 messages)
To: <linux-kernel@...>
Cc: <akpm@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, KOSAKI Motohiro <kosaki.motohiro@...>
Date: Tuesday, July 15, 2008 - 4:09 pm

This patch still needs some testing under various workloads
on different hardware - the approach should work but the
threshold may need tweaking.

When there is a lot of streaming IO going on, we do not want
to scan or evict pages from the working set. The old VM used
to skip any mapped page, but still evict indirect blocks and
other data that is useful to cache.

This patch adds logic to skip scanning the anon lists and
the active file list if most of the file pages are on the
inactive file list (where streaming IO pages live), while
at the lowest scanning priority.

If the system is not doing a lot of streaming IO, eg. the
system is running a database workload, then more often used
file pages will be on the active file list and this logic
is automatically disabled.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
include/linux/mmzone.h | 1 +
mm/vmscan.c | 18 ++++++++++++++++--
2 files changed, 17 insertions(+), 2 deletions(-)

Index: linux-2.6.26-rc8-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.26-rc8-mm1.orig/include/linux/mmzone.h 2008-07-07 15:41:32.000000000 -0400
+++ linux-2.6.26-rc8-mm1/include/linux/mmzone.h 2008-07-15 14:58:50.000000000 -0400
@@ -453,6 +453,7 @@ static inline int zone_is_oom_locked(con
* queues ("queue_length >> 12") during an aging round.
*/
#define DEF_PRIORITY 12
+#define PRIO_CACHE_ONLY DEF_PRIORITY+1

/* Maximum number of zones on a zonelist */
#define MAX_ZONES_PER_ZONELIST (MAX_NUMNODES * MAX_NR_ZONES)
Index: linux-2.6.26-rc8-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.26-rc8-mm1.orig/mm/vmscan.c 2008-07-07 15:41:33.000000000 -0400
+++ linux-2.6.26-rc8-mm1/mm/vmscan.c 2008-07-15 15:10:05.000000000 -0400
@@ -1481,6 +1481,20 @@ static unsigned long shrink_zone(int pri
}
}

+ /*
+ * If there is a lot of sequential IO going on, most of the
+ * file pages will be on the i...

To: Rik van Riel <riel@...>
Cc: <linux-kernel@...>, <Lee.Schermerhorn@...>, <kosaki.motohiro@...>
Date: Tuesday, July 15, 2008 - 4:48 pm

On Tue, 15 Jul 2008 16:09:48 -0400

I'd be surprised if indirect blocks are getting kicked - they tend to
be awfully sticky due to frequent touch_buffer()s or equivalent.

inode blocks tend to be pretty sticky too - this is affected a lot by
whether or not atime updates are enabled.

directory blocks might be less sticky, but that might be what we want
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <Lee.Schermerhorn@...>, <kosaki.motohiro@...>, <lwoodman@...>
Date: Tuesday, July 15, 2008 - 5:52 pm

On Tue, 15 Jul 2008 13:48:48 -0700

Agreed. In my initial testing this patch seems to bring the
behaviour of the kernel closer to the behaviour the old VM

If you rewrite a large enough file, they get kicked. This
has become noticable some time between 2.6.9 and 2.6.18, but
I don't think we can point to any particular changeset that
caused it - and even if we do, chances are it does more good
than harm :)

--
All Rights Reversed
--

Previous thread: Re: 2.6.26-rc9-git9 doesn't boot on Macintel by Justin Mattock on Tuesday, July 15, 2008 - 3:52 pm. (1 message)

Next thread: Re: [stable] Linux 2.6.25.10 by Linus Torvalds on Tuesday, July 15, 2008 - 4:18 pm. (56 messages)