Re: [PATCH -mm] vmscan: bail out of page reclaim after swap_cluster_max pages

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Mel Gorman
Date: Wednesday, November 19, 2008 - 9:54 am

On Thu, Nov 13, 2008 at 05:12:08PM -0500, Rik van Riel wrote:

Is this not strictly true as this is used as a running count?

/* Number of pages freed so far during a call to shrink_zones() */ 


nit, spurious whitespace change there.


This triggered alarm bells for me because I thought it would affect lumpy
reclaim. However, lumpy reclaim happens at a lower level and what I'd
expect to happen is that nr_reclaimed be at least the number of base pages
making up a high-order block for example. Thinking about it, this should be
safe but I ran it through the old anti-frag tests for hugepage allocations
(basically allocating hugepages under compile-load).

On some machines the situation improved very slighly but on one, the
success rates under load were severely impaired. On all machines at rest,
a one-shot attempt to resize the hugepage pool is resulting in much lower
success figures. However, multiple attempts eventually succeed and aggressive
resizing of the hugepage pool is resulting in higher success rates on all
but one machine.

Bottom line, hugepage pool resizing is taking more attempts but still
succeeding. While I'm not happy about the one-attempt hugepage pool resizing
being worse, I strongly suspect it's due to the current reclaim algorithm
reclaiming aggressively as a percentage of total memory and this behaviour
seems to make more sense overall. I'll re-examine how dynamic pool resizing
is and look at making it better if this change goes through.

With that out of the way, I also tried thinking about what this change really
means and have a few questions. This is all hand-wavy on my part and possibly
clueless so take it with a grain of salt.  Basically the changes comes down to;

o A process doing direct reclaim is not reclaiming a number of pages
  based on memory size and reclaim priority any more. Instead, it'll reclaim
  a bit and then check to see what the situation is.

Q1. There already is a check higher up to bail out when more than
swap_cluster_max pages are reclaimed. Should that change be now eliminated
as being redundant as it takes place "too late" when a lot of memory may
already been unnecessarily reclaimed?

Q2. In low-memory situations, it would appear that one process entering
direct reclaim (e.g. a process doing all the dirtying) might also have ended
up doing all of the cleaning. Is there a danger that a heavy-dirtying process
is now going to offload its cleaning work in small bits and pieces to every
other process entering direct reclaim?

Q3. Related to Q2, would it make sense to exclude kswapd from the check? On
the plus side, it may get to be the sucker process that does the scanning
and reclaim. On the downside, it may reclaim way more memory than is needed
to bring the free pages above the high watermark and becomes a variation of
the problem you are trying to solve here.

Q4. Less reclaim also means less scanning which means the aging information
of the pages on the lists is that bit older too. Do we care?

I was going to ask if it was easier to go OOM now, but even under very high
stress, we should be making forward progress. It's just in smaller steps so
I can't see it causing a problem.


-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [PATCH -mm] vmscan: bail out of page reclaim after swa ..., KAMEZAWA Hiroyuki, (Sun Nov 16, 5:38 pm)
Re: [PATCH -mm] vmscan: bail out of page reclaim after swa ..., Mel Gorman, (Wed Nov 19, 9:54 am)