[PATCH] prevent kswapd from freeing excessive amounts of lowmem

Previous thread: [PATCH 7/7] cxgb3 - Update engine microcode version by Divy Le Ray on Wednesday, September 5, 2007 - 3:58 pm. (1 message)

Next thread: Why do so many machines need "noapic"? by Chuck Ebbert on Wednesday, September 5, 2007 - 4:30 pm. (16 messages)
From: Rik van Riel
Date: Wednesday, September 5, 2007 - 4:01 pm

The current VM can get itself into trouble fairly easily on systems
with a small ZONE_HIGHMEM, which is common on i686 computers with
1GB of memory.

On one side, page_alloc() will allocate down to zone->pages_low,
while on the other side, kswapd() and balance_pgdat() will try
to free memory from every zone, until every zone has more free
pages than zone->pages_high.

Highmem can be filled up to zone->pages_low with page tables,
ramfs, vmalloc allocations and other unswappable things quite
easily and without many bad side effects, since we still have
a huge ZONE_NORMAL to do future allocations from.

However, as long as the number of free pages in the highmem
zone is below zone->pages_high, kswapd will continue swapping
things out from ZONE_NORMAL, too!

Sami Farin managed to get his system into a stage where kswapd
had freed about 700MB of low memory and was still "going strong".

The attached patch will make kswapd stop paging out data from
zones when there is more than enough memory free.  We do go above
zone->pages_high in order to keep pressure between zones equal
in normal circumstances, but the patch should prevent the kind
of excesses that made Sami's computer totally unusable.

Please merge this into -mm.

Signed-off-by: Rik van Riel <riel@redhat.com>
From: Andrew Morton
Date: Wednesday, September 5, 2007 - 6:23 pm

hm.  Did highmem's all_unreclaimable get set?  If so perhaps we could use

I guess for a very small upper zone and a very large lower zone this could
still put the scan balancing out of whack, fixable by a smarter version of
"8*zone->pages_high" but it doesn't seem very likely that this will affect
things much.

Why doesn't direct reclaim need similar treatment?

-

From: Rik van Riel
Date: Thursday, September 6, 2007 - 9:38 am

Because we only go into the direct reclaim path once
every zone is at or below zone->pages_low, and the
direct reclaim path will exit once we have freed more
than swap_cluster_max pages.

-- 
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-

From: Andrew Morton
Date: Thursday, September 6, 2007 - 3:34 pm

hm.  Now I need to remember why direct-reclaim does that :(
-

From: Rik van Riel
Date: Thursday, September 6, 2007 - 3:47 pm

Mlock can cause the problem too.  As for all_unreclaimable,
it is ignored when priority == DEF_PRIORITY, balance_pgdat

This is done so the system does not end up with the first
process that goes into page reclaim staying there forever,
while the other processes in the system happily consume
the pages freed by that poor first process.

There may be other reasons, too.

-- 
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-

From: Pavel Machek
Date: Friday, September 7, 2007 - 5:24 am

That does not seem right. Having empty HIGHMEM and full LOWMEM would
be very bad, right? We may stop freeing when there's enough LOWMEM
free, but not if there's only HIGHMEM free.

							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Rik van Riel
Date: Saturday, September 8, 2007 - 1:20 pm

Please read the code this patch applies to.

The check I add conditionalizes the individual
calls to shrink_zone(), so we do not call
shrink_zone() for a zone that has a ton of free
pages.  We still call shrink_zone() for the
other zones.

-- 
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-

Previous thread: [PATCH 7/7] cxgb3 - Update engine microcode version by Divy Le Ray on Wednesday, September 5, 2007 - 3:58 pm. (1 message)

Next thread: Why do so many machines need "noapic"? by Chuck Ebbert on Wednesday, September 5, 2007 - 4:30 pm. (16 messages)