The current VM can get itself into trouble fairly easily on systems with a small ZONE_HIGHMEM, which is common on i686 computers with 1GB of memory. On one side, page_alloc() will allocate down to zone->pages_low, while on the other side, kswapd() and balance_pgdat() will try to free memory from every zone, until every zone has more free pages than zone->pages_high. Highmem can be filled up to zone->pages_low with page tables, ramfs, vmalloc allocations and other unswappable things quite easily and without many bad side effects, since we still have a huge ZONE_NORMAL to do future allocations from. However, as long as the number of free pages in the highmem zone is below zone->pages_high, kswapd will continue swapping things out from ZONE_NORMAL, too! Sami Farin managed to get his system into a stage where kswapd had freed about 700MB of low memory and was still "going strong". The attached patch will make kswapd stop paging out data from zones when there is more than enough memory free. We do go above zone->pages_high in order to keep pressure between zones equal in normal circumstances, but the patch should prevent the kind of excesses that made Sami's computer totally unusable. Please merge this into -mm. Signed-off-by: Rik van Riel <riel@redhat.com>
hm. Did highmem's all_unreclaimable get set? If so perhaps we could use I guess for a very small upper zone and a very large lower zone this could still put the scan balancing out of whack, fixable by a smarter version of "8*zone->pages_high" but it doesn't seem very likely that this will affect things much. Why doesn't direct reclaim need similar treatment? -
Because we only go into the direct reclaim path once every zone is at or below zone->pages_low, and the direct reclaim path will exit once we have freed more than swap_cluster_max pages. -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. -
hm. Now I need to remember why direct-reclaim does that :( -
Mlock can cause the problem too. As for all_unreclaimable, it is ignored when priority == DEF_PRIORITY, balance_pgdat This is done so the system does not end up with the first process that goes into page reclaim staying there forever, while the other processes in the system happily consume the pages freed by that poor first process. There may be other reasons, too. -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. -
That does not seem right. Having empty HIGHMEM and full LOWMEM would be very bad, right? We may stop freeing when there's enough LOWMEM free, but not if there's only HIGHMEM free. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Please read the code this patch applies to. The check I add conditionalizes the individual calls to shrink_zone(), so we do not call shrink_zone() for a zone that has a ton of free pages. We still call shrink_zone() for the other zones. -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. -
