Hi, is zone->lru_lock really the right lock to take in setup_per_zone_pages_min()? All other functions in mm/page_alloc.c take zone->lock instead, for working with page->lru free-list or PageBuddy(). setup_per_zone_pages_min() eventually calls move_freepages(), which is also manipulating the page->lru free-list and checking for PageBuddy(). Both should be protected by zone->lock instead of zone->lru_lock, if I understood that right, or else there could be a race with the other functions in mm/page_alloc.c. We ran into a list corruption bug in free_pages_bulk() once, during memory hotplug stress test, but cannot reproduce it easily. So I cannot verify if using zone->lock instead of zone->lru_lock would fix it, but to me it looks like this may be the problem. Any thoughts? BTW, I also wonder if a spin_lock_irq() would be enough, instead of spin_lock_irqsave(), because this function should never be called from interrupt context, right? Thanks, Gerald --
The allocator protects it freelists using zone->lock (as we can see in rmqueue_bulk), so anything which manipulates those should also be using that lock. move_freepages() is scanning the cmap and picking up free pages directly off the free lists, it is expecting those lists to be stable; it would appear to need zone->lock. It does look like setup_per_zone_pages_min() is holding the wrong thing at first look. -apw --
I just noticed that the spin_lock in that function is much older than the call to setup_zone_migrate_reserve(), which then calls move_freepages(). So it seems that the zone->lru_lock there does (did?) have another purpose, maybe protecting zone->present_pages/pages_min/etc. Looks like the need for a zone->lock (if any) was added later, but I'm not sure if makes sense to take both locks together, or if the lru_lock is still needed at all. Thanks, Gerald --
On Mon, 29 Sep 2008 23:20:05 +0200 At first look, replacing zone->lru_lock with zone->lock is enough... This function is an only one function which use zone->lru_lock in page_alloc.c And zone_watermark_ok() which access zone->pages_min/low/high is not under any locks. So, taking zone->lru_lock here doesn't seem to be necessary... Thanks, -Kame --
The zone->lru_lock() have been used before memory hotplug code was I agree. Bye. -- Yasunori Goto --
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
This replaces zone->lru_lock in setup_per_zone_pages_min() with zone->lock.
There seems to be no need for the lru_lock anymore, but there is a need for
zone->lock instead, because that function may call move_freepages() via
setup_zone_migrate_reserve().
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -4207,7 +4207,7 @@ void setup_per_zone_pages_min(void)
for_each_zone(zone) {
u64 tmp;
- spin_lock_irqsave(&zone->lru_lock, flags);
+ spin_lock_irqsave(&zone->lock, flags);
tmp = (u64)pages_min * zone->present_pages;
do_div(tmp, lowmem_pages);
if (is_highmem(zone)) {
@@ -4239,7 +4239,7 @@ void setup_per_zone_pages_min(void)
zone->pages_low = zone->pages_min + (tmp >> 2);
zone->pages_high = zone->pages_min + (tmp >> 1);
setup_zone_migrate_reserve(zone);
- spin_unlock_irqrestore(&zone->lru_lock, flags);
+ spin_unlock_irqrestore(&zone->lock, flags);
}
/* update totalreserve_pages */
--
On Wed, 01 Oct 2008 19:39:32 +0200 Thank you!. --
