2.6.xxx race condition in x86_64's global_flush_tlb???

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: <linux-kernel@...>
Cc: <dreiland@...>
Date: Wednesday, October 24, 2007 - 4:39 pm

I have seen some hangs in 2.6-x86_64 in flush_kernel_map(). The tests
cause alot of ioremap/iounmap to occur concurrently across many
processor threads.

Looking at the hung processor hangs, they are looping in
flush_kernel_map() and the list they get from the smp_call_function()
appears to be corrupt. In fact, I see deferred_pages as an entry and
that isn't supposed to happen.

I am questioning the locking in global_flush_tlb() listed below. The
down_read/up_read protection doesn't seen safe. If several threads are
rushing thru here, deferred_pages could be getting changed as they
look at it. I don't think there any protection when
list_replace_init() calls INIT_LIST_HEAD().

I changed the down_read()/up_read() around list_replace_init() to
down_write()/up_write() and my test runs fine.


void global_flush_tlb(void)
{
        struct page *pg, *next;
        struct list_head l;

        down_read(&init_mm.mmap_sem); // XXX should be down_write()???
        list_replace_init(&deferred_pages, &l);
        up_read(&init_mm.mmap_sem); // XXX should be up_write()????
        flush_map(&l);

        list_for_each_entry_safe(pg, next, &l, lru) {
                ClearPagePrivate(pg);
                __free_page(pg);
        }
}
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.xxx race condition in x86_64's global_flush_tlb???, Doug Reiland, (Wed Oct 24, 4:39 pm)