login
Header Space

 
 

Re: [PATCH 08 of 11] anon-vma-rwsem

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Nick Piggin <npiggin@...>
Cc: Robin Holt <holt@...>, Nick Piggin <nickpiggin@...>, Linus Torvalds <torvalds@...>, Andrea Arcangeli <andrea@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Jack Steiner <steiner@...>, Peter Zijlstra <a.p.zijlstra@...>, <kvm-devel@...>, Kanoj Sarcar <kanojsarcar@...>, Roland Dreier <rdreier@...>, Steve Wise <swise@...>, <linux-kernel@...>, Avi Kivity <avi@...>, <linux-mm@...>, <general@...>, Hugh Dickins <hugh@...>, Rusty Russell <rusty@...>, Anthony Liguori <aliguori@...>, Chris Wright <chrisw@...>, Marcelo Tosatti <marcelo@...>, Eric Dumazet <dada1@...>, Paul E. McKenney <paulmck@...>
Date: Wednesday, May 14, 2008 - 7:26 am

On Wed, May 14, 2008 at 06:11:22AM +0200, Nick Piggin wrote:

I assume by coherent domains, your are actually talking about system
images.  Our memory coherence domain on the 3700 family is 512 processors
on 128 nodes.  On the 4700 family, it is 16,384 processors on 4096 nodes.
We extend a "Read-Exclusive" mode beyond the coherence domain so any
processor is able to read any cacheline on the system.  We also provide
uncached access for certain types of memory beyond the coherence domain.

For the other partitions, the exporting partition does not know what
virtual address the imported pages are mapped.  The pages are frequently
mapped in a different order by the MPI library to help with MPI collective
operations.

For the exporting side to do those TLB flushes, we would need to replicate
all that importing information back to the exporting side.

Additionally, the hardware that does the TLB flushing is protected
by a spinlock on each system image.  We would need to change that
simple spinlock into a type of hardware lock that would work (on 3700)
outside the processors coherence domain.  The only way to do that is to
use uncached addresses with our Atomic Memory Operations which do the
cmpxchg at the memory controller.  The uncached accesses are an order
of magnitude or more slower.


But it isn't that we are having a problem adapting to just the hardware.
One of the limiting factors is Linux on the other partition.


zap_page_range calls unmap_vmas which walks to vma->next.  Are you saying
that can be walked without grabbing the mmap_sem at least readably?
I feel my understanding of list management and locking completely
shifting.


Are you suggesting the sending side would not need to sleep or the
receiving side?  Assuming you meant the sender, it spins waiting for the
remote side to acknowledge the invalidate request?  We place the data
into a previously agreed upon buffer and send an interrupt.  At this
point, we would need to start spinning and waiting for completion.
Let's assume we never run out of buffer space.

The receiving side receives an interrupt.  The interrupt currently wakes
an XPC thread to do the work of transfering and delivering the message
to XPMEM.  The transfer of the data which XPC does uses the BTE engine
which takes up to 28 seconds to timeout (hardware timeout before raising
and error) and the BTE code automatically does a retry for certain
types of failure.  We currently need to grab semaphores which _MAY_
be able to be reworked into other types of locks.


Thanks,
Robin
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 00 of 11] mmu notifier #v16, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 10 of 11] export zap_page_range for XPMEM, Andrea Arcangeli, (Wed May 7, 10:36 am)
[PATCH 11 of 11] mmap sems, Andrea Arcangeli, (Wed May 7, 10:36 am)
[PATCH 09 of 11] mm_lock-rwsem, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 4:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 5:26 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Jack Steiner, (Wed May 7, 6:42 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 5:36 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 7, 8:38 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 8:06 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 13, 11:32 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Wed May 14, 12:11 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 14, 7:26 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Thu May 15, 3:57 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Thu May 15, 1:33 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Thu May 15, 7:52 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Fri May 16, 7:23 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Fri May 16, 7:50 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 1:31 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 6:01 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 6:50 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 7:05 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 7:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 7:26 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Thu May 15, 7:01 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Avi Kivity, (Thu May 15, 7:12 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 11:18 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 14, 1:57 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 2:27 pm)
mm notifier: Notifications when pages are unmapped., Christoph Lameter, (Fri May 16, 9:38 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 14, 12:22 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 12:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 8:55 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:22 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 6:44 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:58 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 7:09 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:02 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrew Morton, (Wed May 7, 6:31 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:44 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Benjamin Herrenschmidt, (Wed May 7, 7:28 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:45 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 9:34 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 8:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Benjamin Herrenschmidt, (Wed May 14, 1:43 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Jack Steiner, (Wed May 14, 9:15 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Wed May 14, 2:06 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrew Morton, (Wed May 7, 6:59 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:02 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 9:26 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 9:12 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 11:10 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 11:41 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 12:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Thu May 8, 1:20 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 11:03 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 12:11 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Peter Zijlstra, (Fri May 9, 2:37 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Fri May 9, 2:55 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Peter Zijlstra, (Fri May 9, 3:04 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Thu May 8, 6:01 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Pekka Enberg, (Thu May 8, 1:27 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Pekka Enberg, (Thu May 8, 1:30 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Thu May 8, 1:49 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:32 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 7:19 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 7:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 8:03 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 8:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 9:52 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:57 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:24 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 10:32 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:07 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 7, 8:52 pm)
[PATCH 07 of 11] i_mmap_rwsem, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 06 of 11] rwsem contended, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 05 of 11] unmap vmas tlb flushing, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 05 of 11] unmap vmas tlb flushing, Rik van Riel, (Wed May 7, 1:46 pm)
[PATCH 04 of 11] free-pgtables, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 04 of 11] free-pgtables, Rik van Riel, (Wed May 7, 1:41 pm)
[PATCH 03 of 11] invalidate_page outside PT lock, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 03 of 11] invalidate_page outside PT lock, Rik van Riel, (Wed May 7, 1:39 pm)
Re: [PATCH 03 of 11] invalidate_page outside PT lock, Andrea Arcangeli, (Wed May 7, 1:57 pm)
[PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrew Morton, (Wed May 7, 4:05 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 4:30 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 5:58 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 6:11 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 6:27 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 7:00 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 6:37 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 7:38 pm)
Re: [ofa-general] Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 6:39 pm)
Re: [ofa-general] Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 7:03 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrew Morton, (Wed May 7, 4:02 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Rik van Riel, (Wed May 7, 1:35 pm)
[PATCH 02 of 11] get_task_mm, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 02 of 11] get_task_mm, Robin Holt, (Wed May 7, 11:59 am)
Re: [PATCH 02 of 11] get_task_mm, Andrea Arcangeli, (Wed May 7, 12:20 pm)
speck-geostationary