From: Cliff Wickman <cpw@sgi.com>
TLB shootdown for SGI UV.
v1: 6/2 original
v2: 6/3 corrections/improvements per Ingo's review
v3: 6/4 split atomic operations off to a separate patch (Jeremy's review)
v4: 6/12 include <mach_apic.h> rather than <asm/mach-bigsmp/mach_apic.h>
(fixes a !SMP build problem that Ingo found)
fix the index on uv_table_bases[blade]
Now depends on patch:
x86 atomic operations: atomic_or_long atomic_inc_short
which was split off (and improved) at the suggestion of
Jeremy Fitzhardinge <jeremy@goop.org>
This patch provides the ability to flush TLB's in cpu's that are not on
the local node. The hardware mechanism for distributing the flush
messages is the UV's "broadcast assist unit".
The hook to intercept TLB shootdown requests is a 2-line change to
native_flush_tlb_others() (arch/x86/kernel/tlb_64.c).
This code has been tested on a hardware simulator. The real hardware
is not yet available.
The shootdown statistics are provided through /proc/sgi_uv/ptc_statistics.
The use of /sys was considered, but would have required the use of
many /sys files. The debugfs was also considered, but these statistics
should be available on an ongoing basis, not just for debugging.
Issues to be fixed later:
- The IRQ for the messaging interrupt is currently hardcoded as 200
(see UV_BAU_MESSAGE). It should be dynamically assigned in the future.
- The use of appropriate udelay()'s is untested, as they are a problem
in the simulator.
Diffed against 2.6.26-rc4
Signed-off-by: Cliff Wickman <cpw@sgi.com>
---
arch/x86/kernel/Makefile | 2
arch/x86/kernel/entry_64.S | 4
arch/x86/kernel/tlb_64.c | 5
arch/x86/kernel/tlb_uv.c | 786 ++++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/uv/uv_bau.h | 336 ++++++++++++++++++
5 files changed, 1132 insertions(+), 1 deletion(-)
Index: 080602.ingo/arch/x86/kernel/entry_64.S
===================================================================
--- ...For someone not too familiar with low level x86 (or UV) code, can you explain why you are hooking at this point? I mean, what it looks like is either a performance improvement, or for some reason UV does not support send_IPI_mask out to CPUs "not on the local node". If the former, what sort of improvement to you expect / see? If the latter, then why aren't you hooking at send_IPI_mask? If possible that would obviously be preferable so you aren't making the tlb flushing harder to follow... Can you set me straight? :) Thanks, Nick --
Hi Nick, Yes, a performance improvement. The UV machine has hardware for broadcasting messages to a set of nodes (represented in a bit mask). The messages will raise interrupts at each of the target nodes and provide the message - all in one step. (IPI is supported. In fact this patch falls back to the IPI method Good question. The hardware does not exist yet. But using IPI there would be one set of packets exchanged to deliver the interrupts and another set to pull over the flush address, just to start the operation. I expect the improvement to be significant. -Cliff -- Cliff Wickman Silicon Graphics, Inc. cpw@sgi.com (651) 683-3824 --
Thanks, that makes it perfectly clear to me now (the intent, not the details of the code :)) So long as this raises a maskable interrupt on each target CPU, it Ah, so you can send a small message with the IPI, and that can be decoded and used by the target without invoking the cc protocol. Seems like pretty sweet functionality. I guess TLB flushing is an obvious candidate, but it could be quite useful for other operations as well. I wonder if it couldn't be used to create a slightly more advanced API (than send_IPI) which other platforms can just implement using cache coherency for the payload... For example, some classes of smp_call_function could use this too. But for now I don't see anything wrong with getting this patch upstream and looking to generalise it later. --
Hi Nick, Jack Steiner's thought as well. But I haven't considered any yet. If you care to nominate any other such uses for this hardware mechanism -Cliff -- Cliff Wickman Silicon Graphics, Inc. cpw@sgi.com (651) 683-3824 --
style: we try to add a single space after commas, to make it visually since you basically hook into this function and replace it, wouldnt it make more sense instead to override the flush_tlb_others callback in return is not needed at the end of void functions. (there are many other why not split the iterator into a helper function? That would avoid all these line length artifacts as well. Also, decrease constant name 'struct bau_activation_descriptor' is too long of a name (and that many repetitive __get_cpu_var(bau_control) instances - put that into a small nit, we try to keep vertical alignment consistent across files, this iteration could avoid the linebreak via intelligent shrinking of (similar things in other places as well) Ingo --
