Re: [PATCH 0/5] Generic smp_call_function(), improvements, and smp_call_function_single()

Previous thread: none

Next thread: [patch 04/13] x86: PAT make ioremap_change_attr non-static by venkatesh.pallipadi on Tuesday, March 18, 2008 - 5:00 pm. (1 message)
From: Jens Axboe
Date: Wednesday, March 19, 2008 - 4:56 am

Hi,

While working on the IO CPU affinity bits, I had to convert some more
archs to support smp_call_function_single(). Nicks original patch
for x86-64 also included improvements for smp_call_function(), so
it isn't serialized by call_lock anymore (but still hits call_lock
on every call). I've made further improvements and changes upon that
while converting x86, going further and updating IA64 and PPC made me
realize that we have a lot of shared code for the same functionality.

So I came up with kernel/smp.c that basically just holds the new
and abstracted code for issuing smp_call_function() and
smp_call_function_single() along with the helpers and interrupt
handlers associated. Perhaps a better place would be lib/smp.c, I'm
open to suggestions.

Even with half of this being new functionality, the diffstat still looks
pretty nice:

 arch/ia64/kernel/smp.c                     |  171 ++---------
 arch/powerpc/kernel/smp.c                  |  223 +-------------
 arch/powerpc/platforms/cell/interrupt.c    |    1 
 arch/powerpc/platforms/ps3/smp.c           |    7 
 arch/powerpc/platforms/pseries/xics.c      |    6 
 arch/powerpc/sysdev/mpic.c                 |    2 
 arch/x86/kernel/entry_64.S                 |    3 
 arch/x86/kernel/i8259_64.c                 |    1 
 arch/x86/kernel/smp_32.c                   |  165 +++-------
 arch/x86/kernel/smp_64.c                   |  175 ++---------
 arch/x86/kernel/smpboot_32.c               |    4 
 arch/x86/kernel/smpcommon_32.c             |   34 --
 include/asm-powerpc/smp.h                  |    5 
 include/asm-x86/hw_irq_32.h                |    1 
 include/asm-x86/hw_irq_64.h                |    4 
 include/asm-x86/mach-default/entry_arch.h  |    1 
 include/asm-x86/mach-default/irq_vectors.h |    1 
 include/linux/smp.h                        |   30 +
 kernel/Makefile                            |    2 
 kernel/smp.c                               |  316 +++++++++++++++++++++
 20 files changed, 533 insertions(+), 619 ...
From: Jens Axboe
Date: Wednesday, March 19, 2008 - 4:56 am

This adds kernel/smp.c which contains helpers for IPI function calls. In
addition to supporting the existing smp_call_function() in a more efficient
manner, it also adds a more scalable variant called smp_call_function_single()
for calling a given function on a single CPU only.

The core of this is based on the x86-64 patch from Nick Piggin.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 include/linux/smp.h |   30 +++++-
 kernel/Makefile     |    2 +-
 kernel/smp.c        |  316 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 345 insertions(+), 3 deletions(-)
 create mode 100644 kernel/smp.c

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 55232cc..19b217e 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -7,9 +7,19 @@
  */
 
 #include <linux/errno.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/cpumask.h>
 
 extern void cpu_idle(void);
 
+struct call_single_data {
+	struct list_head list;
+	void (*func) (void *info);
+	void *info;
+	unsigned int flags;
+};
+
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
@@ -50,12 +60,27 @@ extern int __cpu_up(unsigned int cpunum);
 extern void smp_cpus_done(unsigned int max_cpus);
 
 /*
- * Call a function on all other processors
+ * Call a function on all other processors - arch exported functions
  */
 int smp_call_function(void(*func)(void *info), void *info, int retry, int wait);
-
 int smp_call_function_single(int cpuid, void (*func) (void *info), void *info,
 				int retry, int wait);
+void __smp_call_function_single(int cpuid, struct call_single_data *data);
+
+/*
+ * Generic helpers
+ */
+int generic_smp_call_function_single(int cpuid, void (*func) (void *info),
+					void *info, int wait,
+					void (*send_ipi)(int cpu));
+int generic_smp_call_function(void (*func) (void *info), void *info, int wait,
+				cpumask_t mask, void (*send_ipi)(cpumask_t));
+void generic_exec_single(int cpu, struct call_single_data *data,
+			 ...
From: Jens Axboe
Date: Wednesday, March 19, 2008 - 4:56 am

This converts x86 to use the new helpers for smp_call_function() and
frieds, and adds support for smp_call_function_single().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 arch/x86/kernel/smp_32.c                   |  165 +++++++++-------------------
 arch/x86/kernel/smpboot_32.c               |    4 +
 arch/x86/kernel/smpcommon_32.c             |   34 ------
 include/asm-x86/hw_irq_32.h                |    1 +
 include/asm-x86/mach-default/entry_arch.h  |    1 +
 include/asm-x86/mach-default/irq_vectors.h |    1 +
 6 files changed, 61 insertions(+), 145 deletions(-)

diff --git a/arch/x86/kernel/smp_32.c b/arch/x86/kernel/smp_32.c
index dc0cde9..8d12130 100644
--- a/arch/x86/kernel/smp_32.c
+++ b/arch/x86/kernel/smp_32.c
@@ -476,64 +476,29 @@ static void native_smp_send_reschedule(int cpu)
 	send_IPI_mask(cpumask_of_cpu(cpu), RESCHEDULE_VECTOR);
 }
 
-/*
- * Structure and data for smp_call_function(). This is designed to minimise
- * static memory requirements. It also looks cleaner.
- */
-static DEFINE_SPINLOCK(call_lock);
-
-struct call_data_struct {
-	void (*func) (void *info);
-	void *info;
-	atomic_t started;
-	atomic_t finished;
-	int wait;
-};
-
 void lock_ipi_call_lock(void)
 {
-	spin_lock_irq(&call_lock);
+	spin_lock_irq(&call_function_lock);
 }
 
 void unlock_ipi_call_lock(void)
 {
-	spin_unlock_irq(&call_lock);
+	spin_unlock_irq(&call_function_lock);
 }
 
-static struct call_data_struct *call_data;
-
-static void __smp_call_function(void (*func) (void *info), void *info,
-				int nonatomic, int wait)
+static void send_call_function_ipi(cpumask_t mask)
 {
-	struct call_data_struct data;
-	int cpus = num_online_cpus() - 1;
-
-	if (!cpus)
-		return;
-
-	data.func = func;
-	data.info = info;
-	atomic_set(&data.started, 0);
-	data.wait = wait;
-	if (wait)
-		atomic_set(&data.finished, 0);
-
-	call_data = &data;
-	mb();
-	
-	/* Send a message to all other CPUs and wait for them to respond ...
From: Jens Axboe
Date: Wednesday, March 19, 2008 - 4:56 am

This converts x86-64 to use the new helpers for smp_call_function() and
frieds, and adds support for smp_call_function_single().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 arch/x86/kernel/entry_64.S  |    3 +
 arch/x86/kernel/i8259_64.c  |    1 +
 arch/x86/kernel/smp_64.c    |  175 ++++++++++---------------------------------
 include/asm-x86/hw_irq_64.h |    4 +-
 4 files changed, 46 insertions(+), 137 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c20c9e7..22caf56 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -713,6 +713,9 @@ END(invalidate_interrupt\num)
 ENTRY(call_function_interrupt)
 	apicinterrupt CALL_FUNCTION_VECTOR,smp_call_function_interrupt
 END(call_function_interrupt)
+ENTRY(call_function_single_interrupt)
+	apicinterrupt CALL_FUNCTION_SINGLE_VECTOR,smp_call_function_single_interrupt
+END(call_function_single_interrupt)
 ENTRY(irq_move_cleanup_interrupt)
 	apicinterrupt IRQ_MOVE_CLEANUP_VECTOR,smp_irq_move_cleanup_interrupt
 END(irq_move_cleanup_interrupt)
diff --git a/arch/x86/kernel/i8259_64.c b/arch/x86/kernel/i8259_64.c
index fa57a15..2b0b6d2 100644
--- a/arch/x86/kernel/i8259_64.c
+++ b/arch/x86/kernel/i8259_64.c
@@ -493,6 +493,7 @@ void __init native_init_IRQ(void)
 
 	/* IPI for generic function call */
 	set_intr_gate(CALL_FUNCTION_VECTOR, call_function_interrupt);
+	set_intr_gate(CALL_FUNCTION_SINGLE_VECTOR, call_function_single_interrupt);
 
 	/* Low priority IPI to cleanup after moving an irq */
 	set_intr_gate(IRQ_MOVE_CLEANUP_VECTOR, irq_move_cleanup_interrupt);
diff --git a/arch/x86/kernel/smp_64.c b/arch/x86/kernel/smp_64.c
index 2fd74b0..6002274 100644
--- a/arch/x86/kernel/smp_64.c
+++ b/arch/x86/kernel/smp_64.c
@@ -295,111 +295,27 @@ void smp_send_reschedule(int cpu)
 	send_IPI_mask(cpumask_of_cpu(cpu), RESCHEDULE_VECTOR);
 }
 
-/*
- * Structure and data for smp_call_function(). This is designed to minimise
- * static memory requirements. It also ...
From: Jens Axboe
Date: Wednesday, March 19, 2008 - 4:56 am

This converts ia64 to use the new helpers for smp_call_function() and
frieds, and adds support for smp_call_function_single().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 arch/ia64/kernel/smp.c |  171 +++++++++++++-----------------------------------
 1 files changed, 45 insertions(+), 126 deletions(-)

diff --git a/arch/ia64/kernel/smp.c b/arch/ia64/kernel/smp.c
index 4e446aa..fa26528 100644
--- a/arch/ia64/kernel/smp.c
+++ b/arch/ia64/kernel/smp.c
@@ -61,24 +61,9 @@ static struct local_tlb_flush_counts {
 static DEFINE_PER_CPU(unsigned int, shadow_flush_counts[NR_CPUS]) ____cacheline_aligned;
 
 
-/*
- * Structure and data for smp_call_function(). This is designed to minimise static memory
- * requirements. It also looks cleaner.
- */
-static  __cacheline_aligned DEFINE_SPINLOCK(call_lock);
-
-struct call_data_struct {
-	void (*func) (void *info);
-	void *info;
-	long wait;
-	atomic_t started;
-	atomic_t finished;
-};
-
-static volatile struct call_data_struct *call_data;
-
 #define IPI_CALL_FUNC		0
 #define IPI_CPU_STOP		1
+#define IPI_CALL_FUNC_SINGLE	2
 #define IPI_KDUMP_CPU_STOP	3
 
 /* This needs to be cacheline aligned because it is written to by *other* CPUs.  */
@@ -89,13 +74,13 @@ extern void cpu_halt (void);
 void
 lock_ipi_calllock(void)
 {
-	spin_lock_irq(&call_lock);
+	spin_lock_irq(&call_function_lock);
 }
 
 void
 unlock_ipi_calllock(void)
 {
-	spin_unlock_irq(&call_lock);
+	spin_unlock_irq(&call_function_lock);
 }
 
 static void
@@ -139,32 +124,11 @@ handle_IPI (int irq, void *dev_id)
 
 			switch (which) {
 			      case IPI_CALL_FUNC:
-			      {
-				      struct call_data_struct *data;
-				      void (*func)(void *info);
-				      void *info;
-				      int wait;
-
-				      /* release the 'pointer lock' */
-				      data = (struct call_data_struct *) call_data;
-				      func = data->func;
-				      info = data->info;
-				      wait = data->wait;
-
-				      mb();
-				      ...
From: Jens Axboe
Date: Wednesday, March 19, 2008 - 4:56 am

This converts ppc to use the new helpers for smp_call_function() and
frieds, and adds support for smp_call_function_single().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 arch/powerpc/kernel/smp.c               |  223 ++++---------------------------
 arch/powerpc/platforms/cell/interrupt.c |    1 +
 arch/powerpc/platforms/ps3/smp.c        |    7 +-
 arch/powerpc/platforms/pseries/xics.c   |    6 +-
 arch/powerpc/sysdev/mpic.c              |    2 +-
 include/asm-powerpc/smp.h               |    5 +-
 6 files changed, 36 insertions(+), 208 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index be35ffa..c94c80c 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -72,12 +72,8 @@ struct smp_ops_t *smp_ops;
 
 static volatile unsigned int cpu_callin_map[NR_CPUS];
 
-void smp_call_function_interrupt(void);
-
 int smt_enabled_at_boot = 1;
 
-static int ipi_fail_ok;
-
 static void (*crash_ipi_function_ptr)(struct pt_regs *) = NULL;
 
 #ifdef CONFIG_PPC64
@@ -99,12 +95,15 @@ void smp_message_recv(int msg)
 {
 	switch(msg) {
 	case PPC_MSG_CALL_FUNCTION:
-		smp_call_function_interrupt();
+		generic_smp_call_function_interrupt();
 		break;
 	case PPC_MSG_RESCHEDULE:
 		/* XXX Do we have to do this? */
 		set_need_resched();
 		break;
+	case PPC_MSG_CALL_FUNC_SINGLE:
+		generic_smp_call_function_single_interrupt();
+		break;
 	case PPC_MSG_DEBUGGER_BREAK:
 		if (crash_ipi_function_ptr) {
 			crash_ipi_function_ptr(get_irq_regs());
@@ -154,215 +153,47 @@ static void stop_this_cpu(void *dummy)
 		;
 }
 
-/*
- * Structure and data for smp_call_function(). This is designed to minimise
- * static memory requirements. It also looks cleaner.
- * Stolen from the i386 version.
- */
-static  __cacheline_aligned_in_smp DEFINE_SPINLOCK(call_lock);
-
-static struct call_data_struct {
-	void (*func) (void *info);
-	void *info;
-	atomic_t started;
-	atomic_t finished;
-	int wait;
-} *call_data;
-
-/* delay ...
From: Ingo Molnar
Date: Friday, March 21, 2008 - 2:53 am

i'm still wondering about the following fundamental bit: why not use per 
CPU kernel threads? That way you get a fast (lockless) IPI "for free" as 
SMP wakeups already do this.

smp_call_function() is quirky and has deep limitations on atomicity, 
etc., so we are moving away from it and should not base more 
functionality on it.

	Ingo
--

From: Jens Axboe
Date: Friday, March 21, 2008 - 6:15 am

The kernel thread variant wont be any more lockless than the
smp_call_function_single() approach, they both have to grab the
destination queue lock. If you recall, I pushed forward on the kernel
thread variant and even still have it online here:

http://git.kernel.dk/?p=linux-2.6-block.git;a=shortlog;h=io-cpu-affinity-kthread

which is pretty much identical to io-cpu-affinity, except it uses kernel
threads for completion.

The reason why I dropped the kthread approach is that it was slower.
Time from signal to run was about 33% faster with IPI than with
wake_up_process(). Doing benchmark runs, and the IPI approach won hands

The patchset does not build on smp_call_function(), it merely cleans
that stuff up instead of having essentially the same code in each arch.
As more archs are converted, it'll remove lots more code.

The block stuff builds on smp_call_function_single(), which doesn't
suffer from any of the badness that smp_call_function() does.

-- 
Jens Axboe

--

From: Nick Piggin
Date: Tuesday, March 25, 2008 - 1:00 am

It's obviously going to be faster. The kthread approach needs an IPI
*and* a context switch on the destination CPU.

For broadcast situations, it is also going to be faster, because it
can use platform specific broadcast IPIs 

The only weird atomicity requirements of smp_call_function is due to
its terribly unscalable design. Once you have the producer provide its
own data (or accept -ENOMEM failures), then there is no problem. It's
then even possible to use it from interrupt context with not too much

Also there is nothing wrong with improving the speed of the existing
implementation. The current smp_call_function is more or less unusable
for any real work because it is totally serialised.

Thanks for taking this patchset under your wing, Jens.

--

From: Ingo Molnar
Date: Thursday, March 27, 2008 - 3:08 am

with irq threads we'll have all irq context run in kthread context 
again. Could you show me how you measured the performance of the kthread 
approach versus the raw-IPI approach?

we can do a million kthread context switches per CPU per second, so 
kthread context-switch cost cannot be a true performance limit, unless 
you micro-benchmarked this.

	Ingo
--

From: Jens Axboe
Date: Thursday, March 27, 2008 - 3:37 am

There were 3 different indicators that the irq thread approach was
slower:

- Time from signal to actual run of the trigger was ~2usec vs ~3usec for
  IPI vs kthread. That was a microbenchmark.
- Cache misses were higher with the kthread approach.
- Actual performance in non-micro benchmarks was lower with the kthread
  approach.

I'll defer to Alan for the actual numbers, most of this was done in
private mails back and forth doing performance analysis. The initial
testing was done with the IPI hack, then we moved to the kthread
approach. Later the two were pitted against each other and the kthread
part was definitely slower. It ended up using more system time than the
IPI approach. So the kthread approach was than abandoned and all testing
has been on the smp_call_function_single() branch since then.

I very much wanted the kthread approach to work, since it's easier to
work with. It's not for lack of will or trying... I'll be happy to
supply you otherwise identical patches for this, the only difference

At which point you wont be doing much else, so a cs microbenchmark is
not really that interesting.

-- 
Jens Axboe

--

From: Ingo Molnar
Date: Thursday, March 27, 2008 - 3:43 am

i'd love to be able to run/pull something simple that enables me to 
replicate the measurements you did on a generic PC [without having to 
hit any real IO hardware which would put any context switching effects 
down into the noise category].

Obviously since the kthread approach embedds IPI sending it can never be 
as fast as a pure IPI approach - but it should still be reasonably fast. 
3 usecs versus 2 usecs in a microbenchmark sounds about right to me, but 
it would be better to make it a better-sounding 2.5 usecs versus 2 usecs 
or so :-)

	Ingo
--

From: Jens Axboe
Date: Thursday, March 27, 2008 - 5:02 am

You can pull io-cpu-affinity or io-cpu-affinity-kthread from
git://git.kernel.dk/linux-2.6-block.git - or just see the two attached

If you have ideas to speedup the kthread approach, fire away :-)

-- 
Jens Axboe

From: Ingo Molnar
Date: Thursday, March 27, 2008 - 5:32 am

another stupid question: what should i run in user-space to replicate 
your "3 usecs versus 2 usecs" result? io-affinity-ipi.patch seems to 
have no self-benchmarking capability at first sight. (I'd rather not try 
and cook up anything myself - i'd like to reproduce the workload you 
think is relevant for your IO affinity purposes.) Best would be to have 
a Kconfig based self-test that just runs during bootup if i boot a 
bzImage. (laziness rules - and this way i could also track performance 
regressions more easily, by looking at historic serial logs.)

	Ingo
--

From: Jens Axboe
Date: Thursday, March 27, 2008 - 5:35 am

I didn't do those numbers - Alan, when you timed the kthread vs ipi
"wake up", what did you use? I'm guessing some hacked in test, perhaps
you can pass that to Ingo?

-- 
Jens Axboe

--

From: Luck, Tony
Date: Friday, March 21, 2008 - 11:22 am

Can you give me a clue as to which line needs the one-liner ... although
it builds cleanly for IA64 it panics with a dereference NULL fault. It
looks like we were in the list_add_tail(&data->list, &dst->list) code.

Console log looks like this:

ACPI: Core revision 20070126
Boot processor id 0x0/0xc018
Fixed BSP b0 value from CPU 1
Unable to handle kernel NULL pointer dereference (address 0000000000000000)
swapper[0]: Oops 8804682956800 [1]
Modules linked in:

Pid: 0, CPU 1, comm:              swapper
psr : 00001010084a2010 ifs : 800000000000060e ip  : [<a0000001000be1c0>]    Not tainted (2.6.25-rc6-tiger-smp)
ip is at generic_exec_single+0xc0/0x200
unat: 0000000000000000 pfs : 000000000000060e rsc : 0000000000000003
rnat: 5555555555555555 bsps: 5555555555555555 pr  : 0000000000002655
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270433f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a0000001000be180 b6  : e00000007fe131b0 b7  : e00000007fb1bde0
f6  : 000000000000000000000 f7  : 1003e6db6db6db6db6db7
f8  : 1003e00000000000c410a f9  : 1000fffff800000000000
f10 : 1003e000000000001ffff f11 : 000000000000000000000
r1  : a000000100c42f20 r2  : 0000000000000000 r3  : ffffffffffff5968
r8  : 00000010084a6010 r9  : a000000100a617c0 r10 : 0000000000000000
r11 : a000000100a617c0 r12 : e0000001c025fdf0 r13 : e0000001c0250000
r14 : e000000180016058 r15 : e000000180005970 r16 : 0000000000000002
r17 : a000000100a617c0 r18 : 0000000000004000 r19 : a00000010099ffb0
r20 : e0000001c01550c0 r21 : e0000001c01501c0 r22 : 0000000000000000
r23 : e000000180016064 r24 : a07ffffffff20498 r25 : 00000000000000c0
r26 : 00000000000000c1 r27 : 00000010084a6010 r28 : 0000000000000001
r29 : a00000010099ff80 r30 : 0000000000000000 r31 : e000000180005978

Call Trace:
 [<a000000100011bb0>] show_stack+0x50/0xa0
                                sp=e0000001c025f9c0 bsp=e0000001c0250e68
 [<a000000100012490>] show_regs+0x830/0x860
                                sp=e0000001c025fb90 ...
From: Jens Axboe
Date: Friday, March 21, 2008 - 11:31 am

My thinking was that it may need a bit of help to ensure that the init
function for single call data works,
kernel/smp.c:init_call_single_data() specifically. It may be run before

Is it data or dst being NULL? Looking at your trace, it must be dst.
Which makes me believe that it is indeed a missing call of the above
function at the right time. So some help would be accepted there :)

-- 
Jens Axboe

--

From: Luck, Tony
Date: Friday, March 21, 2008 - 11:56 am

The problem may be worse than that.  I see that you have a per-cpu
"call_single_queue" on which you do some list operations.  On ia64
the per-cpu variables live at the same virtual address on each processor
with a per-cpu TLB mapping directing them to a different physical
address on each.  This makes Linux lists very confused.

-Tony
--

From: Jens Axboe
Date: Saturday, March 22, 2008 - 5:29 am

Funky, how does accessing other CPU's per-cpu variables work on ia64
then? Perhaps I made some false assumptions. Unfortunately I have no
access to any IA64 machines, so I either have to yank the ia64 bits
(which is unfortunate, since Alan tests on those :-) or rely on a bit of
help from you and/or others in getting that bit right.

-- 
Jens Axboe

--

From: Luck, Tony
Date: Monday, March 24, 2008 - 9:45 am

The per-cpu memory is mapped at two different spots in the kernel
virtual memory.  When cpuA wants to access per-cpu memory that belongs
to cpuB it can use the mappings that allow access to every percpu
area (which may just be indexing by cpu number into a big block of
memory that has all the per-cpu spaces ... or some more complex
arithmetic and pointers for NUMA systems where the per-cpu memory
ought to be allocated out of memory on the right node for the cpu
that it refers to).

When any cpu wants to access its own per-cpu data, it can do so
via the per-cpu mapping (which is much more efficient from a code
generation perspective at the per-cpu virtual area is the top 64K
of virtual address space, so can be accessed with a small negative
offset from register r0).

This is why lists can be a problem ... since the same memory can
be accessed via two different virtual addresses, things can get
badly knotted when the two different addresses get used in different
parts of the code.  Then operations like "list_empty()" may give
the wrong answer because the virtual address used for head->next
isn't the same as that used for head ... but they both refer to the

I'll see if I can figure out what is going wrong.

-Tony
--

Previous thread: none

Next thread: [patch 04/13] x86: PAT make ioremap_change_attr non-static by venkatesh.pallipadi on Tuesday, March 18, 2008 - 5:00 pm. (1 message)