[PATCH, RFC] hacks to allow -rt to run kernbench on POWER

Previous thread: [git patches 1/2] libata fixes by Jeff Garzik on Monday, October 29, 2007 - 2:53 pm. (2 messages)

Next thread: Re: Linux Security *Module* Framework (Was: LSM conversion to static interface) by Rob Meijer on Monday, October 29, 2007 - 3:04 pm. (37 messages)
To: <linux-kernel@...>
Cc: <tony@...>, <paulus@...>, <benh@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>, <rostedt@...>
Date: Monday, October 29, 2007 - 2:50 pm

Hello!

A few random patches that permit POWER to pass kernbench on -rt.
Many of these have more focus on expediency than care for correctness,
so might best be thought of as workarounds than as complete solutions.
There are still issues not addressed by this patch, including:

o kmem_cache_alloc() from non-preemptible context during
bootup (xics_startup() building the irq_radix_revmap()).

o unmap_vmas() freeing pages with preemption disabled.
Might be able to address this by linking the pages together,
then freeing them en masse after preemption has been re-enabled,
but there is likely a better approach.

Thoughts?

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

arch/powerpc/kernel/prom.c | 2 +-
arch/powerpc/mm/fault.c | 3 +++
arch/powerpc/mm/tlb_64.c | 8 ++++++--
arch/powerpc/platforms/pseries/eeh.c | 2 +-
drivers/of/base.c | 2 +-
include/asm-powerpc/tlb.h | 5 ++++-
include/asm-powerpc/tlbflush.h | 5 ++++-
mm/memory.c | 2 ++
8 files changed, 22 insertions(+), 7 deletions(-)

diff -urpNa -X dontdiff linux-2.6.23.1-rt4/arch/powerpc/kernel/prom.c linux-2.6.23.1-rt4-fix/arch/powerpc/kernel/prom.c
--- linux-2.6.23.1-rt4/arch/powerpc/kernel/prom.c 2007-10-12 09:43:44.000000000 -0700
+++ linux-2.6.23.1-rt4-fix/arch/powerpc/kernel/prom.c 2007-10-28 13:37:23.000000000 -0700
@@ -80,7 +80,7 @@ struct boot_param_header *initial_boot_p

extern struct device_node *allnodes; /* temporary while merging */

-extern rwlock_t devtree_lock; /* temporary while merging */
+extern raw_rwlock_t devtree_lock; /* temporary while merging */

/* export that to outside world */
struct device_node *of_chosen;
diff -urpNa -X dontdiff linux-2.6.23.1-rt4/arch/powerpc/mm/fault.c linux-2.6.23.1-rt4-fix/arch/powerpc/mm/fault.c
--- linux-2.6.23.1-rt4/arch/powerpc/mm/fault.c 2007-10-27 22:20:57.000000000 -0700
+++ linux-2.6.23.1-rt4-fix/arch/p...

To: Paul E. McKenney <paulmck@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <benh@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>
Date: Wednesday, December 12, 2007 - 11:56 pm

I'm pulling your patch for the above added code. Took me a few hours to
find the culprit, but I was getting scheduling in atomic bugs. Turns out
that this code you put "preempt_disable" in calls sleeping spinlocks.

Might want to run with DEBUG_PREEMPT.

Thanks,

-- Steve

--

To: Steven Rostedt <rostedt@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <benh@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>
Date: Thursday, December 13, 2007 - 2:10 am

I thought that you had already pulled the above version...

Here is the replacement that I posted on November 9th (with much help
from Ben H):

http://lkml.org/lkml/2007/11/9/114

Thanx, Paul

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

diff -urpNa -X dontdiff linux-2.6.23.1-rt4/arch/powerpc/kernel/process.c linux-2.6.23.1-rt4-fix/arch/powerpc/kernel/process.c
--- linux-2.6.23.1-rt4/arch/powerpc/kernel/process.c 2007-10-12 09:43:44.000000000 -0700
+++ linux-2.6.23.1-rt4-fix/arch/powerpc/kernel/process.c 2007-11-12 09:18:55.000000000 -0800
@@ -245,6 +245,10 @@ struct task_struct *__switch_to(struct t
struct thread_struct *new_thread, *old_thread;
unsigned long flags;
struct task_struct *last;
+#ifdef CONFIG_PREEMPT_RT
+ struct ppc64_tlb_batch *batch;
+ int hadbatch;
+#endif /* #ifdef CONFIG_PREEMPT_RT */

#ifdef CONFIG_SMP
/* avoid complexity of lazy save/restore of fpu
@@ -325,6 +329,17 @@ struct task_struct *__switch_to(struct t
}
#endif

+#ifdef CONFIG_PREEMPT_RT
+ batch = &__get_cpu_var(ppc64_tlb_batch);
+ if (batch->active) {
+ hadbatch = 1;
+ if (batch->index) {
+ __flush_tlb_pending(batch);
+ }
+ batch->active = 0;
+ }
+#endif /* #ifdef CONFIG_PREEMPT_RT */
+
local_irq_save(flags);

account_system_vtime(current);
@@ -335,6 +350,13 @@ struct task_struct *__switch_to(struct t

local_irq_restore(flags);

+#ifdef CONFIG_PREEMPT_RT
+ if (hadbatch) {
+ batch = &__get_cpu_var(ppc64_tlb_batch);
+ batch->active = 1;
+ }
+#endif /* #ifdef CONFIG_PREEMPT_RT */
+
return last;
}

diff -urpNa -X dontdiff linux-2.6.23.1-rt4/arch/powerpc/kernel/prom.c linux-2.6.23.1-rt4-fix/arch/powerpc/kernel/prom.c
--- linux-2.6.23.1-rt4/arch/powerpc/kernel/prom.c 2007-10-12 09:43:44.000000000 -0700
+++ linux-2.6.23.1-rt4-fix/arch/powerpc/kernel/prom.c 2007-10-28 13:37:23.000000000 -0700
@@ -80,7 +80,7 @@ struct boot_param_header *initial_boot_p

extern struct device_node *allnodes; ...

To: Paul E. McKenney <paulmck@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <benh@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>
Date: Thursday, December 13, 2007 - 8:52 am

OK, sorry, I somehow got the two reversed, and I think I replaced the new
one with the old one :-(

OK, will apply to -rt14

Thanks,

-- Steve
--

To: Steven Rostedt <rostedt@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <benh@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>
Date: Thursday, December 13, 2007 - 2:25 pm

If you give -me- espresso, you also have to give me a putty knife so that

Thank you!

Thanx, Paul
--

To: <paulmck@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>, <rostedt@...>
Date: Monday, October 29, 2007 - 4:07 pm

I see a lot of case where you add preempt_disable/enable around areas
that have the PTE lock held...

So in -rt, spin_lock doesn't disable preempt ? I'm a bit worried...
there are some strong requirements that anything within that lock is not
preempted, so zap_pte_ranges() is the obvious ones but all of them would
need to be addressed.

Ben.

-

To: <benh@...>
Cc: <paulmck@...>, <linux-kernel@...>, <tony@...>, <paulus@...>, <dino@...>, <tytso@...>, <antonb@...>, <rostedt@...>
Date: Wednesday, October 31, 2007 - 4:54 pm

So as Paul mentioned, spin_lock is now a mutex. There is a new
raw_spinlock however (simply change the way it is declared, calling
conventions are the same) which is used in a very few areas where a
traditional spin_lock is truly necessary. This may or may not be one of
those times, but I wanted to point it out.

-

To: Darren Hart <dvhltc@...>
Cc: <paulmck@...>, <linux-kernel@...>, <tony@...>, <paulus@...>, <dino@...>, <tytso@...>, <antonb@...>, <rostedt@...>
Date: Wednesday, October 31, 2007 - 5:15 pm

Yeah, I figured that. My main worry has more to do with some fishy
assumptions the powerpc VM code does regarding what can and cannot
happen in those locked sections, among other things. I'll have to sit
and think about it for a little while to convince myself we are ok ...
or not. Plus we do keep track of various MM related things in per-CPU
data structures but it looks like Paul already spotted that.

Cheers,
Ben.

-

To: Benjamin Herrenschmidt <benh@...>
Cc: Darren Hart <dvhltc@...>, <linux-kernel@...>, <tony@...>, <paulus@...>, <dino@...>, <tytso@...>, <antonb@...>, <rostedt@...>
Date: Thursday, November 1, 2007 - 11:50 am

My concern would be that I failed to spot all of them. ;-)

Thanx, Paul
-

To: Benjamin Herrenschmidt <benh@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>, <rostedt@...>, <niv@...>
Date: Monday, October 29, 2007 - 4:26 pm

Right in one! One of the big changes in -rt is that spinlock critical
sections (and RCU read-side critical sections, for that matter) are
preemptible under CONFIG_PREEMPT_RT.

And I agree that this patchset will have missed quite a few places where
additional changes are required. Hence the word "including" above, rather
than something like "specifically". ;-)

Thanx, Paul
-

To: <paulmck@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>, <rostedt@...>, <niv@...>
Date: Monday, October 29, 2007 - 4:37 pm

Ok, well, I'm pretty familiar with that MM code since I wrote a good
deal of the current version so I'll try to spend some time with your
patch have a look. It may have to wait for next week though, but feel
free to ping me if you don't hear back, in case it falls through the
hole in my brain :-)

Ben.

-

To: Benjamin Herrenschmidt <benh@...>
Cc: <linux-kernel@...>, <tony@...>, <paulus@...>, <dino@...>, <tytso@...>, <dvhltc@...>, <antonb@...>, <rostedt@...>, <niv@...>
Date: Monday, October 29, 2007 - 5:16 pm

Works for me!!!

Thanx, Paul
-

Previous thread: [git patches 1/2] libata fixes by Jeff Garzik on Monday, October 29, 2007 - 2:53 pm. (2 messages)

Next thread: Re: Linux Security *Module* Framework (Was: LSM conversion to static interface) by Rob Meijer on Monday, October 29, 2007 - 3:04 pm. (37 messages)