Re: 2.6.33 GP fault only when built with tracing

Previous thread: [PATCH] move tty_kref_put() outside of __cleanup_signal() by Oleg Nesterov on Thursday, March 18, 2010 - 3:37 pm. (6 messages)

Next thread: [PATCH 1/2] firewire: core: fix Model_ID in modalias by Stefan Richter on Thursday, March 18, 2010 - 4:38 pm. (2 messages)
From: Randy Dunlap
Date: Thursday, March 18, 2010 - 4:26 pm

I can build/boot 2.6.33 with CONFIG_TRACE/TRACING disabled successfully,
but when I enable lots of tracing config options and then boot with
ftrace=nop on the kernel command line, I see a GP fault when the parport &
parport_pc modules are loading/initializing.

It happens in drivers/parport/share.c::parport_register_device(), when that
function calls try_module_get().

If I comment out the trace_module_get() calls in include/linux/module.h,
the kernel boots with no problems.

[   21.852829] general protection fault: 0000 [#1] SMP 
[   21.856321] last sysfs file: /sys/module/parport/initstate
[   21.856321] CPU 0 
[   21.856321] Pid: 2089, comm: modprobe Not tainted 2.6.33 #11 0HH807/OptiPlex GX620               
[   21.856321] RIP: 0010:[<ffffffffa0437671>]  [<ffffffffa0437671>] parport_register_device+0xe4/0x48c [parport]
[   21.856321] RSP: 0018:ffff8800765cba78  EFLAGS: 00010283
[   21.856321] RAX: ffff10007b04a3d0 RBX: ffff88007a6a5e30 RCX: 0000000000000000
[   21.856321] RDX: 0000000000000000 RSI: ffffffffa043d1de RDI: ffff88007a6a5e30
[   21.856321] RBP: ffff8800765cbad8 R08: 0000000000000000 R09: 0000000000000000
[   21.856321] R10: ffffffffa043dff8 R11: 0000000000000000 R12: ffffffffa043d1de
[   21.856321] R13: ffffffffa043d1de R14: ffffffffa045c940 R15: 0000000000000000
[   21.856321] FS:  00007f09cc3fb6f0(0000) GS:ffff880004a00000(0000) knlGS:0000000000000000
[   21.856321] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   21.856321] CR2: 0000003fb5ad62c0 CR3: 00000000764f6000 CR4: 00000000000006f0
[   21.856321] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   21.856321] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   21.856321] Process modprobe (pid: 2089, threadinfo ffff8800765ca000, task ffff88007664a3d0)
[   21.856321] Stack:
[   21.856321]  ffff8800765cbab8 0000000000000206 0000000000000000 ffffffff812abaf2
[   21.856321] <0> 0000000000000000 0000000000000000 ffffffffa043d1de 00000000ffffffff
[   21.856321] <0> ...
From: Steven Rostedt
Date: Thursday, March 18, 2010 - 4:55 pm

Do you see it without adding the "ftrace=nop"? The only thing that


Interesting. Well, trace_module_get() is a TRACE_EVENT tracepoint. But
should be disabled here. It may be something to do with DEFINE_TRACE. 


I've never seen it. Do you have a config you can send me. I can try it
out.

Thanks,

-- Steve


--

From: Randy Dunlap
Date: Thursday, March 18, 2010 - 5:08 pm

It's attached.

-- 
~Randy
From: Mathieu Desnoyers
Date: Thursday, March 18, 2010 - 5:59 pm

can you try replacing the "local_read(__module_ref_addr(module, cpu))" argument
with "0" ?

Arguments with side-effects are not skipped by the jump over disabled
instrumentation. This is why we should do that part within the probe declaration
in the TRACE_EVENT macros.

But if we find out that the problem really is this argument, then it should be
fixed, because something would be wrong with it (just moving it to TRACE_EVENT
is not a proper solution).

Thanks,


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--

From: Randy Dunlap
Date: Friday, March 19, 2010 - 11:22 am

-- 
~Randy
--

From: Mathieu Desnoyers
Date: Friday, March 19, 2010 - 11:46 am

clickety-clicketa... git blame include/linux/module.h :

commit 7ead8b8313d92b3a69a1a61b0dcbc4cd66c960dc
Author: Li Zefan <lizf@cn.fujitsu.com>
Date:   Mon Aug 17 16:56:28 2009 +0800

    tracing/events: Add module tracepoints

(Adding Li Zefan in CC)

Two things:

1) In this commit, most of the tracepoints contain argument with side-effects.
   These do not belong there; they should be moved into TRACE_EVENT macros.

2) There seem to be a null-pointer bug with
   local_read(__module_ref_addr(module, cpu)) in try_module_get(). This should
   be investigated even if we move the argument to TRACE_EVENT.

Thanks,


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--

From: Randy Dunlap
Date: Tuesday, March 23, 2010 - 8:26 am

Hi Li,



---
~Randy
--

From: Mathieu Desnoyers
Date: Tuesday, March 23, 2010 - 6:20 pm

While we wait for the sun to move to other time zones, can you check if the
following patch fixes your problem ?


module: fix __module_ref_addr()

__module_ref_addr() should use per_cpu_ptr() to obfuscate the pointer
(RELOC_HIDE is needed for per cpu pointers).

This non-standard per-cpu pointer use has been introduced by commit
720eba31f47aeade8ec130ca7f4353223c49170f

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Eric Dumazet <dada1@cosmosbay.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
---
 include/linux/module.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-lttng/include/linux/module.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/module.h	2010-03-23 18:11:14.000000000 -0400
+++ linux-2.6-lttng/include/linux/module.h	2010-03-23 18:14:07.000000000 -0400
@@ -467,7 +467,7 @@ void symbol_put_addr(void *addr);
 static inline local_t *__module_ref_addr(struct module *mod, int cpu)
 {
 #ifdef CONFIG_SMP
-	return (local_t *) (mod->refptr + per_cpu_offset(cpu));
+	return (local_t *) per_cpu_ptr(mod->refptr, cpu);
 #else
 	return &mod->ref;

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--

From: Li Zefan
Date: Tuesday, March 23, 2010 - 6:42 pm

Sorry, I overlooked this mail thread..

I'll make a patch to move side-effects arguments from trace_module_xxx()
to the definition of TRACE_EVENT().

But it's for reducing overhead when tracepoints are disabled, this should

So the uptream kernel is free from this bug, because __module_ref_addr()
--

From: Randy Dunlap
Date: Wednesday, March 24, 2010 - 1:21 pm

Hi Mathieu,
Yes, this works.  Thanks.



-- 
~Randy
--

From: Steven Rostedt
Date: Wednesday, March 24, 2010 - 1:31 pm

If it works, can you send me your "Tested-by". I have it queued to go.

Thanks,

-- Steve


--

From: Randy Dunlap
Date: Friday, March 19, 2010 - 5:12 pm

-- 
~Randy
--

From: Frederic Weisbecker
Date: Thursday, March 18, 2010 - 5:01 pm

Thanks for this report.

May be you still have the config that can trigger this, so
that we can test?



--

From: Steven Rostedt
Date: Thursday, March 18, 2010 - 7:15 pm

Is this fully reproducible at every boot up? I just booted your config
(no changes) and loaded the parport_pc module with no issues. I even
added your command line (just modifying root=.. )

-- Steve


--

From: Randy Dunlap
Date: Friday, March 19, 2010 - 9:10 am

Yes, it has failed for me 100% of the time (about 10 boots).

I'm rebuilding to test with Mathieu's request now.

-- 
~Randy
--

Previous thread: [PATCH] move tty_kref_put() outside of __cleanup_signal() by Oleg Nesterov on Thursday, March 18, 2010 - 3:37 pm. (6 messages)

Next thread: [PATCH 1/2] firewire: core: fix Model_ID in modalias by Stefan Richter on Thursday, March 18, 2010 - 4:38 pm. (2 messages)