unregister bug:
codes using makers are typically calling marker_probe_unregister()
and then destroying the data that marker_probe_func needs(or
unloading this module). This is bug when the corresponding
marker_probe_func is still running(on other cpus),
it is using the destroying/ed data.
we should call synchronize_sched() after marker_update_probes().
reenter bug:
marker_probe_register(), marker_probe_unregister() and
marker_probe_unregister_private_data() are not reentrant safe
functions. these 3 functions release markers_mutex and then
require it again and do "entry->oldptr = old; ...", but entry->oldptr
maybe is using now for these 3 functions may reenter when markers_mutex
is released.
we use synchronize_sched() instead of call_rcu_sched() to fix
this bug. actually we can do:
"
if (entry->rcu_pending)
rcu_barrier_sched();
"
after require markers_mutex again. but synchronize_sched()
is better and simpler. For these 3 functions are not critical path.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/kernel/marker.c b/kernel/marker.c
index 7d1faec..9f76c4a 100644
--- a/kernel/marker.c
+++ b/kernel/marker.c
@@ -60,9 +60,6 @@ struct marker_entry {
struct marker_probe_closure single;
struct marker_probe_closure *multi;
int refcount; /* Number of times armed. 0 if disarmed. */
- struct rcu_head rcu;
- void *oldptr;
- unsigned char rcu_pending:1;
unsigned char ptype:1;
char name[0]; /* Contains name'\0'format'\0' */
};
@@ -199,16 +196,6 @@ void marker_probe_cb_noarg(const struct marker *mdata, void *call_private, ...)
}
EXPORT_SYMBOL_GPL(marker_probe_cb_noarg);
-static void free_old_closure(struct rcu_head *head)
-{
- struct marker_entry *entry = container_of(head,
- struct marker_entry, rcu);
- kfree(entry->oldptr);
- /* Make sure we free the data before setting the pending flag to 0 */
- smp_wmb();
- entry->rcu_pending = 0;
-}
-
static void debug_print_probes(struct marker_entry *entry)
{
int i;
@@ ...applied to tip/tracing/markers, thanks! Mathieu, i've reactivated tip/tracing/markers for linux-next integration, it propagates into auto-ftrace-next. Tracepoints are what we use in the scheduler/etc. in tip/master, but until there's marker use elsewhere there's no reason not to apply fix patches. Ingo --
nack for this patch. I'll my fix patchset. -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
(probably already in -tip) Add rcu_read_lock_sched() and rcu_read_unlock_sched() to rcupdate.h to match the recently added write-side call_rcu_sched() and rcu_barrier_sched(). They also match the no-so-recently-added synchronize_sched(). It will help following matching use of the update/read lock primitives. Those new read lock will replace preempt_disable()/enable() used in pair with RCU-classic synchronization. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Peter Zijlstra <peterz@infradead.org> CC: Paul E McKenney <paulmck@linux.vnet.ibm.com> CC: akpm@linux-foundation.org --- include/linux/rcupdate.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) Index: linux-2.6-lttng/include/linux/rcupdate.h =================================================================== --- linux-2.6-lttng.orig/include/linux/rcupdate.h 2008-07-15 15:18:20.000000000 -0400 +++ linux-2.6-lttng/include/linux/rcupdate.h 2008-07-15 15:18:46.000000000 -0400 @@ -132,6 +132,26 @@ struct rcu_head { #define rcu_read_unlock_bh() __rcu_read_unlock_bh() /** + * rcu_read_lock_sched - mark the beginning of a RCU-classic critical section + * + * Should be used with either + * - synchronize_sched() + * or + * - call_rcu_sched() and rcu_barrier_sched() + * on the write-side to insure proper synchronization. + */ +#define rcu_read_lock_sched() preempt_disable() + +/* + * rcu_read_unlock_sched - marks the end of a RCU-classic critical section + * + * See rcu_read_lock_sched for more information. + */ +#define rcu_read_unlock_sched() preempt_enable() + + + +/** * rcu_dereference - fetch an RCU-protected pointer in an * RCU read-side critical section. This pointer may later * be safely dereferenced. -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
Please do! Thanx, Paul --
Looks like a very good addition to me! Thanx, Paul --
Use the new rcu_read_lock_sched/unlock_sched() in marker code around the call site instead of preempt_disable/enable(). It helps reviewing the code more easily. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Ingo Molnar <mingo@elte.hu> CC: Peter Zijlstra <peterz@infradead.org> CC: Paul E McKenney <paulmck@linux.vnet.ibm.com> CC: akpm@linux-foundation.org --- kernel/marker.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) Index: linux-2.6-lttng/kernel/marker.c =================================================================== --- linux-2.6-lttng.orig/kernel/marker.c 2008-07-15 15:21:04.000000000 -0400 +++ linux-2.6-lttng/kernel/marker.c 2008-07-15 15:21:12.000000000 -0400 @@ -105,11 +105,11 @@ void marker_probe_cb(const struct marker char ptype; /* - * preempt_disable does two things : disabling preemption to make sure - * the teardown of the callbacks can be done correctly when they are in - * modules and they insure RCU read coherency. + * rcu_read_lock_sched does two things : disabling preemption to make + * sure the teardown of the callbacks can be done correctly when they + * are in modules and they insure RCU read coherency. */ - preempt_disable(); + rcu_read_lock_sched(); ptype = mdata->ptype; if (likely(!ptype)) { marker_probe_func *func; @@ -146,7 +146,7 @@ void marker_probe_cb(const struct marker va_end(args); } } - preempt_enable(); + rcu_read_unlock_sched(); } EXPORT_SYMBOL_GPL(marker_probe_cb); @@ -165,7 +165,7 @@ void marker_probe_cb_noarg(const struct va_list args; /* not initialized */ char ptype; - preempt_disable(); + rcu_read_lock_sched(); ptype = mdata->ptype; if (likely(!ptype)) { marker_probe_func *func; @@ -197,7 +197,7 @@ void marker_probe_cb_noarg(const struct multi[i].func(multi[i].probe_private, call_private, fmt, &args); } - preempt_enable(); + rcu_read_unlock_sched(); } EXPORT_SYMBOL_GPL(marker_probe_cb_noarg); ...
applied this cleanup on top of the fix from Lai Jiangshan, thanks Matthieu! here are the new commits in tip/tracing/markers: 2396ac3: sputrace: use marker_synchronize_unregister() a16801c: markers: documentation fix for teardown b05d9cf: markers: probe example, fix teardown 4b0b16f: markers: fix unregister bug and reenter bug, cleanup 1638da5: markers: marker_synchronize_unregister() b211730: Merge branch 'core/rcu' into tracing/markers 1c50b72: rcu: add rcu_read_lock_sched() / rcu_read_unlock_sched() d587284: markers: fix unregister bug and reenter bug Ingo --
Need a marker_synchronize_unregister() before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Ingo Molnar <mingo@elte.hu> CC: Rusty Russell <rusty@rustcorp.com.au> CC: akpm@linux-foundation.org CC: "Frank Ch. Eigler" <fche@redhat.com> --- samples/markers/probe-example.c | 1 + 1 file changed, 1 insertion(+) Index: linux-2.6-lttng/samples/markers/probe-example.c =================================================================== --- linux-2.6-lttng.orig/samples/markers/probe-example.c 2008-07-31 09:11:13.000000000 -0400 +++ linux-2.6-lttng/samples/markers/probe-example.c 2008-07-31 09:18:54.000000000 -0400 @@ -81,6 +81,7 @@ static void __exit probe_fini(void) probe_array[i].probe_func, &probe_array[i]); printk(KERN_INFO "Number of event b : %u\n", atomic_read(&eventb_count)); + marker_synchronize_unregister(); } module_init(probe_init); -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
Create marker_synchronize_unregister() which must be called before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Ingo Molnar <mingo@elte.hu> CC: Rusty Russell <rusty@rustcorp.com.au> CC: akpm@linux-foundation.org CC: "Frank Ch. Eigler" <fche@redhat.com> --- include/linux/marker.h | 7 +++++++ 1 file changed, 7 insertions(+) Index: linux-2.6-lttng/include/linux/marker.h =================================================================== --- linux-2.6-lttng.orig/include/linux/marker.h 2008-07-31 09:12:52.000000000 -0400 +++ linux-2.6-lttng/include/linux/marker.h 2008-07-31 09:19:31.000000000 -0400 @@ -142,4 +142,11 @@ extern int marker_probe_unregister_priva extern void *marker_get_private_data(const char *name, marker_probe_func *probe, int num); +/* + * marker_synchronize_unregister must be called between the last marker probe + * unregistration and the end of module exit to make sure there is no caller + * executing a probe when it is freed. + */ +#define marker_synchronize_unregister() synchronize_sched() + #endif -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
marker_synchronize_unregister must be called _also_ between unregistration and destruction the data that unregistration-ed probes need to make sure --
Document the need for a marker_synchronize_unregister() before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Ingo Molnar <mingo@elte.hu> CC: Rusty Russell <rusty@rustcorp.com.au> CC: akpm@linux-foundation.org CC: "Frank Ch. Eigler" <fche@redhat.com> --- Documentation/markers.txt | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) Index: linux-2.6-lttng/Documentation/markers.txt =================================================================== --- linux-2.6-lttng.orig/Documentation/markers.txt 2008-07-31 09:11:13.000000000 -0400 +++ linux-2.6-lttng/Documentation/markers.txt 2008-07-31 09:20:57.000000000 -0400 @@ -50,10 +50,12 @@ Connecting a function (probe) to a marke to call) for the specific marker through marker_probe_register() and can be activated by calling marker_arm(). Marker deactivation can be done by calling marker_disarm() as many times as marker_arm() has been called. Removing a probe -is done through marker_probe_unregister(); it will disarm the probe and make -sure there is no caller left using the probe when it returns. Probe removal is -preempt-safe because preemption is disabled around the probe call. See the -"Probe example" section below for a sample probe module. +is done through marker_probe_unregister(); it will disarm the probe. +marker_synchronize_unregister() must be called before the end of the module exit +function to make sure there is no caller left using the probe. This, and the +fact that preemption is disabled around the probe call, make sure that probe +removal and module unload are safe. See the "Probe example" section below for a +sample probe module. The marker mechanism supports inserting multiple instances of the same marker. Markers can be put in inline functions, inlined static functions, and -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 ...
We need a marker_synchronize_unregister() before the end of exit() to make sure every probe callers have exited the non preemptible section and thus are not executing the probe code anymore. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Ingo Molnar <mingo@elte.hu> CC: Jeremy Kerr <jk@ozlabs.org> CC: linuxppc-dev@ozlabs.org CC: Christoph Hellwig <hch@infradead.org> --- arch/powerpc/platforms/cell/spufs/sputrace.c | 1 + 1 file changed, 1 insertion(+) Index: linux-2.6-lttng/arch/powerpc/platforms/cell/spufs/sputrace.c =================================================================== --- linux-2.6-lttng.orig/arch/powerpc/platforms/cell/spufs/sputrace.c 2008-07-31 09:34:58.000000000 -0400 +++ linux-2.6-lttng/arch/powerpc/platforms/cell/spufs/sputrace.c 2008-07-31 09:35:15.000000000 -0400 @@ -233,6 +233,7 @@ static void __exit sputrace_exit(void) remove_proc_entry("sputrace", NULL); kfree(sputrace_log); + marker_synchronize_unregister(); } module_init(sputrace_init); -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
Looks good - added to spufs.git. Cheers, Jeremy --
that wont work very well as the patch relies on the new marker_synchronize_unregister() facility. Ingo --
d'oh, right you are. Should I leave this in your hands to merge? Cheers, Jeremy --
would be nice if you could give your Acked-by for the sputrace bits, then we can merge it. It's a oneliner so it shouldnt cause merging trouble in linux-next. Ingo --
Sure! Acked-by: Jeremy Kerr <jk@ozlabs.org> Jeremy --
Hi Lai, I'll have to nack this fix : One fix I already posted makes sure every marker unregister callers call synchronize_sched() _at some point_ before module unload. It thus makes sure we can do batch unregistration without doing multiple synchronize_sched calls. Also, there is no need to do the synchronize_sched with the marker mutex held. call_rcu_sched takes care of making sure the previous quiescent state is over before calling kfree. This means that when we return from the register/unregister functions, there may still be markers "in flight" using the old markers. Again, why would it be a problem ? Thanks, Mathieu P.S. : I'll send along the patches I am referring to. Ingo, those should probably be merged if they are not in -tip already. -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
1) the new API marker_synchronize_unregister() is ugly, it separate one thing into two steps. user have to remember to call marker_synchronize_unregister() and have to know what he can do and what he can't do before marker_synchronize_unregister(). 2) IMO, synchronous code is better than asynchronous in non-critical-path. we need synchronize_sched() for free(old). you fixes haven't fix the reenter bug. --
Hum, yes it does separate unregistration and synchronization in two steps for a very precise purpose : I don't want unregistration of 100 free(old) is only done in call_rcu. the rcu callback is forced by a rcu_barrier() if two consecutive operations are done on the same I don't see any reentrancy bug here. Have you actually experienced such an issue ? Can you give me an example of a bogus execution trace (step-by-step operations) ? Thanks, -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
