[PATCH] Markers : probe example fix teardown

Previous thread: [REV2: PATCH 0/2] workqueue: Two API's to update delayed works quickly by Krishna Kumar on Monday, September 29, 2008 - 12:03 am. (7 messages)

Next thread: [PATCH]Real-time kernel compile-fix for 32-bit when disabling CONFIG_PREEMPT_RT by John Kacur on Monday, September 29, 2008 - 1:10 am. (2 messages)
From: Lai Jiangshan
Date: Monday, September 29, 2008 - 1:00 am

unregister bug:

codes using makers are typically calling marker_probe_unregister()
and then destroying the data that marker_probe_func needs(or
unloading this module). This is bug when the corresponding
marker_probe_func is still running(on other cpus),
it is using the destroying/ed data.

we should call synchronize_sched() after marker_update_probes().

reenter bug:

marker_probe_register(), marker_probe_unregister() and
marker_probe_unregister_private_data() are not reentrant safe
functions. these 3 functions release markers_mutex and then
require it again and do "entry->oldptr = old; ...", but entry->oldptr
maybe is using now for these 3 functions may reenter when markers_mutex
is released.

we use synchronize_sched() instead of call_rcu_sched() to fix
this bug. actually we can do:
"
if (entry->rcu_pending)
		rcu_barrier_sched();
"
after require markers_mutex again. but synchronize_sched()
is better and simpler. For these 3 functions are not critical path.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/kernel/marker.c b/kernel/marker.c
index 7d1faec..9f76c4a 100644
--- a/kernel/marker.c
+++ b/kernel/marker.c
@@ -60,9 +60,6 @@ struct marker_entry {
 	struct marker_probe_closure single;
 	struct marker_probe_closure *multi;
 	int refcount;	/* Number of times armed. 0 if disarmed. */
-	struct rcu_head rcu;
-	void *oldptr;
-	unsigned char rcu_pending:1;
 	unsigned char ptype:1;
 	char name[0];	/* Contains name'\0'format'\0' */
 };
@@ -199,16 +196,6 @@ void marker_probe_cb_noarg(const struct marker *mdata, void *call_private, ...)
 }
 EXPORT_SYMBOL_GPL(marker_probe_cb_noarg);
 
-static void free_old_closure(struct rcu_head *head)
-{
-	struct marker_entry *entry = container_of(head,
-		struct marker_entry, rcu);
-	kfree(entry->oldptr);
-	/* Make sure we free the data before setting the pending flag to 0 */
-	smp_wmb();
-	entry->rcu_pending = 0;
-}
-
 static void debug_print_probes(struct marker_entry *entry)
 {
 	int i;
@@ ...
From: Ingo Molnar
Date: Monday, September 29, 2008 - 1:27 am

applied to tip/tracing/markers, thanks!

Mathieu, i've reactivated tip/tracing/markers for linux-next 
integration, it propagates into auto-ftrace-next. Tracepoints are what 
we use in the scheduler/etc. in tip/master, but until there's marker use 
elsewhere there's no reason not to apply fix patches.

	Ingo
--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:03 am

nack for this patch. I'll my fix patchset.


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:06 am

(probably already in -tip)

Add rcu_read_lock_sched() and rcu_read_unlock_sched() to rcupdate.h to match the
recently added write-side call_rcu_sched() and rcu_barrier_sched(). They also
match the no-so-recently-added synchronize_sched().

It will help following matching use of the update/read lock primitives. Those
new read lock will replace preempt_disable()/enable() used in pair with
RCU-classic synchronization.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Peter Zijlstra <peterz@infradead.org>
CC: Paul E McKenney <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
---
 include/linux/rcupdate.h |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Index: linux-2.6-lttng/include/linux/rcupdate.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/rcupdate.h	2008-07-15 15:18:20.000000000 -0400
+++ linux-2.6-lttng/include/linux/rcupdate.h	2008-07-15 15:18:46.000000000 -0400
@@ -132,6 +132,26 @@ struct rcu_head {
 #define rcu_read_unlock_bh() __rcu_read_unlock_bh()
 
 /**
+ * rcu_read_lock_sched - mark the beginning of a RCU-classic critical section
+ *
+ * Should be used with either
+ * - synchronize_sched()
+ * or
+ * - call_rcu_sched() and rcu_barrier_sched()
+ * on the write-side to insure proper synchronization.
+ */
+#define rcu_read_lock_sched() preempt_disable()
+
+/*
+ * rcu_read_unlock_sched - marks the end of a RCU-classic critical section
+ *
+ * See rcu_read_lock_sched for more information.
+ */
+#define rcu_read_unlock_sched() preempt_enable()
+
+
+
+/**
  * rcu_dereference - fetch an RCU-protected pointer in an
  * RCU read-side critical section.  This pointer may later
  * be safely dereferenced.
-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Ingo Molnar
Date: Tuesday, September 30, 2008 - 3:08 am

Paul, does this API extension look good to you?

i'll apply it to tip/core/rcu if yes.

	Ingo
--

From: Paul E. McKenney
Date: Tuesday, September 30, 2008 - 6:10 am

Please do!

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, September 30, 2008 - 6:10 am

Looks like a very good addition to me!

							Thanx, Paul

--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:08 am

Use the new rcu_read_lock_sched/unlock_sched() in marker code around the call
site instead of preempt_disable/enable(). It helps reviewing the code more
easily.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul E McKenney <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
---
 kernel/marker.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Index: linux-2.6-lttng/kernel/marker.c
===================================================================
--- linux-2.6-lttng.orig/kernel/marker.c	2008-07-15 15:21:04.000000000 -0400
+++ linux-2.6-lttng/kernel/marker.c	2008-07-15 15:21:12.000000000 -0400
@@ -105,11 +105,11 @@ void marker_probe_cb(const struct marker
 	char ptype;
 
 	/*
-	 * preempt_disable does two things : disabling preemption to make sure
-	 * the teardown of the callbacks can be done correctly when they are in
-	 * modules and they insure RCU read coherency.
+	 * rcu_read_lock_sched does two things : disabling preemption to make
+	 * sure the teardown of the callbacks can be done correctly when they
+	 * are in modules and they insure RCU read coherency.
 	 */
-	preempt_disable();
+	rcu_read_lock_sched();
 	ptype = mdata->ptype;
 	if (likely(!ptype)) {
 		marker_probe_func *func;
@@ -146,7 +146,7 @@ void marker_probe_cb(const struct marker
 			va_end(args);
 		}
 	}
-	preempt_enable();
+	rcu_read_unlock_sched();
 }
 EXPORT_SYMBOL_GPL(marker_probe_cb);
 
@@ -165,7 +165,7 @@ void marker_probe_cb_noarg(const struct 
 	va_list args;	/* not initialized */
 	char ptype;
 
-	preempt_disable();
+	rcu_read_lock_sched();
 	ptype = mdata->ptype;
 	if (likely(!ptype)) {
 		marker_probe_func *func;
@@ -197,7 +197,7 @@ void marker_probe_cb_noarg(const struct 
 			multi[i].func(multi[i].probe_private, call_private, fmt,
 				&args);
 	}
-	preempt_enable();
+	rcu_read_unlock_sched();
 }
 EXPORT_SYMBOL_GPL(marker_probe_cb_noarg);
 ...
From: Ingo Molnar
Date: Tuesday, September 30, 2008 - 3:13 am

applied this cleanup on top of the fix from Lai Jiangshan, thanks 
Matthieu!

here are the new commits in tip/tracing/markers:

 2396ac3: sputrace: use marker_synchronize_unregister()
 a16801c: markers: documentation fix for teardown
 b05d9cf: markers: probe example, fix teardown
 4b0b16f: markers: fix unregister bug and reenter bug, cleanup
 1638da5: markers: marker_synchronize_unregister()
 b211730: Merge branch 'core/rcu' into tracing/markers
 1c50b72: rcu: add rcu_read_lock_sched() / rcu_read_unlock_sched()
 d587284: markers: fix unregister bug and reenter bug

	Ingo
--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:09 am

Need a marker_synchronize_unregister() before the end of exit() to make sure
every probe callers have exited the non preemptible section and thus are not
executing the probe code anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: akpm@linux-foundation.org
CC: "Frank Ch. Eigler" <fche@redhat.com>
---
 samples/markers/probe-example.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6-lttng/samples/markers/probe-example.c
===================================================================
--- linux-2.6-lttng.orig/samples/markers/probe-example.c	2008-07-31 09:11:13.000000000 -0400
+++ linux-2.6-lttng/samples/markers/probe-example.c	2008-07-31 09:18:54.000000000 -0400
@@ -81,6 +81,7 @@ static void __exit probe_fini(void)
 			probe_array[i].probe_func, &probe_array[i]);
 	printk(KERN_INFO "Number of event b : %u\n",
 			atomic_read(&eventb_count));
+	marker_synchronize_unregister();
 }
 
 module_init(probe_init);
-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:05 am

Create marker_synchronize_unregister() which must be called before the end of
exit() to make sure every probe callers have exited the non preemptible section
and thus are not executing the probe code anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: akpm@linux-foundation.org
CC: "Frank Ch. Eigler" <fche@redhat.com>
---
 include/linux/marker.h |    7 +++++++
 1 file changed, 7 insertions(+)

Index: linux-2.6-lttng/include/linux/marker.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/marker.h	2008-07-31 09:12:52.000000000 -0400
+++ linux-2.6-lttng/include/linux/marker.h	2008-07-31 09:19:31.000000000 -0400
@@ -142,4 +142,11 @@ extern int marker_probe_unregister_priva
 extern void *marker_get_private_data(const char *name, marker_probe_func *probe,
 	int num);
 
+/*
+ * marker_synchronize_unregister must be called between the last marker probe
+ * unregistration and the end of module exit to make sure there is no caller
+ * executing a probe when it is freed.
+ */
+#define marker_synchronize_unregister() synchronize_sched()
+
 #endif
-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Lai Jiangshan
Date: Monday, September 29, 2008 - 6:47 pm

marker_synchronize_unregister must be called _also_ between unregistration
and destruction the data that unregistration-ed probes need to make sure


--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:10 am

Document the need for a marker_synchronize_unregister() before the end of exit()
to make sure every probe callers have exited the non preemptible section and
thus are not executing the probe code anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: akpm@linux-foundation.org
CC: "Frank Ch. Eigler" <fche@redhat.com>
---
 Documentation/markers.txt |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Index: linux-2.6-lttng/Documentation/markers.txt
===================================================================
--- linux-2.6-lttng.orig/Documentation/markers.txt	2008-07-31 09:11:13.000000000 -0400
+++ linux-2.6-lttng/Documentation/markers.txt	2008-07-31 09:20:57.000000000 -0400
@@ -50,10 +50,12 @@ Connecting a function (probe) to a marke
 to call) for the specific marker through marker_probe_register() and can be
 activated by calling marker_arm(). Marker deactivation can be done by calling
 marker_disarm() as many times as marker_arm() has been called. Removing a probe
-is done through marker_probe_unregister(); it will disarm the probe and make
-sure there is no caller left using the probe when it returns. Probe removal is
-preempt-safe because preemption is disabled around the probe call. See the
-"Probe example" section below for a sample probe module.
+is done through marker_probe_unregister(); it will disarm the probe.
+marker_synchronize_unregister() must be called before the end of the module exit
+function to make sure there is no caller left using the probe. This, and the
+fact that preemption is disabled around the probe call, make sure that probe
+removal and module unload are safe. See the "Probe example" section below for a
+sample probe module.
 
 The marker mechanism supports inserting multiple instances of the same marker.
 Markers can be put in inline functions, inlined static functions, and
-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 ...
From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:11 am

We need a marker_synchronize_unregister() before the end of exit() to make sure
every probe callers have exited the non preemptible section and thus are not
executing the probe code anymore.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@elte.hu>
CC: Jeremy Kerr <jk@ozlabs.org>
CC: linuxppc-dev@ozlabs.org
CC: Christoph Hellwig <hch@infradead.org>
---
 arch/powerpc/platforms/cell/spufs/sputrace.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6-lttng/arch/powerpc/platforms/cell/spufs/sputrace.c
===================================================================
--- linux-2.6-lttng.orig/arch/powerpc/platforms/cell/spufs/sputrace.c	2008-07-31 09:34:58.000000000 -0400
+++ linux-2.6-lttng/arch/powerpc/platforms/cell/spufs/sputrace.c	2008-07-31 09:35:15.000000000 -0400
@@ -233,6 +233,7 @@ static void __exit sputrace_exit(void)
 
 	remove_proc_entry("sputrace", NULL);
 	kfree(sputrace_log);
+	marker_synchronize_unregister();
 }
 
 module_init(sputrace_init);
-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Christoph Hellwig
Date: Monday, September 29, 2008 - 8:13 am

Looks good.
--

From: Jeremy Kerr
Date: Monday, September 29, 2008 - 5:28 pm

Looks good - added to spufs.git.

Cheers,


Jeremy
--

From: Ingo Molnar
Date: Tuesday, September 30, 2008 - 2:55 am

that wont work very well as the patch relies on the new 
marker_synchronize_unregister() facility.

	Ingo
--

From: Jeremy Kerr
Date: Tuesday, September 30, 2008 - 4:22 am

d'oh, right you are. Should I leave this in your hands to merge?

Cheers,


Jeremy
--

From: Ingo Molnar
Date: Tuesday, September 30, 2008 - 4:30 am

would be nice if you could give your Acked-by for the sputrace bits, 
then we can merge it. It's a oneliner so it shouldnt cause merging 
trouble in linux-next.

	Ingo
--

From: Jeremy Kerr
Date: Tuesday, September 30, 2008 - 4:34 am

Sure!

Acked-by: Jeremy Kerr <jk@ozlabs.org>


Jeremy
--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:03 am

Hi Lai,

I'll have to nack this fix :

One fix I already posted makes sure every marker unregister callers call
synchronize_sched() _at some point_ before module unload. It thus makes
sure we can do batch unregistration without doing multiple
synchronize_sched calls.

Also, there is no need to do the synchronize_sched with the marker mutex
held. call_rcu_sched takes care of making sure the previous quiescent
state is over before calling kfree. This means that when we return from
the register/unregister functions, there may still be markers "in
flight" using the old markers. Again, why would it be a problem ?

Thanks,

Mathieu

P.S. : I'll send along the patches I am referring to. Ingo, those should
probably be merged if they are not in -tip already.

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Lai Jiangshan
Date: Monday, September 29, 2008 - 6:40 pm

1) the new API marker_synchronize_unregister() is ugly, it separate one thing
   into two steps.
   user have to remember to call marker_synchronize_unregister() and have
   to know what he can do and what he can't do before
   marker_synchronize_unregister().

2) IMO, synchronous code is better than asynchronous in non-critical-path.
   we need synchronize_sched() for free(old).

   you fixes haven't fix the reenter bug.



--

From: Mathieu Desnoyers
Date: Monday, September 29, 2008 - 8:38 pm

Hum, yes it does separate unregistration and synchronization in two
steps for a very precise purpose : I don't want unregistration of 100

free(old) is only done in call_rcu. the rcu callback is forced by a
rcu_barrier() if two consecutive operations are done on the same

I don't see any reentrancy bug here. Have you actually experienced such
an issue ? Can you give me an example of a bogus execution trace
(step-by-step operations) ?

Thanks,


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

Previous thread: [REV2: PATCH 0/2] workqueue: Two API's to update delayed works quickly by Krishna Kumar on Monday, September 29, 2008 - 12:03 am. (7 messages)

Next thread: [PATCH]Real-time kernel compile-fix for 32-bit when disabling CONFIG_PREEMPT_RT by John Kacur on Monday, September 29, 2008 - 1:10 am. (2 messages)