Hi, This patchset introduces a debugobjects-based rcu list head debugging infrastructure. The first patch adds the ability to keep track of a "state machine" into debugobjects. The state machine logic is all contained in the type-aware caller. Debugobjects only keep track of the current state (a single integer). A state validation/transition API is provided to debugobjects users. This patchset is based on 2.6.33.1. Comments are welcome, Mathieu -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
Helps finding racy users of call_rcu(), which results in hangs because list entries are overwritten and/or skipped. This new patch version is based on the debugobjects with the newly introduced "active state" tracker. Non-initialized entries are all considered as "statically initialized". An activation fixup (triggered by call_rcu()) takes care of performing the debug object initialization without issuing any warning. Since we cannot increase the size of struct rcu_head, I don't see much room to put an identifier for statically initialized rcu_head structures. So for now, we have to live without "activation without explicit init" detection. But the main purpose of this debug option is to detect double-activations (double call_rcu() use of a rcu_head before the callback is executed), which is correctly addressed here. This also detects potential internal RCU callback corruption, which would cause the callbacks to be executed twice. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: akpm@linux-foundation.org CC: mingo@elte.hu CC: laijs@cn.fujitsu.com CC: dipankar@in.ibm.com CC: josh@joshtriplett.org CC: dvhltc@us.ibm.com CC: niv@us.ibm.com CC: tglx@linutronix.de CC: peterz@infradead.org CC: rostedt@goodmis.org CC: Valdis.Kletnieks@vt.edu CC: dhowells@redhat.com CC: eric.dumazet@gmail.com CC: Alexey Dobriyan <adobriyan@gmail.com> --- include/linux/rcupdate.h | 61 ++++++++++++++++++- kernel/rcupdate.c | 148 +++++++++++++++++++++++++++++++++++++++++++++++ kernel/rcutiny.c | 2 kernel/rcutree.c | 2 lib/Kconfig.debug | 6 + 5 files changed, 215 insertions(+), 4 deletions(-) Index: linux-2.6-lttng/include/linux/rcupdate.h =================================================================== --- linux-2.6-lttng.orig/include/linux/rcupdate.h 2010-03-19 15:26:02.000000000 -0400 +++ linux-2.6-lttng/include/linux/rcupdate.h 2010-03-19 15:26:06.000000000 ...
Is this useful? Basic usage is so there no double call_rcu(): if (atomic_dec_and_test()) call_rcu() --
I believe that it is. There have been a few cases of call_rcu() being invoked twice without a grace period between the two invocations. Mathieu's patch would catch this sort of misbehavior. That said, I do agree that if everyone followed the rules, there would be no need for Mathieu's patch -- and there would be no need for much else, besides. ;-) Thanx, Paul --
Very nice you do not touch struct rcu_head. I'm a little foolish. I don't know why we need STATE_RCU_HEAD_READY/QUEUED. 1) obj->astate does not be set to STATE_RCU_HEAD_READY in debug_object_activate/init(). Maybe you need to add a function: debug_object_activate_with_astate(head, descr, init_astate); (the same as debug_object_activate(), but init obj->astate also) Or you need: @debugobjects.h # define DEBUG_OBJECT_ASTATE_INIT 0 @rcupdate.c # define STATE_RCU_HEAD_READY DEBUG_OBJECT_ASTATE_INIT # define STATE_RCU_HEAD_QUEUED (STATE_RCU_HEAD_READY + 1) 2) In debug_rcu_head_queue(), debug_object_active_state() is always success when debug_object_activate() is success. In debug_rcu_head_unqueue(), debug_object_active_state() is always success if RCU subsystem is running correctly. (Correct me if I'm wrong) So we don't need debug_object_active_state() We may need call rcu_barrier(), rcu_barrier_sched(), rcu_barrier_bh() together? --
Yes it is. astate is set to 0 upon activation, and STATE_RCU_HEAD_READY is 0.
This is the initial state of the state machine (and also the accepting state).
Maybe I could document it a little bit more. Will write:
+/*
+ * Active state:
+ * - Set at 0 upon initialization.
+ * - Must return to 0 before deactivation.
+ */
+extern void
+debug_object_active_state(void *addr, struct debug_obj_descr *descr,
+ unsigned int expect, unsigned int next);
Yes, but this debug infrastructure also detects problems that may arise in the
RCU subsystem itself: e.g. a rcu head that would be executed on two CPUs, which
could be caused by races between cpu hotplug and the RCU subsystem. I'm not
saying that we have these races currently, but that the RCU subsystems is now so
tied to complex parts of the kernel (scheduler, cpu hotplug) that it's bound to
have nasty bugs eventually.
So without the state machine, what we don't have in debugobjects is the ability
to detect a second "deactivation": debugobjects considers that an object can be
deactivated more than once without error. But in this case, just one
deactivation should be allowed. I thought that this state machine scheme would
allow following an object life more closely than just the
init/activate/deactivate/destroy/free phases. Maybe there could be ways to
combine the calls ? But.. given that it's for debugging purposes, do we care
that much about performance that it's worth creating new API members, making the
Will change for:
#ifndef CONFIG_PREEMPT
WARN_ON(1);
return 0;
#else
if (rcu_preempt_depth() != 0 || preempt_count() != 0 ||
irqs_disabled()) {
WARN_ON(1);
return 0;
}
rcu_barrier();
rcu_barrier_sched();
rcu_barrier_bh();
debug_object_activate(head, &rcuhead_debug_descr);
return ...Implement a basic state machine checker in the debugobjects. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: akpm@linux-foundation.org CC: mingo@elte.hu CC: laijs@cn.fujitsu.com CC: dipankar@in.ibm.com CC: josh@joshtriplett.org CC: dvhltc@us.ibm.com CC: niv@us.ibm.com CC: tglx@linutronix.de CC: peterz@infradead.org CC: rostedt@goodmis.org CC: Valdis.Kletnieks@vt.edu CC: dhowells@redhat.com CC: eric.dumazet@gmail.com CC: Alexey Dobriyan <adobriyan@gmail.com> --- include/linux/debugobjects.h | 9 ++++++ lib/debugobjects.c | 59 ++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 65 insertions(+), 3 deletions(-) Index: linux-2.6-lttng/include/linux/debugobjects.h =================================================================== --- linux-2.6-lttng.orig/include/linux/debugobjects.h 2010-03-19 15:26:02.000000000 -0400 +++ linux-2.6-lttng/include/linux/debugobjects.h 2010-03-19 15:26:06.000000000 -0400 @@ -20,12 +20,14 @@ struct debug_obj_descr; * struct debug_obj - representaion of an tracked object * @node: hlist node to link the object into the tracker list * @state: tracked object state + * @astate: current active state * @object: pointer to the real object * @descr: pointer to an object type specific debug description structure */ struct debug_obj { struct hlist_node node; enum debug_obj_state state; + unsigned int astate; void *object; struct debug_obj_descr *descr; }; @@ -60,6 +62,13 @@ extern void debug_object_deactivate(void extern void debug_object_destroy (void *addr, struct debug_obj_descr *descr); extern void debug_object_free (void *addr, struct debug_obj_descr *descr); +/* + * Active state must return to 0 before deactivation. + */ +extern void +debug_object_active_state(void *addr, struct debug_obj_descr *descr, + unsigned int expect, unsigned int next); + extern void debug_objects_early_init(void); ...
