Re: [PATCH,RFC] RCU-based detection of stalled CPUs for Classic RCU

Previous thread: IT8720: it87.c update by Jean-Marc Spaggiari on Wednesday, October 1, 2008 - 5:32 pm. (3 messages)

Next thread: VERIFY YOUR NETVISION.NET.IL EMAIL ACCOUNT IMMEDIATELY! by NETVISION.NET.IL on Wednesday, October 1, 2008 - 5:34 pm. (1 message)
From: Paul E. McKenney
Date: Wednesday, October 1, 2008 - 5:36 pm

Hello!

This patch adds stalled-CPU detection to Classic RCU.  This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR,
which defaults disabled.  This is a debugging feature, not something
that non-kernel-hackers would be expected to care about.  This feature
can detect looping CPUs in !PREEMPT builds and looping CPUs with
preemption disabled in PREEMPT builds.  This is essentially a port of
this functionality from the treercu patch.

One current shortcoming: on some systems, stalls are detected during
early boot, when we normally would not care about them.  My thought is
to add a call from late initialization to suppress stall detection until
the system is well along its way to being booted, but thought I should
check to see if there might already be something for this purpose.

(Currently against 2.6.27-rc8, FYI.)

Thoughts?

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 include/linux/rcuclassic.h |    9 ++++
 kernel/rcuclassic.c        |   88 +++++++++++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug          |   13 ++++++
 3 files changed, 110 insertions(+)

diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 4ab8436..9b62e9a 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -40,6 +40,10 @@
 #include <linux/cpumask.h>
 #include <linux/seqlock.h>
 
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+#define RCU_SECONDS_TILL_STALL_CHECK	3	/* for rcp->seconds_stall */
+#define RCU_SECONDS_TILL_STALL_RECHECK	30	/* for rcp->seconds_stall */
+#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /* Global control variables for rcupdate callback mechanism. */
 struct rcu_ctrlblk {
@@ -52,6 +56,11 @@ struct rcu_ctrlblk {
 	spinlock_t	lock	____cacheline_internodealigned_in_smp;
 	cpumask_t	cpumask; /* CPUs that need to switch in order    */
 				 /* for current batch to proceed.        */
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+	unsigned long gp_start;	 /* Time at which GP started ...
From: Ingo Molnar
Date: Thursday, October 2, 2008 - 1:07 am

could you be a bit more specific, why do those warnings show up and why 
dont we care about them? There are things like networking that 

i think this is a very good idea in general - often the question comes 
up whether a hang seen in the RCU code is indeed caused by RCU or other 
factors. Could you perhaps rebase it against tip/core/rcu ? [or 
tip/master for convenience]

	Ingo
--

From: Paul E. McKenney
Date: Thursday, October 2, 2008 - 7:05 am

Hmmm...

Good point, I was just falling back on my old "we don't care about RCU
stalls in boot-time code" rule from long ago.  It is entirely possible
(in fact reasonably likely) that this rule no longer applies to Linux as
it exists today.  So please see below for the console output.  Not all
systems report this stall.  I have recently been running only on Power
systems, will fire off on some x86s.  My kneejerk reaction was that the
"stall" was really due to the clock-setting operation -- RCU stalls are
based on get_seconds().

So maybe I need to change the stall-detection code to use jiffies
instead.


Will do!

							Thanx, Paul

------------------------------------------------------------------------
Console output: search for "RCU detected CPU 5 stall" near the end.
------------------------------------------------------------------------

Elapsed time since release of system processors: 7742 mins 58 secs

Config file read, 2048 bytes
Welcome
Welcome to yaboot version 1.3.13-cas (Red Hat 1.3.13.cas-5)
Enter "help" to get some basic usage information
boot: autobench
Please wait, loading kernel...
   Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 02b00000, size: 2331 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
    -\|/-\|/-\|/-\|
Elapsed time since release of system processors: 7743 mins 29 secs

Config file read, 2048 bytes
Welcome
Welcome to yaboot version 1.3.13-cas (Red Hat 1.3.13.cas-5)
Enter "help" to get some basic usage information
boot: autobench 
Please wait, loading kernel...
   Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 02b00000, size: 2331 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: ro console=hvc0 autobench_args: root=/dev/sda6
ABAT:1222922465 
memory layout at init:
  alloc_bottom : 0000000002d47000
  alloc_top    : 0000000008000000
  alloc_top_hi : 0000000008000000
  rmo_top      : 0000000008000000
  ram_top  ...
From: Mathieu Desnoyers
Date: Thursday, October 2, 2008 - 8:10 am

You could possibly use get monotonic time, which should hopefully be a
bit better than its NTP-corrected alternatives.

Also, do you have all the tools you need to poinpoint the precise source
of RCU stalls ? I haven't looked at RCU trace, but I should say adding a
new tracepoint to instrument RCU callback execution at boot time and to
trace it with LTTng is close to trivial. (actually, very early kernel
boot is not so trivial as we would have to create a small kernel module
to activate lttng tracing, but tracing the early phases of userspace
bootup is trivial given we can use the userspace API).


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Paul E. McKenney
Date: Thursday, October 2, 2008 - 10:43 am

OK.  I thought jiffies were monotonic (though not perfectly correlated
to the passage of time).  This is a warning timeout, so does not need
better than a few percent accuracy, as long as time never jumps too far.


I just dump stack, which historically worked pretty well.

Could you please tell me more about what you think should be added?

--

From: Mathieu Desnoyers
Date: Thursday, October 2, 2008 - 11:51 am

Well, jiffies should also be monotonic, but I don't know their exact

Something like do_posix_clock_monotonic_gettime() in linux/time.h. I

I just created two patches which instrument RCU callbacks and call_rcu
in classic and preempt RCU to the -lttng tree to show how simple it is
once the LTTng infrastructure is in place.

Basically, all you need to do is to pull this tree (branch
2.6.27-rc8-lttng-0.31) :

git://git.kernel.org/pub/scm/linux/kernel/git/compudj/linux-2.6-lttng.git

And follow :
http://ltt.polymtl.ca/svn/trunk/lttv/QUICKSTART

Which will basically tell you to enable markers/tracepoints and leave
most ltt-related options to default (all in general setup menu).

wget http://ltt.polymtl.ca/lttng/ltt-control-0.53-02102008.tar.gz
extract, ./configure, make, make install

wget http://ltt.polymtl.ca/packages/lttv-0.10.0-pre15-12082008.tar.gz
extract, ./configure, make, make install
(see QUICKSTART for package dependencies. Will be automatically checked
by configure)

You may have to add a mount point for debugfs and load some kernel
modules (as explained in the QUICKSTART). Don't forget rcu-trace.ko.

To take a trace, once you are booted in the kernel :

ltt-armall
(check that the rcu_* markers are enabled by looking at the "state"
field of cat /proc/ltt | grep rcu)
lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace
[Reproduce behavior to trace]
lttctl -n trace -R
# Produce a text dump
lttv -m textDump -t /tmp/trace

# Produce a text dump filtering only rcu events :
lttv -m textDump -t /tmp/trace \
  -e "event.name=rcu_classic_callback|event.name=rcu_classic_call_rcu|\
event.name=rcu_classic_call_rcu_bh|event.name=rcu_preempt_callback|\
event.name=rcu_preempt_call_rcu_sched|event.name=rcu_preempt_call_rcu"

Then some scripting could be required to turn the function pointers into
symbols. Keep you kernel vmlinux around to perform symbol resolution.
A quick trick is to use objdump -S --start-address=0x.... vmlinux|less
but I am certain there are much ...
From: Paul E. McKenney
Date: Thursday, October 2, 2008 - 3:55 pm

There will be at least one non-dyntick CPU if there are RCU callbacks
pending.  If there are no RCU callbacks pending, then RCU won't be


OK, I think I see the disconnection here...

When RCU detects a stall, that often indicates that some CPU is spinning
in the kernel (for !CONFIG_PREEMPT) or that some CPU is spinning in the
kernel with preemption disabled (for CONFIG_PREEMPT).  The stack trace
will normally show what is spinning.

Tracing the RCU activity in this case will not help -- it is not RCU that
is spinning, but rather that RCU is stalled waiting for some other part
of the kernel that is spinning.  (In a past life, RCU once complained
about a specific CPU that had managed to fail in a way that left the
system running!  Except that RCU grace periods never ended...)

FWIW, what I normally use for debugging RCU itself is counters rather
than log messages.  The reason I avoid log messages is that the race
conditions I often find myself chasing are reasonably narrow, so that
I would have a very large number of log messages to pile through.
Nevertheless, I will keep track of these patches in case I come across
some situation that is more friendly to log-based debugging -- and thank
you for putting them together!!!

--

From: Paul E. McKenney
Date: Thursday, October 2, 2008 - 4:06 pm

Hello!

This patch adds stalled-CPU detection to Classic RCU.  This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which
defaults disabled.  This is a debugging feature to detect infinite loops
in kernel code, not something that non-kernel-hackers would be expected
to care about.  This feature can detect looping CPUs in !PREEMPT builds
and looping CPUs with preemption disabled in PREEMPT builds.  This is
essentially a port of this functionality from the treercu patch, replacing
the stall debug patch that is already in tip/core/rcu (commit 67182ae1c4).

The changes from the patch in tip/core/rcu include making the config
variable name match that in treercu, changing from seconds to jiffies to
avoid spurious warnings, and printing a boot message when this feature
is enabled.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 include/linux/rcuclassic.h |   12 ++-
 kernel/rcuclassic.c        |  166 +++++++++++++++++++++++----------------------
 lib/Kconfig.debug          |    2 
 3 files changed, 96 insertions(+), 84 deletions(-)

diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 29bf528..2d72d20 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -40,15 +40,21 @@
 #include <linux/cpumask.h>
 #include <linux/seqlock.h>
 
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+#define RCU_SECONDS_TILL_STALL_CHECK	3 * HZ	/* for rcp->jiffies_stall */
+#define RCU_SECONDS_TILL_STALL_RECHECK	30 * HZ	/* for rcp->jiffies_stall */
+#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /* Global control variables for rcupdate callback mechanism. */
 struct rcu_ctrlblk {
 	long	cur;		/* Current batch number.                      */
 	long	completed;	/* Number of the last completed batch         */
 	long	pending;	/* Number of the last pending batch           */
-#ifdef CONFIG_DEBUG_RCU_STALL
-	unsigned long gp_check;	/* Time grace period should end, in seconds.  */
-#endif /* #ifdef ...
From: Mathieu Desnoyers
Date: Thursday, October 2, 2008 - 9:41 pm

Hi Paul,

Thanks for the previous explanations. Out of curiosity, what can this
patch do that the nmi watchdog can't do ?


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--

From: Paul E. McKenney
Date: Thursday, October 2, 2008 - 10:02 pm

Hello, Mathieu,

Operate on architectures that don't have NMIs.  Possibly allow shorter
timeouts than the NMI watchdog.  Operate on a per-CPU basis (or does
the NMI watchdog do that these days?).

--

From: Ingo Molnar
Date: Friday, October 3, 2008 - 1:39 am

applied to tip/core/rcu, thanks Paul!



and have removed the two tailing tabs from this line.

	Ingo
--

From: Ingo Molnar
Date: Friday, October 3, 2008 - 1:42 am

-tip testing found a minor build bug - find the fix below.

	Ingo

-------------------->
From 2ec2b482b10a1ed3493c224f1893cddd3d33833b Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 3 Oct 2008 10:41:00 +0200
Subject: [PATCH] rcu: RCU-based detection of stalled CPUs for Classic RCU, fix

fix the !CONFIG_RCU_CPU_STALL_DETECTOR path:

 kernel/rcuclassic.c: In function '__rcu_pending':
 kernel/rcuclassic.c:609: error: too few arguments to function 'check_cpu_stall'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/rcuclassic.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
index 0d07e6e..37f72e5 100644
--- a/kernel/rcuclassic.c
+++ b/kernel/rcuclassic.c
@@ -239,7 +239,7 @@ static void record_gp_stall_check_time(struct rcu_ctrlblk *rcp)
 {
 }
 
-static void check_cpu_stall(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
+static inline void check_cpu_stall(struct rcu_ctrlblk *rcp)
 {
 }
 
--

From: Paul E. McKenney
Date: Friday, October 3, 2008 - 8:14 am

I should have learned by now to test with all relevant config parameter
combinations...  My apologies for the hassle!!!

							Thanx, Paul
--

From: Paul E. McKenney
Date: Friday, October 3, 2008 - 8:13 am

<red face>  checkpatch.pl...  I have heard of it!  :-)

							Thanx, Paul
--

From: Paul E. McKenney
Date: Thursday, October 2, 2008 - 10:39 am

Hello again!

This patch adds stalled-CPU detection to Classic RCU.  This capability 
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, 
which defaults disabled.  This is a debugging feature, not something 
that non-kernel-hackers would be expected to care about.  This feature 
can detect looping CPUs in !PREEMPT builds and looping CPUs with 
preemption disabled in PREEMPT builds.  This is essentially a port of 
this functionality from the treercu patch.

This version uses jiffies rather than get_seconds(), which eliminates
the spurious boot-time CPU stall warnings seen on some systems with
the previous patch.

This is still against v2.6.27-rc8 -- I will do a version against tip
this evening (Pacific Time) when I get back to the combination of better
bandwidth and AC power.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 include/linux/rcuclassic.h |    9 ++++
 kernel/rcuclassic.c        |   88 +++++++++++++++++++++++++++++++++++++++++++++
 lib/Kconfig.debug          |   13 ++++++
 3 files changed, 110 insertions(+)

diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 4ab8436..cab055b 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -40,6 +40,10 @@
 #include <linux/cpumask.h>
 #include <linux/seqlock.h>
 
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+#define RCU_SECONDS_TILL_STALL_CHECK	3 * HZ	/* for rcp->jiffies_stall */
+#define RCU_SECONDS_TILL_STALL_RECHECK	30 * HZ	/* for rcp->jiffies_stall */
+#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /* Global control variables for rcupdate callback mechanism. */
 struct rcu_ctrlblk {
@@ -52,6 +56,11 @@ struct rcu_ctrlblk {
 	spinlock_t	lock	____cacheline_internodealigned_in_smp;
 	cpumask_t	cpumask; /* CPUs that need to switch in order    */
 				 /* for current batch to proceed.        */
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+	unsigned long gp_start;	 /* Time at which GP started in jiffies. */
+	unsigned long jiffies_stall;
+				 ...
From: Lai Jiangshan
Date: Friday, October 3, 2008 - 3:12 am

I found a magic number in it.


Is it (2 * HZ)?



--

From: Paul E. McKenney
Date: Friday, October 3, 2008 - 8:12 am

Good catch, Jiangshan!!!  See below for responses, and please see the

It should be "2", though a macro might be good.  The reason for "2" is
that it guarantees that the other CPU had a full period to respond, so


Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 include/linux/rcuclassic.h |    4 ++++
 kernel/rcuclassic.c        |    6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 8388647..69e1929 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -43,6 +43,10 @@
 #ifdef CONFIG_RCU_CPU_STALL_DETECTOR
 #define RCU_SECONDS_TILL_STALL_CHECK	(3 * HZ)  /* for rcp->jiffies_stall */
 #define RCU_SECONDS_TILL_STALL_RECHECK	(30 * HZ) /* for rcp->jiffies_stall */
+#define RCU_STALL_RAT_DELAY		2	  /* Allow other CPUs time */
+						  /*  to take at least one */
+						  /*  scheduling clock irq */
+						  /*  before ratting on them. */
 #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /* Global control variables for rcupdate callback mechanism. */
diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
index 0ce0802..8c2d15d 100644
--- a/kernel/rcuclassic.c
+++ b/kernel/rcuclassic.c
@@ -136,7 +136,7 @@ static void print_other_cpu_stall(struct rcu_ctrlblk *rcp)
 
 	spin_lock_irqsave(&rcp->lock, flags);
 	delta = jiffies - rcp->seconds_stall;
-	if (delta < 2 || rcp->cur != rcp->completed) {
+	if (delta < RCU_STALL_RAT_DELAY || rcp->cur != rcp->completed) {
 		spin_unlock_irqrestore(&rcp->lock, flags);
 		return;
 	}
@@ -180,9 +180,9 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
 		/* We haven't checked in, so go dump stack. */
 		print_cpu_stall(rcp);
 
-	} else if (rcp->cur != rcp->completed && delta >= 2) {
+	} else if (rcp->cur != rcp->completed && delta >= RCU_STALL_RAT_DELAY) {
 
-		/* They had two seconds to dump stack, so complain. */
+		/* They had two time units to dump stack, so complain. */
 ...
Previous thread: IT8720: it87.c update by Jean-Marc Spaggiari on Wednesday, October 1, 2008 - 5:32 pm. (3 messages)

Next thread: VERIFY YOUR NETVISION.NET.IL EMAIL ACCOUNT IMMEDIATELY! by NETVISION.NET.IL on Wednesday, October 1, 2008 - 5:34 pm. (1 message)