Hello!
This patch adds stalled-CPU detection to Classic RCU. This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR,
which defaults disabled. This is a debugging feature, not something
that non-kernel-hackers would be expected to care about. This feature
can detect looping CPUs in !PREEMPT builds and looping CPUs with
preemption disabled in PREEMPT builds. This is essentially a port of
this functionality from the treercu patch.
One current shortcoming: on some systems, stalls are detected during
early boot, when we normally would not care about them. My thought is
to add a call from late initialization to suppress stall detection until
the system is well along its way to being booted, but thought I should
check to see if there might already be something for this purpose.
(Currently against 2.6.27-rc8, FYI.)
Thoughts?
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/rcuclassic.h | 9 ++++
kernel/rcuclassic.c | 88 +++++++++++++++++++++++++++++++++++++++++++++
lib/Kconfig.debug | 13 ++++++
3 files changed, 110 insertions(+)
diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 4ab8436..9b62e9a 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -40,6 +40,10 @@
#include <linux/cpumask.h>
#include <linux/seqlock.h>
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+#define RCU_SECONDS_TILL_STALL_CHECK 3 /* for rcp->seconds_stall */
+#define RCU_SECONDS_TILL_STALL_RECHECK 30 /* for rcp->seconds_stall */
+#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
/* Global control variables for rcupdate callback mechanism. */
struct rcu_ctrlblk {
@@ -52,6 +56,11 @@ struct rcu_ctrlblk {
spinlock_t lock ____cacheline_internodealigned_in_smp;
cpumask_t cpumask; /* CPUs that need to switch in order */
/* for current batch to proceed. */
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+ unsigned long gp_start; /* Time at which GP started ...could you be a bit more specific, why do those warnings show up and why dont we care about them? There are things like networking that i think this is a very good idea in general - often the question comes up whether a hang seen in the RCU code is indeed caused by RCU or other factors. Could you perhaps rebase it against tip/core/rcu ? [or tip/master for convenience] Ingo --
Hmmm...
Good point, I was just falling back on my old "we don't care about RCU
stalls in boot-time code" rule from long ago. It is entirely possible
(in fact reasonably likely) that this rule no longer applies to Linux as
it exists today. So please see below for the console output. Not all
systems report this stall. I have recently been running only on Power
systems, will fire off on some x86s. My kneejerk reaction was that the
"stall" was really due to the clock-setting operation -- RCU stalls are
based on get_seconds().
So maybe I need to change the stall-detection code to use jiffies
instead.
Will do!
Thanx, Paul
------------------------------------------------------------------------
Console output: search for "RCU detected CPU 5 stall" near the end.
------------------------------------------------------------------------
Elapsed time since release of system processors: 7742 mins 58 secs
Config file read, 2048 bytes
Welcome
Welcome to yaboot version 1.3.13-cas (Red Hat 1.3.13.cas-5)
Enter "help" to get some basic usage information
boot: autobench
Please wait, loading kernel...
Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 02b00000, size: 2331 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
-\|/-\|/-\|/-\|
Elapsed time since release of system processors: 7743 mins 29 secs
Config file read, 2048 bytes
Welcome
Welcome to yaboot version 1.3.13-cas (Red Hat 1.3.13.cas-5)
Enter "help" to get some basic usage information
boot: autobench
Please wait, loading kernel...
Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 02b00000, size: 2331 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: ro console=hvc0 autobench_args: root=/dev/sda6
ABAT:1222922465
memory layout at init:
alloc_bottom : 0000000002d47000
alloc_top : 0000000008000000
alloc_top_hi : 0000000008000000
rmo_top : 0000000008000000
ram_top ...You could possibly use get monotonic time, which should hopefully be a bit better than its NTP-corrected alternatives. Also, do you have all the tools you need to poinpoint the precise source of RCU stalls ? I haven't looked at RCU trace, but I should say adding a new tracepoint to instrument RCU callback execution at boot time and to trace it with LTTng is close to trivial. (actually, very early kernel boot is not so trivial as we would have to create a small kernel module to activate lttng tracing, but tracing the early phases of userspace bootup is trivial given we can use the userspace API). -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
OK. I thought jiffies were monotonic (though not perfectly correlated to the passage of time). This is a warning timeout, so does not need better than a few percent accuracy, as long as time never jumps too far. I just dump stack, which historically worked pretty well. Could you please tell me more about what you think should be added? --
Well, jiffies should also be monotonic, but I don't know their exact Something like do_posix_clock_monotonic_gettime() in linux/time.h. I I just created two patches which instrument RCU callbacks and call_rcu in classic and preempt RCU to the -lttng tree to show how simple it is once the LTTng infrastructure is in place. Basically, all you need to do is to pull this tree (branch 2.6.27-rc8-lttng-0.31) : git://git.kernel.org/pub/scm/linux/kernel/git/compudj/linux-2.6-lttng.git And follow : http://ltt.polymtl.ca/svn/trunk/lttv/QUICKSTART Which will basically tell you to enable markers/tracepoints and leave most ltt-related options to default (all in general setup menu). wget http://ltt.polymtl.ca/lttng/ltt-control-0.53-02102008.tar.gz extract, ./configure, make, make install wget http://ltt.polymtl.ca/packages/lttv-0.10.0-pre15-12082008.tar.gz extract, ./configure, make, make install (see QUICKSTART for package dependencies. Will be automatically checked by configure) You may have to add a mount point for debugfs and load some kernel modules (as explained in the QUICKSTART). Don't forget rcu-trace.ko. To take a trace, once you are booted in the kernel : ltt-armall (check that the rcu_* markers are enabled by looking at the "state" field of cat /proc/ltt | grep rcu) lttctl -n trace -d -l /mnt/debugfs/ltt -t /tmp/trace [Reproduce behavior to trace] lttctl -n trace -R # Produce a text dump lttv -m textDump -t /tmp/trace # Produce a text dump filtering only rcu events : lttv -m textDump -t /tmp/trace \ -e "event.name=rcu_classic_callback|event.name=rcu_classic_call_rcu|\ event.name=rcu_classic_call_rcu_bh|event.name=rcu_preempt_callback|\ event.name=rcu_preempt_call_rcu_sched|event.name=rcu_preempt_call_rcu" Then some scripting could be required to turn the function pointers into symbols. Keep you kernel vmlinux around to perform symbol resolution. A quick trick is to use objdump -S --start-address=0x.... vmlinux|less but I am certain there are much ...
There will be at least one non-dyntick CPU if there are RCU callbacks pending. If there are no RCU callbacks pending, then RCU won't be OK, I think I see the disconnection here... When RCU detects a stall, that often indicates that some CPU is spinning in the kernel (for !CONFIG_PREEMPT) or that some CPU is spinning in the kernel with preemption disabled (for CONFIG_PREEMPT). The stack trace will normally show what is spinning. Tracing the RCU activity in this case will not help -- it is not RCU that is spinning, but rather that RCU is stalled waiting for some other part of the kernel that is spinning. (In a past life, RCU once complained about a specific CPU that had managed to fail in a way that left the system running! Except that RCU grace periods never ended...) FWIW, what I normally use for debugging RCU itself is counters rather than log messages. The reason I avoid log messages is that the race conditions I often find myself chasing are reasonably narrow, so that I would have a very large number of log messages to pile through. Nevertheless, I will keep track of these patches in case I come across some situation that is more friendly to log-based debugging -- and thank you for putting them together!!! --
Hello!
This patch adds stalled-CPU detection to Classic RCU. This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which
defaults disabled. This is a debugging feature to detect infinite loops
in kernel code, not something that non-kernel-hackers would be expected
to care about. This feature can detect looping CPUs in !PREEMPT builds
and looping CPUs with preemption disabled in PREEMPT builds. This is
essentially a port of this functionality from the treercu patch, replacing
the stall debug patch that is already in tip/core/rcu (commit 67182ae1c4).
The changes from the patch in tip/core/rcu include making the config
variable name match that in treercu, changing from seconds to jiffies to
avoid spurious warnings, and printing a boot message when this feature
is enabled.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/rcuclassic.h | 12 ++-
kernel/rcuclassic.c | 166 +++++++++++++++++++++++----------------------
lib/Kconfig.debug | 2
3 files changed, 96 insertions(+), 84 deletions(-)
diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 29bf528..2d72d20 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -40,15 +40,21 @@
#include <linux/cpumask.h>
#include <linux/seqlock.h>
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+#define RCU_SECONDS_TILL_STALL_CHECK 3 * HZ /* for rcp->jiffies_stall */
+#define RCU_SECONDS_TILL_STALL_RECHECK 30 * HZ /* for rcp->jiffies_stall */
+#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
/* Global control variables for rcupdate callback mechanism. */
struct rcu_ctrlblk {
long cur; /* Current batch number. */
long completed; /* Number of the last completed batch */
long pending; /* Number of the last pending batch */
-#ifdef CONFIG_DEBUG_RCU_STALL
- unsigned long gp_check; /* Time grace period should end, in seconds. */
-#endif /* #ifdef ...Hi Paul, Thanks for the previous explanations. Out of curiosity, what can this patch do that the nmi watchdog can't do ? -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
Hello, Mathieu, Operate on architectures that don't have NMIs. Possibly allow shorter timeouts than the NMI watchdog. Operate on a per-CPU basis (or does the NMI watchdog do that these days?). --
applied to tip/core/rcu, thanks Paul! and have removed the two tailing tabs from this line. Ingo --
-tip testing found a minor build bug - find the fix below.
Ingo
-------------------->
From 2ec2b482b10a1ed3493c224f1893cddd3d33833b Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Fri, 3 Oct 2008 10:41:00 +0200
Subject: [PATCH] rcu: RCU-based detection of stalled CPUs for Classic RCU, fix
fix the !CONFIG_RCU_CPU_STALL_DETECTOR path:
kernel/rcuclassic.c: In function '__rcu_pending':
kernel/rcuclassic.c:609: error: too few arguments to function 'check_cpu_stall'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/rcuclassic.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
index 0d07e6e..37f72e5 100644
--- a/kernel/rcuclassic.c
+++ b/kernel/rcuclassic.c
@@ -239,7 +239,7 @@ static void record_gp_stall_check_time(struct rcu_ctrlblk *rcp)
{
}
-static void check_cpu_stall(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)
+static inline void check_cpu_stall(struct rcu_ctrlblk *rcp)
{
}
--
I should have learned by now to test with all relevant config parameter combinations... My apologies for the hassle!!! Thanx, Paul --
<red face> checkpatch.pl... I have heard of it! :-) Thanx, Paul --
Hello again!
This patch adds stalled-CPU detection to Classic RCU. This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR,
which defaults disabled. This is a debugging feature, not something
that non-kernel-hackers would be expected to care about. This feature
can detect looping CPUs in !PREEMPT builds and looping CPUs with
preemption disabled in PREEMPT builds. This is essentially a port of
this functionality from the treercu patch.
This version uses jiffies rather than get_seconds(), which eliminates
the spurious boot-time CPU stall warnings seen on some systems with
the previous patch.
This is still against v2.6.27-rc8 -- I will do a version against tip
this evening (Pacific Time) when I get back to the combination of better
bandwidth and AC power.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/rcuclassic.h | 9 ++++
kernel/rcuclassic.c | 88 +++++++++++++++++++++++++++++++++++++++++++++
lib/Kconfig.debug | 13 ++++++
3 files changed, 110 insertions(+)
diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 4ab8436..cab055b 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -40,6 +40,10 @@
#include <linux/cpumask.h>
#include <linux/seqlock.h>
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+#define RCU_SECONDS_TILL_STALL_CHECK 3 * HZ /* for rcp->jiffies_stall */
+#define RCU_SECONDS_TILL_STALL_RECHECK 30 * HZ /* for rcp->jiffies_stall */
+#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
/* Global control variables for rcupdate callback mechanism. */
struct rcu_ctrlblk {
@@ -52,6 +56,11 @@ struct rcu_ctrlblk {
spinlock_t lock ____cacheline_internodealigned_in_smp;
cpumask_t cpumask; /* CPUs that need to switch in order */
/* for current batch to proceed. */
+#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
+ unsigned long gp_start; /* Time at which GP started in jiffies. */
+ unsigned long jiffies_stall;
+ ...I found a magic number in it. Is it (2 * HZ)? --
Good catch, Jiangshan!!! See below for responses, and please see the
It should be "2", though a macro might be good. The reason for "2" is
that it guarantees that the other CPU had a full period to respond, so
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
include/linux/rcuclassic.h | 4 ++++
kernel/rcuclassic.c | 6 +++---
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 8388647..69e1929 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -43,6 +43,10 @@
#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
#define RCU_SECONDS_TILL_STALL_CHECK (3 * HZ) /* for rcp->jiffies_stall */
#define RCU_SECONDS_TILL_STALL_RECHECK (30 * HZ) /* for rcp->jiffies_stall */
+#define RCU_STALL_RAT_DELAY 2 /* Allow other CPUs time */
+ /* to take at least one */
+ /* scheduling clock irq */
+ /* before ratting on them. */
#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
/* Global control variables for rcupdate callback mechanism. */
diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
index 0ce0802..8c2d15d 100644
--- a/kernel/rcuclassic.c
+++ b/kernel/rcuclassic.c
@@ -136,7 +136,7 @@ static void print_other_cpu_stall(struct rcu_ctrlblk *rcp)
spin_lock_irqsave(&rcp->lock, flags);
delta = jiffies - rcp->seconds_stall;
- if (delta < 2 || rcp->cur != rcp->completed) {
+ if (delta < RCU_STALL_RAT_DELAY || rcp->cur != rcp->completed) {
spin_unlock_irqrestore(&rcp->lock, flags);
return;
}
@@ -180,9 +180,9 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
/* We haven't checked in, so go dump stack. */
print_cpu_stall(rcp);
- } else if (rcp->cur != rcp->completed && delta >= 2) {
+ } else if (rcp->cur != rcp->completed && delta >= RCU_STALL_RAT_DELAY) {
- /* They had two seconds to dump stack, so complain. */
+ /* They had two time units to dump stack, so complain. */
...| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List | ixbge: remove TX lock and redo TX accounting. |
| Linux Kernel Mailing List | ixgbe: fix several counter register errata |
| Linux Kernel Mailing List | b43: fix build with CONFIG_SSB_PCIHOST=n |
| Linux Kernel Mailing List | 9p: block-based virtio client |
