[PATCH tip/core/rcu 26/48] rcu: RCU_FAST_NO_HZ must check RCU dyntick state

Previous thread: PATCH: rar and memrar updates by Alan Cox on Tuesday, May 4, 2010 - 12:40 pm. (1 message)

Next thread: [Bug #15505] No more b43 wireless interface since 2.6.34-rc1 by Rafael J. Wysocki on Tuesday, May 4, 2010 - 1:49 pm. (1 message)
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

Hello!

RFC preview of RCU patches queued for 2.6.35, take 4.
Take 3 is at http://lkml.org/lkml/2010/4/20/321.
New patches flagged with "New".

 1.	New: optionally leave lockdep enabled after RCU lockdep splat
	This patch, based on one from Lai Jiangshan, allows multiple
	RCU-lockdep splats per boot.  This was proposed for 2.6.34,
	but it does not qualify as a regression, so it is now at the
	head of the 2.6.35 queue.

 2.	substitute set_need_resched for sending resched IPIs
 	This reduces OS jitter.

 3.	make dead code really dead.
 4.	move some code from macro to function
 	Cleanups from Lai Jiangshan.

 5.	ignore offline CPUs in last non dyntick idle CPU check
 	Fix to my CONFIG_RCU_FAST_NO_HZ code to handle offline and
	non-existent CPUs, also from Lai Jiangshan.

 6.	fix bogus CONFIG_PROVE_LOCKING in comments to reality
 7.	fix now bogus rcu_scheduler_active comments
 	Comment fixups.

 8.	shrink rcutiny by making synchronize_rcu_bh be inline
 	Shrink TINY_RCU some more.

 9.	rename rcutiny rcu_ctrlblk to rcu_sched_ctrlblk
 	First step towards TINY_PREEMPTIBLE_RCU.

10.	refactor RCU's context switch handling
 	Reduce the number of needless softirqs.

11.	slim down rcutiny by removing rcu_scheduler_active and friends
	More shrinkage for TINY_RCU

12.	enable CPU_STALL_VERBOSE by default.  It will have been in one
	release, so time to enable it.

13.	disable CPU stall warnings upon panic

14.	print boot-time console messages if RCU configs out of ordinary

15.	improve RCU CPU stall-warning messages

16.	permit discontiguous cpu_possible_mask CPU numbering

17.	v2: reduce the number of spurious RCU_SOFTIRQ invocations
	[Original from Lai Jiangshan]

18.	improve the RCU CPU-stall-warning documentation

19.	debugobjects transition check
20.	introduce rcu_head_init_on_stack
21.	remove all non-on-stack rcu_head initializations
22.	remove rcu_head initializers
23.	Add debug RCU head objects
	Debug-objects-based rcu_head checking from Mathieu ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Lai Jiangshan <laijs@cn.fujitsu.com>

cleanup: make dead code really dead

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e54c123..6042fb8 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1236,11 +1236,11 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed)
 		break; /* grace period idle or initializing, ignore. */
 
 	case RCU_SAVE_DYNTICK:
-
-		raw_spin_unlock(&rnp->lock);  /* irqs remain disabled */
 		if (RCU_SIGNAL_INIT != RCU_SAVE_DYNTICK)
 			break; /* So gcc recognizes the dead code. */
 
+		raw_spin_unlock(&rnp->lock);  /* irqs remain disabled */
+
 		/* Record dyntick-idle state. */
 		force_qs_rnp(rsp, dyntick_save_progress_counter);
 		raw_spin_lock(&rnp->lock);  /* irqs already disabled */
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The CPU_STALL_VERBOSE kernel configuration parameter was added to
2.6.34 to identify any preempted/blocked tasks that were preventing
the current grace period from completing when running preemptible
RCU.  As is conventional for new configurations parameters, this
defaulted disabled.  It is now time to enable it by default.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 lib/Kconfig.debug |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 94090b4..930a9e5 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -805,7 +805,7 @@ config RCU_CPU_STALL_DETECTOR
 config RCU_CPU_STALL_VERBOSE
 	bool "Print additional per-task information for RCU_CPU_STALL_DETECTOR"
 	depends on RCU_CPU_STALL_DETECTOR && TREE_PREEMPT_RCU
-	default n
+	default y
 	help
 	  This option causes RCU to printk detailed per-task information
 	  for any tasks that are stalling the current RCU grace period.
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
---
 include/linux/sched.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7307c74..74b3125 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1301,9 +1301,9 @@ struct task_struct {
 	struct list_head cpu_timers[3];
 
 /* process credentials */
-	const struct cred *real_cred;	/* objective and real subjective task
+	const struct cred __rcu *real_cred; /* objective and real subjective task
 					 * credentials (COW) */
-	const struct cred *cred;	/* effective (overridable) subjective task
+	const struct cred __rcu *cred;	/* effective (overridable) subjective task
 					 * credentials (COW) */
 	struct mutex cred_guard_mutex;	/* guard against foreign influences on
 					 * credential calculations
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The sparse RCU-pointer checking relies on type magic that dereferences
the pointer in question.  This does not work if the pointer is in fact
an array index.  This commit therefore supplies a new RCU API that
omits the sparse checking to continue to support rcu_dereference()
on integers.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   33 +++++++++++++++++++++++++++++++++
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 2f9e56c..3be0ad7 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -560,4 +560,37 @@ static inline void debug_rcu_head_unqueue(struct rcu_head *head)
 }
 #endif	/* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
 
+#ifndef CONFIG_PROVE_RCU
+#define __do_rcu_dereference_check(c) do { } while (0)
+#endif /* #ifdef CONFIG_PROVE_RCU */
+
+#define __rcu_dereference_index_check(p, c) \
+	({ \
+		typeof(p) _________p1 = ACCESS_ONCE(p); \
+		__do_rcu_dereference_check(c); \
+		smp_read_barrier_depends(); \
+		(_________p1); \
+	})
+
+/**
+ * rcu_dereference_index_check() - rcu_dereference for indices with debug checking
+ * @p: The pointer to read, prior to dereferencing
+ * @c: The conditions under which the dereference will take place
+ *
+ * Similar to rcu_dereference_check(), but omits the sparse checking.
+ * This allows rcu_dereference_index_check() to be used on integers,
+ * which can then be used as array indices.  Attempting to use
+ * rcu_dereference_check() on an integer will give compiler warnings
+ * because the sparse address-space mechanism relies on dereferencing
+ * the RCU-protected pointer.  Dereferencing integers is not something
+ * that even gcc will put up with.
+ *
+ * Note that this function does not implicitly check for RCU read-side
+ * critical sections.  If this function gains lots of uses, it might
+ * make sense to provide versions for each flavor of RCU, but it does
+ * not make sense as of ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

This patch defines an __rcu annotation that permits sparse to check for
correct use of RCU-protected pointers.  If a pointer that is annotated
with __rcu is accessed directly (as opposed to via rcu_dereference(),
rcu_assign_pointer(), or one of their variants), sparse can be made
to complain.  To enable such complaints, use the new default-disabled
CONFIG_SPARSE_RCU_POINTER kernel configuration option.  Please note that
these sparse complaints are intended to be a debugging aid, -not- a
code-style-enforcement mechanism.

There are special rcu_dereference_protected() and rcu_access_pointer()
accessors for use when RCU read-side protection is not required, for
example, when no other CPU has access to the data structure in question
or while the current CPU hold the update-side lock.

This patch also updates a number of docbook comments that were showing
their age.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Christopher Li <sparse@chrisli.org>
Cc: Josh Triplett <josh@joshtriplett.org>
---
 include/linux/compiler.h |    6 +
 include/linux/rcupdate.h |  354 ++++++++++++++++++++++++++++------------------
 include/linux/srcu.h     |   27 +++-
 kernel/rcupdate.c        |    6 +-
 lib/Kconfig.debug        |   13 ++
 5 files changed, 260 insertions(+), 146 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index a5a472b..320d6c9 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -16,6 +16,11 @@
 # define __release(x)	__context__(x,-1)
 # define __cond_lock(x,c)	((c) ? ({ __acquire(x); 1; }) : 0)
 # define __percpu	__attribute__((noderef, address_space(3)))
+#ifdef CONFIG_SPARSE_RCU_POINTER
+# define __rcu		__attribute__((noderef, address_space(4)))
+#else
+# define __rcu
+#endif
 extern void __chk_user_ptr(const volatile void __user *);
 extern void __chk_io_ptr(const volatile void __iomem *);
 #else
@@ -34,6 +39,7 @@ extern void __chk_io_ptr(const volatile void ...
From: Arnd Bergmann
Date: Tuesday, May 4, 2010 - 1:58 pm

To add more background, I was thinking that it might make sense to
always leave the address space attribute in place but to make part
part of the checking optional.

The idea would be that we always make sure that an __rcu annotated
pointer cannot be dereferenced or cast directly, while we would
only complain about non-annotated pointers being passed to rcu_dereference
and rcu_assign_pointer if CONFIG_SPARSE_RCU_POINTER is set.

Most of the work I had spent on my tree was about fixing all the
false positives from that, but more work would be needed to get
a clean build from it even with the modified CONFIG_SPARSE_RCU_POINTER
disabled. Since you managed to find the real bugs and fix them,
your series by itself is probably more useful than the full set

Do you have specific plans to add these (__rcu_bh etc) back in the future,
or do you just want to leave the options open?

Anyway, good to see that you found your way through my patches and got them
into shape. 

	Arnd
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 4:07 pm

No specific plans at the moment.  But having the underlying plumbing all

Thank -you- for making this happen!!!

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

TREE_RCU assumes that CPU numbering is contiguous, but some users need
large holes in the numbering to better map to hardware layout.  This patch
makes TREE_RCU (and TREE_PREEMPT_RCU) tolerate large holes in the CPU
numbering.  However, NR_CPUS must still be greater than the largest
CPU number.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f391886..c60fd74 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1913,7 +1913,7 @@ static void __init rcu_init_one(struct rcu_state *rsp)
 
 	rnp = rsp->level[NUM_RCU_LVLS - 1];
 	for_each_possible_cpu(i) {
-		if (i > rnp->grphi)
+		while (i > rnp->grphi)
 			rnp++;
 		rsp->rda[i]->mynode = rnp;
 		rcu_boot_init_percpu_data(i, rsp);
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

PEM:
o     Would it be possible to make this bisectable as follows?

      a.      Insert a new patch after current patch 4/6 that
              defines destroy_rcu_head_on_stack(),
              init_rcu_head_on_stack(), and init_rcu_head() with
              their !CONFIG_DEBUG_OBJECTS_RCU_HEAD definitions.

This patch performs this transition.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: David S. Miller <davem@davemloft.net>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: tglx@linutronix.de
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 23be3a7..b653b4a 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -79,6 +79,14 @@ extern void rcu_init(void);
        (ptr)->next = NULL; (ptr)->func = NULL; \
 } while (0)
 
+static inline void init_rcu_head_on_stack(struct rcu_head *head)
+{
+}
+
+static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
+{
+}
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 extern struct lockdep_map rcu_lock_map;
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@relay.de.ibm.com>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/netfilter/nf_conntrack.h |    2 +-
 net/ipv4/netfilter/nf_nat_core.c     |    2 +-
 net/netfilter/core.c                 |    2 +-
 net/netfilter/nf_conntrack_ecache.c  |    4 ++--
 net/netfilter/nf_conntrack_expect.c  |   12 ++++++------
 net/netfilter/nf_conntrack_extend.c  |    2 +-
 net/netfilter/nf_conntrack_proto.c   |    4 ++--
 net/netfilter/nf_log.c               |    2 +-
 net/netfilter/nf_queue.c             |    2 +-
 9 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index bde095f..92229d1 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -75,7 +75,7 @@ struct nf_conntrack_helper;
 /* nf_conn feature for connections that have a helper */
 struct nf_conn_help {
 	/* Helper. if any */
-	struct nf_conntrack_helper *helper;
+	struct nf_conntrack_helper __rcu *helper;
 
 	union nf_conntrack_help help;
 
diff --git a/net/ipv4/netfilter/nf_nat_core.c b/net/ipv4/netfilter/nf_nat_core.c
index 4f8bddb..1263f2a 100644
--- a/net/ipv4/netfilter/nf_nat_core.c
+++ b/net/ipv4/netfilter/nf_nat_core.c
@@ -38,7 +38,7 @@ static DEFINE_SPINLOCK(nf_nat_lock);
 static struct nf_conntrack_l3proto *l3proto __read_mostly;
 
 #define MAX_IP_NAT_PROTO 256
-static const struct nf_nat_protocol *nf_nat_protos[MAX_IP_NAT_PROTO]
+static const struct nf_nat_protocol __rcu *nf_nat_protos[MAX_IP_NAT_PROTO]
 						__read_mostly;
 
 static inline const struct nf_nat_protocol *
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 78b505d..fdaec7d 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -27,7 +27,7 @@
 
 static ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@relay.de.ibm.com>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
---
 drivers/vhost/net.c   |    6 +++---
 drivers/vhost/vhost.c |   12 ++++++------
 drivers/vhost/vhost.h |    4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 9777583..36e8dec 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -364,7 +364,7 @@ static void vhost_net_disable_vq(struct vhost_net *n,
 static void vhost_net_enable_vq(struct vhost_net *n,
 				struct vhost_virtqueue *vq)
 {
-	struct socket *sock = vq->private_data;
+	struct socket *sock = rcu_dereference(vq->private_data);
 	if (!sock)
 		return;
 	if (vq == n->vqs + VHOST_NET_VQ_TX) {
@@ -380,7 +380,7 @@ static struct socket *vhost_net_stop_vq(struct vhost_net *n,
 	struct socket *sock;
 
 	mutex_lock(&vq->mutex);
-	sock = vq->private_data;
+	sock = rcu_dereference_const(vq->private_data);
 	vhost_net_disable_vq(n, vq);
 	rcu_assign_pointer(vq->private_data, NULL);
 	mutex_unlock(&vq->mutex);
@@ -518,7 +518,7 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
 	}
 
 	/* start polling new socket */
-	oldsock = vq->private_data;
+	oldsock = rcu_dereference_const(vq->private_data);
 	if (sock == oldsock)
 		goto done;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e69d238..fc9bde2 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -180,7 +180,7 @@ long vhost_dev_reset_owner(struct vhost_dev *dev)
 	vhost_dev_cleanup(dev);
 
 	memory->nregions = 0;
-	dev->memory = memory;
+	rcu_assign_pointer(dev->memory, memory);
 	return 0;
 }
 
@@ -212,8 +212,8 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
 		fput(dev->log_file);
 	dev->log_file = NULL;
 	/* No one will access memory at this point */
-	kfree(dev->memory);
-	dev->memory = ...
From: Michael S. Tsirkin
Date: Tuesday, May 4, 2010 - 2:39 pm

This should be rcu_dereference_const as well: it is called

This is called when there can be no active readers, so the smp_wmb
inside rcu_assign_pointer isn't really needed.


rcu_dereference_const. This is called under vq mutex and the comment
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 4:57 pm

How about the following?

	struct socket *sock;
	
	sock = rcu_dereference_protected(vq->private_data,
					 lockdep_is_held(&vq->mutex));

This could be used for some (though not all) of these situations.

And just so you know...  The fact that this is here in the first
place is actually my mistake -- my intention was to include the __rcu
annotations and nothing else, then follow up with bug fixes.  In fact,
the alert reader will have noted that there is in fact no such thing
as rcu_dereference_const().  And have concluded that none of my test
machines use vhost.  :-/

But as long as we are here, might as well complete the annotation...

So I have inserted guesses for the lockdep_is_held() expressions below

	sock = rcu_dereference_protected(vq->private_data,

	oldsock = rcu_dereference_protected(vq->private_data,
					    lockdep_is_held(&vq->mutex));

Though I can't say I see where this lock is actually acquired in this


	kfree(rcu_dereference_protected(dev->memory,

Fixed -- any in any case, we can always use RCU_INIT_POINTER() when

	return memory_access_ok(dev, rcu_dereference_protected(dev->memory, lockdep_is_held(&dev->mutex)), 1);

And yes, we do need an rcu_dereference_vqdev() wrapper function, but just
want to identify the mutexes for the moment.



	oldmem = rcu_dereference_protected(d->memory,
--

From: Michael S. Tsirkin
Date: Tuesday, May 4, 2010 - 4:59 pm

I'll go over it. Could you point me to documentation to the API

Just above:

       mutex_lock(&vq->mutex);

        /* Verify that ring has been setup correctly. */
        if (!vhost_vq_access_ok(vq)) {
                r = -EFAULT;
                goto err_vq;
        }
        sock = get_socket(fd);
        if (IS_ERR(sock)) {
                r = PTR_ERR(sock);
                goto err_vq;
        }

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 5:39 pm

Documentation/RCU/whatisRCU.txt in current mainline lists the APIs.
This patchset, especially 31/48, gets the in-kernel docbook into

Ah!  Color me blind, as usual!  ;-)

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Lai Jiangshan <laijs@cn.fujitsu.com>

Shrink the RCU_INIT_FLAVOR() macro by moving all but the initialization
of the ->rda[] array to rcu_init_one().  The call to rcu_init_one()
can then be moved to the end of the RCU_INIT_FLAVOR() macro, which is
required because rcu_boot_init_percpu_data(), which is now called from
rcu_init_one(), depends on the initialization of the ->rda[] array.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 6042fb8..86bb949 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1859,6 +1859,14 @@ static void __init rcu_init_one(struct rcu_state *rsp)
 			INIT_LIST_HEAD(&rnp->blocked_tasks[3]);
 		}
 	}
+
+	rnp = rsp->level[NUM_RCU_LVLS - 1];
+	for_each_possible_cpu(i) {
+		if (i > rnp->grphi)
+			rnp++;
+		rsp->rda[i]->mynode = rnp;
+		rcu_boot_init_percpu_data(i, rsp);
+	}
 }
 
 /*
@@ -1869,19 +1877,11 @@ static void __init rcu_init_one(struct rcu_state *rsp)
 #define RCU_INIT_FLAVOR(rsp, rcu_data) \
 do { \
 	int i; \
-	int j; \
-	struct rcu_node *rnp; \
 	\
-	rcu_init_one(rsp); \
-	rnp = (rsp)->level[NUM_RCU_LVLS - 1]; \
-	j = 0; \
 	for_each_possible_cpu(i) { \
-		if (i > rnp[j].grphi) \
-			j++; \
-		per_cpu(rcu_data, i).mynode = &rnp[j]; \
 		(rsp)->rda[i] = &per_cpu(rcu_data, i); \
-		rcu_boot_init_percpu_data(i, rsp); \
 	} \
+	rcu_init_one(rsp); \
 } while (0)
 
 void __init rcu_init(void)
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@relay.de.ibm.com>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |    2 +-
 include/linux/kvm_host.h        |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 06d9e79..9c65faf 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -384,7 +384,7 @@ struct kvm_mem_aliases {
 };
 
 struct kvm_arch {
-	struct kvm_mem_aliases *aliases;
+	struct kvm_mem_aliases __rcu *aliases;
 
 	unsigned int n_free_mmu_pages;
 	unsigned int n_requested_mmu_pages;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 08fe794..282b041 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -169,7 +169,7 @@ struct kvm {
 	raw_spinlock_t requests_lock;
 	struct mutex slots_lock;
 	struct mm_struct *mm; /* userspace tied to this vm */
-	struct kvm_memslots *memslots;
+	struct kvm_memslots __rcu *memslots;
 	struct srcu_struct srcu;
 #ifdef CONFIG_KVM_APIC_ARCHITECTURE
 	u32 bsp_vcpu_id;
@@ -179,7 +179,7 @@ struct kvm {
 	atomic_t online_vcpus;
 	struct list_head vm_list;
 	struct mutex lock;
-	struct kvm_io_bus *buses[KVM_NR_BUSES];
+	struct kvm_io_bus __rcu *buses[KVM_NR_BUSES];
 #ifdef CONFIG_HAVE_KVM_EVENTFD
 	struct {
 		spinlock_t        lock;
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

This avoids warnings from missing __rcu annotations
in the rculist implementation, making it possible to
use the same lists in both RCU and non-RCU cases.

We can add rculist annotations later, together with
lockdep support for rculist, which is missing as well,
but that may involve changing all the users.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
---
 include/linux/rculist.h       |   52 ++++++++++++++++++++++++++---------------
 include/linux/rculist_nulls.h |   16 ++++++++----
 kernel/pid.c                  |    2 +-
 3 files changed, 45 insertions(+), 25 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 2c9b46c..2bf07e4 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -10,6 +10,12 @@
 #include <linux/rcupdate.h>
 
 /*
+ * return the ->next pointer of a list_head in an rcu safe
+ * way, we must not access it directly
+ */
+#define list_next_rcu(list)	(*((struct list_head __rcu **)(&(list)->next)))
+
+/*
  * Insert a new entry between two known consecutive entries.
  *
  * This is only for internal list manipulation where we know
@@ -20,7 +26,7 @@ static inline void __list_add_rcu(struct list_head *new,
 {
 	new->next = next;
 	new->prev = prev;
-	rcu_assign_pointer(prev->next, new);
+	rcu_assign_pointer(list_next_rcu(prev), new);
 	next->prev = new;
 }
 
@@ -138,7 +144,7 @@ static inline void list_replace_rcu(struct list_head *old,
 {
 	new->next = old->next;
 	new->prev = old->prev;
-	rcu_assign_pointer(new->prev->next, new);
+	rcu_assign_pointer(list_next_rcu(new->prev), new);
 	new->next->prev = new;
 	old->prev = LIST_POISON2;
 }
@@ -193,7 +199,7 @@ static inline void list_splice_init_rcu(struct list_head *list,
 	 */
 
 	last->next = at;
-	rcu_assign_pointer(head->next, ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Lai Jiangshan <laijs@cn.fujitsu.com>

Offline CPUs are not in nohz_cpu_mask, but can be ignored when checking
for the last non-dyntick-idle CPU.  This patch therefore only checks
online CPUs for not being dyntick idle, allowing fast entry into
full-system dyntick-idle state even when there are some offline CPUs.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_plugin.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 79b53bd..687c4e9 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1016,7 +1016,7 @@ int rcu_needs_cpu(int cpu)
 
 	/* Don't bother unless we are the last non-dyntick-idle CPU. */
 	for_each_cpu_not(thatcpu, nohz_cpu_mask)
-		if (thatcpu != cpu) {
+		if (cpu_online(thatcpu) && thatcpu != cpu) {
 			per_cpu(rcu_dyntick_drain, cpu) = 0;
 			per_cpu(rcu_dyntick_holdoff, cpu) = jiffies - 1;
 			return rcu_needs_cpu_quick_check(cpu);
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
---
 include/linux/kvm_host.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 169d077..08fe794 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -197,7 +197,7 @@ struct kvm {
 
 	struct mutex irq_lock;
 #ifdef CONFIG_HAVE_KVM_IRQCHIP
-	struct kvm_irq_routing_table *irq_routing;
+	struct kvm_irq_routing_table __rcu *irq_routing;
 	struct hlist_head mask_notifier_list;
 	struct hlist_head irq_ack_notifier_list;
 #endif
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The existing Documentation/RCU/stallwarn.txt has proven unhelpful, so
rework it a bit.  In particular, show how to interpret the stall-warning
messages.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/stallwarn.txt |   94 +++++++++++++++++++++++++++++---------
 1 files changed, 71 insertions(+), 23 deletions(-)

diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 1423d25..44c6dcc 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -3,35 +3,79 @@ Using RCU's CPU Stall Detector
 The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables
 RCU's CPU stall detector, which detects conditions that unduly delay
 RCU grace periods.  The stall detector's idea of what constitutes
-"unduly delayed" is controlled by a pair of C preprocessor macros:
+"unduly delayed" is controlled by a set of C preprocessor macros:
 
 RCU_SECONDS_TILL_STALL_CHECK
 
 	This macro defines the period of time that RCU will wait from
 	the beginning of a grace period until it issues an RCU CPU
-	stall warning.	It is normally ten seconds.
+	stall warning.	This time period is normally ten seconds.
 
 RCU_SECONDS_TILL_STALL_RECHECK
 
 	This macro defines the period of time that RCU will wait after
-	issuing a stall warning until it issues another stall warning.
-	It is normally set to thirty seconds.
+	issuing a stall warning until it issues another stall warning
+	for the same stall.  This time period is normally set to thirty
+	seconds.
 
 RCU_STALL_RAT_DELAY
 
-	The CPU stall detector tries to make the offending CPU rat on itself,
-	as this often gives better-quality stack traces.  However, if
-	the offending CPU does not detect its own stall in the number
-	of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will
-	complain.  This is normally set to two jiffies.
+	The CPU stall detector tries to make the offending CPU print its
+	own warnings, as this often gives better-quality stack ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
---
 drivers/input/evdev.c |    2 +-
 include/linux/input.h |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 2ee6c7a..73b1208 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -28,7 +28,7 @@ struct evdev {
 	int minor;
 	struct input_handle handle;
 	wait_queue_head_t wait;
-	struct evdev_client *grab;
+	struct evdev_client __rcu *grab;
 	struct list_head client_list;
 	spinlock_t client_lock; /* protects client_list */
 	struct mutex mutex;
diff --git a/include/linux/input.h b/include/linux/input.h
index 7ed2251..850b6b7 100644
--- a/include/linux/input.h
+++ b/include/linux/input.h
@@ -1173,7 +1173,7 @@ struct input_dev {
 	int (*flush)(struct input_dev *dev, struct file *file);
 	int (*event)(struct input_dev *dev, unsigned int type, unsigned int code, int value);
 
-	struct input_handle *grab;
+	struct input_handle __rcu *grab;
 
 	spinlock_t event_lock;
 	struct mutex mutex;
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
---
 include/linux/idr.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/idr.h b/include/linux/idr.h
index e968db7..cdb715e 100644
--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -50,14 +50,14 @@
 
 struct idr_layer {
 	unsigned long		 bitmap; /* A zero bit means "space here" */
-	struct idr_layer	*ary[1<<IDR_BITS];
+	struct idr_layer __rcu	*ary[1<<IDR_BITS];
 	int			 count;	 /* When zero, we can release it */
 	int			 layer;	 /* distance from leaf */
 	struct rcu_head		 rcu_head;
 };
 
 struct idr {
-	struct idr_layer *top;
+	struct idr_layer __rcu *top;
 	struct idr_layer *id_free;
 	int		  layers; /* only valid without concurrent changes */
 	int		  id_free_cnt;
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
---
 fs/nfs/delegation.h             |    2 +-
 include/linux/nfs_fs.h          |    2 +-
 include/linux/sunrpc/auth_gss.h |    4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h
index 69e7b81..2978814 100644
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -14,7 +14,7 @@
  */
 struct nfs_delegation {
 	struct list_head super_list;
-	struct rpc_cred *cred;
+	struct rpc_cred __rcu *cred;
 	struct inode *inode;
 	nfs4_stateid stateid;
 	fmode_t type;
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 1a0b85a..3cc4fb6 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -178,7 +178,7 @@ struct nfs_inode {
 	struct nfs4_cached_acl	*nfs4_acl;
         /* NFSv4 state */
 	struct list_head	open_states;
-	struct nfs_delegation	*delegation;
+	struct nfs_delegation __rcu *delegation;
 	fmode_t			 delegation_state;
 	struct rw_semaphore	rwsem;
 #endif /* CONFIG_NFS_V4*/
diff --git a/include/linux/sunrpc/auth_gss.h b/include/linux/sunrpc/auth_gss.h
index d48d4e6..994db5a 100644
--- a/include/linux/sunrpc/auth_gss.h
+++ b/include/linux/sunrpc/auth_gss.h
@@ -69,7 +69,7 @@ struct gss_cl_ctx {
 	enum rpc_gss_proc	gc_proc;
 	u32			gc_seq;
 	spinlock_t		gc_seq_lock;
-	struct gss_ctx		*gc_gss_ctx;
+	struct gss_ctx __rcu	*gc_gss_ctx;
 	struct xdr_netobj	gc_wire_ctx;
 	u32			gc_win;
 	unsigned long		gc_expiry;
@@ -80,7 +80,7 @@ struct gss_upcall_msg;
 struct gss_cred {
 	struct rpc_cred		gc_base;
 	enum rpc_gss_svc	gc_service;
-	struct gss_cl_ctx	*gc_ctx;
+	struct gss_cl_ctx __rcu	*gc_ctx;
 	struct gss_upcall_msg	*gc_upcall;
 	unsigned char		gc_machine_cred : 1;
 };
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Implement a basic state machine checker in the debugobjects.

This state machine checker detects races and inconsistencies within the "active"
life of a debugobject. The checker only keeps track of the current state; all
the state machine logic is kept at the object instance level.

The checker works by adding a supplementary "unsigned int astate" field to the
debug_obj structure. It keeps track of the current "active state" of the object.

The only constraints that are imposed on the states by the debugobjects system
is that:

- activation of an object sets the current active state to 0,
- deactivation of an object expects the current active state to be 0.

For the rest of the states, the state mapping is determined by the specific
object instance. Therefore, the logic keeping track of the state machine is
within the specialized instance, without any need to know about it at the
debugobject level.

The current object active state is changed by calling:

debug_object_active_state(addr, descr, expect, next)

where "expect" is the expected state and "next" is the next state to move to if
the expected state is found. A warning is generated if the expected is not
found.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/debugobjects.h |   11 ++++++++
 lib/debugobjects.c           |   59 +++++++++++++++++++++++++++++++++++++++--
 2 files ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
---
 include/linux/notifier.h |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index fee6c2f..f05f5e4 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -49,28 +49,28 @@
 
 struct notifier_block {
 	int (*notifier_call)(struct notifier_block *, unsigned long, void *);
-	struct notifier_block *next;
+	struct notifier_block __rcu *next;
 	int priority;
 };
 
 struct atomic_notifier_head {
 	spinlock_t lock;
-	struct notifier_block *head;
+	struct notifier_block __rcu *head;
 };
 
 struct blocking_notifier_head {
 	struct rw_semaphore rwsem;
-	struct notifier_block *head;
+	struct notifier_block __rcu *head;
 };
 
 struct raw_notifier_head {
-	struct notifier_block *head;
+	struct notifier_block __rcu *head;
 };
 
 struct srcu_notifier_head {
 	struct mutex mutex;
 	struct srcu_struct srcu;
-	struct notifier_block *head;
+	struct notifier_block __rcu *head;
 };
 
 #define ATOMIC_INIT_NOTIFIER_HEAD(name) do {	\
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The sparse RCU-pointer annotations require definition of the
underlying type of any pointer passed to rcu_dereference() and friends.
So fcheck_files() needs "struct file" to be defined, so include fs.h.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
 include/linux/fdtable.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index d147461..f59ed29 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -11,6 +11,7 @@
 #include <linux/rcupdate.h>
 #include <linux/types.h>
 #include <linux/init.h>
+#include <linux/fs.h>
 
 #include <asm/atomic.h>
 
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Testing resulted in the following debugobjects splat
[...]
ODEBUG: object is on stack, but not annotated
------------[ cut here ]------------
Badness at lib/debugobjects.c:294
NIP: c0000000002c76f0 LR: c0000000002c76ec CTR: c00000000041ecd8
REGS: c0000001de71b280 TRAP: 0700   Tainted: G        W   (2.6.34-rc3-autokern1)
MSR: 8000000000029032 <EE,ME,CE,IR,DR>  CR: 24000424  XER: 0000000f
TASK = c0000001de7dca00[3695] 'arping' THREAD: c0000001de718000 CPU: 1
GPR00: c0000000002c76ec c0000001de71b500 c00000000096c048 0000000000000034
GPR04: 0000000000000001 c000000000063918 0000000000000000 0000000000000002
GPR08: 0000000000000003 0000000000000000 c000000000086f68 c0000001de7dca00
GPR12: 000000000000256d c0000000074e4200 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000201b8f60
GPR20: 00000000201b8f70 00000000201b8f48 0000000000000000 c0000000008766b8
GPR24: c0000001de71b800 0000000000000001 c0000000008ad400 c000000001247478
GPR28: c0000000e6abb8c0 c0000000e6abb8c0 c000000000904570 c000000001247470
NIP [c0000000002c76f0] .__debug_object_init+0x314/0x40c
LR [c0000000002c76ec] .__debug_object_init+0x310/0x40c
Call Trace:
[c0000001de71b500] [c0000000002c76ec] .__debug_object_init+0x310/0x40c (unreliable)
[c0000001de71b5d0] [c00000000007d990] .rcuhead_fixup_activate+0x40/0xdc
[c0000001de71b660] [c0000000002c6a7c] .debug_object_fixup+0x4c/0x74
[c0000001de71b6f0] [c0000000000c5e54] .__call_rcu+0x3c/0x1d4
[c0000001de71b790] [c0000000000c6050] .synchronize_rcu+0x4c/0x6c
[c0000001de71b870] [c0000000004be218] .synchronize_net+0x10/0x24
[c0000001de71b8e0] [c0000000005498c8] .packet_release+0x1d4/0x274
[c0000001de71b990] [c0000000004ac1f0] .sock_release+0x54/0x124
[c0000001de71ba20] [c0000000004ac9e4] .sock_close+0x34/0x4c
[c0000001de71baa0] [c00000000012469c] .__fput+0x174/0x264
[c0000001de71bb40] [c000000000120c54] .filp_close+0xb0/0xd8
[c0000001de71bbd0] [c000000000065e70] ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 include/linux/perf_event.h |    6 +++---
 include/linux/sched.h      |    2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8e3754..48132eb 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -582,7 +582,7 @@ struct perf_event {
 	int				nr_siblings;
 	int				group_flags;
 	struct perf_event		*group_leader;
-	struct perf_event		*output;
+	struct perf_event __rcu		*output;
 	const struct pmu		*pmu;
 
 	enum perf_event_active_state	state;
@@ -643,7 +643,7 @@ struct perf_event {
 	/* mmap bits */
 	struct mutex			mmap_mutex;
 	atomic_t			mmap_count;
-	struct perf_mmap_data		*data;
+	struct perf_mmap_data __rcu	*data;
 
 	/* poll related */
 	wait_queue_head_t		waitq;
@@ -710,7 +710,7 @@ struct perf_event_context {
 	 * These fields let us detect when two contexts have both
 	 * been cloned (inherited) from a common ancestor.
 	 */
-	struct perf_event_context	*parent_ctx;
+	struct perf_event_context __rcu *parent_ctx;
 	u64				parent_gen;
 	u64				generation;
 	int				pin_count;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 74b3125..34d28f7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1442,7 +1442,7 @@ struct task_struct {
 	struct futex_pi_state *pi_state_cache;
 #endif
 #ifdef CONFIG_PERF_EVENTS
-	struct perf_event_context *perf_event_ctxp;
+	struct perf_event_context __rcu *perf_event_ctxp;
 	struct mutex perf_event_mutex;
 	struct list_head perf_event_list;
 #endif
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/bnx2.h         |    2 +-
 drivers/net/bnx2x.h        |    2 +-
 drivers/net/cnic.h         |    2 +-
 drivers/net/macvtap.c      |    2 +-
 include/linux/if_macvlan.h |    2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bnx2.h b/drivers/net/bnx2.h
index cd4b0e4..7bdb1cb 100644
--- a/drivers/net/bnx2.h
+++ b/drivers/net/bnx2.h
@@ -6746,7 +6746,7 @@ struct bnx2 {
 	u32		tx_wake_thresh;
 
 #ifdef BCM_CNIC
-	struct cnic_ops		*cnic_ops;
+	struct cnic_ops	__rcu	*cnic_ops;
 	void			*cnic_data;
 #endif
 
diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index 3c48a7a..9dfb57b 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -1007,7 +1007,7 @@ struct bnx2x {
 	dma_addr_t		timers_mapping;
 	void			*qm;
 	dma_addr_t		qm_mapping;
-	struct cnic_ops		*cnic_ops;
+	struct cnic_ops __rcu	*cnic_ops;
 	void			*cnic_data;
 	u32			cnic_tag;
 	struct cnic_eth_dev	cnic_eth_dev;
diff --git a/drivers/net/cnic.h b/drivers/net/cnic.h
index a0d853d..9852375 100644
--- a/drivers/net/cnic.h
+++ b/drivers/net/cnic.h
@@ -177,7 +177,7 @@ struct cnic_local {
 #define ULP_F_INIT	0
 #define ULP_F_START	1
 #define ULP_F_CALL_PENDING	2
-	struct cnic_ulp_ops *ulp_ops[MAX_CNIC_ULP_TYPE];
+	struct cnic_ulp_ops __rcu *ulp_ops[MAX_CNIC_ULP_TYPE];
 
 	/* protected by ulp_lock */
 	u32 cnic_local_flags;
diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index abba3cc..adf0145 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -37,7 +37,7 @@
 struct macvtap_queue {
 	struct sock sk;
 	struct socket sock;
-	struct macvlan_dev *vlan;
+	struct macvlan_dev __rcu *vlan;
 	struct file *file;
 	unsigned int flags;
 };
diff --git a/include/linux/if_macvlan.h ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The new versions of the rcu_dereference() APIs requires that any pointers
passed to one of these APIs be fully defined.  The ->br_port field
in struct net_device points to a struct net_bridge_port, which is an
incomplete type.  This commit therefore changes ->br_port to be a void*,
and introduces a br_port() helper function to convert the type to struct
net_bridge_port, and applies this new helper function where required.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/if_bridge.h           |    1 +
 net/bridge/br_fdb.c                 |    2 +-
 net/bridge/br_private.h             |    8 ++++++++
 net/bridge/netfilter/ebt_redirect.c |    2 +-
 net/bridge/netfilter/ebt_ulog.c     |    4 ++--
 net/bridge/netfilter/ebtables.c     |    4 ++--
 net/netfilter/nfnetlink_log.c       |    4 ++--
 net/netfilter/nfnetlink_queue.c     |    4 ++--
 8 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 938b7e8..28a4559 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -101,6 +101,7 @@ struct __fdb_entry {
 
 #include <linux/netdevice.h>
 
+struct net_bridge_port;
 extern void brioctl_set(int (*ioctl_hook)(struct net *, unsigned int, void __user *));
 extern struct sk_buff *(*br_handle_frame_hook)(struct net_bridge_port *p,
 					       struct sk_buff *skb);
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 9101a4e..3f66cd1 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -246,7 +246,7 @@ int br_fdb_test_addr(struct net_device *dev, unsigned char *addr)
 		return 0;
 
 	rcu_read_lock();
-	fdb = __br_fdb_get(dev->br_port->br, addr);
+	fdb = __br_fdb_get(br_port(dev)->br, addr);
 	ret = fdb && fdb->dst->dev != dev &&
 		fdb->dst->state == ...
From: Stephen Hemminger
Date: Tuesday, May 4, 2010 - 2:26 pm

On Tue,  4 May 2010 13:19:38 -0700

I would rather make the bridge hook generic and not take a type argument.
--

From: Arnd Bergmann
Date: Tuesday, May 4, 2010 - 2:41 pm

Not sure if you were confused by the comment in the same way that I was.

The bridge hook is not impacted by this at all, since we can either pass
a void* or a struct net_bridge_port* to it. The br_port() helper
is used for all the places where we actually want to dereference 
dev->br_port and access its contents.

	Arnd
--

From: Paul E. McKenney
Date: Wednesday, May 5, 2010 - 3:03 pm

What should I change in the commit message to clear this up?

Of course, if the code needs to change, please let me know what should
change there as well.

							Thanx, Paul
--

From: Arnd Bergmann
Date: Thursday, May 6, 2010 - 7:09 am

I think it's both ok, I was mostly confused by the discussion we had earlier.
Maybe add a sentence like:

 The br_handle_frame_hook now needs a forward declaration of struct net_bridge_port.

Or you just change br_handle_frame_hook to take a void* to avoid the forward
declaration. Not sure what Stephen was referring to really.

	Arnd
--

From: Paul E. McKenney
Date: Thursday, May 6, 2010 - 4:12 pm

This sounds like a way to make things quite a bit more intrusive, so
holding off on this.

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

This patch adds a check to __rcu_pending() that does a local
set_need_resched() if the current CPU is holding up the current grace
period and if force_quiescent_state() will be called soon.  The goal is
to reduce the probability that force_quiescent_state() will need to do
smp_send_reschedule(), which sends an IPI and is therefore more expensive
on most architectures.

Signed-off-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 3ec8160..e54c123 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1499,6 +1499,16 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
 
 	/* Is the RCU core waiting for a quiescent state from this CPU? */
 	if (rdp->qs_pending) {
+
+		/*
+		 * If force_quiescent_state() coming soon and this CPU
+		 * needs a quiescent state, and this is either RCU-sched
+		 * or RCU-bh, force a local reschedule.
+		 */
+		if (!rdp->preemptable &&
+		    ULONG_CMP_LT(ACCESS_ONCE(rsp->jiffies_force_qs) - 1,
+				 jiffies))
+			set_need_resched();
 		rdp->n_rp_qs_pending++;
 		return 1;
 	}
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The existing RCU CPU stall-warning messages can be confusing, especially
in the case where one CPU detects a single other stalled CPU.  In addition,
the console messages did not say which flavor of RCU detected the stall,
which can make it difficult to work out exactly what is causing the stall.
This commit improves these messages.

Requested-by: Dhaval Giani <dhaval.giani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   20 +++++++++++---------
 kernel/rcutree.h |    1 +
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index ec6196f..f391886 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -54,8 +54,8 @@
 
 static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
 
-#define RCU_STATE_INITIALIZER(name) { \
-	.level = { &name.node[0] }, \
+#define RCU_STATE_INITIALIZER(structname) { \
+	.level = { &structname.node[0] }, \
 	.levelcnt = { \
 		NUM_RCU_LVL_0,  /* root of hierarchy. */ \
 		NUM_RCU_LVL_1, \
@@ -66,13 +66,14 @@ static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
 	.signaled = RCU_GP_IDLE, \
 	.gpnum = -300, \
 	.completed = -300, \
-	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&name.onofflock), \
+	.onofflock = __RAW_SPIN_LOCK_UNLOCKED(&structname.onofflock), \
 	.orphan_cbs_list = NULL, \
-	.orphan_cbs_tail = &name.orphan_cbs_list, \
+	.orphan_cbs_tail = &structname.orphan_cbs_list, \
 	.orphan_qlen = 0, \
-	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&name.fqslock), \
+	.fqslock = __RAW_SPIN_LOCK_UNLOCKED(&structname.fqslock), \
 	.n_force_qs = 0, \
 	.n_force_qs_ngp = 0, \
+	.name = #structname, \
 }
 
 struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched_state);
@@ -483,7 +484,8 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 
 	/* OK, time to rat on our buddy... */
 
-	printk(KERN_ERR "INFO: RCU detected CPU stalls:");
+	printk(KERN_ERR "INFO: %s detected stalls on CPUs/tasks: {",
+	       rsp->name);
 ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@relay.de.ibm.com>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/igmp.h               |    4 ++--
 include/linux/netdevice.h          |   12 ++++++------
 include/net/dst.h                  |    2 +-
 include/net/fib_rules.h            |    2 +-
 include/net/garp.h                 |    2 +-
 include/net/inet_sock.h            |    2 +-
 include/net/ip6_tunnel.h           |    2 +-
 include/net/ipip.h                 |    6 +++---
 include/net/net_namespace.h        |    2 +-
 include/net/netns/xfrm.h           |    2 +-
 include/net/sock.h                 |    4 ++--
 kernel/sched.c                     |    2 +-
 net/802/stp.c                      |    4 ++--
 net/ipv4/ip_gre.c                  |    2 +-
 net/ipv4/ipip.c                    |   10 +++++-----
 net/ipv4/protocol.c                |    2 +-
 net/ipv4/route.c                   |    2 +-
 net/ipv4/tcp.c                     |    4 ++--
 net/ipv6/ip6_tunnel.c              |    6 +++---
 net/ipv6/protocol.c                |    2 +-
 net/ipv6/sit.c                     |   10 +++++-----
 net/mac80211/ieee80211_i.h         |   15 ++++++++-------
 net/mac80211/sta_info.h            |    4 ++--
 net/netlabel/netlabel_domainhash.c |    4 ++--
 net/netlabel/netlabel_unlabeled.c  |    4 ++--
 net/netlink/af_netlink.c           |    2 +-
 net/phonet/af_phonet.c             |    2 +-
 net/phonet/pn_dev.c                |    2 +-
 net/socket.c                       |    2 +-
 29 files changed, 60 insertions(+), 59 deletions(-)

diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index 93fc244..39dd315 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -167,10 +167,10 @@ struct ip_sf_socklist {
  */
 
 struct ip_mc_socklist {
-	struct ip_mc_socklist	*next;
+	struct ip_mc_socklist __rcu *next;
 	struct ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

Make naming line up in preparation for CONFIG_TINY_PREEMPT_RCU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutiny.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 272c6d2..d9f8a62 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -44,9 +44,9 @@ struct rcu_ctrlblk {
 };
 
 /* Definition for rcupdate control block. */
-static struct rcu_ctrlblk rcu_ctrlblk = {
-	.donetail	= &rcu_ctrlblk.rcucblist,
-	.curtail	= &rcu_ctrlblk.rcucblist,
+static struct rcu_ctrlblk rcu_sched_ctrlblk = {
+	.donetail	= &rcu_sched_ctrlblk.rcucblist,
+	.curtail	= &rcu_sched_ctrlblk.rcucblist,
 };
 
 static struct rcu_ctrlblk rcu_bh_ctrlblk = {
@@ -108,7 +108,8 @@ static int rcu_qsctr_help(struct rcu_ctrlblk *rcp)
  */
 void rcu_sched_qs(int cpu)
 {
-	if (rcu_qsctr_help(&rcu_ctrlblk) + rcu_qsctr_help(&rcu_bh_ctrlblk))
+	if (rcu_qsctr_help(&rcu_sched_ctrlblk) +
+	    rcu_qsctr_help(&rcu_bh_ctrlblk))
 		raise_softirq(RCU_SOFTIRQ);
 }
 
@@ -173,7 +174,7 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
  */
 static void rcu_process_callbacks(struct softirq_action *unused)
 {
-	__rcu_process_callbacks(&rcu_ctrlblk);
+	__rcu_process_callbacks(&rcu_sched_ctrlblk);
 	__rcu_process_callbacks(&rcu_bh_ctrlblk);
 }
 
@@ -221,7 +222,7 @@ static void __call_rcu(struct rcu_head *head,
  */
 void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu))
 {
-	__call_rcu(head, func, &rcu_ctrlblk);
+	__call_rcu(head, func, &rcu_sched_ctrlblk);
 }
 EXPORT_SYMBOL_GPL(call_rcu);
 
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Lai Jiangshan <laijs@cn.fujitsu.com>

There is no need to disable lockdep after an RCU lockdep splat,
so remove the debug_lockdeps_off() from lockdep_rcu_dereference().
To avoid repeated lockdep splats, use a static variable in the inlined
rcu_dereference_check() and rcu_dereference_protected() macros so that
a given instance splats only once, but so that multiple instances can
be detected per boot.

This is controlled by a new config variable CONFIG_PROVE_RCU_REPEATEDLY,
which is disabled by default.  This provides the normal lockdep behavior
by default, but permits people who want to find multiple RCU-lockdep
splats per boot to easily do so.

Requested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    6 ++----
 kernel/lockdep.c         |    2 ++
 lib/Kconfig.debug        |   12 ++++++++++++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index a8b2e03..4dca275 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -230,8 +230,7 @@ extern int rcu_my_thread_group_empty(void);
  */
 #define rcu_dereference_check(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		rcu_dereference_raw(p); \
 	})
 
@@ -248,8 +247,7 @@ extern int rcu_my_thread_group_empty(void);
  */
 #define rcu_dereference_protected(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		(p); \
 	})
 
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 2594e1c..73747b7 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3801,8 +3801,10 @@ void lockdep_rcu_dereference(const char *file, const int line)
 {
 	struct task_struct *curr = ...
From: Mathieu Desnoyers
Date: Wednesday, May 5, 2010 - 3:46 pm

I'll play the devil's advocate here. (just because that's so much fun)
;-)

If we look at:

include/linux/debug_locks.h:

static inline int __debug_locks_off(void)
{
        return xchg(&debug_locks, 0);
}

We see that all code following a call to "debug_locks_off()" can assume
that it cannot possibly run concurrently with other code following
"debug_locks_off()". Now, I'm not saying that the code we currently have
will necessarily break, but I think it is worth asking if there is any
assumption in lockdep (or RCU lockdep more specifically) about mutual
exclusion after debug_locks_off() ?

Because if there is, then this patch is definitely breaking something by
not protecting lockdep against multiple concurrent executions.

Thanks,


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--

From: Paul E. McKenney
Date: Wednesday, May 5, 2010 - 4:05 pm

So what in lockdep_rcu_dereference() needs to be protected from concurrent
execution?  All that happens beyond that point is a bunch of printk()s,
printing the locks held by this task, and dumping this task's stack.

							Thanx, Paul
--

From: Mathieu Desnoyers
Date: Wednesday, May 5, 2010 - 4:24 pm

I agree with you that printing the current task information should be safe.
However, I am not sure that concurrent updates to the lock_class while printk()
is showing its information will end up doing what we expect it to do.

It could be acceptable to have unreliable information in these rare cases, but
the important thing would be to ensure that the kernel does not OOPS.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--

From: Paul E. McKenney
Date: Wednesday, May 5, 2010 - 4:36 pm

But any races other than the printk()s can already happen as follows:

o	CPU 0 needs to update some information about the lock.  It
	checks debug_locks and finds that it is non-zero.

o	CPU 1 detects a deadlock, and invokes __debug_locks_off(),
	which atomically sets debug_locks to zero.

o	CPU 1 then proceeds to printk() information that CPU 0
	is concurrently modifying.  Which looks to be OK in any case.

Or is there some other race that cannot already happen that I am
introducing?

							Thanx, Paul
--

From: Mathieu Desnoyers
Date: Wednesday, May 5, 2010 - 7:05 pm

Nope, I don't think so. Although it's probably worth putting a comment in
lockdep_rcu_dereference() to state that lockdep can be used by multiple
concurrent instances here, just in case someone ever consider adding code
to this splat handler thinking lockdep is always only used by a single "splat"
at a time.

Thanks,


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--

From: Paul E. McKenney
Date: Thursday, May 6, 2010 - 4:09 pm

Done!

							Thanx, Paul
--

From: Matt Mackall
Date: Tuesday, May 4, 2010 - 1:27 pm

Please document that distinction at the definition of INIT_RCU_HEADS,
otherwise we're sure to see creep.

-- 
Mathematics is the supreme nostalgia of our time.


--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:36 pm

Hello, Matt,

RCU_HEAD_INIT() is no more.  Or will be no more after this patchset.
However, now that you mention it, we do need docbook comments for
init_rcu_head_on_stack() and friends.  I will fix that.

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
---
 include/linux/radix-tree.h |    4 +++-
 lib/radix-tree.c           |    2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index 55ca73c..d801044 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -47,6 +47,8 @@ static inline void *radix_tree_indirect_to_ptr(void *ptr)
 {
 	return (void *)((unsigned long)ptr & ~RADIX_TREE_INDIRECT_PTR);
 }
+#define radix_tree_indirect_to_ptr(ptr) \
+	radix_tree_indirect_to_ptr((void __force *)(ptr))
 
 static inline int radix_tree_is_indirect_ptr(void *ptr)
 {
@@ -61,7 +63,7 @@ static inline int radix_tree_is_indirect_ptr(void *ptr)
 struct radix_tree_root {
 	unsigned int		height;
 	gfp_t			gfp_mask;
-	struct radix_tree_node	*rnode;
+	struct radix_tree_node	__rcu *rnode;
 };
 
 #define RADIX_TREE_INIT(mask)	{					\
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 2a087e0..08f86cc 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -49,7 +49,7 @@ struct radix_tree_node {
 	unsigned int	height;		/* Height from the bottom */
 	unsigned int	count;
 	struct rcu_head	rcu_head;
-	void		*slots[RADIX_TREE_MAP_SIZE];
+	void __rcu	*slots[RADIX_TREE_MAP_SIZE];
 	unsigned long	tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS];
 };
 
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Remove all rcu head inits. We don't care about the RCU head state before passing
it to call_rcu() anyway. Only leave the "on_stack" variants so debugobjects can
keep track of objects on stack.

This patch applies to current -tip based on 2.6.34-rc2.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: tglx@linutronix.de
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
---
 arch/powerpc/mm/pgtable.c                   |    1 -
 block/cfq-iosched.c                         |    1 -
 block/genhd.c                               |    1 -
 drivers/staging/batman-adv/hard-interface.c |    1 -
 fs/file.c                                   |    3 --
 fs/fs-writeback.c                           |   31 ++++++++++++++++++++++----
 fs/partitions/check.c                       |    1 -
 include/linux/init_task.h                   |    1 -
 kernel/rcutiny.c                            |    6 +++++
 kernel/rcutorture.c                         |    2 +
 kernel/rcutree.c                            |    4 +++
 ...
From: James Morris
Date: Tuesday, May 4, 2010 - 4:44 pm

Reviewed-by: James Morris <jmorris@namei.org>

-- 
James Morris
<jmorris@namei.org>
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 5:03 pm

Thank you, James!

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The current version of RCU_FAST_NO_HZ reproduces the old CLASSIC_RCU
dyntick-idle bug, as it fails to detect CPUs that have interrupted
or NMIed out of dyntick-idle mode.  Fix this by making rcu_needs_cpu()
check the state in the per-CPU rcu_dynticks variables, thus correctly
detecting the dyntick-idle state from an RCU perspective.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_plugin.h |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 5c599e8..92e5c9b 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1053,6 +1053,8 @@ static DEFINE_PER_CPU(unsigned long, rcu_dyntick_holdoff);
 int rcu_needs_cpu(int cpu)
 {
 	int c = 0;
+	int snap;
+	int snap_nmi;
 	int thatcpu;
 
 	/* Check for being in the holdoff period. */
@@ -1060,12 +1062,18 @@ int rcu_needs_cpu(int cpu)
 		return rcu_needs_cpu_quick_check(cpu);
 
 	/* Don't bother unless we are the last non-dyntick-idle CPU. */
-	for_each_cpu_not(thatcpu, nohz_cpu_mask)
-		if (cpu_online(thatcpu) && thatcpu != cpu) {
+	for_each_online_cpu(thatcpu) {
+		if (thatcpu == cpu)
+			continue;
+		snap = per_cpu(rcu_dynticks, thatcpu)->dynticks;
+		snap_nmi = per_cpu(rcu_dynticks, thatcpu)->dynticks_nmi;
+		smp_mb(); /* Order sampling of snap with end of grace period. */
+		if (((snap & 0x1) != 0) || ((snap_nmi & 0x1) != 0)) {
 			per_cpu(rcu_dyntick_drain, cpu) = 0;
 			per_cpu(rcu_dyntick_holdoff, cpu) = jiffies - 1;
 			return rcu_needs_cpu_quick_check(cpu);
 		}
+	}
 
 	/* Check and update the rcu_dyntick_drain sequencing. */
 	if (per_cpu(rcu_dyntick_drain, cpu) <= 0) {
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

RCU heads really don't need to be initialized. Their state before call_rcu()
really does not matter.

We need to keep init/destroy_rcu_head_on_stack() though, since we want
debugobjects to be able to keep track of these objects.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: tglx@linutronix.de
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index b653b4a..3a1a70d 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -73,12 +73,6 @@ extern void rcu_init(void);
 #error "Unknown RCU implementation specified to kernel configuration"
 #endif
 
-#define RCU_HEAD_INIT	{ .next = NULL, .func = NULL }
-#define RCU_HEAD(head) struct rcu_head head = RCU_HEAD_INIT
-#define INIT_RCU_HEAD(ptr) do { \
-       (ptr)->next = NULL; (ptr)->func = NULL; \
-} while (0)
-
 static inline void init_rcu_head_on_stack(struct rcu_head *head)
 {
 }
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The addition of preemptible RCU to treercu resulted in a bit of
confusion and inefficiency surrounding the handling of context switches
for RCU-sched and for RCU-preempt.  For RCU-sched, a context switch
is a quiescent state, pure and simple, just like it always has been.
For RCU-preempt, a context switch is in no way a quiescent state, but
special handling is required when a task blocks in an RCU read-side
critical section.

However, the callout from the scheduler and the outer loop in ksoftirqd
still calls something named rcu_sched_qs(), whose name is no longer
accurate.  Furthermore, when rcu_check_callbacks() notes an RCU-sched
quiescent state, it ends up unnecessarily (though harmlessly, aside
from the performance hit) enqueuing the current task if it happens to
be running in an RCU-preempt read-side critical section.  This not only
increases the maximum latency of scheduler_tick(), it also needlessly
increases the overhead of the next outermost rcu_read_unlock() invocation.

This patch addresses this situation by separating the notion of RCU's
context-switch handling from that of RCU-sched's quiescent states.
The context-switch handling is covered by rcu_note_context_switch() in
general and by rcu_preempt_note_context_switch() for preemptible RCU.
This permits rcu_sched_qs() to handle quiescent states and only quiescent
states.  It also reduces the maximum latency of scheduler_tick(), though
probably by much less than a microsecond.  Finally, it means that tasks
within preemptible-RCU read-side critical sections avoid incurring the
overhead of queuing unless there really is a context switch.

Suggested-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 include/linux/rcutiny.h |    4 ++++
 include/linux/rcutree.h |    1 +
 kernel/rcutree.c        |   17 ++++++++++++-----
 kernel/rcutree_plugin.h ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Helps finding racy users of call_rcu(), which results in hangs because list
entries are overwritten and/or skipped.

Changelog since v4:
- Bissectability is now OK
- Now generate a WARN_ON_ONCE() for non-initialized rcu_head passed to
  call_rcu(). Statically initialized objects are detected with
  object_is_static().
- Rename rcu_head_init_on_stack to init_rcu_head_on_stack.
- Remove init_rcu_head() completely.

Changelog since v3:
- Include comments from Lai Jiangshan

This new patch version is based on the debugobjects with the newly introduced
"active state" tracker.

Non-initialized entries are all considered as "statically initialized". An
activation fixup (triggered by call_rcu()) takes care of performing the debug
object initialization without issuing any warning. Since we cannot increase the
size of struct rcu_head, I don't see much room to put an identifier for
statically initialized rcu_head structures. So for now, we have to live without
"activation without explicit init" detection. But the main purpose of this debug
option is to detect double-activations (double call_rcu() use of a rcu_head
before the callback is executed), which is correctly addressed here.

This also detects potential internal RCU callback corruption, which would cause
the callbacks to be executed twice.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: tglx@linutronix.de
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

TINY_RCU does not need rcu_scheduler_active unless CONFIG_DEBUG_LOCK_ALLOC.
So conditionally compile rcu_scheduler_active in order to slim down
rcutiny a bit more.  Also gets rid of an EXPORT_SYMBOL_GPL, which is
responsible for most of the slimming.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    4 +---
 include/linux/rcutiny.h  |   13 +++++++++++++
 include/linux/rcutree.h  |    3 +++
 kernel/rcupdate.c        |   19 -------------------
 kernel/rcutiny.c         |    7 +++++++
 kernel/rcutiny_plugin.h  |   39 +++++++++++++++++++++++++++++++++++++++
 kernel/rcutree.c         |   19 +++++++++++++++++++
 7 files changed, 82 insertions(+), 22 deletions(-)
 create mode 100644 kernel/rcutiny_plugin.h

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index d8fb2ab..23be3a7 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -64,8 +64,6 @@ extern int sched_expedited_torture_stats(char *page);
 
 /* Internal to kernel */
 extern void rcu_init(void);
-extern int rcu_scheduler_active;
-extern void rcu_scheduler_starting(void);
 
 #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
 #include <linux/rcutree.h>
@@ -178,7 +176,7 @@ static inline int rcu_read_lock_bh_held(void)
 #ifdef CONFIG_PREEMPT
 static inline int rcu_read_lock_sched_held(void)
 {
-	return !rcu_scheduler_active || preempt_count() != 0 || irqs_disabled();
+	return preempt_count() != 0 || irqs_disabled();
 }
 #else /* #ifdef CONFIG_PREEMPT */
 static inline int rcu_read_lock_sched_held(void)
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index ff22b97..14e5a76 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -128,4 +128,17 @@ static inline int rcu_preempt_depth(void)
 	return 0;
 }
 
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
+extern int rcu_scheduler_active __read_mostly;
+extern void rcu_scheduler_starting(void);
+
+#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+static inline ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
---
 include/linux/cred.h |    2 +-
 include/linux/key.h  |    3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/cred.h b/include/linux/cred.h
index 52507c3..413f98a 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -84,7 +84,7 @@ struct thread_group_cred {
 	atomic_t	usage;
 	pid_t		tgid;			/* thread group process ID */
 	spinlock_t	lock;
-	struct key	*session_keyring;	/* keyring inherited over fork */
+	struct key __rcu *session_keyring;	/* keyring inherited over fork */
 	struct key	*process_keyring;	/* keyring private to this process */
 	struct rcu_head	rcu;			/* RCU deletion hook */
 };
diff --git a/include/linux/key.h b/include/linux/key.h
index cd50dfa..3db0adc 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -178,8 +178,9 @@ struct key {
 	 */
 	union {
 		unsigned long		value;
+		void __rcu		*rcudata;
 		void			*data;
-		struct keyring_list	*subscriptions;
+		struct keyring_list __rcu *subscriptions;
 	} payload;
 };
 
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

Print boot-time messages if tracing is enabled, if fanout is set
to non-default values, if exact fanout is specified, if accelerated
dyntick-idle grace periods have been enabled, if RCU-lockdep is enabled,
if rcutorture has been boot-time enabled, if the CPU stall detector has
been disabled, or if four-level hierarchy has been enabled.

This is all for TREE_RCU and TREE_PREEMPT_RCU.  TINY_RCU will be handled
separately, if at all.

Suggested-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |    6 ------
 kernel/rcutree_plugin.h |   44 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 595fb83..ec6196f 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1938,12 +1938,6 @@ void __init rcu_init(void)
 	int cpu;
 
 	rcu_bootup_announce();
-#ifdef CONFIG_RCU_CPU_STALL_DETECTOR
-	printk(KERN_INFO "RCU-based detection of stalled CPUs is enabled.\n");
-#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
-#if NUM_RCU_LVL_4 != 0
-	printk(KERN_INFO "Experimental four-level hierarchy is enabled.\n");
-#endif /* #if NUM_RCU_LVL_4 != 0 */
 	RCU_INIT_FLAVOR(&rcu_sched_state, rcu_sched_data);
 	RCU_INIT_FLAVOR(&rcu_bh_state, rcu_bh_data);
 	__rcu_init_preempt();
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index f9bc83a..0ae2339 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -26,6 +26,45 @@
 
 #include <linux/delay.h>
 
+/*
+ * Check the RCU kernel configuration parameters and print informative
+ * messages about anything out of the ordinary.  If you like #ifdef, you
+ * will love this function.
+ */
+static void __init rcu_bootup_announce_oddness(void)
+{
+#ifdef CONFIG_RCU_TRACE
+	printk(KERN_INFO "\tRCU debugfs-based tracing is enabled.\n");
+#endif
+#if (defined(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 64) || (!defined(CONFIG_64BIT) && ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

Lai Jiangshan noted that up to 10% of the RCU_SOFTIRQ are spurious, and
traced this down to the fact that the current grace-period machinery
will uselessly raise RCU_SOFTIRQ when a given CPU needs to go through
a quiescent state, but has not yet done so.  In this situation, there
might well be nothing that RCU_SOFTIRQ can do, and the overhead can be
worth worrying about in the ksoftirqd case.  This patch therefore avoids
raising RCU_SOFTIRQ in this situation.

Changes since v1 (http://lkml.org/lkml/2010/3/30/122 from Lai Jiangshan):

o	Omit the rcu_qs_pending() prechecks, as they aren't that
	much less expensive than the quiescent-state checks.

o	Merge with the set_need_resched() patch that reduces IPIs.

o	Add the new n_rp_report_qs field to the rcu_pending tracing output.

o	Update the tracing documentation accordingly.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/RCU/trace.txt |   35 +++++++++++++++++++----------------
 kernel/rcutree.c            |   11 ++++++-----
 kernel/rcutree.h            |    1 +
 kernel/rcutree_trace.c      |    4 +++-
 4 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index 8608fd8..efd8cc9 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -256,23 +256,23 @@ o	Each element of the form "1/1 0:127 ^0" represents one struct
 The output of "cat rcu/rcu_pending" looks as follows:
 
 rcu_sched:
-  0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741
-  1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792
-  2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629
-  3 np=236249 qsp=48766 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723
-  4 np=221310 qsp=46850 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110
-  5 np=237332 qsp=48449 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456
-  6 np=219995 ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

This adds annotations for RCU operations in core kernel components

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/fdtable.h   |    6 +++---
 include/linux/fs.h        |    2 +-
 include/linux/genhd.h     |    6 +++---
 include/linux/init_task.h |    4 ++--
 include/linux/iocontext.h |    2 +-
 include/linux/mm_types.h  |    2 +-
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index f59ed29..b156bd1 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -31,7 +31,7 @@ struct embedded_fd_set {
 
 struct fdtable {
 	unsigned int max_fds;
-	struct file ** fd;      /* current fd array */
+	struct file __rcu ** fd;      /* current fd array */
 	fd_set *close_on_exec;
 	fd_set *open_fds;
 	struct rcu_head rcu;
@@ -46,7 +46,7 @@ struct files_struct {
    * read mostly part
    */
 	atomic_t count;
-	struct fdtable *fdt;
+	struct fdtable __rcu *fdt;
 	struct fdtable fdtab;
   /*
    * written part on a separate cache line in SMP
@@ -55,7 +55,7 @@ struct files_struct {
 	int next_fd;
 	struct embedded_fd_set close_on_exec_init;
 	struct embedded_fd_set open_fds_init;
-	struct file * fd_array[NR_OPEN_DEFAULT];
+	struct file __rcu * fd_array[NR_OPEN_DEFAULT];
 };
 
 #define rcu_dereference_check_fdtable(files, fdtfd) \
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 39d57bc..4449669 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1382,7 +1382,7 @@ struct super_block {
 	 * Saved mount options for lazy filesystems using
 	 * generic_show_options()
 	 */
-	char *s_options;
+	char __rcu *s_options;
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
diff --git a/include/linux/genhd.h ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The current RCU CPU stall warnings remain enabled even after a panic
occurs, which some people have found to be a bit counterproductive.
This patch therefore uses a notifier to disable stall warnings once a
panic occurs.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   24 ++++++++++++++++++++++++
 1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 3623f8e..595fb83 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -449,6 +449,8 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 
 #ifdef CONFIG_RCU_CPU_STALL_DETECTOR
 
+int rcu_cpu_stall_panicking __read_mostly;
+
 static void record_gp_stall_check_time(struct rcu_state *rsp)
 {
 	rsp->gp_start = jiffies;
@@ -526,6 +528,8 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 	long delta;
 	struct rcu_node *rnp;
 
+	if (rcu_cpu_stall_panicking)
+		return;
 	delta = jiffies - rsp->jiffies_stall;
 	rnp = rdp->mynode;
 	if ((rnp->qsmask & rdp->grpmask) && delta >= 0) {
@@ -540,6 +544,21 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 	}
 }
 
+static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
+{
+	rcu_cpu_stall_panicking = 1;
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block rcu_panic_block = {
+	.notifier_call = rcu_panic,
+};
+
+static void __init check_cpu_stall_init(void)
+{
+	atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
+}
+
 #else /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 static void record_gp_stall_check_time(struct rcu_state *rsp)
@@ -550,6 +569,10 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 {
 }
 
+static void __init check_cpu_stall_init(void)
+{
+}
+
 #endif /* #else #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */
 
 /*
@@ -1934,6 +1957,7 @@ void __init rcu_init(void)
 	cpu_notifier(rcu_cpu_notify, 0);
 	for_each_online_cpu(cpu)
 ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

Because synchronize_rcu_bh() is identical to synchronize_sched(),
make the former a static inline invoking the latter, saving the
overhead of an EXPORT_SYMBOL_GPL() and the duplicate code.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    2 --
 include/linux/rcutiny.h  |   12 +++++++++++-
 include/linux/rcutree.h  |    2 ++
 kernel/rcutiny.c         |    9 ++-------
 4 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 02537a7..d8fb2ab 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -56,8 +56,6 @@ struct rcu_head {
 };
 
 /* Exported common interfaces */
-extern void synchronize_rcu_bh(void);
-extern void synchronize_sched(void);
 extern void rcu_barrier(void);
 extern void rcu_barrier_bh(void);
 extern void rcu_barrier_sched(void);
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index a519587..bbeb55b 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -74,7 +74,17 @@ static inline void rcu_sched_force_quiescent_state(void)
 {
 }
 
-#define synchronize_rcu synchronize_sched
+extern void synchronize_sched(void);
+
+static inline void synchronize_rcu(void)
+{
+	synchronize_sched();
+}
+
+static inline void synchronize_rcu_bh(void)
+{
+	synchronize_sched();
+}
 
 static inline void synchronize_rcu_expedited(void)
 {
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 42cc3a0..7484fe6 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -86,6 +86,8 @@ static inline void __rcu_read_unlock_bh(void)
 
 extern void call_rcu_sched(struct rcu_head *head,
 			   void (*func)(struct rcu_head *rcu));
+extern void synchronize_rcu_bh(void);
+extern void synchronize_sched(void);
 extern void synchronize_rcu_expedited(void);
 
 static inline void synchronize_rcu_bh_expedited(void)
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 9f6d9ff..272c6d2 ...
From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

From: Arnd Bergmann <arnd@arndb.de>

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
 include/linux/cgroup.h |    4 ++--
 include/linux/sched.h  |    2 +-
 kernel/cgroup.c        |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 8f78073..b147fd5 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -75,7 +75,7 @@ struct cgroup_subsys_state {
 
 	unsigned long flags;
 	/* ID for this css, if possible */
-	struct css_id *id;
+	struct css_id __rcu *id;
 };
 
 /* bits in struct cgroup_subsys_state flags field */
@@ -205,7 +205,7 @@ struct cgroup {
 	struct list_head children;	/* my children */
 
 	struct cgroup *parent;		/* my parent */
-	struct dentry *dentry;	  	/* cgroup fs entry, RCU protected */
+	struct dentry __rcu *dentry;	/* cgroup fs entry, RCU protected */
 
 	/* Private pointers for each registered subsystem */
 	struct cgroup_subsys_state *subsys[CGROUP_SUBSYS_COUNT];
diff --git a/include/linux/sched.h b/include/linux/sched.h
index dad7f66..7307c74 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1429,7 +1429,7 @@ struct task_struct {
 #endif
 #ifdef CONFIG_CGROUPS
 	/* Control Group info protected by css_set_lock */
-	struct css_set *cgroups;
+	struct css_set __rcu *cgroups;
 	/* cg_list protected by css_set_lock and tsk->alloc_lock */
 	struct list_head cg_list;
 #endif
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 3a53c77..5cfbc93 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -138,7 +138,7 @@ struct css_id {
 	 * is called after synchronize_rcu(). But for safe use, css_is_removed()
 	 * css_tryget() should be used for avoiding race.
 	 */
-	struct cgroup_subsys_state *css;
+	struct cgroup_subsys_state __rcu *css;
 	/*
 	 * ID of this css.
 	 */
-- 
1.7.0

--

From: Paul Menage
Date: Tuesday, May 4, 2010 - 1:48 pm

On Tue, May 4, 2010 at 1:19 PM, Paul E. McKenney

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 5:04 pm

Thank you, Paul!

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The mce processing applies rcu_dereference_check() to integers used as
array indices.  This patch therefore moves mce to the new RCU API
rcu_dereference_index_check() that avoids the sparse processing that
would otherwise result in compiler errors.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
 arch/x86/kernel/cpu/mcheck/mce.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8a6f0af..f00a9f2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -50,7 +50,7 @@
 static DEFINE_MUTEX(mce_read_mutex);
 
 #define rcu_dereference_check_mce(p) \
-	rcu_dereference_check((p), \
+	rcu_dereference_index_check((p), \
 			      rcu_read_lock_sched_held() || \
 			      lockdep_is_held(&mce_read_mutex))
 
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

The rcu_scheduler_active check has been wrapped into the new
debug_lockdep_rcu_enabled() function, so update the comments to
reflect this new reality.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index a150af0..02537a7 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -111,7 +111,8 @@ extern int debug_lockdep_rcu_enabled(void);
  * this assumes we are in an RCU read-side critical section unless it can
  * prove otherwise.
  *
- * Check rcu_scheduler_active to prevent false positives during boot.
+ * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
+ * and while lockdep is disabled.
  */
 static inline int rcu_read_lock_held(void)
 {
@@ -136,7 +137,8 @@ extern int rcu_read_lock_bh_held(void);
  * of preemption (including disabling irqs) counts as an RCU-sched
  * read-side critical section.
  *
- * Check rcu_scheduler_active to prevent false positives during boot.
+ * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
+ * and while lockdep is disabled.
  */
 #ifdef CONFIG_PREEMPT
 static inline int rcu_read_lock_sched_held(void)
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

Add a #include for mutex.h to allow SRCU to be more easily used in
kernel modules.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/srcu.h |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 9c01f10..4d5d2f5 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -27,6 +27,8 @@
 #ifndef _LINUX_SRCU_H
 #define _LINUX_SRCU_H
 
+#include <linux/mutex.h>
+
 struct srcu_struct_array {
 	int c[2];
 };
-- 
1.7.0

--

From: Paul E. McKenney
Date: Tuesday, May 4, 2010 - 1:19 pm

It is CONFIG_DEBUG_LOCK_ALLOC rather than CONFIG_PROVE_LOCKING, so fix it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   15 ++++++++-------
 include/linux/srcu.h     |    4 ++--
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 4dca275..a150af0 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -106,8 +106,8 @@ extern int debug_lockdep_rcu_enabled(void);
 /**
  * rcu_read_lock_held - might we be in RCU read-side critical section?
  *
- * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in
- * an RCU read-side critical section.  In absence of CONFIG_PROVE_LOCKING,
+ * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an RCU
+ * read-side critical section.  In absence of CONFIG_DEBUG_LOCK_ALLOC,
  * this assumes we are in an RCU read-side critical section unless it can
  * prove otherwise.
  *
@@ -129,11 +129,12 @@ extern int rcu_read_lock_bh_held(void);
 /**
  * rcu_read_lock_sched_held - might we be in RCU-sched read-side critical section?
  *
- * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in an
- * RCU-sched read-side critical section.  In absence of CONFIG_PROVE_LOCKING,
- * this assumes we are in an RCU-sched read-side critical section unless it
- * can prove otherwise.  Note that disabling of preemption (including
- * disabling irqs) counts as an RCU-sched read-side critical section.
+ * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an
+ * RCU-sched read-side critical section.  In absence of
+ * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
+ * critical section unless it can prove otherwise.  Note that disabling
+ * of preemption (including disabling irqs) counts as an RCU-sched
+ * read-side critical section.
  *
  * Check rcu_scheduler_active to prevent false positives during boot.
  */
diff --git a/include/linux/srcu.h ...
From: David Howells
Date: Wednesday, May 5, 2010 - 3:14 am

That's unnecessary.  I posted a patch that removed the usage of RCU on that
pointer:

	Date: Tue, 20 Apr 2010 11:26:13 +0100
	Subject: NFS: Fix RCU issues in the NFSv4 delegation code

Did Trond take it?

David
--

From: Trond Myklebust
Date: Wednesday, May 5, 2010 - 5:44 am

Yes. I haven't pushed it to Linus yet, but I'm planning to do so in the
next 2 days.

Cheers
  Trond
--

From: Paul E. McKenney
Date: Wednesday, May 5, 2010 - 2:01 pm

I have dropped this change from this commit.  Are the rest of the changes
in this commit (nfs_fs.h and auth_gss.h) OK?

							Thanx, Paul
--

Previous thread: PATCH: rar and memrar updates by Alan Cox on Tuesday, May 4, 2010 - 12:40 pm. (1 message)

Next thread: [Bug #15505] No more b43 wireless interface since 2.6.34-rc1 by Rafael J. Wysocki on Tuesday, May 4, 2010 - 1:49 pm. (1 message)