[PATCH v2.6.34-rc5 07/12] KEYS: Fix an RCU warning in the reading of user keys

Previous thread: Upgrade Your Email Account? by Webmail Upgrade Team on Sunday, March 7, 2010 - 5:07 pm. (1 message)

Next thread: Re: [PATCH] USB: N-trig Finger Pen Multitouch fix by Rafi Rubin on Sunday, March 7, 2010 - 6:41 pm. (1 message)

[    0.000000] Memory: 4004344k/5242880k available (5147k kernel code,
1084452k absent, 154084k reserved, 5943k data, 520k init)
[    0.000000] SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0,
CPUs=2, Nodes=1
[    0.000000]
[    0.000000] ===================================================
[    0.000000] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.000000] ---------------------------------------------------
[    0.000000] include/linux/cgroup.h:492 invoked
rcu_dereference_check() without protection!
[    0.000000]
[    0.000000] other info that might help us debug this:
[    0.000000]
[    0.000000] 1 lock held by swapper/0:
[    0.000000]  #0:  (&rq->lock){......}, at: [<ffffffff814f9b4c>]
init_idle+0x2b/0xb8
[    0.000000]
[    0.000000] stack backtrace:
[    0.000000] Pid: 0, comm: swapper Not tainted 2.6.33-git11 #1
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff81065e36>] lockdep_rcu_dereference+0x8c/0x94
[    0.000000]  [<ffffffff8102d147>] task_subsys_state+0x38/0x50
[    0.000000]  [<ffffffff8102d179>] set_task_rq+0x1a/0x4a
[    0.000000]  [<ffffffff814f9b96>] init_idle+0x75/0xb8
[    0.000000]  [<ffffffff81afe36a>] sched_init+0x49c/0x4ea
[    0.000000]  [<ffffffff81aebb85>] start_kernel+0x233/0x40c
[    0.000000]  [<ffffffff81aeb297>] x86_64_start_reservations+0xa7/0xab
[    0.000000]  [<ffffffff81aeb37f>] x86_64_start_kernel+0xe4/0xeb

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.33-git11
# Sat Mar  6 07:14:10 2010
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not ...

Hello, Miles,

This one is a false positive that is addressed by a set of boot-time
fixes that have recently gone into mainline, this series of commits:

	c598a070..71da8132


--


On Wed, Mar 10, 2010 at 11:28 PM, Paul E. McKenney

I know you indicated this was fixed in mainline and I see that set of
commits objects, but I'm seeing the below spew from linux-next today.

tree: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
tag: next-20100412
commit: bbeecf185fe464ccd7ee97ce6d3646ad686995b4

[    0.035602] ===================================================
[    0.036003] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.037006] ---------------------------------------------------
[    0.038004] include/linux/cgroup.h:533 invoked
rcu_dereference_check() without protection!
[    0.039003]
[    0.039004] other info that might help us debug this:
[    0.039004]
[    0.040003]
[    0.040004] rcu_scheduler_active = 1, debug_locks = 0
[    0.041004] no locks held by swapper/0.
[    0.042003]
[    0.042004] stack backtrace:
[    0.043005] Pid: 0, comm: swapper Not tainted 2.6.34-rc3-next-20100412+ #65
[    0.044003] Call Trace:
[    0.045015]  [<ffffffff8108584f>] lockdep_rcu_dereference+0xaf/0xc0
[    0.046010]  [<ffffffff81044812>] set_task_cpu+0x2d2/0x370
[    0.047009]  [<ffffffff814dfef5>] ? _raw_spin_unlock_irqrestore+0x65/0x80
[    0.048006]  [<ffffffff81087aa0>] ? trace_hardirqs_on_caller+0x120/0x1a0
[    0.049006]  [<ffffffff81087b2d>] ? trace_hardirqs_on+0xd/0x10
[    0.050006]  [<ffffffff814dfed9>] ? _raw_spin_unlock_irqrestore+0x49/0x80
[    0.051005]  [<ffffffff8104a7a6>] ? task_fork_fair+0xc6/0x390
[    0.052005]  [<ffffffff810497b4>] sched_fork+0x74/0x170
[    0.053006]  [<ffffffff81054a3f>] copy_process+0x62f/0x11e0
[    0.054006]  [<ffffffff810882bd>] ? validate_chain+0x4fd/0x1360
[    0.055005]  [<ffffffff810556ae>] do_fork+0xbe/0x3e0
[    0.056008]  [<ffffffff81012519>] ? sched_clock+0x9/0x10
[    0.057008]  [<ffffffff81077485>] ? sched_clock_local+0x15/0x80
[    0.058005]  [<ffffffff810775ab>] ? sched_clock_cpu+0xbb/0xf0
[    0.059005]  [<ffffffff81076415>] ? up+0x35/0x50
[    0.060005]  [<ffffffff81083623>] ? ...

Oh, right, I still have to sort that out.

I need to figure out how all that scheduler and cgroup muck interact to
fix this.
--


I think the below should cure this..


Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/sched.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 3acf694..2e06d87 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -323,6 +323,15 @@ static inline struct task_group *task_group(struct task_struct *p)
 /* Change a task's cfs_rq and parent entity if it moves across CPUs/groups */
 static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
 {
+	/*
+	 * Strictly speaking this rcu_read_lock() is not needed since the
+	 * task_group is tied to the cgroup, which in turn can never go away
+	 * as long as there are tasks attached to it.
+	 *
+	 * However since task_group() uses task_subsys_state() which is an
+	 * rcu_dereference() user, this quiets CONFIG_PROVE_RCU.
+	 */
+	rcu_read_lock();
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	p->se.cfs_rq = task_group(p)->cfs_rq[cpu];
 	p->se.parent = task_group(p)->se[cpu];
@@ -332,6 +341,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
 	p->rt.rt_rq  = task_group(p)->rt_rq[cpu];
 	p->rt.parent = task_group(p)->rt_se[cpu];
 #endif
+	rcu_read_unlock();
 }
 
 #else


--


Should be [tip:core/urgent]

Acked-by: Lai Jiangshan <laijs@cn.fujitsu.com>


--


So I'm back with another one even with this patch.  Would people prefer
another thread?

[    0.037175] ===================================================
[    0.038003] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.039003] ---------------------------------------------------
[    0.040004] include/linux/cgroup.h:533 invoked rcu_dereference_check() without protection!
[    0.041003]
[    0.041004] other info that might help us debug this:
[    0.041005]
[    0.042004]
[    0.042004] rcu_scheduler_active = 1, debug_locks = 0
[    0.043004] no locks held by swapper/0.
[    0.044003]
[    0.044004] stack backtrace:
[    0.045005] Pid: 0, comm: swapper Not tainted 2.6.34-rc4-next-20100415+ #94
[    0.046004] Call Trace:
[    0.047014]  [<ffffffff8108652f>] lockdep_rcu_dereference+0xaf/0xc0
[    0.048013]  [<ffffffff810a3453>] freezer_fork+0xb3/0xd0
[    0.049007]  [<ffffffff8109d61c>] cgroup_fork_callbacks+0x2c/0x40
[    0.050007]  [<ffffffff81055e4a>] copy_process+0xb6a/0x11e0
[    0.051006]  [<ffffffff8105657e>] do_fork+0xbe/0x3e0
[    0.052007]  [<ffffffff81012519>] ? sched_clock+0x9/0x10
[    0.053008]  [<ffffffff81077d45>] ? sched_clock_local+0x15/0x80
[    0.054006]  [<ffffffff81077e69>] ? sched_clock_cpu+0xb9/0xf0
[    0.055006]  [<ffffffff81076cd5>] ? up+0x35/0x50
[    0.056006]  [<ffffffff81084073>] ? get_lock_stats+0x23/0x70
[    0.057006]  [<ffffffff810840ce>] ? put_lock_stats+0xe/0x30
[    0.058010]  [<ffffffff81cade20>] ? kernel_init+0x0/0x2e0
[    0.059006]  [<ffffffff810136dd>] kernel_thread+0x8d/0xa0
[    0.060006]  [<ffffffff81cade20>] ? kernel_init+0x0/0x2e0
[    0.061007]  [<ffffffff8100bc20>] ? kernel_thread_helper+0x0/0x10
[    0.062006]  [<ffffffff81cad140>] ? early_idt_handler+0x0/0x71
[    0.063011]  [<ffffffff814e40c1>] rest_init+0x21/0x110
[    0.064005]  [<ffffffff81cadd3f>] start_kernel+0x3af/0x490
[    0.065006]  [<ffffffff81cad29c>] x86_64_start_reservations+0x7c/0xd0
[    0.066005]  [<ffffffff81cad000>] ? ...

Yep, different code path to the same location.  Does the following
patch help?

							Thanx, Paul

------------------------------------------------------------------------

commit 2836f18139267ea918ed2cf39023fb0eb38c4361
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Apr 19 15:59:50 2010 -0700

    rcu: fix RCU lockdep splat on freezer_fork path
    
    Add an RCU read-side critical section to suppress this false positive.
    
    Located-by: Eric Paris <eparis@parisplace.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index da5e139..e5c0244 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -205,9 +205,12 @@ static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
 	 * No lock is needed, since the task isn't on tasklist yet,
 	 * so it can't be moved to another cgroup, which means the
 	 * freezer won't be removed and will be valid during this
-	 * function call.
+	 * function call.  Nevertheless, apply RCU read-side critical
+	 * section to suppress RCU lockdep false positives.
 	 */
+	rcu_read_lock();
 	freezer = task_freezer(task);
+	rcu_read_unlock();
 
 	/*
 	 * The root cgroup is non-freezable, so we can skip the
--


That one is also fixed so feel free to add a tested or something from
me.  But we've got another, weeeee!  If there some way I could get all
of these at once?

[    0.045164] ===================================================
[    0.047002] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.048002] ---------------------------------------------------
[    0.049012] include/linux/cgroup.h:533 invoked rcu_dereference_check() without protection!
[    0.051087] 
[    0.051088] other info that might help us debug this:
[    0.051088] 
[    0.057011] 
[    0.057011] rcu_scheduler_active = 1, debug_locks = 0
[    0.059046] no locks held by watchdog/0/5.
[    0.060002] 
[    0.060003] stack backtrace:
[    0.061003] Pid: 5, comm: watchdog/0 Not tainted 2.6.34-rc4-next-20100415+ #104
[    0.063011] Call Trace:
[    0.063654]  [<ffffffff8108652f>] lockdep_rcu_dereference+0xaf/0xc0
[    0.065009]  [<ffffffff810b53f0>] ? watchdog+0x0/0xa0
[    0.066006]  [<ffffffff8104d01b>] __sched_setscheduler+0x3ab/0x430
[    0.067005]  [<ffffffff810b53f0>] ? watchdog+0x0/0xa0
[    0.068004]  [<ffffffff8104d0be>] sched_setscheduler+0xe/0x10
[    0.070009]  [<ffffffff810b541a>] watchdog+0x2a/0xa0
[    0.071004]  [<ffffffff810b53f0>] ? watchdog+0x0/0xa0
[    0.072007]  [<ffffffff810714ec>] kthread+0xac/0xc0
[    0.073005]  [<ffffffff81087ff0>] ? trace_hardirqs_on_caller+0xc0/0x240
[    0.074006]  [<ffffffff8100bc24>] kernel_thread_helper+0x4/0x10
[    0.075005]  [<ffffffff814fee54>] ? restore_args+0x0/0x30
[    0.076004]  [<ffffffff81071440>] ? kthread+0x0/0xc0
[    0.077004]  [<ffffffff8100bc20>] ? kernel_thread_helper+0x0/0x10



--


Sure!  I -think- that if you remove the first "if" statement in
lockdep_rcu_dereference() in kernel/lockdep.c, you will get lots of them
all at once.  Maybe more than your console log is able to hold...

So another approach would be to print only the first 100 or some such.

It -looks- to me that you could make __debug_locks_off() atomically
decrement a counter rather than just setting it to zero, see
include/linux/debug_locks.h.  I suspect that atomic_dec_not_zero()
would work very well for you here.

Peter probably knows a better approach, but those would work.

--


Right, so because the RCU checking stuff doesn't poke at the lockdep
class chains, but simply validates the current task state we can delay
or altogether not disable lockdep.

So something like the below, or simply remove that debug_locks_off()
thing alltogether.

---
 kernel/lockdep.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 78325f8..31c8738 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -3784,12 +3784,16 @@ void lockdep_sys_exit(void)
 	}
 }
 
+static atomic_t rcu_warns = ATOMIC_INIT(100);
+
 void lockdep_rcu_dereference(const char *file, const int line)
 {
 	struct task_struct *curr = current;
 
-	if (!debug_locks_off())
-		return;
+	if (!debug_locks || atomic_dec_and_test(&rcu_warns)) {
+		if (!debug_locks_off())
+			return;
+	}
 	printk("\n===================================================\n");
 	printk(  "[ INFO: suspicious rcu_dereference_check() usage. ]\n");
 	printk(  "---------------------------------------------------\n");


--

From: Lai Jiangshan
Date: Tuesday, April 20, 2010 - 1:23 am

[PATCH] RCU: don't turn off lockdep when find suspicious rcu_dereference_check() usage

When suspicious rcu_dereference_check() usage is detected, lockdep is still
available actually, so we should not call debug_locks_off() in
lockdep_rcu_dereference().

For get rid of too much "suspicious rcu_dereference_check() usage"
output when the "if(!debug_locks_off())" statement is removed. This patch uses
static variable '__warned's for very usage of "rcu_dereference*()".

One variable per usage, so, Now, we can get multiple complaint
when we detect multiple different suspicious rcu_dereference_check() usage.

Requested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 9f1ddfe..30b8d20 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -193,6 +193,15 @@ static inline int rcu_read_lock_sched_held(void)
 
 #ifdef CONFIG_PROVE_RCU
 
+#define __do_rcu_dereference_check(c)					\
+	do {								\
+		static bool __warned;					\
+		if (debug_lockdep_rcu_enabled() && !__warned && !(c)) {	\
+			__warned = true;				\
+			lockdep_rcu_dereference(__FILE__, __LINE__);	\
+		}							\
+	} while (0)
+
 /**
  * rcu_dereference_check - rcu_dereference with debug checking
  * @p: The pointer to read, prior to dereferencing
@@ -222,8 +231,7 @@ static inline int rcu_read_lock_sched_held(void)
  */
 #define rcu_dereference_check(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		rcu_dereference_raw(p); \
 	})
 
@@ -240,8 +248,7 @@ static inline int rcu_read_lock_sched_held(void)
  */
 #define rcu_dereference_protected(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		(p); \
 	})
 
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 78325f8..cc52ffe ...
From: Peter Zijlstra
Date: Tuesday, April 20, 2010 - 1:36 am

Ah indeed, very nice.

--

From: Eric Paris
Date: Tuesday, April 20, 2010 - 5:31 am

Although mine was a linux-next kernel and it doesn't appear that I have
rcu_dereference_protected() at all, so I dropped that bit of the patch,
it worked great!  I got 4 more complaints to harass people with.  Feel
free to add my tested by if you care to.

Tested-by: Eric Paris <eparis@redhat.com>

--

From: Paul E. McKenney
Date: Tuesday, April 20, 2010 - 6:28 am

Very nice!!!  Queued for urgent, thank you Lai and Eric!!!

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, April 20, 2010 - 6:52 am

I will be sending a patchset out later today after testing, but
please see below for a sneak preview collapsed into a single patch.


diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 07db2fe..ec9ab49 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -190,6 +190,15 @@ static inline int rcu_read_lock_sched_held(void)
 
 #ifdef CONFIG_PROVE_RCU
 
+#define __do_rcu_dereference_check(c)					\
+	do {								\
+		static bool __warned;					\
+		if (debug_lockdep_rcu_enabled() && !__warned && !(c)) {	\
+			__warned = true;				\
+			lockdep_rcu_dereference(__FILE__, __LINE__);	\
+		}							\
+	} while (0)
+
 /**
  * rcu_dereference_check - rcu_dereference with debug checking
  * @p: The pointer to read, prior to dereferencing
@@ -219,8 +228,7 @@ static inline int rcu_read_lock_sched_held(void)
  */
 #define rcu_dereference_check(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		rcu_dereference_raw(p); \
 	})
 
@@ -237,8 +245,7 @@ static inline int rcu_read_lock_sched_held(void)
  */
 #define rcu_dereference_protected(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		(p); \
 	})
 
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index da5e139..e5c0244 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -205,9 +205,12 @@ static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
 	 * No lock is needed, since the task isn't on tasklist yet,
 	 * so it can't be moved to another cgroup, which means the
 	 * freezer won't be removed and will be valid during this
-	 * function call.
+	 * function call.  Nevertheless, apply RCU read-side critical
+	 * section to suppress RCU lockdep false positives.
 	 */
+	rcu_read_lock();
 	freezer = task_freezer(task);
+	rcu_read_unlock();
 ...
From: Miles Lane
Date: Tuesday, April 20, 2010 - 8:38 am

Excellent.  Here are the results on my machine.  .config appended.

[    0.177300] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.177428] ---------------------------------------------------
[    0.177557] include/linux/cgroup.h:533 invoked
rcu_dereference_check() without protection!
[    0.177760]
[    0.177761] other info that might help us debug this:
[    0.177762]
[    0.178123]
[    0.178124] rcu_scheduler_active = 1, debug_locks = 1
[    0.178369] no locks held by watchdog/0/5.
[    0.178493]
[    0.178494] stack backtrace:
[    0.178735] Pid: 5, comm: watchdog/0 Not tainted 2.6.34-rc5 #18
[    0.178863] Call Trace:
[    0.178994]  [<ffffffff81067fc2>] lockdep_rcu_dereference+0x9d/0xa5
[    0.179127]  [<ffffffff8102d667>] task_subsys_state+0x48/0x60
[    0.179259]  [<ffffffff810328e5>] __sched_setscheduler+0x19d/0x300
[    0.179392]  [<ffffffff8102b477>] ? need_resched+0x1e/0x28
[    0.179523]  [<ffffffff813cd501>] ? schedule+0x643/0x66e
[    0.179653]  [<ffffffff81091903>] ? watchdog+0x0/0x8c
[    0.179783]  [<ffffffff81032a63>] sched_setscheduler+0xe/0x10
[    0.179913]  [<ffffffff8109192d>] watchdog+0x2a/0x8c
[    0.180010]  [<ffffffff81091903>] ? watchdog+0x0/0x8c
[    0.180142]  [<ffffffff8105713e>] kthread+0x89/0x91
[    0.180272]  [<ffffffff81068922>] ? trace_hardirqs_on_caller+0x114/0x13f
[    0.180405]  [<ffffffff81003994>] kernel_thread_helper+0x4/0x10
[    0.180537]  [<ffffffff813cfcc0>] ? restore_args+0x0/0x30
[    0.180667]  [<ffffffff810570b5>] ? kthread+0x0/0x91
[    0.180796]  [<ffffffff81003990>] ? kernel_thread_helper+0x0/0x10

[    3.116754] [ INFO: suspicious rcu_dereference_check() usage. ]
[    3.116754] ---------------------------------------------------
[    3.116754] kernel/cgroup.c:4432 invoked rcu_dereference_check()
without protection!
[    3.116754]
[    3.116754] other info that might help us debug this:
[    3.116754]
[    3.116754]
[    3.116754] rcu_scheduler_active = 1, debug_locks = 1
[    3.116754] 2 locks held by ...
From: Borislav Petkov
Date: Tuesday, April 20, 2010 - 11:04 pm

Hi,

a plain -rc5 triggers at net/core/dev.c:1993 here too:

[   12.889090] ===================================================
[   12.889387] [ INFO: suspicious rcu_dereference_check() usage. ]
[   12.889533] ---------------------------------------------------
[   12.889679] net/core/dev.c:1993 invoked rcu_dereference_check() without protection!
[   12.889929] 
[   12.889929] other info that might help us debug this:
[   12.889930] 
[   12.890368] 
[   12.890369] rcu_scheduler_active = 1, debug_locks = 0
[   12.890659] 2 locks held by swapper/0:
[   12.890803]  #0:  (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff81045f4a>] run_timer_softirq+0x266/0x503
[   12.891227]  #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81397eb4>] dev_queue_xmit+0x153/0x512
[   12.891647] 
[   12.891648] stack backtrace:
[   12.891934] Pid: 0, comm: swapper Not tainted 2.6.34-rc5 #1
[   12.892085] Call Trace:
[   12.892231]  <IRQ>  [<ffffffff81065d8f>] lockdep_rcu_dereference+0xaa/0xb2
[   12.892430]  [<ffffffff81397fbf>] dev_queue_xmit+0x25e/0x512
[   12.892576]  [<ffffffff81397eb4>] ? dev_queue_xmit+0x153/0x512
[   12.892723]  [<ffffffff81066a4a>] ? trace_hardirqs_on+0xd/0xf
[   12.892871]  [<ffffffff8103f4fb>] ? local_bh_enable_ip+0xbc/0xda
[   12.893024]  [<ffffffff8139ea67>] neigh_resolve_output+0x323/0x36a
[   12.893183]  [<ffffffffa00ae6b7>] ? ipv6_chk_mcast_addr+0x0/0x1fa [ipv6]
[   12.893338]  [<ffffffffa0094860>] ip6_output_finish+0x81/0xb9 [ipv6]
[   12.893492]  [<ffffffffa0096067>] ip6_output2+0x2a9/0x2b4 [ipv6]
[   12.893644]  [<ffffffffa0096c33>] ip6_output+0xbc1/0xbd0 [ipv6]
[   12.893797]  [<ffffffffa00a2a06>] ? fib6_force_start_gc+0x30/0x32 [ipv6]
[   12.893951]  [<ffffffffa00b04e8>] mld_sendpack+0x30b/0x435 [ipv6]
[   12.894109]  [<ffffffffa00b01dd>] ? mld_sendpack+0x0/0x435 [ipv6]
[   12.894264]  [<ffffffff8106676d>] ? mark_held_locks+0x52/0x70
[   12.894418]  [<ffffffffa00b0d2d>] mld_ifc_timer_expire+0x254/0x28d [ipv6]
[   12.894570]  [<ffffffff81046065>] ...
From: Paul E. McKenney
Date: Wednesday, April 21, 2010 - 2:45 pm

Thank you for testing this, Borislav!  A prototype patch may be
found below, CCing the netdev guys for their thoughts on this.
(And if this patch is valid, it is probably best for it to go up with
the networking patches, since this part of the code looks to be under
active development.)




commit df3f17af2d26d1451a3d23d5c7b7a6423a38569e
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed Apr 21 14:40:37 2010 -0700

    net: fix dev_pick_tx() to use rcu_dereference_bh()
    
    dev_pick_tx() is called in an RCU-bh read-side critical section, but
    uses rcu_dereference().  This patch changes to rcu_dereference_bh() in
    order to suppress the RCU lockdep splat.
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/net/core/dev.c b/net/core/dev.c
index 92584bf..f769098 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1990,7 +1990,7 @@ static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 				queue_index = skb_tx_hash(dev, skb);
 
 			if (sk) {
-				struct dst_entry *dst = rcu_dereference(sk->sk_dst_cache);
+				struct dst_entry *dst = rcu_dereference_bh(sk->sk_dst_cache);
 
 				if (dst && skb_dst(skb) == dst)
 					sk_tx_queue_set(sk, queue_index);
--

From: Paul E. McKenney
Date: Wednesday, April 21, 2010 - 2:35 pm

I have a prototype patch for this way down below, but someone who knows
more about CONFIG_RT_GROUP_SCHED than I do should look it over.  In the

I cannot convince myself that the above access is safe.  Vivek, Nauman,

This looks like an rcu_dereference() needs to instead be
rcu_dereference_bh(), but the line numbering in my version of
net/core/dev.c does not match yours.  CCing netdev, hopefully

This one looks to be an update-side reference protected by dev->struct_mutex,
but there is no obvious way to get that information to the pair
of rcu_dereference() calls in for_each_sta_info().  Besides which,
I am not 100% certain that this one is really only a false positive.
Especially given that the next one looks similar, but uses a different
lock.



This one appears to be a case of missing rcu_read_lock(), but it is
not clear to me at what level it needs to go.


commit d3b8ba1bde9afb7d50cf0712f9d95317ea66c06f
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed Apr 21 14:04:56 2010 -0700

    sched: protect __sched_setscheduler() access to cgroups
    
    A given task's cgroups structures must remain while that task is running
    due to reference counting, so this is presumably a false positive.
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/sched.c b/kernel/sched.c
index 14c44ec..1d43c1a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4575,9 +4575,11 @@ recheck:
 		 * Do not allow realtime tasks into groups that have no runtime
 		 * assigned.
 		 */
+		rcu_read_lock();
 		if (rt_bandwidth_enabled() && rt_policy(policy) &&
 				task_group(p)->rt_bandwidth.rt_runtime == 0)
 			return -EPERM;
+		rcu_read_unlock();
 #endif
 
 		retval = security_task_setscheduler(p, policy, param);
--

From: Paul E. McKenney
Date: Wednesday, April 21, 2010 - 2:48 pm

And as Tetsuo Handa pointed out privately, my patch was way broken.

Here is an updated version.

							Thanx, Paul

commit b15e561ed91b7a366c3cc635026f3b9ce6483070
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed Apr 21 14:04:56 2010 -0700

    sched: protect __sched_setscheduler() access to cgroups
    
    A given task's cgroups structures must remain while that task is running
    due to reference counting, so this is presumably a false positive.
    Updated to reflect feedback from Tetsuo Handa.
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/sched.c b/kernel/sched.c
index 14c44ec..f425a2b 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4575,9 +4575,13 @@ recheck:
 		 * Do not allow realtime tasks into groups that have no runtime
 		 * assigned.
 		 */
+		rcu_read_lock();
 		if (rt_bandwidth_enabled() && rt_policy(policy) &&
-				task_group(p)->rt_bandwidth.rt_runtime == 0)
+				task_group(p)->rt_bandwidth.rt_runtime == 0) {
+			rcu_read_unlock();
 			return -EPERM;
+		}
+		rcu_read_unlock();
 #endif
 
 		retval = security_task_setscheduler(p, policy, param);
--

From: Paul E. McKenney
Date: Wednesday, April 21, 2010 - 3:14 pm

Thank you for chasing this down, Eric Dumazet!

Eric Biederman, any enlightment?

--

From: Eric W. Biederman
Date: Wednesday, April 21, 2010 - 4:26 pm

That change to twsk_net probably should have come in
575f4cd5a5b639457747434dbe18d175fa767db4.  The point was to make
twsk_net usable in an rcu context, instead of requiring a lock. 

Should it become rcu_deference_raw now that we have lockdep support?

commit 575f4cd5a5b639457747434dbe18d175fa767db4
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Thu Dec 3 02:29:08 2009 +0000

    net: Use rcu lookups in inet_twsk_purge.
    
    While we are looking up entries to free there is no reason to take
    the lock in inet_twsk_purge.  We have to drop locks and restart
    occassionally anyway so adding a few more in case we get on the
    wrong list because of a timewait move is no big deal.  At the
    same time not taking the lock for long periods of time is much
    more polite to the rest of the users of the hash table.
    
    In my test configuration of killing 4k network namespaces
    this change causes 4k back to back runs of inet_twsk_purge on an
    empty hash table to go from roughly 20.7s to 3.3s, and the total
    time to destroy 4k network namespaces goes from roughly 44s to
    3.3s.
    
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>



Eric
--

From: Vivek Goyal
Date: Thursday, April 22, 2010 - 7:56 am

On Wed, Apr 21, 2010 at 02:35:43PM -0700, Paul E. McKenney wrote:


Hi Paul,

blkiocg_add_blkio_group() is called from two paths.

First one is following. This path should be safe as it takes rcu read
lock.

cfq_get_cfqg()
	rcu_read_lock()
	cfq_find_alloc_cfqg()
		blkiocg_add_blkio_group()
	rcu_read_unlock()

Second one is as shown in above backtrace.

cfq_init_queue()
	blkiocg_add_blkio_group().

This path is called at request queue and cfq initialization time and
we access only root cgroup (root blkio_cgroup). As root cgroup can't
go away, do we have to protect that call also using rcu_read_lock()?

So I guess it is not unsafe but propably we need to fix the warning, I
should wrap second call to blkiocg_add_blkio_group() with
rcu_read_lock/unlock pair?

Thanks
Vivek
--

From: Paul E. McKenney
Date: Thursday, April 22, 2010 - 9:01 am

You are correct, if the root cgroup cannot go away and if we only access

That would work very well!

							Thanx, Paul
--

From: Miles Lane
Date: Friday, April 23, 2010 - 5:50 am

Hi Paul,
There has been a bit of back and forth, and I am not sure what patches
I should test now.
Could you send me a bundle of whatever needs testing now?

I currently have a build of 2.6.34-rc5-git3 with the same patch I
tested before applied.
I notice a few minor differences in the warnings given.  I suspect
these do not indicate
new issues, since the trace from <IRQ> through <EOI> is the same as before.

[   60.174809] [ INFO: suspicious rcu_dereference_check() usage. ]
[   60.174812] ---------------------------------------------------
[   60.174816] net/mac80211/sta_info.c:886 invoked
rcu_dereference_check() without protection!
[   60.174820]
[   60.174821] other info that might help us debug this:
[   60.174822]
[   60.174825]
[   60.174826] rcu_scheduler_active = 1, debug_locks = 1
[   60.174829] no locks held by wpa_supplicant/3973.
[   60.174832]
[   60.174833] stack backtrace:
[   60.174838] Pid: 3973, comm: wpa_supplicant Not tainted 2.6.34-rc5-git3 #19
[   60.174841] Call Trace:
[   60.174844]  <IRQ>  [<ffffffff81067faa>] lockdep_rcu_dereference+0x9d/0xa5
[   60.174873]  [<ffffffffa014e9ae>]
ieee80211_find_sta_by_hw+0x46/0x10f [mac80211]
[   60.174886]  [<ffffffffa014ea8e>] ieee80211_find_sta+0x17/0x19 [mac80211]
[   60.174902]  [<ffffffffa01a60f2>] iwl_tx_queue_reclaim+0xdb/0x1b1 [iwlcore]
[   60.174909]  [<ffffffff81068417>] ? mark_lock+0x2d/0x235
[   60.174920]  [<ffffffffa01d5f1c>] iwl5000_rx_reply_tx+0x4a9/0x556 [iwlagn]
[   60.174927]  [<ffffffff8120a2d3>] ? is_swiotlb_buffer+0x2e/0x3b
[   60.174936]  [<ffffffffa01cebf4>] iwl_rx_handle+0x163/0x2b5 [iwlagn]
[   60.174943]  [<ffffffff810688f0>] ? trace_hardirqs_on_caller+0xfa/0x13f
[   60.174952]  [<ffffffffa01cf3ac>] iwl_irq_tasklet+0x2bb/0x3c0 [iwlagn]
[   60.174959]  [<ffffffff810411df>] tasklet_action+0xa7/0x10f
[   60.174965]  [<ffffffff810421f1>] __do_softirq+0x144/0x252
[   60.174972]  [<ffffffff81003a8c>] call_softirq+0x1c/0x34
[   60.174977]  [<ffffffff810050e4>] do_softirq+0x38/0x80
[   ...
From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:42 pm

Hello, Miles,

I am posting my set as replies to this message.  There are a couple
of KVM fixes that are going up via Avi's tree, and a number of networking
fixes that are going up via Dave Miller's tree -- a number of these
are against quickly changing code, so it didn't make sense for me to
keep them separately.

I believe that the two splats below are addressed by this patch set
carried in the networking tree:

	https://patchwork.kernel.org/patch/90754/

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: Li Zefan <lizf@cn.fujitsu.com>

Expand task_subsys_state()'s rcu_dereference_check() to include the full
locking rule as documented in Documentation/cgroups/cgroups.txt by adding
a check for task->alloc_lock being held.

This fixes an RCU false positive when resuming from suspend. The warning
comes from freezer cgroup in cgroup_freezing_or_frozen().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/cgroup.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b8ad1ea..8f78073 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -530,6 +530,7 @@ static inline struct cgroup_subsys_state *task_subsys_state(
 {
 	return rcu_dereference_check(task->cgroups->subsys[subsys_id],
 				     rcu_read_lock_held() ||
+				     lockdep_is_held(&task->alloc_lock) ||
 				     cgroup_lock_is_held());
 }
 
-- 
1.7.0

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

This patch fixes task_in_mem_cgroup(), mem_cgroup_uncharge_swapcache(),
mem_cgroup_move_swap_account(), and is_target_pte_for_mc() to protect
calls to css_id().  An additional RCU lockdep splat was reported for
memcg_oom_wake_function(), however, this function is not yet in
mainline as of 2.6.34-rc5.

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 mm/memcontrol.c |   21 ++++++++++++++++-----
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f4ede99..e06490d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -811,10 +811,12 @@ int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem)
 	 * enabled in "curr" and "curr" is a child of "mem" in *cgroup*
 	 * hierarchy(even if use_hierarchy is disabled in "mem").
 	 */
+	rcu_read_lock();
 	if (mem->use_hierarchy)
 		ret = css_is_ancestor(&curr->css, &mem->css);
 	else
 		ret = (curr == mem);
+	rcu_read_unlock();
 	css_put(&curr->css);
 	return ret;
 }
@@ -2312,7 +2314,9 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 
 	/* record memcg information */
 	if (do_swap_account && swapout && memcg) {
+		rcu_read_lock();
 		swap_cgroup_record(ent, css_id(&memcg->css));
+		rcu_read_unlock();
 		mem_cgroup_get(memcg);
 	}
 	if (swapout && memcg)
@@ -2369,8 +2373,10 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
 {
 	unsigned short old_id, new_id;
 
+	rcu_read_lock();
 	old_id = css_id(&from->css);
 	new_id = css_id(&to->css);
+	rcu_read_unlock();
 
 	if (swap_cgroup_cmpxchg(entry, old_id, new_id) == old_id) {
 		mem_cgroup_swap_statistics(from, false);
@@ -4038,11 +4044,16 @@ static int is_target_pte_for_mc(struct vm_area_struct ...
From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

Add an RCU read-side critical section to suppress this false positive.

Located-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
---
 kernel/cgroup_freezer.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index da5e139..e5c0244 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -205,9 +205,12 @@ static void freezer_fork(struct cgroup_subsys *ss, struct task_struct *task)
 	 * No lock is needed, since the task isn't on tasklist yet,
 	 * so it can't be moved to another cgroup, which means the
 	 * freezer won't be removed and will be valid during this
-	 * function call.
+	 * function call.  Nevertheless, apply RCU read-side critical
+	 * section to suppress RCU lockdep false positives.
 	 */
+	rcu_read_lock();
 	freezer = task_freezer(task);
+	rcu_read_unlock();
 
 	/*
 	 * The root cgroup is non-freezable, so we can skip the
-- 
1.7.0

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: Li Zefan <lizf@cn.fujitsu.com>

with CONFIG_PROVE_RCU=y, a warning can be triggered:

  # mount -t cgroup -o debug xxx /mnt
  # cat /proc/$$/cgroup

...
kernel/cgroup.c:1649 invoked rcu_dereference_check() without protection!
...

This is a false-positive, because cgroup_path() can be called
with either rcu_read_lock() held or cgroup_mutex held.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/cgroup.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index e2769e1..4ca928d 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1646,7 +1646,9 @@ static inline struct cftype *__d_cft(struct dentry *dentry)
 int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 {
 	char *start;
-	struct dentry *dentry = rcu_dereference(cgrp->dentry);
+	struct dentry *dentry = rcu_dereference_check(cgrp->dentry,
+						      rcu_read_lock_held() ||
+						      cgroup_lock_is_held());
 
 	if (!dentry || cgrp == dummytop) {
 		/*
@@ -1662,13 +1664,17 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen)
 	*--start = '\0';
 	for (;;) {
 		int len = dentry->d_name.len;
+
 		if ((start -= len) < buf)
 			return -ENAMETOOLONG;
-		memcpy(start, cgrp->dentry->d_name.name, len);
+		memcpy(start, dentry->d_name.name, len);
 		cgrp = cgrp->parent;
 		if (!cgrp)
 			break;
-		dentry = rcu_dereference(cgrp->dentry);
+
+		dentry = rcu_dereference_check(cgrp->dentry,
+					       rcu_read_lock_held() ||
+					       cgroup_lock_is_held());
 		if (!cgrp->parent)
 			continue;
 		if (--start < buf)
-- 
1.7.0

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: Lai Jiangshan <laijs@cn.fujitsu.com>

There is no need to disable lockdep after an RCU lockdep splat, so
remove the debug_lockdeps_off() from lockdep_rcu_dereference().
To avoid repeated lockdep splats, use a static variable in the
inlined rcu_dereference_check() and rcu_dereference_protected()
macros so that a given instance splats only once, but so that
multiple instances can be detected per boot.

Requested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcupdate.h |   15 +++++++++++----
 kernel/lockdep.c         |    2 --
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 07db2fe..ec9ab49 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -190,6 +190,15 @@ static inline int rcu_read_lock_sched_held(void)
 
 #ifdef CONFIG_PROVE_RCU
 
+#define __do_rcu_dereference_check(c)					\
+	do {								\
+		static bool __warned;					\
+		if (debug_lockdep_rcu_enabled() && !__warned && !(c)) {	\
+			__warned = true;				\
+			lockdep_rcu_dereference(__FILE__, __LINE__);	\
+		}							\
+	} while (0)
+
 /**
  * rcu_dereference_check - rcu_dereference with debug checking
  * @p: The pointer to read, prior to dereferencing
@@ -219,8 +228,7 @@ static inline int rcu_read_lock_sched_held(void)
  */
 #define rcu_dereference_check(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		rcu_dereference_raw(p); \
 	})
 
@@ -237,8 +245,7 @@ static inline int rcu_read_lock_sched_held(void)
  */
 #define rcu_dereference_protected(p, c) \
 	({ \
-		if (debug_lockdep_rcu_enabled() && !(c)) \
-			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		__do_rcu_dereference_check(c); \
 		(p); \
 	})
 
diff --git a/kernel/lockdep.c ...
From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: Li Zefan <lizf@cn.fujitsu.com>

With CONFIG_PROVE_RCU=y, a warning can be triggered:

  # mount -t cgroup -o memory xxx /mnt
  # mkdir /mnt/0

...
kernel/cgroup.c:4442 invoked rcu_dereference_check() without protection!
...

This is a false-positive. It's safe to directly access parent_css->id.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/cgroup.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 4ca928d..3a53c77 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4561,13 +4561,13 @@ static int alloc_css_id(struct cgroup_subsys *ss, struct cgroup *parent,
 {
 	int subsys_id, i, depth = 0;
 	struct cgroup_subsys_state *parent_css, *child_css;
-	struct css_id *child_id, *parent_id = NULL;
+	struct css_id *child_id, *parent_id;
 
 	subsys_id = ss->subsys_id;
 	parent_css = parent->subsys[subsys_id];
 	child_css = child->subsys[subsys_id];
-	depth = css_depth(parent_css) + 1;
 	parent_id = parent_css->id;
+	depth = parent_id->depth;
 
 	child_id = get_new_cssid(ss, depth);
 	if (IS_ERR(child_id))
-- 
1.7.0

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: David Howells <dhowells@redhat.com>

Fix a number of RCU issues in the NFSv4 delegation code.

 (1) delegation->cred doesn't need to be RCU protected as it's essentially an
     invariant refcounted structure.

     By the time we get to nfs_free_delegation(), the delegation is being
     released, so no one else should be attempting to use the saved
     credentials, and they can be cleared.

     However, since the list of delegations could still be under traversal at
     this point by such as nfs_client_return_marked_delegations(), the cred
     should be released in nfs_do_free_delegation() rather than in
     nfs_free_delegation().  Simply using rcu_assign_pointer() to clear it is
     insufficient as that doesn't stop the cred from being destroyed, and nor
     does calling put_rpccred() after call_rcu(), given that the latter is
     asynchronous.

 (2) nfs_detach_delegation_locked() and nfs_inode_set_delegation() should use
     rcu_derefence_protected() because they can only be called if
     nfs_client::cl_lock is held, and that guards against anyone changing
     nfsi->delegation under it.  Furthermore, the barrier imposed by
     rcu_dereference() is superfluous, given that the spin_lock() is also a
     barrier.

 (3) nfs_detach_delegation_locked() is now passed a pointer to the nfs_client
     struct so that it can issue lockdep advice based on clp->cl_lock for (2).

 (4) nfs_inode_return_delegation_noreclaim() and nfs_inode_return_delegation()
     should use rcu_access_pointer() outside the spinlocked region as they
     merely examine the pointer and don't follow it, thus rendering unnecessary
     the need to impose a partial ordering over the one item of interest.

     These result in an RCU warning like the following:

[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
fs/nfs/delegation.c:332 invoked rcu_dereference_check() without protection!

other info that might help us debug ...
From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: David Howells <dhowells@redhat.com>

Fix an RCU warning in the reading of user keys:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
security/keys/user_defined.c:202 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by keyctl/3637:
 #0:  (&key->sem){+++++.}, at: [<ffffffff811a80ae>] keyctl_read_key+0x9c/0xcf

stack backtrace:
Pid: 3637, comm: keyctl Not tainted 2.6.34-rc5-cachefs #18
Call Trace:
 [<ffffffff81051f6c>] lockdep_rcu_dereference+0xaa/0xb2
 [<ffffffff811aa55f>] user_read+0x47/0x91
 [<ffffffff811a80be>] keyctl_read_key+0xac/0xcf
 [<ffffffff811a8a06>] sys_keyctl+0x75/0xb7
 [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 security/keys/user_defined.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/security/keys/user_defined.c b/security/keys/user_defined.c
index 7c687d5..e9aa079 100644
--- a/security/keys/user_defined.c
+++ b/security/keys/user_defined.c
@@ -199,7 +199,8 @@ long user_read(const struct key *key, char __user *buffer, size_t buflen)
 	struct user_key_payload *upayload;
 	long ret;
 
-	upayload = rcu_dereference(key->payload.data);
+	upayload = rcu_dereference_protected(
+		key->payload.data, rwsem_is_locked(&((struct key *)key)->sem));
 	ret = upayload->datalen;
 
 	/* we can return the data as is */
-- 
1.7.0

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: Peter Zijlstra <peterz@infradead.org>

Add an RCU read-side critical section to suppress this false positive.

Located-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/sched.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index de0bd26..3c2a54f 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -323,6 +323,15 @@ static inline struct task_group *task_group(struct task_struct *p)
 /* Change a task's cfs_rq and parent entity if it moves across CPUs/groups */
 static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
 {
+	/*
+	 * Strictly speaking this rcu_read_lock() is not needed since the
+	 * task_group is tied to the cgroup, which in turn can never go away
+	 * as long as there are tasks attached to it.
+	 *
+	 * However since task_group() uses task_subsys_state() which is an
+	 * rcu_dereference() user, this quiets CONFIG_PROVE_RCU.
+	 */
+	rcu_read_lock();
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	p->se.cfs_rq = task_group(p)->cfs_rq[cpu];
 	p->se.parent = task_group(p)->se[cpu];
@@ -332,6 +341,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
 	p->rt.rt_rq  = task_group(p)->rt_rq[cpu];
 	p->rt.parent = task_group(p)->rt_se[cpu];
 #endif
+	rcu_read_unlock();
 }
 
 #else
-- 
1.7.0

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: Li Zefan <lizf@cn.fujitsu.com>

With CONFIG_PROVE_RCU=y, a warning can be triggered:

  $ cat /proc/sched_debug

...
kernel/cgroup.c:1649 invoked rcu_dereference_check() without protection!
...

Both cgroup_path() and task_group() should be called with either
rcu_read_lock or cgroup_mutex held.

The rcu_dereference_check() does include cgroup_lock_is_held(), so we
know that this lock is not held.  Therefore, in a CONFIG_PREEMPT kernel,
to say nothing of a CONFIG_PREEMPT_RT kernel, the original code could
have ended up copying a string out of the freelist.

This patch inserts RCU read-side primitives needed to prevent this
scenario.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/sched_debug.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
index 9b49db1..19be00b 100644
--- a/kernel/sched_debug.c
+++ b/kernel/sched_debug.c
@@ -114,7 +114,9 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p)
 	{
 		char path[64];
 
+		rcu_read_lock();
 		cgroup_path(task_group(p)->css.cgroup, path, sizeof(path));
+		rcu_read_unlock();
 		SEQ_printf(m, " %s", path);
 	}
 #endif
-- 
1.7.0

--

From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Ensure that we correctly rcu-dereference the delegation itself, and that we
protect against removal while we're changing the contents.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 fs/nfs/delegation.c |   42 ++++++++++++++++++++++++++++--------------
 1 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 1567124..8d9ec49 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -129,21 +129,35 @@ again:
  */
 void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred, struct nfs_openres *res)
 {
-	struct nfs_delegation *delegation = NFS_I(inode)->delegation;
-	struct rpc_cred *oldcred;
+	struct nfs_delegation *delegation;
+	struct rpc_cred *oldcred = NULL;
 
-	if (delegation == NULL)
-		return;
-	memcpy(delegation->stateid.data, res->delegation.data,
-			sizeof(delegation->stateid.data));
-	delegation->type = res->delegation_type;
-	delegation->maxsize = res->maxsize;
-	oldcred = delegation->cred;
-	delegation->cred = get_rpccred(cred);
-	clear_bit(NFS_DELEGATION_NEED_RECLAIM, &delegation->flags);
-	NFS_I(inode)->delegation_state = delegation->type;
-	smp_wmb();
-	put_rpccred(oldcred);
+	rcu_read_lock();
+	delegation = rcu_dereference(NFS_I(inode)->delegation);
+	if (delegation != NULL) {
+		spin_lock(&delegation->lock);
+		if (delegation->inode != NULL) {
+			memcpy(delegation->stateid.data, res->delegation.data,
+			       sizeof(delegation->stateid.data));
+			delegation->type = res->delegation_type;
+			delegation->maxsize = res->maxsize;
+			oldcred = delegation->cred;
+			delegation->cred = get_rpccred(cred);
+			clear_bit(NFS_DELEGATION_NEED_RECLAIM,
+				  &delegation->flags);
+			NFS_I(inode)->delegation_state = ...
From: Paul E. McKenney
Date: Friday, April 23, 2010 - 12:43 pm

From: David Howells <dhowells@redhat.com>

Fix the following RCU warning:

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
security/keys/request_key.c:116 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by keyctl/5372:
 #0:  (key_types_sem){.+.+.+}, at: [<ffffffff811a4e3d>] key_type_lookup+0x1c/0x70

stack backtrace:
Pid: 5372, comm: keyctl Not tainted 2.6.34-rc3-cachefs #150
Call Trace:
 [<ffffffff810515f8>] lockdep_rcu_dereference+0xaa/0xb2
 [<ffffffff811a9220>] call_sbin_request_key+0x156/0x2b6
 [<ffffffff811a4c66>] ? __key_instantiate_and_link+0xb1/0xdc
 [<ffffffff811a4cd3>] ? key_instantiate_and_link+0x42/0x5f
 [<ffffffff811a96b8>] ? request_key_auth_new+0x17b/0x1f3
 [<ffffffff811a8e00>] ? request_key_and_link+0x271/0x400
 [<ffffffff810aba6f>] ? kmem_cache_alloc+0xe1/0x118
 [<ffffffff811a8f1a>] request_key_and_link+0x38b/0x400
 [<ffffffff811a7b72>] sys_request_key+0xf7/0x14a
 [<ffffffff81052227>] ? trace_hardirqs_on_caller+0x10c/0x130
 [<ffffffff81393f5c>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b

This was caused by doing:

	[root@andromeda ~]# keyctl newring fred @s
	539196288
	[root@andromeda ~]# keyctl request2 user a a 539196288
	request_key: Required key not available

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 security/keys/request_key.c |   13 ++++++++-----
 1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 03fe63e..ea97c31 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -68,7 +68,8 @@ static int call_sbin_request_key(struct key_construction *cons,
 {
 	const struct ...
From: Miles Lane
Date: Friday, April 23, 2010 - 3:59 pm

On Fri, Apr 23, 2010 at 3:42 PM, Paul E. McKenney

With your twelve patches and the one linked to above applied to
2.6.34-rc5-git3, here are the warnings I see:

[    0.173969] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.174097] ---------------------------------------------------
[    0.174226] include/linux/cgroup.h:534 invoked
rcu_dereference_check() without protection!
[    0.174429]
[    0.174430] other info that might help us debug this:
[    0.174431]
[    0.174792]
[    0.174793] rcu_scheduler_active = 1, debug_locks = 1
[    0.175037] no locks held by watchdog/0/5.
[    0.175162]
[    0.175163] stack backtrace:
[    0.175405] Pid: 5, comm: watchdog/0 Not tainted 2.6.34-rc5-git3 #22
[    0.175534] Call Trace:
[    0.175666]  [<ffffffff81067fbe>] lockdep_rcu_dereference+0x9d/0xa5
[    0.175799]  [<ffffffff8102d678>] task_subsys_state+0x59/0x70
[    0.175931]  [<ffffffff810328fa>] __sched_setscheduler+0x19d/0x300
[    0.176064]  [<ffffffff8102b477>] ? need_resched+0x1e/0x28
[    0.176196]  [<ffffffff813cd401>] ? schedule+0x5c3/0x66e
[    0.176327]  [<ffffffff81091943>] ? watchdog+0x0/0x8c
[    0.176457]  [<ffffffff81032a78>] sched_setscheduler+0xe/0x10
[    0.176587]  [<ffffffff8109196d>] watchdog+0x2a/0x8c
[    0.176677]  [<ffffffff81091943>] ? watchdog+0x0/0x8c
[    0.176808]  [<ffffffff81057152>] kthread+0x89/0x91
[    0.176939]  [<ffffffff8106891e>] ? trace_hardirqs_on_caller+0x114/0x13f
[    0.177073]  [<ffffffff81003994>] kernel_thread_helper+0x4/0x10
[    0.177204]  [<ffffffff813cfc40>] ? restore_args+0x0/0x30
[    0.177334]  [<ffffffff810570c9>] ? kthread+0x0/0x91
[    0.177463]  [<ffffffff81003990>] ? kernel_thread_helper+0x0/0x10

[    3.173419] [ INFO: suspicious rcu_dereference_check() usage. ]
[    3.173419] ---------------------------------------------------
[    3.173419] kernel/cgroup.c:4438 invoked rcu_dereference_check()
without protection!
[    3.173419]
[    3.173419] other info that might help us debug this:
[    3.173419]
[    ...
From: Miles Lane
Date: Friday, April 23, 2010 - 10:35 pm

2.6.34-rc5-git5 with all of your patches applied.

I reconfigured my kernel build options and got the following new issue:

[    2.686515] [ INFO: suspicious rcu_dereference_check() usage. ]
[    2.686519] ---------------------------------------------------
[    2.686523] kernel/cgroup.c:4438 invoked rcu_dereference_check()
without protection!
[    2.686526]
[    2.686527] other info that might help us debug this:
[    2.686529]
[    2.686532]
[    2.686533] rcu_scheduler_active = 1, debug_locks = 1
[    2.686537] 2 locks held by swapper/1:
[    2.686540]  #0:  (mtd_table_mutex){+.+.+.}, at:
[<ffffffff812d7714>] register_mtd_blktrans+0xa2/0x25e
[    2.686555]  #1:  (&(&blkcg->lock)->rlock){......}, at:
[<ffffffff811ca7bd>] blkiocg_add_blkio_group+0x29/0x7f
[    2.686566]
[    2.686567] stack backtrace:
[    2.686572] Pid: 1, comm: swapper Not tainted 2.6.34-rc5-git5 #25
[    2.686576] Call Trace:
[    2.686584]  [<ffffffff810642da>] lockdep_rcu_dereference+0x9d/0xa5
[    2.686591]  [<ffffffff8107af54>] css_id+0x3f/0x52
[    2.686597]  [<ffffffff811ca7cc>] blkiocg_add_blkio_group+0x38/0x7f
[    2.686603]  [<ffffffff811cc593>] cfq_init_queue+0xdf/0x2dc
[    2.686609]  [<ffffffff811bb858>] elevator_init+0xba/0xf5
[    2.686616]  [<ffffffff812d7046>] ? mtd_blktrans_request+0x0/0x1c
[    2.686623]  [<ffffffff811c0b62>] blk_init_queue_node+0x12f/0x135
[    2.686629]  [<ffffffff811c0b74>] blk_init_queue+0xc/0xe
[    2.686635]  [<ffffffff812d7777>] register_mtd_blktrans+0x105/0x25e
[    2.686642]  [<ffffffff818c0de9>] ? init_mtdblock+0x0/0x2c
[    2.686648]  [<ffffffff818c0e13>] init_mtdblock+0x2a/0x2c
[    2.686656]  [<ffffffff810001ef>] do_one_initcall+0x59/0x14e
[    2.686663]  [<ffffffff818986a6>] kernel_init+0x160/0x1ea
[    2.686669]  [<ffffffff81003814>] kernel_thread_helper+0x4/0x10
[    2.686677]  [<ffffffff8140d77c>] ? restore_args+0x0/0x30
[    2.686683]  [<ffffffff81898546>] ? kernel_init+0x0/0x1ea
[    2.686688]  [<ffffffff81003810>] ? ...
From: Paul E. McKenney
Date: Saturday, April 24, 2010 - 7:36 pm

This should be covered by the patch I sent with my previous email.

And thank you again, Miles, for all the testing!!!

							Thanx, Paul
--

From: Paul E. McKenney
Date: Saturday, April 24, 2010 - 7:34 pm

According to Documentation/cgroups/cgroups.txt, we must hold cgroup_mutex,
the task's task_alloc lock, or be in an RCU read-side critical section.
We are in neither of these.

I would argue that sched_setscheduler() should take care of
synchronization, but am not sure which of these three are appropriate

Please see below for a patch for this based on my earlier conversation
with Vivek Goyal.  (Vivek, if you are already pushing a fix elsewhere,


This is a repeat from last time that confused me at the time.  I could
do a hacky "fix" by putting an RCU read-side critical section around
the for_each_sta_info() in ieee80211_find_sta_by_hw(), but I do not
understand this code well enough to feel comfortable doing so.


Ditto.

						Thanx, Paul

------------------------------------------------------------------------

commit 0868dd631def762ba00c2f0f397a53c5cdf24ae2
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Sat Apr 24 19:23:30 2010 -0700

    block-cgroup: fix RCU-lockdep splat in blkiocg_add_blkio_group()
    
    It is necessary to be in an RCU read-side critical section when invoking
    css_id(), so this patch adds one to blkiocg_add_blkio_group().  This is
    actually a false positive, because this is called at initialization time,
    and hence always refers to the root cgroup, which cannot go away.
    
    Located-by: Miles Lane <miles.lane@gmail.com>
    Suggested-by: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 5fe03de..55c8c73 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -71,7 +71,9 @@ void blkiocg_add_blkio_group(struct blkio_cgroup *blkcg,
 
 	spin_lock_irqsave(&blkcg->lock, flags);
 	rcu_assign_pointer(blkg->key, key);
+	rcu_read_lock();
 	blkg->blkcg_id = css_id(&blkcg->css);
+	rcu_read_unlock();
 	hlist_add_head_rcu(&blkg->blkcg_node, &blkcg->blkg_list);
 	spin_unlock_irqrestore(&blkcg->lock, flags);
 ...
From: Johannes Berg
Date: Sunday, April 25, 2010 - 12:45 am

The station locking is a tad confusing, but I've added the right
annotations already, should be coming to a kernel near you soon (i.e.
are in net-2.6 right now).

johannes

--

From: David Miller
Date: Sunday, April 25, 2010 - 12:49 am

From: Johannes Berg <johannes@sipsolutions.net>

Linus took in everything I have so it should be in Linus's tree
by now.
--

From: Paul E. McKenney
Date: Sunday, April 25, 2010 - 7:07 pm

Thank you both, Dave and Johannes!

							Thanx, Paul
--

From: Miles Lane
Date: Sunday, April 25, 2010 - 8:49 am

On Sat, Apr 24, 2010 at 10:34 PM, Paul E. McKenney

I am down to seeing three suspicious rcu_dereference_check traces when
I apply this patch and all the previous patches to 2.6.34-rc5-git6.

1. The "__sched_setscheduler+0x19d/0x300" trace.
2. The two "is_swiotlb_buffer+0x2e/0x3b" traces (waiting to see
Johannes' patch show up in a Linux snapshot)

Did I miss a patch for the setscheduler issue?

Thanks!
        Miles
--

From: Miles Lane
Date: Sunday, April 25, 2010 - 1:20 pm

Hmm.  I am still seeing these two messages as well.

[   83.363146] [ INFO: suspicious rcu_dereference_check() usage. ]
[   83.363148] ---------------------------------------------------
[   83.363151] include/net/inet_timewait_sock.h:227 invoked
rcu_dereference_check() without protection!
[   83.363154]
[   83.363155] other info that might help us debug this:
[   83.363156]
[   83.363158]
[   83.363159] rcu_scheduler_active = 1, debug_locks = 1
[   83.363162] 2 locks held by gwibber-service/5076:
[   83.363164]  #0:  (&p->lock){+.+.+.}, at: [<ffffffff8110534a>]
seq_read+0x37/0x381
[   83.363176]  #1:  (&(&hashinfo->ehash_locks[i])->rlock){+.-...},
at: [<ffffffff813ddcd5>] established_get_next+0xc4/0x132
[   83.363186]
[   83.363187] stack backtrace:
[   83.363191] Pid: 5076, comm: gwibber-service Not tainted 2.6.34-rc5-git6 #27
[   83.363194] Call Trace:
[   83.363202]  [<ffffffff81068086>] lockdep_rcu_dereference+0x9d/0xa5
[   83.363207]  [<ffffffff813dc998>] twsk_net+0x4f/0x57
[   83.363212]  [<ffffffff813ddc65>] established_get_next+0x54/0x132
[   83.363216]  [<ffffffff813dde47>] tcp_seq_next+0x5d/0x6a
[   83.363221]  [<ffffffff81105599>] seq_read+0x286/0x381
[   83.363226]  [<ffffffff81105313>] ? seq_read+0x0/0x381
[   83.363231]  [<ffffffff8113503c>] proc_reg_read+0x8d/0xac
[   83.363236]  [<ffffffff810ebf14>] vfs_read+0xa6/0x103
[   83.363241]  [<ffffffff810ec027>] sys_read+0x45/0x69
[   83.363246]  [<ffffffff81002b6b>] system_call_fastpath+0x16/0x1b

[   84.660302] [ INFO: suspicious rcu_dereference_check() usage. ]
[   84.660304] ---------------------------------------------------
[   84.660308] include/net/inet_timewait_sock.h:227 invoked
rcu_dereference_check() without protection!
[   84.660311]
[   84.660312] other info that might help us debug this:
[   84.660313]
[   84.660315]
[   84.660316] rcu_scheduler_active = 1, debug_locks = 1
[   84.660319] no locks held by gwibber-service/5081.
[   84.660321]
[   84.660322] stack backtrace:
[   84.660325] ...
From: Paul E. McKenney
Date: Monday, April 26, 2010 - 9:09 am

Eric Dumazet traced these down to a commit from Eric Biederman.

If I don't hear from Eric Biederman in a few days, I will attempt a
patch, but it would be more likely to be correct coming from someone
with a better understanding of the code.  ;-)

							Thanx, Paul
--

From: Paul E. McKenney
Date: Monday, April 26, 2010 - 9:27 pm

You did indeed!!!  This experience is giving me an even better appreciation
of the maintainers' ability to keep all their patches straight!

I will put together something based on your suggestion.

							Thanx, Paul
--

From: Paul E. McKenney
Date: Tuesday, April 27, 2010 - 9:22 am

How about the following?

							Thanx, Paul

------------------------------------------------------------------------

commit 85fa42bd568ab99c375f018761ae6345249942cd
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Apr 26 21:40:05 2010 -0700

    net: suppress RCU lockdep false positive in twsk_net()
    
    Calls to twsk_net() are in some cases protected by reference counting
    as an alternative to RCU protection.  Cases covered by reference counts
    include __inet_twsk_kill(), inet_twsk_free(), inet_twdr_do_twkill_work(),
    inet_twdr_twcal_tick(), and tcp_timewait_state_process().  RCU is used
    by inet_twsk_purge().  Locking is used by established_get_first()
    and established_get_next().  Finally, __inet_twsk_hashdance() is an
    initialization case.
    
    It appears to be non-trivial to locate the appropriate locks and
    reference counts from within twsk_net(), so used rcu_dereference_raw().
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index 79f67ea..a066fdd 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -224,7 +224,9 @@ static inline
 struct net *twsk_net(const struct inet_timewait_sock *twsk)
 {
 #ifdef CONFIG_NET_NS
-	return rcu_dereference(twsk->tw_net);
+	return rcu_dereference_raw(twsk->tw_net); /* protected by locking, */
+						  /* reference counting, */
+						  /* initialization, or RCU. */
 #else
 	return &init_net;
 #endif
--

From: Eric Dumazet
Date: Tuesday, April 27, 2010 - 9:33 am

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>


--

From: Miles Lane
Date: Tuesday, April 27, 2010 - 10:58 am

On Tue, Apr 27, 2010 at 12:22 PM, Paul E. McKenney

Worked for me.  Thanks!

           Miles
--

From: Paul E. McKenney
Date: Tuesday, April 27, 2010 - 4:31 pm

Thank you both!  I have added Eric's Acked-by and Miles's Tested-by.

							Thanx, Paul
--

From: David Miller
Date: Tuesday, April 27, 2010 - 4:42 pm

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

How can you?  It's in net-next-2.6 already :-)
--

From: Paul E. McKenney
Date: Tuesday, April 27, 2010 - 4:52 pm

OK, OK, I will drop my copy.  ;-)

							Thanx, Paul
--

From: Paul E. McKenney
Date: Wednesday, April 28, 2010 - 2:37 pm

I replaced the above with an improved patch from Vivek Goyal, which has
not yet reached mainline.  I will resend my patch stack.

--

From: Miles Lane
Date: Saturday, May 1, 2010 - 10:26 am

On Tue, Apr 20, 2010 at 9:52 AM, Paul E. McKenney

Hi Paul.

Has this patch made it into the Linus tree?
Thanks!

          Miles
--

From: Paul E. McKenney
Date: Saturday, May 1, 2010 - 2:55 pm

Hello, Miles,

Not yet -- working with Ingo to get a variant of it into -tip on
its way to Linus's tree.  The latest patch stack may be found at
http://lkml.org/lkml/2010/4/30/500.

						Thanx, Paul
--

From: Miles Lane
Date: Saturday, May 1, 2010 - 7:00 pm

On Sat, May 1, 2010 at 5:55 PM, Paul E. McKenney

What is the rationale for defaulting to showing only one RCU splat?
That setting seems likely to reduce the rate at which things get
cleaned up.
Ciao,
      Miles
--

From: Paul E. McKenney
Date: Saturday, May 1, 2010 - 9:11 pm

Hello, Miles,

The discussion is at http://lkml.org/lkml/2010/4/21/304.  It might
reduce it or even increase it.  The increase might come from people
who might disable CONFIG_PROVE_RCU completely if they kept getting
too many splats.  This way people can choose how much they want to
contribute to cleaning up.

And regardless of how this is eventually settled, let me say again
that I very much appreciate your testing efforts!!!

							Thanx, Paul
--


Acked-by: Li Zefan <lizf@cn.fujitsu.com>
--


Thank you, Li!  I have added your Acked-by.

							Thanx, Paul
--


Thank you for checking on this, Eric.  I am finally breaking down and
cloning linux-next, and will look into this.

--

Previous thread: Upgrade Your Email Account? by Webmail Upgrade Team on Sunday, March 7, 2010 - 5:07 pm. (1 message)

Next thread: Re: [PATCH] USB: N-trig Finger Pen Multitouch fix by Rafi Rubin on Sunday, March 7, 2010 - 6:41 pm. (1 message)