From: Corey Minyard <cminyard@mvista.com>
Convert access to the udp_hash table to use RCU.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
---
include/linux/rculist.h | 19 +++++++++++++++++++
include/net/sock.h | 39 +++++++++++++++++++++++++++++++++++++++
include/net/udp.h | 9 +++++----
net/ipv4/udp.c | 36 +++++++++++++++++++++---------------
net/ipv6/udp.c | 13 +++++++------
5 files changed, 91 insertions(+), 25 deletions(-)
This patch is pretty straightforward; I've tested it a while and it
seems to work properly with a test program that constantly creates and
destroys UDP sockets while sending and receiving large numbers of
packets on an SMP box. I think I've covered all the bases, though RCU
is subtle.
This doesn't make much difference when using no preempt or desktop,
but it makes a huge difference when used with PREEMPT_RT. More than
10 times more UDP throughput on a 16-way machine.
So I'm not sure if this belongs in the RT patch or in the mainstream
kernel.
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index eb4443c..4d3cc58 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -397,5 +397,24 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev,
({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); \
pos = rcu_dereference(pos->next))
+
+/**
+ * hlist_for_each_entry_from_rcu - iterate over rcu list starting from pos
+ * @tpos: the type * to use as a loop cursor.
+ * @pos: the &struct hlist_node to use as a loop cursor.
+ * @head: the head for your list.
+ * @member: the name of the hlist_node within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as hlist_add_head_rcu()
+ * as long as the traversal is guarded by rcu_read_lock().
+ */
+#define hlist_for_each_entry_from_rcu(tpos, pos, member) \
+ for (; ...On Wed, 24 Sep 2008 12:28:27 -0500 Could this be synchronize_rcu? You are using rcu_read_lock() protected sections. --
I meant to comment on that. I wasn't sure which to use, so I chose the more conservative approach. synchronize_rcu() might be appropriate. -corey --
You do indeed need to match the update-side and read-side primitives: Update-side Read-side synchronize_rcu() rcu_read_lock() call_rcu() rcu_read_unlock() call_rcu_bh() rcu_read_lock_bh() rcu_read_unlock_bh() synchronize_sched() preempt_disable() preempt_enable() [and anything else that disables either preemption or irqs] synchronize_srcu() srcu_read_lock() srcu_read_unlock() Mixing RCU or RCU-SCHED with RCU-BH will fail in Classic RCU systems, while mixing RCU or RCU-BH with RCU-SCHED will fail in preemptable RCU systems. Mixing SRCU with any of the other flavors of RCU will fail on any system. So please match them up correctly! Thanx, Paul --
On Thu, 25 Sep 2008 08:29:36 -0700 Also, for consistency with other parts of networking code, don't introduce the synchronize_sched() or synchronize_srcu() pattern to network protocols unless there is a no other way to achieve the desired result. --
Ok, will do. I read more on this, and I think I understand the issues Do you mean synchronize_rcu(), too? It seems to be used in the net code. To avoid that I'd need to add a struct rcu_head to struct sock. Would that be preferable? Thanks, -corey --
On Thu, 25 Sep 2008 14:21:55 -0500 synchhonize_rcu or call_rcu_bh is fine. But I worry that if the other stricter types are used, then we would have to audit all the other RCU usage in networking to make sure nesting was correct. --
On 24-09-2008 19:28, Corey Minyard wrote: ... Aren't other functions like sk_next() or sk_unhashed() used on the read side and need _rcu versions? Jarek P. --
It also needs sk_next_rcu(). sk_unhashed() is only used on the update side, so an rcu version is not needed there. Thanks for pointing this out. -corey --
