Re: [PATCH 1/1] Use RCU for the UDP hash lock

Previous thread: [PATCH v3 0/4] FHCI USB Host support patches by Anton Vorontsov on Wednesday, September 24, 2008 - 11:20 am. (6 messages)

Next thread: [PATCH] x86_64: be less annoying on boot by Bill Nottingham on Wednesday, September 24, 2008 - 11:35 am. (10 messages)
From: Corey Minyard
Date: Wednesday, September 24, 2008 - 10:28 am

From: Corey Minyard <cminyard@mvista.com>

Convert access to the udp_hash table to use RCU.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
---
 include/linux/rculist.h |   19 +++++++++++++++++++
 include/net/sock.h      |   39 +++++++++++++++++++++++++++++++++++++++
 include/net/udp.h       |    9 +++++----
 net/ipv4/udp.c          |   36 +++++++++++++++++++++---------------
 net/ipv6/udp.c          |   13 +++++++------
 5 files changed, 91 insertions(+), 25 deletions(-)

This patch is pretty straightforward; I've tested it a while and it
seems to work properly with a test program that constantly creates and
destroys UDP sockets while sending and receiving large numbers of
packets on an SMP box.  I think I've covered all the bases, though RCU
is subtle.

This doesn't make much difference when using no preempt or desktop,
but it makes a huge difference when used with PREEMPT_RT.  More than
10 times more UDP throughput on a 16-way machine.

So I'm not sure if this belongs in the RT patch or in the mainstream
kernel.

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index eb4443c..4d3cc58 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -397,5 +397,24 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev,
 		({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); \
 		pos = rcu_dereference(pos->next))
 
+
+/**
+ * hlist_for_each_entry_from_rcu - iterate over rcu list starting from pos
+ * @tpos:      the type * to use as a loop cursor.
+ * @pos:       the &struct hlist_node to use as a loop cursor.
+ * @head:      the head for your list.
+ * @member:    the name of the hlist_node within the struct.
+ *
+ * This list-traversal primitive may safely run concurrently with
+ * the _rcu list-mutation primitives such as hlist_add_head_rcu()
+ * as long as the traversal is guarded by rcu_read_lock().
+ */
+#define hlist_for_each_entry_from_rcu(tpos, pos, member)                \
+	for (;                                   ...
From: Stephen Hemminger
Date: Wednesday, September 24, 2008 - 12:40 pm

On Wed, 24 Sep 2008 12:28:27 -0500

Could this be synchronize_rcu? You are using rcu_read_lock() protected sections.
--

From: Corey Minyard
Date: Wednesday, September 24, 2008 - 1:46 pm

I meant to comment on that.  I wasn't sure which to use, so I chose the 
more conservative approach.  synchronize_rcu() might be appropriate.

-corey

--

From: Paul E. McKenney
Date: Thursday, September 25, 2008 - 8:29 am

You do indeed need to match the update-side and read-side primitives:

	Update-side				Read-side

	synchronize_rcu()			rcu_read_lock()
	call_rcu()				rcu_read_unlock()

	call_rcu_bh()				rcu_read_lock_bh()
						rcu_read_unlock_bh()

	synchronize_sched()			preempt_disable()
						preempt_enable()
						[and anything else
						 that disables either
						 preemption or irqs]

	synchronize_srcu()			srcu_read_lock()
						srcu_read_unlock()


Mixing RCU or RCU-SCHED with RCU-BH will fail in Classic RCU systems,
while mixing RCU or RCU-BH with RCU-SCHED will fail in preemptable RCU
systems.  Mixing SRCU with any of the other flavors of RCU will fail
on any system.

So please match them up correctly!

							Thanx, Paul
--

From: Stephen Hemminger
Date: Thursday, September 25, 2008 - 8:34 am

On Thu, 25 Sep 2008 08:29:36 -0700

Also, for consistency with other parts of networking code, don't introduce
the synchronize_sched() or synchronize_srcu() pattern to network protocols
unless there is a no other way to achieve the desired result.

--

From: Corey Minyard
Date: Thursday, September 25, 2008 - 12:21 pm

Ok, will do.  I read more on this, and I think I understand the issues 
Do you mean synchronize_rcu(), too?  It seems to be used in the net 
code.  To avoid that I'd need to add a struct rcu_head to struct sock.  
Would that be preferable?

Thanks,

-corey
--

From: Stephen Hemminger
Date: Thursday, September 25, 2008 - 1:34 pm

On Thu, 25 Sep 2008 14:21:55 -0500

synchhonize_rcu or call_rcu_bh is fine. But I worry that if the other
stricter types are used, then we would have to audit all the other RCU
usage in networking to make sure nesting was correct.
--

From: Jarek Poplawski
Date: Thursday, September 25, 2008 - 1:45 am

On 24-09-2008 19:28, Corey Minyard wrote:


...

Aren't other functions like sk_next() or sk_unhashed() used on the
read side and need _rcu versions?

Jarek P.
--

From: Corey Minyard
Date: Thursday, September 25, 2008 - 12:14 pm

It also needs sk_next_rcu().  sk_unhashed() is only used on the update 
side, so an rcu version is not needed there.

Thanks for pointing this out.

-corey

--

Previous thread: [PATCH v3 0/4] FHCI USB Host support patches by Anton Vorontsov on Wednesday, September 24, 2008 - 11:20 am. (6 messages)

Next thread: [PATCH] x86_64: be less annoying on boot by Bill Nottingham on Wednesday, September 24, 2008 - 11:35 am. (10 messages)