Re: [PATCH 5/5] netfilter: convert x_tables to use RCU

Previous thread: [PATCH 0/5] iptables lockless receive (v0.3) by Stephen Hemminger on Wednesday, January 28, 2009 - 11:25 pm. (2 messages)

Next thread: [PATCH] decnet: incorrect optlen size by Roel Kluin on Thursday, January 29, 2009 - 1:21 am. (3 messages)
From: Stephen Hemminger
Date: Wednesday, January 28, 2009 - 11:25 pm

Replace existing reader/writer lock with Read-Copy-Update to
elminate the overhead of a read lock on each incoming packet.
This should reduce the overhead of iptables especially on SMP
systems.

The previous code used a reader-writer lock for two purposes.
The first was to ensure that the xt_table_info reference was not in
process of being changed. Since xt_table_info is only freed via one
routine, it was a direct conversion to RCU.

The other use of the reader-writer lock was to to block changes
to counters while they were being read. This synchronization was
fixed by the previous patch.  But still need to make sure table info
isn't going away.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


---
 include/linux/netfilter/x_tables.h |   10 ++++++-
 net/ipv4/netfilter/arp_tables.c    |   12 ++++-----
 net/ipv4/netfilter/ip_tables.c     |   12 ++++-----
 net/ipv6/netfilter/ip6_tables.c    |   12 ++++-----
 net/netfilter/x_tables.c           |   48 ++++++++++++++++++++++++++-----------
 5 files changed, 60 insertions(+), 34 deletions(-)

--- a/include/linux/netfilter/x_tables.h	2009-01-28 22:04:39.316517913 -0800
+++ b/include/linux/netfilter/x_tables.h	2009-01-28 22:14:54.648490491 -0800
@@ -352,8 +352,8 @@ struct xt_table
 	/* What hooks you will enter on */
 	unsigned int valid_hooks;
 
-	/* Lock for the curtain */
-	rwlock_t lock;
+	/* Lock for curtain */
+	spinlock_t lock;
 
 	/* Man behind the curtain... */
 	struct xt_table_info *private;
@@ -386,6 +386,12 @@ struct xt_table_info
 	/* Secret compartment */
 	seqcount_t *seq;
 
+	/* For the dustman... */
+	union {
+		struct rcu_head rcu;
+		struct work_struct work;
+	};
+
 	/* ipt_entry tables: one per CPU */
 	/* Note : this field MUST be the last one, see XT_TABLE_INFO_SZ */
 	char *entries[1];
--- a/net/ipv4/netfilter/arp_tables.c	2009-01-28 22:13:16.423490077 -0800
+++ b/net/ipv4/netfilter/arp_tables.c	2009-01-28 22:14:54.648490491 -0800
@@ -238,8 +238,8 @@ unsigned int ...
From: Eric Dumazet
Date: Thursday, January 29, 2009 - 4:04 pm

I feel litle bit nervous seeing a write_lock_bh() changed to a rcu_read_lock()

Also, add_counter_to_entry() is not using seqcount protection, so another thread
doing an iptables -L in parallel with this thread will possibly get corrupted counters.


(With write_lock_bh(), this corruption could not occur)


--

From: Stephen Hemminger
Date: Thursday, January 29, 2009 - 4:16 pm

On Fri, 30 Jan 2009 00:04:16 +0100

--

From: Eric Dumazet
Date: Thursday, January 29, 2009 - 11:53 pm

[Empty message]
From: Eric Dumazet
Date: Friday, January 30, 2009 - 12:02 am

Hum, I just checked and indeed there is a problem...

#define SUM_COUNTER(s,c)  do { (s).bcnt += (c).bcnt; (s).pcnt += (c).pcnt; } while(0)

need to be changed to use 

#define SUM_COUNTER(s, c)  do { xt_incr_counter(s, (c).cnt, (c).pcnt);} while (0)



--

From: Eric Dumazet
Date: Friday, January 30, 2009 - 12:05 am

Oops

#define SUM_COUNTER(s, c)  xt_incr_counter(s, (c).bcnt, (c).pcnt)

--

Previous thread: [PATCH 0/5] iptables lockless receive (v0.3) by Stephen Hemminger on Wednesday, January 28, 2009 - 11:25 pm. (2 messages)

Next thread: [PATCH] decnet: incorrect optlen size by Roel Kluin on Thursday, January 29, 2009 - 1:21 am. (3 messages)