[RFT 3/4] netfilter: use sequence number synchronization for counters

Previous thread: [RFT 2/4] netfilter: remove unneeded initializations by Stephen Hemminger on Tuesday, January 27, 2009 - 4:53 pm. (2 messages)

Next thread: [RFT 4/4] netfilter: convert x_tables to use RCU by Stephen Hemminger on Tuesday, January 27, 2009 - 4:53 pm. (2 messages)
From: Stephen Hemminger
Date: Tuesday, January 27, 2009 - 4:53 pm

Change how synchronization is done on the iptables counters. Use seqcount
wrapper instead of depending on reader/writer lock.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


---
 include/linux/netfilter/x_tables.h        |    2 +-
 include/linux/netfilter_arp/arp_tables.h  |    3 +++
 include/linux/netfilter_ipv4/ip_tables.h  |    3 +++
 include/linux/netfilter_ipv6/ip6_tables.h |    3 +++
 net/ipv4/netfilter/arp_tables.c           |   20 +++++++++++++-------
 net/ipv4/netfilter/ip_tables.c            |   20 +++++++++++++-------
 net/ipv6/netfilter/ip6_tables.c           |   20 +++++++++++++-------
 net/netfilter/x_tables.c                  |    1 +
 8 files changed, 50 insertions(+), 22 deletions(-)
4
--- a/include/linux/netfilter_ipv6/ip6_tables.h	2009-01-27 15:03:02.843376881 -0800
+++ b/include/linux/netfilter_ipv6/ip6_tables.h	2009-01-27 15:37:38.935377810 -0800
@@ -103,6 +103,9 @@ struct ip6t_entry
 	/* Back pointer */
 	unsigned int comefrom;
 
+	/* Update of counter synchronization */
+	seqcount_t seq;
+
 	/* Packet and byte counters. */
 	struct xt_counters counters;
 
--- a/net/ipv4/netfilter/arp_tables.c	2009-01-27 14:48:41.579877551 -0800
+++ b/net/ipv4/netfilter/arp_tables.c	2009-01-27 15:45:34.566650540 -0800
@@ -256,7 +256,9 @@ unsigned int arpt_do_table(struct sk_buf
 
 			hdr_len = sizeof(*arp) + (2 * sizeof(struct in_addr)) +
 				(2 * skb->dev->addr_len);
+			write_seqcount_begin(&e->seq);
 			ADD_COUNTER(e->counters, hdr_len, 1);
+			write_seqcount_end(&e->seq);
 
 			t = arpt_get_target(e);
 
@@ -549,6 +551,7 @@ static inline int check_entry_size_and_h
 	   < 0 (not ARPT_RETURN). --RR */
 
 	/* Clear counters and comefrom */
+	seqcount_init(&e->seq);
 	e->counters = ((struct xt_counters) { 0, 0 });
 	e->comefrom = 0;
 
@@ -703,14 +706,17 @@ static void get_counters(const struct xt
 			   &i);
 
 	for_each_possible_cpu(cpu) {
+		struct arpt_entry *e = t->entries[cpu];
+		unsigned int start;
+
 		if (cpu == curcpu)
 ...
From: Eric Dumazet
Date: Tuesday, January 27, 2009 - 11:17 pm

This will never complete on a loaded machine and a big set of rules.
When we reach the end of IPT_ENTRY_ITERATE, we notice many packets came 
while doing the iteration and restart,
with wrong accumulated values (no rollback of what was done to accumulator)

You want to do the seqcount_begin/end in the leaf function 
(add_entry_to_counter()), and make accumulate a value pair (bytes/counter)
only once you are sure they are correct.

Using one seqcount_t per rule (struct ipt_entry) is very expensive. 
(This is 4 bytes per rule X num_possible_cpus())

You need one seqcount_t per cpu


--

From: Stephen Hemminger
Date: Tuesday, January 27, 2009 - 11:28 pm

On Wed, 28 Jan 2009 07:17:04 +0100

If we use one count per table, that solves it, but it becomes a hot

The other option would be swapping counters and using rcu, but that adds lots of
RCU synchronization, and RCU sync overhead only seems to be growing.
--

From: Eric Dumazet
Date: Tuesday, January 27, 2009 - 11:35 pm

[Empty message]
From: Patrick McHardy
Date: Wednesday, January 28, 2009 - 9:15 am

Indeed.
--

Previous thread: [RFT 2/4] netfilter: remove unneeded initializations by Stephen Hemminger on Tuesday, January 27, 2009 - 4:53 pm. (2 messages)

Next thread: [RFT 4/4] netfilter: convert x_tables to use RCU by Stephen Hemminger on Tuesday, January 27, 2009 - 4:53 pm. (2 messages)