netfilter 02/12: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize()

Previous thread: [PATCH] ucc_geth: Rework the TX logic. by Joakim Tjernlund on Thursday, March 26, 2009 - 10:44 am. (18 messages)

Next thread: [PATCH] appletalk: this warning can go I think by Alan Cox on Thursday, March 26, 2009 - 1:49 pm. (2 messages)
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

Hi Dave,

following are a few late netfilter patches and fixes for 2.6.30, containing:

- Eric's patch to use SLAB_DESTROY_BY_RCU in conntrack, which reduces
  the conntrack size and avoids temporarily exceeding the configured
  maximum amount of entries before the RCU threshold kicks in.

- another patch from Eric to factorize the optimized ifname comparisons

- a fix from Eric to use hlist_add_head_rcu in nf_conntrack_set_hashsize()
  to avoid a race condition

- a number of patches from Holger Eitzenberger to perform approximately
  correct allocation (might overshoot by a bit) for ctnetlink event
  messages to avoid reallocation in netlink_trim(). According to some
  benchmarks by Pablo. this increases throughput by about 10% in an
  connection intensive workload.

- a patch fixing a build-failure in the new LED target

- a patch from Francis Dupont to fix an old regression in the *tables
  loop detection. Slightly modified and ported to ip6_tables and
  arp_tables by myself.

Please apply or pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6.git

Thanks!


 include/linux/netfilter/x_tables.h                 |   23 ++++
 include/net/netfilter/nf_conntrack.h               |   14 ++-
 include/net/netfilter/nf_conntrack_helper.h        |    2 +
 include/net/netfilter/nf_conntrack_l3proto.h       |    7 +
 include/net/netfilter/nf_conntrack_l4proto.h       |    7 +
 include/net/netfilter/nf_conntrack_tuple.h         |    6 +-
 include/net/netlink.h                              |    1 +
 include/net/netns/conntrack.h                      |    5 +-
 net/ipv4/netfilter/arp_tables.c                    |   18 +--
 net/ipv4/netfilter/ip_tables.c                     |   27 +----
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c     |    6 +
 .../netfilter/nf_conntrack_l3proto_ipv4_compat.c   |   63 ++++++----
 net/ipv4/netfilter/nf_conntrack_proto_icmp.c       |    6 +
 net/ipv4/netfilter/nf_nat_core.c                   |    2 +-
 ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit d0dba7255b541f1651a88e75ebdb20dd45509c2f
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date:   Wed Mar 25 18:24:48 2009 +0100

    netfilter: ctnetlink: add callbacks to the per-proto nlattrs
    
    There is added a single callback for the l3 proto helper.  The two
    callbacks for the l4 protos are necessary because of the general
    structure of a ctnetlink event, which is in short:
    
     CTA_TUPLE_ORIG
       <l3/l4-proto-attributes>
     CTA_TUPLE_REPLY
       <l3/l4-proto-attributes>
     CTA_ID
     ...
     CTA_PROTOINFO
       <l4-proto-attributes>
     CTA_TUPLE_MASTER
       <l3/l4-proto-attributes>
    
    Therefore the formular is
    
     size := sizeof(generic-nlas) + 3 * sizeof(tuple_nlas) + sizeof(protoinfo_nlas)
    
    Some of the NLAs are optional, e. g. CTA_TUPLE_MASTER, which is only
    set if it's an expected connection.  But the number of optional NLAs is
    small enough to prevent netlink_trim() from reallocating if calculated
    properly.
    
    Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/net/netfilter/nf_conntrack_l3proto.h b/include/net/netfilter/nf_conntrack_l3proto.h
index 0378676..9f99d36 100644
--- a/include/net/netfilter/nf_conntrack_l3proto.h
+++ b/include/net/netfilter/nf_conntrack_l3proto.h
@@ -53,10 +53,17 @@ struct nf_conntrack_l3proto
 	int (*tuple_to_nlattr)(struct sk_buff *skb,
 			       const struct nf_conntrack_tuple *t);
 
+	/*
+	 * Calculate size of tuple nlattr
+	 */
+	int (*nlattr_tuple_size)(void);
+
 	int (*nlattr_to_tuple)(struct nlattr *tb[],
 			       struct nf_conntrack_tuple *t);
 	const struct nla_policy *nla_policy;
 
+	size_t nla_size;
+
 #ifdef CONFIG_SYSCTL
 	struct ctl_table_header	*ctl_table_header;
 	struct ctl_path		*ctl_table_path;
diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
index b01070b..a120990 100644
--- ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit 2732c4e45bb67006fdc9ae6669be866762711ab5
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date:   Wed Mar 25 21:50:59 2009 +0100

    netfilter: ctnetlink: allocate right-sized ctnetlink skb
    
    Try to allocate a Netlink skb roughly the size of the actual
    message, with the help from the l3 and l4 protocol helpers.
    This is all to prevent a reallocation in netlink_trim() later.
    
    The overhead of allocating the right-sized skb is rather small, with
    ctnetlink_alloc_skb() actually being inlined away on my x86_64 box.
    The size of the per-proto space is determined at registration time of
    the protocol helper.
    
    Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 349bbef..03547c6 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -405,6 +405,69 @@ nla_put_failure:
 }
 
 #ifdef CONFIG_NF_CONNTRACK_EVENTS
+/*
+ * The general structure of a ctnetlink event is
+ *
+ *  CTA_TUPLE_ORIG
+ *    <l3/l4-proto-attributes>
+ *  CTA_TUPLE_REPLY
+ *    <l3/l4-proto-attributes>
+ *  CTA_ID
+ *  ...
+ *  CTA_PROTOINFO
+ *    <l4-proto-attributes>
+ *  CTA_TUPLE_MASTER
+ *    <l3/l4-proto-attributes>
+ *
+ * Therefore the formular is
+ *
+ *   size = sizeof(headers) + sizeof(generic_nlas) + 3 * sizeof(tuple_nlas)
+ *		+ sizeof(protoinfo_nlas)
+ */
+static struct sk_buff *
+ctnetlink_alloc_skb(const struct nf_conntrack_tuple *tuple, gfp_t gfp)
+{
+	struct nf_conntrack_l3proto *l3proto;
+	struct nf_conntrack_l4proto *l4proto;
+	int len;
+
+#define NLA_TYPE_SIZE(type)		nla_total_size(sizeof(type))
+
+	/* proto independant part */
+	len = NLMSG_SPACE(sizeof(struct nfgenmsg))
+		+ 3 * nla_total_size(0)		/* CTA_TUPLE_ORIG|REPL|MASTER */
+		+ 3 * nla_total_size(0)		/* CTA_TUPLE_IP */
+		+ 3 * nla_total_size(0)		/* CTA_TUPLE_PROTO ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit b8dfe498775de912116f275680ddb57c8799d9ef
Author: Eric Dumazet <dada1@cosmosbay.com>
Date:   Wed Mar 25 17:31:52 2009 +0100

    netfilter: factorize ifname_compare()
    
    We use same not trivial helper function in four places. We can factorize it.
    
    Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index e8e08d0..72918b7 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -435,6 +435,29 @@ extern void xt_free_table_info(struct xt_table_info *info);
 extern void xt_table_entry_swap_rcu(struct xt_table_info *old,
 				    struct xt_table_info *new);
 
+/*
+ * This helper is performance critical and must be inlined
+ */
+static inline unsigned long ifname_compare_aligned(const char *_a,
+						   const char *_b,
+						   const char *_mask)
+{
+	const unsigned long *a = (const unsigned long *)_a;
+	const unsigned long *b = (const unsigned long *)_b;
+	const unsigned long *mask = (const unsigned long *)_mask;
+	unsigned long ret;
+
+	ret = (a[0] ^ b[0]) & mask[0];
+	if (IFNAMSIZ > sizeof(unsigned long))
+		ret |= (a[1] ^ b[1]) & mask[1];
+	if (IFNAMSIZ > 2 * sizeof(unsigned long))
+		ret |= (a[2] ^ b[2]) & mask[2];
+	if (IFNAMSIZ > 3 * sizeof(unsigned long))
+		ret |= (a[3] ^ b[3]) & mask[3];
+	BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+	return ret;
+}
+
 #ifdef CONFIG_COMPAT
 #include <net/compat.h>
 
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 64a7c6c..4b35dba 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -80,19 +80,7 @@ static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap,
 static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
 {
 #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-	const unsigned long *a = (const unsigned long ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit a9a9adfe2f99ddadfb574a098392a007970a1577
Author: Patrick McHardy <kaber@trash.net>
Date:   Wed Mar 25 17:21:34 2009 +0100

    netfilter: fix xt_LED build failure
    
    net/netfilter/xt_LED.c:40: error: field netfilter_led_trigger has incomplete type
    net/netfilter/xt_LED.c: In function led_timeout_callback:
    net/netfilter/xt_LED.c:78: warning: unused variable ledinternal
    net/netfilter/xt_LED.c: In function led_tg_check:
    net/netfilter/xt_LED.c:102: error: implicit declaration of function led_trigger_register
    net/netfilter/xt_LED.c: In function led_tg_destroy:
    net/netfilter/xt_LED.c:135: error: implicit declaration of function led_trigger_unregister
    
    Fix by adding a dependency on LED_TRIGGERS.
    
    Reported-by: Sachin Sant <sachinp@in.ibm.com>
    Tested-by: Subrata Modak <tosubrata@gmail.com>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 2562d05..2c967e4 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -374,7 +374,7 @@ config NETFILTER_XT_TARGET_HL
 
 config NETFILTER_XT_TARGET_LED
 	tristate '"LED" target support'
-	depends on LEDS_CLASS
+	depends on LEDS_CLASS && LED_TRIGGERS
 	depends on NETFILTER_ADVANCED
 	help
 	  This option adds a `LED' target, which allows you to blink LEDs in
--

From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit 78f3648601fdc7a8166748bbd6d0555a88efa24a
Author: Eric Dumazet <dada1@cosmosbay.com>
Date:   Wed Mar 25 17:24:34 2009 +0100

    netfilter: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize()
    
    Using hlist_add_head() in nf_conntrack_set_hashsize() is quite dangerous.
    Without any barrier, one CPU could see a loop while doing its lookup.
    Its true new table cannot be seen by another cpu, but previous table is still
    readable.
    
    Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 55befe5..54e983f 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1121,7 +1121,7 @@ int nf_conntrack_set_hashsize(const char *val, struct kernel_param *kp)
 					struct nf_conntrack_tuple_hash, hnode);
 			hlist_del_rcu(&h->hnode);
 			bucket = __hash_conntrack(&h->tuple, hashsize, rnd);
-			hlist_add_head(&h->hnode, &hash[bucket]);
+			hlist_add_head_rcu(&h->hnode, &hash[bucket]);
 		}
 	}
 	old_size = nf_conntrack_htable_size;
--

From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit af9d32ad6718b9a80fa89f557cc1fbb63a93ec15
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date:   Wed Mar 25 18:44:01 2009 +0100

    netfilter: limit the length of the helper name
    
    This is necessary in order to have an upper bound for Netlink
    message calculation, which is not a problem at all, as there
    are no helpers with a longer name.
    
    Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
index 66d65a7..ee2a4b3 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -14,6 +14,8 @@
 
 struct module;
 
+#define NF_CT_HELPER_NAME_LEN	16
+
 struct nf_conntrack_helper
 {
 	struct hlist_node hnode;	/* Internal use. */
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index a51bdac..805cfdd 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -142,6 +142,7 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
 
 	BUG_ON(me->expect_policy == NULL);
 	BUG_ON(me->expect_class_max >= NF_CT_MAX_EXPECT_CLASSES);
+	BUG_ON(strlen(me->name) > NF_CT_HELPER_NAME_LEN - 1);
 
 	mutex_lock(&nf_ct_helper_mutex);
 	hlist_add_head_rcu(&me->hnode, &nf_ct_helper_hash[h]);
--

From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit d271e8bd8c60ce059ee36d836ba063cfc61c3e21
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date:   Thu Mar 26 13:37:14 2009 +0100

    ctnetlink: compute generic part of event more acurately
    
    On a box with most of the optional Netfilter switches turned off some
    of the NLAs are never send, e. g. secmark, mark or the conntrack
    byte/packet counters.  As a worst case scenario this may possibly
    still lead to ctnetlink skbs being reallocated in netlink_trim()
    later, loosing all the nice effects from the previous patches.
    
    I try to solve that (at least partly) by correctly #ifdef'ing the
    NLAs in the computation.
    
    Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 03547c6..2fb833b 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -441,19 +441,28 @@ ctnetlink_alloc_skb(const struct nf_conntrack_tuple *tuple, gfp_t gfp)
 		+ 3 * NLA_TYPE_SIZE(u_int8_t)	/* CTA_PROTO_NUM */
 		+ NLA_TYPE_SIZE(u_int32_t)	/* CTA_ID */
 		+ NLA_TYPE_SIZE(u_int32_t)	/* CTA_STATUS */
+#ifdef CONFIG_NF_CT_ACCT
 		+ 2 * nla_total_size(0)		/* CTA_COUNTERS_ORIG|REPL */
 		+ 2 * NLA_TYPE_SIZE(uint64_t)	/* CTA_COUNTERS_PACKETS */
 		+ 2 * NLA_TYPE_SIZE(uint64_t)	/* CTA_COUNTERS_BYTES */
+#endif
 		+ NLA_TYPE_SIZE(u_int32_t)	/* CTA_TIMEOUT */
 		+ nla_total_size(0)		/* CTA_PROTOINFO */
 		+ nla_total_size(0)		/* CTA_HELP */
 		+ nla_total_size(NF_CT_HELPER_NAME_LEN)	/* CTA_HELP_NAME */
+#ifdef CONFIG_NF_CONNTRACK_SECMARK
 		+ NLA_TYPE_SIZE(u_int32_t)	/* CTA_SECMARK */
+#endif
+#ifdef CONFIG_NF_NAT_NEEDED
 		+ 2 * nla_total_size(0)		/* CTA_NAT_SEQ_ADJ_ORIG|REPL */
 		+ 2 * NLA_TYPE_SIZE(u_int32_t)	/* CTA_NAT_SEQ_CORRECTION_POS */
 		+ 2 * NLA_TYPE_SIZE(u_int32_t)	/* CTA_NAT_SEQ_CORRECTION_BEFORE */
 		+ 2 * NLA_TYPE_SIZE(u_int32_t)	/* ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit 5c0de29d06318ec8f6e3ba0d17d62529dbbdc1e8
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date:   Wed Mar 25 21:52:17 2009 +0100

    netfilter: nf_conntrack: add generic function to get len of generic policy
    
    Usefull for all protocols which do not add additional data, such
    as GRE or UDPlite.
    
    Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/net/netfilter/nf_conntrack_l4proto.h b/include/net/netfilter/nf_conntrack_l4proto.h
index a120990..ba32ed7 100644
--- a/include/net/netfilter/nf_conntrack_l4proto.h
+++ b/include/net/netfilter/nf_conntrack_l4proto.h
@@ -113,6 +113,7 @@ extern int nf_ct_port_tuple_to_nlattr(struct sk_buff *skb,
 				      const struct nf_conntrack_tuple *tuple);
 extern int nf_ct_port_nlattr_to_tuple(struct nlattr *tb[],
 				      struct nf_conntrack_tuple *t);
+extern int nf_ct_port_nlattr_tuple_size(void);
 extern const struct nla_policy nf_ct_port_nla_policy[];
 
 #ifdef CONFIG_SYSCTL
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index c55bbdc..b182b30 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -921,6 +921,12 @@ int nf_ct_port_nlattr_to_tuple(struct nlattr *tb[],
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nf_ct_port_nlattr_to_tuple);
+
+int nf_ct_port_nlattr_tuple_size(void)
+{
+	return nla_policy_len(nf_ct_port_nla_policy, CTA_PROTO_MAX + 1);
+}
+EXPORT_SYMBOL_GPL(nf_ct_port_nlattr_tuple_size);
 #endif
 
 /* Used by ipt_REJECT and ip6t_REJECT. */
--

From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit a400c30edb1958ceb53c4b8ce78989189b36df47
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date:   Wed Mar 25 21:53:39 2009 +0100

    netfilter: nf_conntrack: calculate per-protocol nlattr size
    
    Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index 8b681f2..7d2ead7 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -328,6 +328,11 @@ static int ipv4_nlattr_to_tuple(struct nlattr *tb[],
 
 	return 0;
 }
+
+static int ipv4_nlattr_tuple_size(void)
+{
+	return nla_policy_len(ipv4_nla_policy, CTA_IP_MAX + 1);
+}
 #endif
 
 static struct nf_sockopt_ops so_getorigdst = {
@@ -347,6 +352,7 @@ struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv4 __read_mostly = {
 	.get_l4proto	 = ipv4_get_l4proto,
 #if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
 	.tuple_to_nlattr = ipv4_tuple_to_nlattr,
+	.nlattr_tuple_size = ipv4_nlattr_tuple_size,
 	.nlattr_to_tuple = ipv4_nlattr_to_tuple,
 	.nla_policy	 = ipv4_nla_policy,
 #endif
diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
index 2a8bee2..23b2c2e 100644
--- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
@@ -262,6 +262,11 @@ static int icmp_nlattr_to_tuple(struct nlattr *tb[],
 
 	return 0;
 }
+
+static int icmp_nlattr_tuple_size(void)
+{
+	return nla_policy_len(icmp_nla_policy, CTA_PROTO_MAX + 1);
+}
 #endif
 
 #ifdef CONFIG_SYSCTL
@@ -309,6 +314,7 @@ struct nf_conntrack_l4proto nf_conntrack_l4proto_icmp __read_mostly =
 	.me			= NULL,
 #if defined(CONFIG_NF_CT_NETLINK) || defined(CONFIG_NF_CT_NETLINK_MODULE)
 	.tuple_to_nlattr	= icmp_tuple_to_nlattr,
+	.nlattr_tuple_size	= icmp_nlattr_tuple_size,
 	.nlattr_to_tuple	= ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit e487eb99cf9381a4f8254fa01747a85818da612b
Author: Holger Eitzenberger <holger@eitzenberger.org>
Date:   Wed Mar 25 18:26:30 2009 +0100

    netlink: add nla_policy_len()
    
    It calculates the max. length of a Netlink policy, which is usefull
    for allocating Netlink buffers roughly the size of the actual
    message.
    
    Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/net/netlink.h b/include/net/netlink.h
index 8a6150a..eddb502 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -230,6 +230,7 @@ extern int		nla_validate(struct nlattr *head, int len, int maxtype,
 extern int		nla_parse(struct nlattr *tb[], int maxtype,
 				  struct nlattr *head, int len,
 				  const struct nla_policy *policy);
+extern int		nla_policy_len(const struct nla_policy *, int);
 extern struct nlattr *	nla_find(struct nlattr *head, int len, int attrtype);
 extern size_t		nla_strlcpy(char *dst, const struct nlattr *nla,
 				    size_t dstsize);
diff --git a/net/netlink/attr.c b/net/netlink/attr.c
index 56c3ce7..ae32c57 100644
--- a/net/netlink/attr.c
+++ b/net/netlink/attr.c
@@ -133,6 +133,32 @@ errout:
 }
 
 /**
+ * nla_policy_len - Determin the max. length of a policy
+ * @policy: policy to use
+ * @n: number of policies
+ *
+ * Determines the max. length of the policy.  It is currently used
+ * to allocated Netlink buffers roughly the size of the actual
+ * message.
+ *
+ * Returns 0 on success or a negative error code.
+ */
+int
+nla_policy_len(const struct nla_policy *p, int n)
+{
+	int i, len = 0;
+
+	for (i = 0; i < n; i++) {
+		if (p->len)
+			len += nla_total_size(p->len);
+		else if (nla_attr_minlen[p->type])
+			len += nla_total_size(nla_attr_minlen[p->type]);
+	}
+
+	return len;
+}
+
+/**
  * nla_parse - Parse a stream of attributes into a tb buffer
  * @tb: destination array with maxtype+1 elements
  * @maxtype: maximum attribute type to be ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit 1f9352ae2253a97b07b34dcf16ffa3b4ca12c558
Author: Patrick McHardy <kaber@trash.net>
Date:   Wed Mar 25 19:26:35 2009 +0100

    netfilter: {ip,ip6,arp}_tables: fix incorrect loop detection
    
    Commit e1b4b9f ([NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case
    search for loops) introduced a regression in the loop detection algorithm,
    causing sporadic incorrectly detected loops.
    
    When a chain has already been visited during the check, it is treated as
    having a standard target containing a RETURN verdict directly at the
    beginning in order to not check it again. The real target of the first
    rule is then incorrectly treated as STANDARD target and checked not to
    contain invalid verdicts.
    
    Fix by making sure the rule does actually contain a standard target.
    
    Based on patch by Francis Dupont <Francis_Dupont@isc.org>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 4b35dba..4f454ce 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -388,7 +388,9 @@ static int mark_source_chains(struct xt_table_info *newinfo,
 			    && unconditional(&e->arp)) || visited) {
 				unsigned int oldpos, size;
 
-				if (t->verdict < -NF_MAX_VERDICT - 1) {
+				if ((strcmp(t->target.u.user.name,
+					    ARPT_STANDARD_TARGET) == 0) &&
+				    t->verdict < -NF_MAX_VERDICT - 1) {
 					duprintf("mark_source_chains: bad "
 						"negative verdict (%i)\n",
 								t->verdict);
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 41c59e3..82ee7c9 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -488,7 +488,9 @@ mark_source_chains(struct xt_table_info *newinfo,
 			    && unconditional(&e->ip)) || visited) {
 				unsigned int oldpos, size;
 
-				if (t->verdict < -NF_MAX_VERDICT - 1) {
+				if ((strcmp(t->target.u.user.name,
+			    		    ...
From: Patrick McHardy
Date: Thursday, March 26, 2009 - 12:02 pm

commit ea781f197d6a835cbb93a0bf88ee1696296ed8aa
Author: Eric Dumazet <dada1@cosmosbay.com>
Date:   Wed Mar 25 21:05:46 2009 +0100

    netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()
    
    Use "hlist_nulls" infrastructure we added in 2.6.29 for RCUification of UDP & TCP.
    
    This permits an easy conversion from call_rcu() based hash lists to a
    SLAB_DESTROY_BY_RCU one.
    
    Avoiding call_rcu() delay at nf_conn freeing time has numerous gains.
    
    First, it doesnt fill RCU queues (up to 10000 elements per cpu).
    This reduces OOM possibility, if queued elements are not taken into account
    This reduces latency problems when RCU queue size hits hilimit and triggers
    emergency mode.
    
    - It allows fast reuse of just freed elements, permitting better use of
    CPU cache.
    
    - We delete rcu_head from "struct nf_conn", shrinking size of this structure
    by 8 or 16 bytes.
    
    This patch only takes care of "struct nf_conn".
    call_rcu() is still used for less critical conntrack parts, that may
    be converted later if necessary.
    
    Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index 4dfb793..6c3f964 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -91,8 +91,7 @@ struct nf_conn_help {
 #include <net/netfilter/ipv4/nf_conntrack_ipv4.h>
 #include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
 
-struct nf_conn
-{
+struct nf_conn {
 	/* Usage count in here is 1 for hash table/destruct timer, 1 per skb,
            plus 1 for any connection(s) we are `master' for */
 	struct nf_conntrack ct_general;
@@ -126,7 +125,6 @@ struct nf_conn
 #ifdef CONFIG_NET_NS
 	struct net *ct_net;
 #endif
-	struct rcu_head rcu;
 };
 
 static inline struct nf_conn *
@@ -190,9 +188,13 @@ static inline void nf_ct_put(struct ...
From: David Miller
Date: Thursday, March 26, 2009 - 10:46 pm

From: Patrick McHardy <kaber@trash.net>

Pulled, thanks.
--

Previous thread: [PATCH] ucc_geth: Rework the TX logic. by Joakim Tjernlund on Thursday, March 26, 2009 - 10:44 am. (18 messages)

Next thread: [PATCH] appletalk: this warning can go I think by Alan Cox on Thursday, March 26, 2009 - 1:49 pm. (2 messages)