Re: [PATCH] netfilter: factorize ifname_compare()

Previous thread: netfilter 10/41: xt_physdev fixes by Patrick McHardy on Tuesday, March 24, 2009 - 7:03 am. (1 message)

Next thread: netfilter 12/41: xt_physdev: unfold two loops in physdev_mt() by Patrick McHardy on Tuesday, March 24, 2009 - 7:03 am. (1 message)
From: Patrick McHardy
Date: Tuesday, March 24, 2009 - 7:03 am

commit ddc214c43a923e89741e04da2f10e3037a64e222
Author: Eric Dumazet <dada1@cosmosbay.com>
Date:   Wed Feb 18 17:47:50 2009 +0100

    netfilter: arp_tables: unfold two critical loops in arp_packet_match()
    
    x86 and powerpc can perform long word accesses in an efficient maner.
    We can use this to unroll two loops in arp_packet_match(), to
    perform arithmetic on long words instead of bytes. This is a win
    on x86_64 for example.
    
    Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 7ea88b6..b5db463 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -73,6 +73,36 @@ static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap,
 	return (ret != 0);
 }
 
+/*
+ * Unfortunatly, _b and _mask are not aligned to an int (or long int)
+ * Some arches dont care, unrolling the loop is a win on them.
+ */
+static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
+{
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+	const unsigned long *a = (const unsigned long *)_a;
+	const unsigned long *b = (const unsigned long *)_b;
+	const unsigned long *mask = (const unsigned long *)_mask;
+	unsigned long ret;
+
+	ret = (a[0] ^ b[0]) & mask[0];
+	if (IFNAMSIZ > sizeof(unsigned long))
+		ret |= (a[1] ^ b[1]) & mask[1];
+	if (IFNAMSIZ > 2 * sizeof(unsigned long))
+		ret |= (a[2] ^ b[2]) & mask[2];
+	if (IFNAMSIZ > 3 * sizeof(unsigned long))
+		ret |= (a[3] ^ b[3]) & mask[3];
+	BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+#else
+	unsigned long ret = 0;
+	int i;
+
+	for (i = 0; i < IFNAMSIZ; i++)
+		ret |= (_a[i] ^ _b[i]) & _mask[i];
+#endif
+	return ret;
+}
+
 /* Returns whether packet matches rule or not. */
 static inline int arp_packet_match(const struct arphdr *arphdr,
 				   struct net_device *dev,
@@ -83,7 +113,7 @@ static inline int ...
From: David Miller
Date: Tuesday, March 24, 2009 - 1:29 pm

From: Patrick McHardy <kaber@trash.net>

I think we can at least give some help for the platforms which
require alignment.

We can, for example, assume 16-bit alignment and thus loop
over u16's
--

From: Eric Dumazet
Date: Tuesday, March 24, 2009 - 2:06 pm

Right. How about this incremental patch ?

Thanks

[PATCH] arp_tables: ifname_compare() can assume 16bit alignment

Arches without efficient unaligned access can still perform a loop
assuming 16bit alignment in ifname_compare()

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>


diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 64a7c6c..84b9c17 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -76,6 +76,7 @@ static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap,
 /*
  * Unfortunatly, _b and _mask are not aligned to an int (or long int)
  * Some arches dont care, unrolling the loop is a win on them.
+ * For other arches, we only have a 16bit alignement.
  */
 static unsigned long ifname_compare(const char *_a, const char *_b, const char *_mask)
 {
@@ -95,10 +96,13 @@ static unsigned long ifname_compare(const char *_a, const char *_b, const char *
 	BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
 #else
 	unsigned long ret = 0;
+	const u16 *a = (const u16 *)_a;
+	const u16 *b = (const u16 *)_b;
+	const u16 *mask = (const u16 *)_mask;
 	int i;
 
-	for (i = 0; i < IFNAMSIZ; i++)
-		ret |= (_a[i] ^ _b[i]) & _mask[i];
+	for (i = 0; i < IFNAMSIZ/sizeof(u16); i++)
+		ret |= (a[i] ^ b[i]) & mask[i];
 #endif
 	return ret;
 }

--

From: David Miller
Date: Tuesday, March 24, 2009 - 2:16 pm

From: Eric Dumazet <dada1@cosmosbay.com>

Looks great, applied, thanks Eric!
--

From: Jan Engelhardt
Date: Tuesday, March 24, 2009 - 2:17 pm

Allow me some skepticism, but the code looks pretty much like a
--

From: David Miller
Date: Tuesday, March 24, 2009 - 2:18 pm

From: Jan Engelhardt <jengelh@medozas.de>

memcmp() can't make any assumptions about alignment.
Whereas we _know_ this thing is exactly 16-bit aligned.

All of the optimized memcmp() implementations look for
32-bit alignment and punt to byte at a time comparison
loops if things are not aligned enough.
--

From: Jan Engelhardt
Date: Tuesday, March 24, 2009 - 2:23 pm

Yes, I seem to remember glibc doing something like

 if ((addr & 0x03) != 0) {
     // process single bytes (increment addr as you go)
     // until addr & 0x03 == 0.
 }

 /* optimized loop here. also increases addr */

 if ((addr & 0x03) != 0)
     // still bytes left after loop - process on a per-byte basis

Is the cost of testing for non-4-divisibility expensive enough
to warrant not usnig memcmp?

Irrespective of all that, I think putting the interface comparison
code should be agglomerated in a function/header so that it is
replicated across iptables, ip6tables, ebtables, arptables, etc.
--

From: David Miller
Date: Tuesday, March 24, 2009 - 2:25 pm

From: Jan Engelhardt <jengelh@medozas.de>


Agreed.
--

From: Eric Dumazet
Date: Tuesday, March 24, 2009 - 2:39 pm

memcmp() is fine, but how is it solving the masking problem we have ?

Also in the case of arp_tables, _a is long word aligned, while _b and _mask are not.

memcmp() in this case is slower, (and dont handle mask thing)

If you look various ifname_compare(), we have two different implementations.

So yes, a factorization is possible for three ip_tables.c, ip6_tables.c and xt_physdev.c




--

From: Jan Engelhardt
Date: Tuesday, March 24, 2009 - 2:52 pm

You are right; we would have to calculate the prefix length of the
mask first. (And I think we can assume that there will not be any 1s

Very well.
--

From: Eric Dumazet
Date: Wednesday, March 25, 2009 - 4:27 am

Here is a patch to factorize these functions, as suggested by Jan

Thank you

[PATCH] netfilter: factorize ifname_compare()

We use same not trivial helper function in four places. We can factorize it.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
 include/linux/netfilter/x_tables.h |   23 +++++++++++++++++++++++
 net/ipv4/netfilter/arp_tables.c    |   14 +-------------
 net/ipv4/netfilter/ip_tables.c     |   23 ++---------------------
 net/ipv6/netfilter/ip6_tables.c    |   23 ++---------------------
 net/netfilter/xt_physdev.c         |   21 ++-------------------
 5 files changed, 30 insertions(+), 74 deletions(-)

diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index e8e08d0..72918b7 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -435,6 +435,29 @@ extern void xt_free_table_info(struct xt_table_info *info);
 extern void xt_table_entry_swap_rcu(struct xt_table_info *old,
 				    struct xt_table_info *new);
 
+/*
+ * This helper is performance critical and must be inlined
+ */
+static inline unsigned long ifname_compare_aligned(const char *_a,
+						   const char *_b,
+						   const char *_mask)
+{
+	const unsigned long *a = (const unsigned long *)_a;
+	const unsigned long *b = (const unsigned long *)_b;
+	const unsigned long *mask = (const unsigned long *)_mask;
+	unsigned long ret;
+
+	ret = (a[0] ^ b[0]) & mask[0];
+	if (IFNAMSIZ > sizeof(unsigned long))
+		ret |= (a[1] ^ b[1]) & mask[1];
+	if (IFNAMSIZ > 2 * sizeof(unsigned long))
+		ret |= (a[2] ^ b[2]) & mask[2];
+	if (IFNAMSIZ > 3 * sizeof(unsigned long))
+		ret |= (a[3] ^ b[3]) & mask[3];
+	BUILD_BUG_ON(IFNAMSIZ > 4 * sizeof(unsigned long));
+	return ret;
+}
+
 #ifdef CONFIG_COMPAT
 #include <net/compat.h>
 
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 84b9c17..ee1133c 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -81,19 ...
From: Patrick McHardy
Date: Wednesday, March 25, 2009 - 9:32 am

Applied, thanks a lot Eric.
--

From: Andi Kleen
Date: Wednesday, March 25, 2009 - 3:33 am

Newer gcc often[1] knows how to generate specialized aligned memcmp()
based on the type alignment.

However you need to make sure the input types are right.

So in theories some casts might be enough given a new enough
compiler.

[1] not always unfortunately, sometimes it loses this information.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.
--

Previous thread: netfilter 10/41: xt_physdev fixes by Patrick McHardy on Tuesday, March 24, 2009 - 7:03 am. (1 message)

Next thread: netfilter 12/41: xt_physdev: unfold two loops in physdev_mt() by Patrick McHardy on Tuesday, March 24, 2009 - 7:03 am. (1 message)