Re: [Bugme-new] [Bug 12327] New: Intermittent TCP issues with => 2.6.27

Previous thread: [PATCH] e1000e: Add process name to WARN message when detecting Mutex contention by Jeff Kirsher on Thursday, January 8, 2009 - 7:03 pm. (2 messages)

Next thread: Re: [PATCH] Make possible speeds known to ethtool by Herbert Xu on Thursday, January 8, 2009 - 8:48 pm. (2 messages)
From: Herbert Xu
Date: Thursday, January 8, 2009 - 8:14 pm

You've hit the nail on the head :)

The bridge netfilter is just one huge pile of crap that should
be deleted.

The least we should do is delete the VLAN/PPPOE parts of it because
it's simply bogus.  What if two VLANs/PPPOE sessions use the same
IP address pairs? They'll be treated as a single flow which is just
insane.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

From: Herbert Xu
Date: Friday, January 9, 2009 - 4:55 am

Hi:

It turns out that even though we have sysctl's that's supposed
to control pppoe/vlan processing, they don't actually work.

This patch should make them work.

bridge: Fix handling of non-IP packets in FORWARD/POST_ROUTING

Currently the bridge FORWARD/POST_ROUTING chains treats all
non-IPv4 packets as IPv6.  This packet fixes that by returning
NF_ACCEPT on non-IP packets instead, just as is done in PRE_ROUTING.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index a65e43a..9a1cd75 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -686,8 +686,11 @@ static unsigned int br_nf_forward_ip(unsigned int hook, struct sk_buff *skb,
 	if (skb->protocol == htons(ETH_P_IP) || IS_VLAN_IP(skb) ||
 	    IS_PPPOE_IP(skb))
 		pf = PF_INET;
-	else
+	else if (skb->protocol == htons(ETH_P_IPV6) || IS_VLAN_IPV6(skb) ||
+		 IS_PPPOE_IPV6(skb))
 		pf = PF_INET6;
+	else
+		return NF_ACCEPT;
 
 	nf_bridge_pull_encap_header(skb);
 
@@ -828,8 +831,11 @@ static unsigned int br_nf_post_routing(unsigned int hook, struct sk_buff *skb,
 	if (skb->protocol == htons(ETH_P_IP) || IS_VLAN_IP(skb) ||
 	    IS_PPPOE_IP(skb))
 		pf = PF_INET;
-	else
+	else if (skb->protocol == htons(ETH_P_IPV6) || IS_VLAN_IPV6(skb) ||
+		 IS_PPPOE_IPV6(skb))
 		pf = PF_INET6;
+	else
+		return NF_ACCEPT;
 
 #ifdef CONFIG_NETFILTER_DEBUG
 	if (skb->dst == NULL) {

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--

From: Herbert Xu
Date: Friday, January 9, 2009 - 5:04 am

With that we can actually turn them off.

bridge: Disable PPPOE/VLAN processing by default

The PPPOE/VLAN processing code in the bridge netfilter is broken
by design.  The VLAN tag and the PPPOE session ID are an integral
part of the packet flow information, yet they're completely
ignored by the bridge netfilter.  This is potentially a security
hole as it treats all VLANs and PPPOE sessions as the same.

What's more, it's actually broken for PPPOE as the bridge netfilter
tries to trim the packets to the IP length without adjusting the
PPPOE header (and adjusting the PPPOE header isn't much better
since the PPPOE peer may require the padding to be present).

Therefore we should disable this by default.

It does mean that people relying on this feature may lose networking
depending on how their bridge netfilter rules are configured.
However, IMHO the problems this code causes are serious enough to
warrant this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 9a1cd75..cf754ac 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -58,11 +58,11 @@ static struct ctl_table_header *brnf_sysctl_header;
 static int brnf_call_iptables __read_mostly = 1;
 static int brnf_call_ip6tables __read_mostly = 1;
 static int brnf_call_arptables __read_mostly = 1;
-static int brnf_filter_vlan_tagged __read_mostly = 1;
-static int brnf_filter_pppoe_tagged __read_mostly = 1;
+static int brnf_filter_vlan_tagged __read_mostly = 0;
+static int brnf_filter_pppoe_tagged __read_mostly = 0;
 #else
-#define brnf_filter_vlan_tagged 1
-#define brnf_filter_pppoe_tagged 1
+#define brnf_filter_vlan_tagged 0
+#define brnf_filter_pppoe_tagged 0
 #endif
 
 static inline __be16 vlan_proto(const struct sk_buff *skb)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: ...
From: Patrick McHardy
Date: Sunday, January 11, 2009 - 10:30 pm

A good reason to disable this crap :)

Applied, thanks.
--

From: Speedster
Date: Tuesday, January 13, 2009 - 3:50 am

I can confirm this resolves the issue. I have compiled a 2.6.28 kernel 
with Herbert's patches and I can now use the packaged 2.6.27-9 kernel in 
the Ubuntu guest.

Cheers
Dean
--

From: Dean Holland
Date: Friday, March 6, 2009 - 3:39 am

Have these commits made it into a kernel release yet? I haven't seen 
them in any of the Changelogs and am keen to get back to a packaged 
distribution kernel :)

Sorry if if I have missed it, but if not is there any indication of when 
it should make it in?

Cheers
Dean
--

From: Ilpo Järvinen
Date: Wednesday, March 25, 2009 - 6:26 am

They are included in, commits:

a2bd40ad3151d4d346fd167e01fb84b06f7247fc
47e0e1ca13d64eeeb687995fbe4e239e743d7544

I think they are in 2.9.29-rc2 already and thus in 2.6.29 (if I read 
gitk's output correctly).

-- 
 i.
--

From: Patrick McHardy
Date: Sunday, January 11, 2009 - 10:27 pm

Which reminds me - I think we should change them to default to off,
they regulary break things for unsuspecting users. I'm not sure
though what the best way to warn people for a while would be,

Applied, thanks.
--

From: Patrick McHardy
Date: Sunday, January 11, 2009 - 10:25 pm

Fully agreed. So far unfortunately nobody had the stomach to
get rid of this piece of junk.
--

Previous thread: [PATCH] e1000e: Add process name to WARN message when detecting Mutex contention by Jeff Kirsher on Thursday, January 8, 2009 - 7:03 pm. (2 messages)

Next thread: Re: [PATCH] Make possible speeds known to ethtool by Herbert Xu on Thursday, January 8, 2009 - 8:48 pm. (2 messages)