Re: [PATCH 2/2] [e1000 VLAN] Disable vlan hw accel when promiscuous mode

Previous thread: [PATCH 1/2] [VLAN] Don't drop an unclassfied vlan packet as PACKET_OTHERHOST by Joonwoo Park on Saturday, November 10, 2007 - 5:51 pm. (1 message)

Next thread: Re : Oops preceded by WARNING: at net/ipv4/tcp_input.c:1571 tcp_remove_reno_sacks() by Chazarain Guillaume on Saturday, November 10, 2007 - 6:39 pm. (9 messages)
From: Joonwoo Park
Date: Saturday, November 10, 2007 - 5:51 pm

IMHO even though netdevice is in the promiscuous mode, we should receive all of ingress packets.
This disable the vlan filtering feature when a vlan hw accel configured e1000 device goes into promiscuous mode.
This make packets visible to sniffers though it's not vlan id of itself.
Any check, comments will be appreciated.
Thanks.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
---
 drivers/net/e1000/e1000_main.c |   26 ++++++++++++++++++++------
 1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 72deff0..cdd5c84 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2424,7 +2424,7 @@ e1000_set_multi(struct net_device *netdev)
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	struct dev_mc_list *mc_ptr;
-	uint32_t rctl;
+	uint32_t rctl, ctrl;
 	uint32_t hash_value;
 	int i, rar_entries = E1000_RAR_ENTRIES;
 	int mta_reg_count = (hw->mac_type == e1000_ich8lan) ?
@@ -2441,14 +2441,25 @@ e1000_set_multi(struct net_device *netdev)
 	/* Check for Promiscuous and All Multicast modes */
 
 	rctl = E1000_READ_REG(hw, RCTL);
+	ctrl = E1000_READ_REG(&adapter->hw, CTRL);
 
 	if (netdev->flags & IFF_PROMISC) {
 		rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE);
-	} else if (netdev->flags & IFF_ALLMULTI) {
-		rctl |= E1000_RCTL_MPE;
-		rctl &= ~E1000_RCTL_UPE;
+		if (adapter->hw.mac_type != e1000_ich8lan) {
+			if (ctrl & E1000_CTRL_VME)
+				rctl &= ~E1000_RCTL_VFE;
+		}
 	} else {
-		rctl &= ~(E1000_RCTL_UPE | E1000_RCTL_MPE);
+		if (adapter->hw.mac_type != e1000_ich8lan) {
+			if (ctrl & E1000_CTRL_VME)
+				rctl |= E1000_RCTL_VFE;
+		}
+		if (netdev->flags & IFF_ALLMULTI) {
+			rctl |= E1000_RCTL_MPE;
+			rctl &= ~E1000_RCTL_UPE;
+		} else {
+			rctl &= ~(E1000_RCTL_UPE | E1000_RCTL_MPE);
+		}
 	}
 
 	E1000_WRITE_REG(hw, RCTL, rctl);
@@ -4952,7 +4963,10 @@ e1000_vlan_rx_register(struct net_device *netdev, struct ...
From: Kok, Auke
Date: Monday, November 12, 2007 - 10:12 am

Actually I think this patch removes a choice from the user.

Before this patch, the user can sniff all traffic by disabling vlans, or a
specific vlan only by leaving vlans on when going into promisc mode.

After this patch, the user has no choice but to sniff all vlans at all times.

I don't think that that is such a good improvement.

-

From: Patrick McHardy
Date: Monday, November 12, 2007 - 10:21 am

Do you really consider that a realistic choice? Who is going to
remove interfaces that are in use just to see traffic for other
VLANs? Sniffing specific VLANs can always be done on the VLAN
device itself.

IMO its more a question of what we want promiscous mode to mean,
and I tend to agree with Joonwoo that it should receive all packets.
-

From: Kok, Auke
Date: Monday, November 12, 2007 - 11:01 am

OK, Joonwoo: can you submit a patch against e1000e as well?

Auke
-

From: David Miller
Date: Monday, November 12, 2007 - 3:33 pm

From: Patrick McHardy <kaber@trash.net>

Change the example to wanting to see traffic on another
physical switch.


On what LAN?

When you select VLAN, you by definition are asking for non-VLAN
traffic to be elided.  It is like plugging the ethernet cable
into one switch or another.
-

From: Chris Friesen
Date: Monday, November 12, 2007 - 3:43 pm

For max functionality it seems like the raw eth device should show 
everything on the wire in promiscuous mode.

If we want to sniff only the traffic for a specific vlan, we can sniff 
the vlan device.

Chris
-

From: Kok, Auke
Date: Monday, November 12, 2007 - 3:54 pm

actually the impact can be quite negative, imagine doing a tcpdump on a 10gig
interface with vlan's enabled - all of a sudden you might accidentally flood the
system with a 100-fold increase in traffic and force the stack to dump all those
packets for you.

I'm still very reluctant about this patch, I think the current situation is OK for
everyone and offers everyone the possibility to do what they need, without hidden
consequences.

Auke
-

From: David Miller
Date: Monday, November 12, 2007 - 3:57 pm

From: "Chris Friesen" <cfriesen@nortel.com>

VLAN settings are a filter of sorts, much like plugging into
one switch or another filters traffic physically.

If you don't want that filter, turn the VLAN settings off.
-

From: Willy Tarreau
Date: Monday, November 12, 2007 - 4:15 pm

I don't really agree with that view. Having spent a lot of time with
tcpdump on production systems, I can say that sometimes you'd like to
be aware that one of your VLANs is wrong and you'd simply like to
sniff the wire to guess the correct tag. And on production, you simply
cannot remove other VLANs, otherwise you disrupt the service.

Basically, what generally happens is that the guy responsible for the
switch tells you "it's OK now", but for you it isn't and you cannot
access the switch.

If the solution is to disable VLAN hardware acceleration, I agree that
it is very risky to do that without the user being aware of it. But at
least we should be able to do this by any means (eg: ethtool) without
disabling what's running.

And since you made the parallel with a switch, when you receive tagged
traffic on a switch port, you generally can mirror that port to another
one and catch all VLANs at once. A new feature that is starting to appear
is the ability to mirror tagged traffic to a VLAN on another port (which
means you get a double 802.1q tag). This is useful for inter-site links
between data-centers for instance.

Regards,
Willy

-

From: David Miller
Date: Monday, November 12, 2007 - 4:19 pm

From: Willy Tarreau <w@1wt.eu>

If you were plugged into the wrong physical switch, how might
you debug that problem in production? :-)

Really, it's the same issue, just virtualized, as in the name
for the feature, VLAN.
-

From: Willy Tarreau
Date: Monday, November 12, 2007 - 4:32 pm

tcpdump on an unfiltered RJ45 tells you a lot of things. Some switches
for instance send keepalive packets (ether proto 9000) with their own
MAC as source+dest, others send CDP packets, etc... I've very rarely
stayed plugged to the wrong switch for too long before discovering it.

No, it's not the same, because as soon as you pass through a switch
which is not configured for VLANs, it does not make any distinction,
and the same DMAC is considered on a single port whatever the VLAN.
This is not true with multiple switches. Also, sometimes, being able
to see that your port gets flooded with traffic for a VLAN on which
you're not configured helps you understand why you cannot fill the
port.

I'd like that we can use the hardware correctly without having to
buy TAPs. It reminds me the discussions about TOE. You claim
yourself that TOE is bad in part because you have little if any
control on the bugs on the TCP stack on the chip. This is the same
here. If I know that the chip does not send me what it receives
when configured in promiscuous mode, I can I swear it never received
the missing packet ? There's always a doubt. Maybe sometimes the
filter is buggy, maybe it only passes tags when the remaining bits
are all zero, etc... Promiscuous mode generally means that you're
the closest possible to the wire. We already do not receive pause
frames and sometimes that's annoying. But having no control over
what we want to see is frustrating.

At least, being able to disable the feature at module load time
would be acceptable. Many people who often need to sniff on decent
machines would always keep it disabled.

Regards,
Willy

-

From: David Miller
Date: Monday, November 12, 2007 - 4:40 pm

From: Willy Tarreau <w@1wt.eu>

I'm willing to accept the feature, in whatever form, as long
as the performance issue is dealt with properly.
-

From: Joonwoo Park
Date: Monday, November 12, 2007 - 6:21 pm

I agree with disabling the hw acceleration feature by manually would
be a non-negative solution.
IMO implementation in the ethtool seems better than module param.
As like Auke mentioned.
If you guys confirm it, I'll try it.

Thanks.
Joonwoo
-

From: Patrick McHardy
Date: Tuesday, November 13, 2007 - 3:21 am

I still think promiscous mode should disable all filters (which would
also provide a consistent view between accerlated and non-accerlated
devices), but an ethtool option is better than nothing :)

-

From: Herbert Xu
Date: Tuesday, November 13, 2007 - 4:09 am

I agree.  People doing a tcpdump don't have to turn on promiscuous
mode, that's what the -p option is for.  In other words, having
promiscuous mode disable VLAN filtering does not take away the
user's options at all.

In fact, the very definition of promiscuous is to turn off hardware
filtering, albeit the filtering of MAC addresses rather than VLAN
tags.  So it would seem logical to have it turn off VLAN filtering
too.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-

From: David Miller
Date: Tuesday, November 13, 2007 - 4:36 am

From: Herbert Xu <herbert@gondor.apana.org.au>

Ok.

The performance implications can be pretty severe however.
I wish we could address this somehow.

-

From: Herbert Xu
Date: Tuesday, November 13, 2007 - 5:03 am

Or perhaps we should just teach everyone to always run tcpdump
with -p, like me :)

Of course this would still have a negative impact on those who
have to be in promiscuous mode all the time (heh) due to multiple
unicast MAC addresses and such.  However, we should able to
communicate that fact to the driver and the driver can then elect
to not disable VLAN acceleration unless we really want to be in
promiscuous mode.

In other words we can make it so that nobody is in promiscuous
mode and therefore have to disable VLAN acceleration *unless*
they really want to be in that state.  In which case it would
imply that they wish to see everything and therefore we should
disable VLAN acceleration.

So that means I'd like to see our current core/driver interface
enhanced so that whether the user has requested us to be in
promiscuous mode can be differentiated from whether the network
stack wants us to be.

Once we have that then I would think that such a patch would be
less controversial.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-

From: David Miller
Date: Tuesday, November 13, 2007 - 5:06 am

From: Herbert Xu <herbert@gondor.apana.org.au>

We already do with the code Patrick added a while ago so
that drivers can support multiple MAC addresses in hardware.

Now just to get the virtualization technologies and all the

This is too complicated, we have multiple unicast MAC support
in the driver API already, let's simply use it.
-

From: Herbert Xu
Date: Tuesday, November 13, 2007 - 5:16 am

Yes I agree.  People not using Patrick's new API deserves to
get poor performance so they can switch over sooner :)

What I was trying to say above is that e1000 currently uses
the old set_multicast_list interface (rather than dev_set_rx_mode)
so it's not immediately obvious why we're in promiscuous mode.
We could look at dev->promiscuity - !!uc_count but that feels a
bit fragile.

Perhaps those who want to push this patch should be encouraged
to convert e1000 to the new interface :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-

From: Patrick McHardy
Date: Tuesday, November 13, 2007 - 5:18 am

I already posted a patch for this, not sure what happened to it.
Auke, any news on merging the secondary unicast address support?

-

From: Kok, Auke
Date: Tuesday, November 13, 2007 - 9:41 am

I dropped the ball on that one. Care to resend it and send me one for e1000e as well?

Auke
-

From: Patrick McHardy
Date: Tuesday, November 13, 2007 - 10:26 am

Patch for e1000 attached.

Does e1000e also work with PCI cards if I add the proper IDs?
Otherwise I could only send an untested patch.

From: Kok, Auke
Date: Tuesday, November 13, 2007 - 10:30 am

no, e1000e will only work with pci-e adapters. The code however is largely the
same, so if you can dish me out (off-list) some test cases I can have our labs do
the testing and add this to our test repertoire.

Cheers,

Auke
-

From: Patrick McHardy
Date: Wednesday, November 14, 2007 - 2:42 am

Actually that part of the code is quite different. I'm not
sure I understand the entire intent behind the differences,
why does it use a packed array of addresses instead of simply
passing the netdev pointer to the callback? That would be
necessary for secondary unicast support since we have to
disable unicast filtering if either the address count exceeds
the number of filter slots or the device is in promiscous mode.

-

From: Kok, Auke
Date: Wednesday, November 14, 2007 - 4:30 pm

the underlying code was not specifically developed with linux only in mind and is
therefore not aware of the net_device struct, hence the specific netdev agnostic
implementation. if this implementation is posing issues then we need to look at
how to improve it or just code it specifically for linux in a better way.


Auke
-

From: Kok, Auke
Date: Tuesday, November 13, 2007 - 12:59 pm

Johnwoo,

your patch unfortunately does not apply after patrick's unicast patch,

also, ich8lan support is removed from e1000 in the e1000 version in
jgarzik/netdev-2.6 #upstream as planned (moved over to e1000e!).

Can you resend your patch so that it applies to jgarzik/netdev-2.6 #upstream with
Patrick's patch applied? That would help a lot. And possibly do the e1000e patch
as well :)

Thanks,

Auke




---
[E1000]: Secondary unicast address support

Add support for configuring secondary unicast addresses. Unicast
addresses take precendece over multicast addresses when filling
the exact address filters to avoid going to promiscous mode.
When more unicast addresses are present than filter slots,
unicast filtering is disabled and all slots can be used for
multicast addresses.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 5d2e80a9c326ca529d278da823c8e4a4da91f612
tree 97a8ac20070b101c250e79912636124167a6dd07
parent 325d22df7b19e0116aff3391d3a03f73d0634ded
author Patrick McHardy <kaber@trash.net> Tue, 13 Nov 2007 18:23:34 +0100
committer Patrick McHardy <kaber@trash.net> Tue, 13 Nov 2007 18:23:34 +0100

 drivers/net/e1000/e1000_main.c |   47 ++++++++++++++++++++++++++--------------
 1 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 72deff0..5fd5f51 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -153,7 +153,7 @@ static void e1000_clean_tx_ring(struct e1000_adapter *adapter,
                                 struct e1000_tx_ring *tx_ring);
 static void e1000_clean_rx_ring(struct e1000_adapter *adapter,
                                 struct e1000_rx_ring *rx_ring);
-static void e1000_set_multi(struct net_device *netdev);
+static void e1000_set_rx_mode(struct net_device *netdev);
 static void e1000_update_phy_info(unsigned long data);
 static void e1000_watchdog(unsigned long data);
 static void ...
From: David Miller
Date: Tuesday, November 13, 2007 - 5:32 am

From: Herbert Xu <herbert@gondor.apana.org.au>

That is my feeling as well :-)
-

From: Kok, Auke
Date: Monday, November 12, 2007 - 4:38 pm

I really do not want to add another module parameter to the driver for this. It
seems bogus as well: if we can agree on one sane choice it's much better, and if
we don't then we should use ethtool to provide a uniform way of implementing this
for all drivers sanely.

Auke
-

From: Joonwoo Park
Date: Monday, November 12, 2007 - 6:21 pm

I agree.
If I had a mis-plugged cable, I can guess it with tcpdump.
Because I cannot see the packets. It means no such packets on the wire 100%.
But if I had a incorrect vlan configuration, it's hard to sure.
In a case both of mis-plugged and mis-configured situation, I cannot
see anything.
Moreover, if I am configuring a machine via vlan interface which is
mis-configured partially, I cannot disable vlan hw acceleration
feature.

Thanks.
Joonwoo
-

From: David Miller
Date: Monday, November 12, 2007 - 3:28 pm

From: "Kok, Auke" <auke-jan.h.kok@intel.com>

I agree.

Look at the VLAN setting as being just like moving the
ethernet cable from one switch to another.

There is no logical difference.
-

From: Stephen Hemminger
Date: Tuesday, November 13, 2007 - 1:43 pm

On Sun, 11 Nov 2007 09:51:20 +0900

Promiscuous has other uses such as bridging. Would this patch break any
existing users startup scripts?
-

Previous thread: [PATCH 1/2] [VLAN] Don't drop an unclassfied vlan packet as PACKET_OTHERHOST by Joonwoo Park on Saturday, November 10, 2007 - 5:51 pm. (1 message)

Next thread: Re : Oops preceded by WARNING: at net/ipv4/tcp_input.c:1571 tcp_remove_reno_sacks() by Chazarain Guillaume on Saturday, November 10, 2007 - 6:39 pm. (9 messages)