Re: [PATCH,RFC] skb->network_header and __vlan_put_tag()

Previous thread: [PATCH 5/5] xen: Avoid allocations causing swap activity on the resume path by Jeremy Fitzhardinge on Monday, June 16, 2008 - 2:45 pm. (1 message)

Next thread: RE: [E1000-devel] [regression?] e1000e breaks IPMI by Brandeburg, Jesse on Monday, June 16, 2008 - 3:30 pm. (1 message)
From: Lennert Buytenhek
Date: Monday, June 16, 2008 - 3:15 pm

__vlan_put_tag() is not only substracting 4 from the skb's
->mac_header pointer (which kind of makes sense[*]), but it is
substracting 4 from the ->network_header pointer as well.

Conceptually, sure, VLAN is another layer of encapsulation between
ethernet and IP, but doesn't it make more sense to just consider the
VLAN tag part of the MAC header?  What good does it do to point
->network_header to the _middle_ of the VLAN tag when a VLAN tag is
inserted?

I'm adding VLAN support to mv643xx_eth (which does not support HW
insertion of a VLAN tag, but it does support HW checksumming when
a VLAN tag is present, you just have to tell the HW how many bytes
there are between the start of the packet and the IP header), and
I'm ending up with code like this:

	if (skb->protocol == htons(ETH_P_8021Q))
		ip_header = ip_hdr(skb) + 4;
	else
		ip_header = ip_hdr(skb);

whereas it'd be nicer if you could just have the same code for the
VLAN and non-VLAN case:

	ip_header = ip_hdr(skb);


[*] But skb->mac_header seems to be mostly NULL when this function
    is called, causing it to end up being 0xfffffffc by the time the
    packet is given to the device's ->hard_start_xmit().


Index: linux-2.6.26-rc5/include/linux/if_vlan.h
===================================================================
--- linux-2.6.26-rc5.orig/include/linux/if_vlan.h
+++ linux-2.6.26-rc5/include/linux/if_vlan.h
@@ -280,7 +280,6 @@ static inline struct sk_buff *__vlan_put
 
 	skb->protocol = htons(ETH_P_8021Q);
 	skb->mac_header -= VLAN_HLEN;
-	skb->network_header -= VLAN_HLEN;
 
 	return skb;
 }
--

From: Patrick McHardy
Date: Tuesday, June 17, 2008 - 5:18 am

I agree that your patch makes sense, it seems some drivers
(niu, gianfar) even assume this. Regarding the invalid
mac_header pointer, that looks like a bug and it should
probably call skb_reset_mac_header() instead of manually

There is another spot where VLAN headers are built with an also
incorrect looking network_header adjustment in vlan_dev_hard_header()
that should also be changed for consisteny.

Could you try if changing both these spots and adding the
skb_reset_mac_header() works for you and still displays
the packets properly in tcpdump?
--

From: Patrick McHardy
Date: Saturday, July 5, 2008 - 11:33 am

I've queued this patch for testing. I'll send it upstream with
my next VLAN update if everything works properly.
From: Lennert Buytenhek
Date: Monday, July 7, 2008 - 5:05 am

This works -- I can now just calculate
'(void *)skb->data - (void *)ip_hdr(skb)' to find out how many bytes
there are in the virtual ethernet header (i.e. ethernet header plus
VLAN tag(s)) and program the TX descriptor accordingly.  (But in the
VLAN case, I only ever get packets that have already been checksummed

I'm not sure whether there is much point to doing this.  With this
change, VLAN-tagged packets are given to ->hard_start_xmit() with
ethhdr pointing to skb->data instead of NULL, but non-VLAN-tagged
packets are still passed in with ethhdr being NULL.

I.e. this hacky code in ->hard_start_xmit():

	if (1) {
		void *ip = ip_hdr(skb);
		void *data = skb->data;

		printk(KERN_INFO "tx %p ", skb);
		printk("proto:%.4x ", ntohs(skb->protocol));
		printk("data:%p ", skb->data);
		printk("ethhdr:%p ", eth_hdr(skb));
		printk("iphdr:%p ", ip);
		printk("macbytes:%d\n", ip - data);
	}

Gives me for VLAN-tagged packets:

	tx c1d63520 proto:8100 data:c1d6240e ethhdr:c1d6240e iphdr:c1d62420 macbytes:18

and for non-VLAN-tagged packets:

	tx c1f90ee0 proto:0800 data:c0b1e8b6 ethhdr:00000000 iphdr:c0b1e8c4 macbytes:14
--

From: Patrick McHardy
Date: Monday, July 7, 2008 - 5:17 am

You're right, its not strictly necessary. I'll remove that part
again.
--

Previous thread: [PATCH 5/5] xen: Avoid allocations causing swap activity on the resume path by Jeremy Fitzhardinge on Monday, June 16, 2008 - 2:45 pm. (1 message)

Next thread: RE: [E1000-devel] [regression?] e1000e breaks IPMI by Brandeburg, Jesse on Monday, June 16, 2008 - 3:30 pm. (1 message)