Hi all, 2.6.29-rc* introduces a bug that causes a crash when a vlan is put into another vlan, so called vlan trunking or QinQ. I can reproduce it reliably with: $ modprobe 8021q $ vconfig add eth1 5 $ ifconfig eth1.5 up $ vconfig add eth1.5 4 I have seen crashes on all 2.6.29-rc kernels, but it does work for me on 2.6.28 (and prior). All my testing was done on ia32 and amd64 systems. I've reproduced it with various configs, but if you need my config let me know. I was tired of crashing my devel box, so this Ooops came from a kvm VM (with the default 8139cp driver) where I attempted a bisect to find the source of the bug. Unfortunately I was unable to bisect it because of other unrelated crashes in the history that made it too time consuming. I have discovered that by doing: $ git reset --hard origin/master # to HEAD of torvalds/linux-2.6.git $ git revert cc883d16c3b7434c7da2c45b54a49c2a99e83db7 $ git revert f7d1b9f5aafa371d7f51f644aa3c38bc914e9205 $ git revert 656299f706e52e0409733d704c2761f1b12d6954 ... the crash goes away. I just validated the procedure with Linus' 778ef1e6cbb049c9bcbf405936ee6f2b6e451892. And other than seeing... [ 154.094561] eth1.5 (): not using net_device_ops yet [ 154.220840] eth1.5.4 (): not using net_device_ops yet ... there is no trace of this bug. I suspect that only 656299f need to be reverted/fixed, but the other two patches are prerequisites for it to apply cleanly. Hope that makes the source of the bug apparent to someone. Cheers, -Bart PS: I see another problem with my KVM setup, but I think that has something to do with my KVM host kernel, not the guest. More specifically, when I do QinQ I only see the inner VLAN tags on the underlying bridge device (under the KVM), but the outer most VLAN tag is missing. But that's an exercise for another day. --- 8< --- [ 1201.822546] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com> [ 1201.830944] All bugs added by David ...
From: Bart Trojanowski <bart@jukie.net> Stephen please fix this. --
I'm maintaining vlan :) I haven't been able to look into this yet, but I should be later today. --
From: Patrick McHardy <kaber@trash.net> Ok, I don't actually care who fixes it :-) I asked Stephen because this appears like it might be netdev_ops fallout. --
Good point. I think I know whats happening. The VLAN devices are registered with vlan_netdev_ops, the ->init function then potentially replaces them based on whether the underlying device supports HW acceleration. At this point the register_netdev() function already has used the first netdev_ops structure to initialize the compat pointers, meaning the new assignment is largely without effect and causes incorrect ops to be used with HW acceleration This probably causes the slab corruption since the HW accelerated VLAN device doesn't increase the headroom, but the non-accelerated functions try to add a hard header anyways. This is a bit tricky to fix since we actually need some valid ops before invoking ->init(). One way would be to move the compat ops initialization to a seperate function and have VLAN use it to switch its ops. --
From: Patrick McHardy <kaber@trash.net> Mind if I push this into net-2.6? vlan: Fix vlan-in-vlan crashes. As analyzed by Patrick McHardy, vlan needs to reset it's netdev_ops pointer in it's ->init() function but this leaves the compat method pointers stale. Add a netdev_resync_ops() and call it from the vlan code. Signed-off-by: David S. Miller <davem@davemloft.net> --- include/linux/netdevice.h | 1 + net/8021q/vlan_dev.c | 1 + net/core/dev.c | 54 +++++++++++++++++++++++++++----------------- 3 files changed, 35 insertions(+), 21 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index ec54785..6593667 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1079,6 +1079,7 @@ extern void synchronize_net(void); extern int register_netdevice_notifier(struct notifier_block *nb); extern int unregister_netdevice_notifier(struct notifier_block *nb); extern int init_dummy_netdev(struct net_device *dev); +extern void netdev_resync_ops(struct net_device *dev); extern int call_netdevice_notifiers(unsigned long val, struct net_device *dev); extern struct net_device *dev_get_by_index(struct net *net, int ifindex); diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 4a19acd..b6e1f9e 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -639,6 +639,7 @@ static int vlan_dev_init(struct net_device *dev) dev->hard_header_len = real_dev->hard_header_len + VLAN_HLEN; dev->netdev_ops = &vlan_netdev_ops; } + netdev_resync_ops(dev); if (is_vlan_dev(real_dev)) subclass = 1; diff --git a/net/core/dev.c b/net/core/dev.c index 9e4afe6..a06c6fa 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4282,6 +4282,38 @@ unsigned long netdev_fix_features(unsigned long features, const char *name) } EXPORT_SYMBOL(netdev_fix_features); +/* Some devices need to (re-)set their netdev_ops inside + * ->init() or similar. If that happens, we have to setup + ...
David, I tried this patch onto v2.6.29-rc7-3-g559595a, but I still get a crash. I assume that this worked for you, so I am not putting much faith in my results at this late hour. I'll confirm tomorrow morning that it's not something else. Cheers, -Bart -- WebSig: http://www.jukie.net/~bart/sig/ --
David, ... if you're interested, here is the Oops. And like I said, I'll retest tomorrow. -Bart [ 231.748126] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com> [ 231.751563] All bugs added by David S. Miller <davem@redhat.com> [ 231.876164] PANIC: double fault, gdt at c364c000 [255 bytes] [ 231.876271] double fault, tss at c364fae0 [ 231.876271] eip = f8163c65, esp = f3045000 [ 231.876271] eax = f3d78800, ebx = f3d78800, ecx = f8163c62, edx = f3e4ce40 [ 231.876271] esi = f3e4ce40, edi = c05228fc [ 232.156110] BUG: unable to handle kernel paging request at a0ad0eb4 [ 232.159226] IP: [<c011fd49>] hrtick_start_fair+0x1f/0x17d [ 232.160050] *pde = 00000000 [ 232.160050] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 232.160050] last sysfs file: /sys/class/net/lo/operstate [ 232.160050] Modules linked in: 8021q virtio_balloon virtio_pci thermal_sys [ 232.160050] [ 232.160050] Pid: 0, comm: swapper Tainted: G S (2.6.29-rc7-bisect-00004-g39fc204 #1) [ 232.160050] EIP: 0060:[<c011fd49>] EFLAGS: 00010092 CPU: 0 [ 232.160050] EIP is at hrtick_start_fair+0x1f/0x17d [ 232.160050] EAX: c0599d80 EBX: c364e200 ECX: c3651dd4 EDX: f8163c85 [ 232.160050] ESI: f52f6198 EDI: f622b708 EBP: c0545ddc ESP: c0545dc4 [ 232.160050] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 232.160050] Process swapper (pid: 0, ti=c0544000 task=c04fd380 task.ti=c0544000) [ 232.160050] Stack: [ 232.160050] c3651d80 5f684afd 00000002 c3651dd4 f52f6198 c3651d80 c0545df0 c011ff82 [ 232.160050] f4796708 c03d6ad8 00000000 c0545e14 c011da63 0d94c928 00000036 00000000 [ 232.160050] c3651d80 c3651d80 c3648d80 00000522 c0545e20 c011daaa f4796708 c0545e34 [ 232.160050] Call Trace: [ 232.160050] [<c011ff82>] ? dequeue_task_fair+0x51/0x56 [ 232.160050] [<c011da63>] ? dequeue_task+0xd5/0xe4 [ 232.160050] [<c011daaa>] ? deactivate_task+0x19/0x1f [ 232.160050] [<c011e504>] ? pull_task+0x11/0x3d [ 232.160050] [<c011e6fc>] ? load_balance_fair+0x134/0x1ce [ 232.160050] ...
Thanks, I'll try to reproduce it here. --
From: Bart Trojanowski <bart@jukie.net> I didn't test, so you the bug must be a different problem or my patch is wrong :) --
This looks fine, thanks. Even if it doesn't fix this particular report, I think its appropriate for net-2.6. --
From: Patrick McHardy <kaber@trash.net> Thanks for looking at it, but I don't want to push it out until we fully resolve this bug. --
I think this will fix it. diff --git a/net/core/dev.c b/net/core/dev.c index 9e4afe6..2dd484e 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4339,6 +4339,7 @@ int register_netdevice(struct net_device *dev) dev->do_ioctl = ops->ndo_do_ioctl; dev->set_config = ops->ndo_set_config; dev->change_mtu = ops->ndo_change_mtu; + dev->neigh_setup = ops->ndo_neigh_setup; dev->tx_timeout = ops->ndo_tx_timeout; dev->get_stats = ops->ndo_get_stats; dev->vlan_rx_register = ops->ndo_vlan_rx_register; --
Good catch, that looks like a likely cause. Will test in a minute ... --
From: Patrick McHardy <kaber@trash.net> We probably need both fixes to cover everything. --
Yes, just the second one still crashes. I'm about to retry using both. --
From: Patrick McHardy <kaber@trash.net> Here is the updated version just for the record: vlan: Fix vlan-in-vlan crashes. As analyzed by Patrick McHardy, vlan needs to reset it's netdev_ops pointer in it's ->init() function but this leaves the compat method pointers stale. Add a netdev_resync_ops() and call it from the vlan code. Any other driver which changes ->netdev_ops after register_netdevice() will need to call this new function after doing so too. Signed-off-by: David S. Miller <davem@davemloft.net> --- include/linux/netdevice.h | 1 + net/8021q/vlan_dev.c | 1 + net/core/dev.c | 56 +++++++++++++++++++++++++++----------------- 3 files changed, 36 insertions(+), 22 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index ec54785..6593667 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1079,6 +1079,7 @@ extern void synchronize_net(void); extern int register_netdevice_notifier(struct notifier_block *nb); extern int unregister_netdevice_notifier(struct notifier_block *nb); extern int init_dummy_netdev(struct net_device *dev); +extern void netdev_resync_ops(struct net_device *dev); extern int call_netdevice_notifiers(unsigned long val, struct net_device *dev); extern struct net_device *dev_get_by_index(struct net *net, int ifindex); diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 4a19acd..b6e1f9e 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -639,6 +639,7 @@ static int vlan_dev_init(struct net_device *dev) dev->hard_header_len = real_dev->hard_header_len + VLAN_HLEN; dev->netdev_ops = &vlan_netdev_ops; } + netdev_resync_ops(dev); if (is_vlan_dev(real_dev)) subclass = 1; diff --git a/net/core/dev.c b/net/core/dev.c index 2dd484e..f112970 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4282,6 +4282,39 @@ unsigned long netdev_fix_features(unsigned long features, const char *name) } ...
This still crashes. I'll have another look at the code. --
This one combined with your patch fixes the crash. The code was calling vlan_dev_neigh_setup recursively. Signed-off-by: Patrick McHardy <kaber@trash.net> (or Tested-by: in case you want to roll it into your patch).
From: Patrick McHardy <kaber@trash.net> Great work, thanks Patrick! I'll get this all integrated and push it out to Linus. --
Hi Dave, Patrick, sorry I could not follow the complete discussion of the fixes done for this problem but does if (netif_device_present(real_dev) && ops->ndo_neigh_setup) - err = ops->ndo_neigh_setup(dev, pa); + err = ops->ndo_neigh_setup(real_dev, pa); not change the idea of the neigh_setup? Remind we want the neigh_setup of the real device as the neigh setup function for the vlan device. Frank --
An we still use it. The only difference is that we pass it the correct device reference, which not only fixes the recursion, but is also expected by the callbacks. Look at bonding or simply vlan itself. The setup itself is still done using the neigh_params passed to VLAN, which appears to be what was originally intended. --
From: Patrick McHardy <kaber@trash.net> Then bond_neigh_setup() has the same bug, doesn't it? --
From: David Miller <davem@davemloft.net>
Looking at the bond_main.c changes in:
commit 008298231abbeb91bc7be9e8b078607b816d1a4a
Author: Stephen Hemminger <shemminger@vyatta.com>
Date: Thu Nov 20 20:14:53 2008 -0800
netdev: add more functions to netdevice ops
shows that it always behaved that way.
--
Yes, but that patch introduced the requirement to pass the correct device down since now the handlers need it to get to the ops of the underlying device. Previously they all relied on the handlers not using their private data. Signed-off-by: Patrick McHardy <kaber@trash.net>
From: Patrick McHardy <kaber@trash.net> Applied, thanks! --
On Wed, 2009-03-04 at 19:53 -0800, David Miller wrote: Any reason dev->stop is not in this list ? Regards, -- Maxime --
From: Maxime Bizon <mbizon@freebox.fr> dev->stop is only called from net/core/dev.c and it is netdev_ops aware :-) --
