On Dec 30, 2007 2:30 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:That's why I wrote that I do not know much about the network core... ... I did not know about the fact that there should not have been an dst. Its just that this warning was the first nice clue about the memory corruption related to networking that I see since 2.6.24-rc3-mm2. The time of the patch (Mon, 26 Nov 2007 15:11:19) even fits into the window between -rc3-mm1 and -rc3-mm2. I doubt that the memory corruption is a hardware problem, because the system in question is using ECC ram and I did not see any messages about corrected/detected errors. I looked into the log in question and the only other warning was a circular locking dependency that lockdep detected around 1.5 hour before this warning. As reported in my original mail immeadeatly after the warning the system OOPSed and hang: [93436.947241] general protection fault: 0000 [1] SMP -> first OOPS [93436.947243] last sysfs file: /sys/devices/pci0000:00/0000:00:0f.0/0000:01:00.1/irq [93436.947245] CPU 1 [93436.947246] Modules linked in: radeon drm nfsd exportfs w83792d ipv6 tuner tea5767 tda8290 tuner_xc2 028 tda9887 tuner_simple mt20xx tea5761 tvaudio msp3400 bttv ir_common compat_ioctl32 videobuf_dma_sg v ideobuf_core btcx_risc tveeprom usbhid videodev v4l2_common hid v4l1_compat pata_amd sg i2c_nforce2 [93436.947257] Pid: 8079, comm: konqueror Not tainted 2.6.24-rc6-mm1 #11 -> not tainted by a previous OOPS [93436.947259] RIP: 0010:[<ffffffff80531438>] [<ffffffff80531438>] skb_drop_list+0x18/0x30 [93436.947262] RSP: 0018:ffff810005f4fda8 EFLAGS: 00010286 [93436.947263] RAX: ab1ed5ca5b74e7de RBX: ab1ed5ca5b74e7de RCX: 000000000000d135 [93436.947265] RDX: ffff81011d089a80 RSI: 0000000000000001 RDI: ffff81011d089a88 [93436.947266] RBP: ffff810005f4fdb8 R08: 0000000000000001 R09: 0000000000000006 [93436.947268] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8100de02c500 [93436.947269] R13: ffff81011c188a00 R14: 0000000000000001 R15: ffff81011c189198 [93436.947271] FS: 00007fb5bde0d700(0000) GS:ffff81007ff22000(0000) knlGS:0000000000000000 [93436.947273] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [93436.947274] CR2: 00007fb5bdd76000 CR3: 00000000664d5000 CR4: 00000000000006e0 [93436.947276] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [93436.947277] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [93436.947279] Process konqueror (pid: 8079, threadinfo ffff810005f4e000, task ffff8100a1dec000) [93436.947281] Stack: ffff810005f4fdd8 ffff810116c86140 ffff810005f4fdd8 ffffffff805314ae [93436.947284] ffff810116c86140 ffff8100de02c500 ffff810005f4fdf8 ffffffff80531cf0 [93436.947286] ffff8100de02c500 ffff81011c188b48 ffff810005f4fe18 ffffffff80531311 [93436.947288] Call Trace: [93436.947290] [<ffffffff805314ae>] skb_release_data+0x5e/0xa0 [93436.947293] [<ffffffff80531cf0>] skb_release_all+0xa0/0x110 [93436.947295] [<ffffffff80531311>] __kfree_skb+0x11/0xa0 [93436.947297] [<ffffffff805313b7>] kfree_skb+0x17/0x30 [93436.947299] [<ffffffff805a0b48>] unix_release_sock+0x128/0x250 [93436.947302] [<ffffffff805a0c91>] unix_release+0x21/0x30 [93436.947304] [<ffffffff8052b144>] sock_release+0x24/0x90 [93436.947307] [<ffffffff8052b656>] sock_close+0x26/0x50 [93436.947309] [<ffffffff8029f921>] __fput+0xc1/0x230 [93436.947312] [<ffffffff8029fe46>] fput+0x16/0x20 [93436.947314] [<ffffffff8029c576>] filp_close+0x56/0x90 [93436.947316] [<ffffffff8029de46>] sys_close+0xa6/0x110 [93436.947319] [<ffffffff8020b57b>] system_call_after_swapgs+0x7b/0x80 [93436.947322] [93436.947322] [93436.947323] Code: 48 8b 18 48 89 c7 e8 5d ff ff ff 48 85 db 75 ed 48 83 c4 08 [93436.947328] RIP [<ffffffff80531438>] skb_drop_list+0x18/0x30 [93436.947330] RSP <ffff810005f4fda8> [93436.947332] ---[ end trace befb7cc3528ab3b1 ]--- Your patch just fit so "good" to my problems: * it had the correct time frame for 2.6.24-rc3-mm2 * it looked guilty at changing the refcounting of __refcnt because of the added dst_release() * it added other release / freeing operations so that a use-after-free memory corruption seemed possible I just have no better idea to what caused this OOPS and the other hangs in -rc3-mm2. Torsten --
| Greg KH | Re: Announce: Linux-next (Or Andrew's dream :-)) |
| Greg KH | [patch 26/73] NET: Correct two mistaken skb_reset_mac_header() conversions. |
| Greg Kroah-Hartman | [PATCH 007/196] Chinese: add translation of stable_kernel_rules.txt |
| Alan Cox | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Alexey Dobriyan | Re: [GIT]: Networking |
| Gerrit Renker | [PATCH 03/37] dccp: List management for new feature negotiation |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Andrew Morton | Re: [BUG] New Kernel Bugs |
