In doing some stress testing of the NetXen driver, I found that my machine was dying in all sorts of weird ways. I saw several different crashes, BUG messages in the TCP stack and some assert messages in the TCP stack as well. I really didn't think that there could be six different bugs all at once in the TCP/IP stack, so I started looking at possible memory corruption. I first saw this on 2.6.16-rt22 with a backported netxen driver from 2.6.22. I figured I should try the latest kernel, so I tried it on 2.6.23-rc6-git7 but could not trigger the slab corruption messages with CONFIG_DEBUG_SLAB, so I figured the race must only exist in the -RT kernels. Next I tried 2.6.23-rc6-git7-rt1 (I applied patch-2.6.23-rc4-rt1 patch to 2.6.23-rc6-git7 and fixed the 5 failing hunks). After an hour or so, lots of slab corruption messages showed up: Slab corruption: size-2048 start=f40c4670, len=2048 Slab corruption: size-2048 start=f313cf48, len=2048 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. Last user: [<c0166be4>](kfree+0x80/0x95) 010: 6b 6b 00 0e 1e 00 16 13 00 0e 1e 00 19 3d 08 00 020: 45 00 05 dc 92 ab 40 00 40 11 8a 5b 0a 02 02 03 030: 0a 02 02 04 80 0c 80 0d 05 c8 dc 39 00 00 00 00 040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Prev obj: start=f313c730, len=2048 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. Last user: [<f8f06186>](netxen_post_rx_buffers_nodb+0x62/0x1f0 [netxen_nic]) 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 010: 5a 5a 00 0e 1e 00 16 13 00 0e 1e 00 19 3d 08 00 Next obj: start=f313d760, len=2048 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. Last user: [<f8f06186>](netxen_post_rx_buffers_nodb+0x62/0x1f0 [netxen_nic]) 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a Slab corruption: size-2048 start=f395a6f0, len=2048 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. Last user: [<c0166be4>](kfree+0x80/0x95) 010: 6b 6b 00 0e 1e 00 16 13 00 0e 1e 00 19 3d 08 00 020: 45 00 05 dc 92 ac 40 00 40 11 8a 5a 0a 02 02 03 030: 0a 02 02 04 80 0c 80 0d 05 c8 dc 39 00 00 00 00 040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Next obj: start=f395af08, len=2048 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. Last user: [<f8f06186>](netxen_post_rx_buffers_nodb+0x62/0x1f0 [netxen_nic]) 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a Redzone: 0x9f911029d74e35b/0x9f911029d74e35b. Last user: [<c0166be4>](kfree+0x80/0x95) 010: 6b 6b 00 0e 1e 00 16 13 00 0e 1e 00 19 3d 08 00 020: 45 00 05 dc 92 aa 40 00 40 11 8a 5c 0a 02 02 03 030: 0a 02 02 04 80 0c 80 0d 05 c8 dc 39 00 00 00 00 040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Next obj: start=f40c4e88, len=2048 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0. Last user: [<f8f06186>](netxen_post_rx_buffers_nodb+0x62/0x1f0 [netxen_nic]) 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 010: 5a 5a 00 0e 1e 00 16 13 00 0e 1e 00 19 3d 08 00 The stress test that I am running is basically a mixed bag of stuff I threw together. It runs eight concurrent netperf TCP streams and two concurrent UDP streams in both directions, (and on both 10GbE interfaces), ping -f in both directions, some more disk/cpu loads in the background and a little bit of NFS traffic thrown in for good measure. --Vernon - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| david | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Natalie Protasevich | [BUG] New Kernel Bugs |
| Brice Goglin | Re: Linux 2.6.21-rc2 |
| Christian Volkmann | 2.6.23-rc6: usb 1-1: device not accepting address 2, error -62 |
git: | |
| Jon Smirl | Re: VCS comparison table |
| Nicolas Pitre | Re: git to libgit2 code relicensing |
| Giuseppe Bilotta | [PATCHv3 0/4] gitweb: remote heads feature |
| Mark Burton | Git commit won't add an untracked file given on the command line |
| Nick Guenther | Re: Real men don't attack straw men |
| Tanvir | Re: Adobe Flash on OpenBSD |
| Marius ROMAN | 1440x900 resolution problem |
| Gilles Chehade | Re: wpi (4) and 4.4 |
| Johann Baudy | Re: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING |
| Volker Armin Hemmann | build error with 2.6.27.6+reiser4+ehci-hub patch. ERROR: "mii_ethtool_gset" [drive... |
| Patrick Ohly | [RFC PATCH 00/13] hardware time stamping + igb example implementation |
| Laszlo Attila Toth | [PATCHv7 3/5] Interface group: core (netlink) part |
