login
Header Space

 
 

Need help debugging memory corruption

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: <linux-kernel@...>, <netdev@...>
Cc: Chris Snook <csnook@...>
Date: Saturday, May 3, 2008 - 2:09 pm

I'm trying to track down a memory corruption bug in the atl1 network
device driver that is exposed only when operating with 4GB or more
memory.  (NB: The driver uses a 32-bit DMA mask.)  The bug is hit under
certain conditions whenever the network interface is commanded down.

The information provided here is against recent -git, but this problem
also afflicts the atl1 driver in kernels 2.6.2[345].y.  Dmesg and
config attached.

I can reproduce the bug at will by simply using scp to copy a few
hundred megabytes from a remote host, then by executing 'ifconfig eth0
down'.

Here is the relevant console output when the bug is hit.  I note that
the apparent corrupting data beginning at address 0xffff81010fcff402 is
actually a received ethernet frame.  (The value 00:17:31:4e:9d:41 is
the MAC address of the local host, corresponding to the destination
address of a received frame.)

Can someone with more experience than me please take a look and give me
some advice or explain what might be happening here?  (What may be
obvious to you is probably not obvious to me.)

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
BUG kmalloc-2048: Poison overwritten
---------------------------------------------------------------------------=
--

INFO: 0xffff81010fcff402-0xffff81010fcff9f9. First byte 0x0 instead of 0x6b
INFO: Allocated in dev_alloc_skb+0x16/0x2c age=3D9459 cpu=3D0 pid=3D2754
INFO: Freed in skb_release_data+0xa8/0xad age=3D9434 cpu=3D0 pid=3D2754
INFO: Slab 0xffffe20005d6f540 objects=3D15 used=3D0 fp=3D0xffff81010fcfb1b0=
 flags=3D0x8000000000002082
INFO: Object 0xffff81010fcff3f0 @offset=3D29680 fp=3D0xffff81010fcfd2d0

Bytes b4 0xffff81010fcff3e0:  3b bd 00 00 01 00 00 00 5a 5a 5a 5a 5a 5a 5a =
5a ;=EF=BF=BD......ZZZZZZZZ
  Object 0xffff81010fcff3f0:  6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b =
6b kkkkkkkkkkkkkkkk
  Object 0xffff81010fcff400:  6b 6b 00 17 31 4e 9d 41 00 1b fc 95 53 da 08 =
00 kk..1N.A..=EF=BF=BD.S=EF=BF=BD..
  Object 0xffff81010fcff410:  45 08 02 2c 83 1f 40 00 40 06 31 e1 c0 a8 01 =
03 E..,..@.@.1=EF=BF=BD=EF=BF=BD=EF=BF=BD..
  Object 0xffff81010fcff420:  c0 a8 01 70 58 b9 c0 18 93 5f c2 41 51 d2 a7 =
5d =EF=BF=BD=EF=BF=BD.pX=EF=BF=BD=EF=BF=BD.._=EF=BF=BDAQ=D2=A7]
  Object 0xffff81010fcff430:  80 18 00 e8 ee ef 00 00 01 01 08 0a 00 e2 58 =
16 ...=EF=BF=BD=EF=BF=BD=EF=BF=BD.......=EF=BF=BDX.
  Object 0xffff81010fcff440:  00 00 bd 38 4b 93 e4 7f 3e 8d 8c 2b 41 dc 9b =
36 ..=EF=BF=BD8K.=EF=BF=BD.>..+A=EF=BF=BD.6
  Object 0xffff81010fcff450:  4d 9f b7 cf 2a 2c 07 06 d8 2f 23 de 5a 34 90 =
cb M.=EF=BF=BD=EF=BF=BD*,..=EF=BF=BD/#=EF=BF=BDZ4.
  Object 0xffff81010fcff460:  6d d7 36 5b 2c 04 19 06 74 95 3f c5 3c c8 a5 =
9a m=EF=BF=BD6[,...t.?=EF=BF=BD<=C8=A5.
 Redzone 0xffff81010fcffbf0:  bb bb bb bb bb bb bb bb                      =
   =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=
       =20
 Padding 0xffff81010fcffc30:  5a 5a 5a 5a 5a 5a 5a 5a                      =
   ZZZZZZZZ       =20
Pid: 2757, comm: ifconfig Not tainted 2.6.25 #1

Call Trace:
 [<ffffffff8108cf02>] print_trailer+0x123/0x12c
 [<ffffffff8108cfaf>] check_bytes_and_report+0xa4/0xcb
 [<ffffffff8108d2de>] check_object+0xca/0x212
 [<ffffffff8108d66d>] __free_slab+0x85/0xfd
 [<ffffffff811e5cf2>] ? skb_release_data+0xa8/0xad
 [<ffffffff8108d71d>] discard_slab+0x38/0x3a
 [<ffffffff8108e0f2>] __slab_free+0xdb/0x2ac
 [<ffffffff8108e3fa>] kfree+0xbc/0xcc
 [<ffffffff811e5cf2>] ? skb_release_data+0xa8/0xad
 [<ffffffff811e5cf2>] skb_release_data+0xa8/0xad
 [<ffffffff811e63b3>] skb_release_all+0xc9/0xce
 [<ffffffff811e5b4d>] __kfree_skb+0x11/0x78
 [<ffffffff811e5bdb>] kfree_skb+0x27/0x29
 [<ffffffffa00e63aa>] :atl1:atl1_clean_rx_ring+0x7e/0xe2
 [<ffffffffa00e64d7>] :atl1:atl1_down+0xc9/0xce
 [<ffffffffa00e8dcd>] :atl1:atl1_close+0x18/0x27
 [<ffffffff811ebd43>] dev_close+0x57/0x72
 [<ffffffff811eba47>] dev_change_flags+0xa8/0x164
 [<ffffffff8122f380>] devinet_ioctl+0x26a/0x5f6
 [<ffffffff8122fbad>] inet_ioctl+0x92/0xaa
 [<ffffffff811df5f4>] sock_ioctl+0x1da/0x202
 [<ffffffff8109f186>] vfs_ioctl+0x2a/0x77
 [<ffffffff8109f435>] do_vfs_ioctl+0x262/0x27f
 [<ffffffff8109f4a9>] sys_ioctl+0x57/0x7a
 [<ffffffff8100bff7>] tracesys+0xd5/0xda

FIX kmalloc-2048: Restoring 0xffff81010fcff402-0xffff81010fcff9f9=3D0x6b

Thanks,
Jay
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Need help debugging memory corruption, Jay Cliburn, (Sat May 3, 2:09 pm)
Re: Need help debugging memory corruption, Bart Van Assche, (Sun May 4, 11:02 am)
Re: Need help debugging memory corruption, Jay Cliburn, (Sun May 4, 3:52 pm)
Re: Need help debugging memory corruption, Jarek Poplawski, (Sun May 4, 10:24 am)
Re: Need help debugging memory corruption, Jarek Poplawski, (Sun May 4, 10:55 am)
Re: Need help debugging memory corruption, Jay Cliburn, (Sun May 4, 3:55 pm)
Re: Need help debugging memory corruption, Jarek Poplawski, (Mon May 5, 3:27 am)
Re: Need help debugging memory corruption, Jarek Poplawski, (Mon May 5, 3:50 am)
speck-geostationary