Kernel crash in 10 minutes flat

Submitted by birchoff
on November 16, 2005 - 1:10pm

Well its not always in ten minutes flat but anywhere between during boot to 10-20minutes after my kernel dies. I am thinking about doing a full re install of linux to see if that solves my problem but this is my last ditch effort to salvage my linux install. First off the first time it started acting up I ran memtest on my ram and found one of the modules was bad so I removed it. That left me with 256MB PC133 ram that I stress tested in memtest86 v3.2 for 10hours... 20 passes later there were no errors to be found. I thought that was the end of my troubles but I cold boot into my 2.6.14 kernel and about 6-8 hours later what do I have a nice ly locked up system. Thinking that maybe the kernel was bad since it was compiled on fault memory I recompiled 2.6.14.2 in debug mode so that I could get more logging information if it too decided to go belly up. I have everything under kernel hacking that isnt experimental checked. Altered syslogd.conf so that it would hopefully catch all the generated logs(it wasnt catching air before). Luckily I was so prudent because like clockwork after I booted into the new debug kernel it died... I have no clue why it is dying since my efforts to log the errors have been partially successful alot of the stuff that gets dumped to console I dont see in my logs when I reboot. I have however captured some of the errors in my log files and they are listed below.
------------------------------------------------------------------------
Nov 15 18:19:59 pandora kernel: Unable to handle kernel paging request at virtual address bc880ee8
Nov 15 18:19:59 pandora kernel: printing eip:
Nov 15 18:19:59 pandora kernel: c012e3a9
Nov 15 18:19:59 pandora kernel: *pde = 00000000
Nov 15 18:19:59 pandora kernel: Oops: 0000
Nov 15 18:19:59 pandora kernel: CPU: 0
Nov 15 18:19:59 pandora kernel: EIP: 0010:[] Not tainted
Nov 15 18:19:59 pandora kernel: EFLAGS: 00010803
Nov 15 18:19:59 pandora kernel: eax: 3e083370 ebx: 6edd3b00 ecx: c4674110 edx: 00000000
Nov 15 18:19:59 pandora kernel: esi: c12c70c0 edi: 00000246 ebp: 000001f0 esp: ca109e84
Nov 15 18:19:59 pandora kernel: ds: 0018 es: 0018 ss: 0018
Nov 15 18:19:59 pandora kernel: Process tar (pid: 3268, stackpage=ca109000)
Nov 15 18:19:59 pandora kernel: Stack: cf6da000 00000000 cffd52e0 cf6da000 c0149991 c12c70c0 000001f0 cf6da000
Nov 15 18:19:59 pandora kernel: cffd52e0 cffd52e0 00065c68 c014ab76 cf6da000 00000000 cffd52e0 00065c68
Nov 15 18:19:59 pandora kernel: cf6da000 c014aee3 cf6da000 00065c68 cffd52e0 00000000 00000000 00065c68
Nov 15 18:19:59 pandora kernel: Call Trace: [] [] [] [] []
Nov 15 18:19:59 pandora kernel: [] [] [] [] []
Nov 15 18:19:59 pandora kernel:
Nov 15 18:19:59 pandora kernel: Code: 8b 44 81 18 03 59 0c 89 41 14 40 74 09 57 9d 89 d8 5b 5e 5f
Nov 15 18:20:00 pandora kernel: kernel BUG at inode.c:335!
Nov 16 11:27:29 pandora kernel: slab error in cache_alloc_debugcheck_after(): cache `size-2048': double free, or memory outside object was overwritten
Nov 16 11:27:29 pandora kernel: [] dump_stack+0x1e/0x20
Nov 16 11:27:29 pandora kernel: [] __slab_error+0x2f/0x40
Nov 16 11:27:29 pandora kernel: [] cache_alloc_debugcheck_after+0xdc/0x1a0
Nov 16 11:27:29 pandora kernel: [] __kmalloc+0xa5/0x100
Nov 16 11:27:29 pandora kernel: [] __alloc_skb+0x53/0x130
Nov 16 11:27:29 pandora kernel: [] boomerang_rx+0x162/0x450
Nov 16 11:27:29 pandora kernel: [] boomerang_interrupt+0xc5/0x3d0
Nov 16 11:27:29 pandora kernel: [] handle_IRQ_event+0x33/0x70
Nov 16 11:27:29 pandora kernel: [] __do_IRQ+0x76/0x110
Nov 16 11:27:29 pandora kernel: [] do_IRQ+0x63/0xa0
Nov 16 11:27:29 pandora kernel: =======================
Nov 16 11:27:29 pandora kernel: [] common_interrupt+0x1a/0x20
Nov 16 11:27:29 pandora kernel: [] cpu_idle+0x57/0x60
Nov 16 11:27:29 pandora kernel: [] rest_init+0x3b/0x40
Nov 16 11:27:29 pandora kernel: [] start_kernel+0x178/0x1c0
Nov 16 11:27:29 pandora kernel: [] 0xc0100199
Nov 16 11:27:29 pandora kernel: cc5de000: redzone 1: 0x5a2cf071, redzone 2: 0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Slab corruption: start=c83e7404, len=32
Nov 16 11:27:29 pandora kernel: Redzone: 0x170fc2a5/0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Last user: [](alloc_slabmgmt+0x52/0x60)
Nov 16 11:27:29 pandora kernel: 000: 30 7f 3e c8 24 ea fe cf 00 00 00 00 00 60 50 c7
Nov 16 11:27:29 pandora kernel: 010: 01 00 00 00 ff ff ff ff 00 00 5a 5a fe ff ff ff
Nov 16 11:27:29 pandora kernel: Prev obj: start=c83e73d8, len=32
Nov 16 11:27:29 pandora kernel: Redzone: 0x170fc2a5/0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Last user: [](alloc_slabmgmt+0x52/0x60)
Nov 16 11:27:29 pandora kernel: 000: 30 2f 37 c8 80 73 3e c8 00 00 00 00 00 80 3e c8
Nov 16 11:27:29 pandora kernel: 010: 01 00 00 00 ff ff ff ff 00 00 5a 5a fe ff ff ff
Nov 16 11:27:29 pandora kernel: Next obj: start=c83e7430, len=32
Nov 16 11:27:29 pandora kernel: Redzone: 0x170fc2a5/0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Last user: [](alloc_slabmgmt+0x52/0x60)
Nov 16 11:27:29 pandora kernel: 000: 88 74 3e c8 58 79 3e c8 00 00 00 00 00 60 c0 c7
Nov 16 11:27:29 pandora kernel: 010: 01 00 00 00 ff ff ff ff 00 00 5a 5a fe ff ff ff
Nov 16 11:27:29 pandora kernel: slab error in cache_alloc_debugcheck_after(): cache `size-32': double free, or memory outside object was overwritten
Nov 16 11:27:29 pandora kernel: [] dump_stack+0x1e/0x20
Nov 16 11:27:29 pandora kernel: [] __slab_error+0x2f/0x40
Nov 16 11:27:29 pandora kernel: [] cache_alloc_debugcheck_after+0xdc/0x1a0
Nov 16 11:27:29 pandora kernel: [] kmem_cache_alloc+0x6e/0xb0
Nov 16 11:27:29 pandora kernel: [] alloc_slabmgmt+0x52/0x60
Nov 16 11:27:29 pandora kernel: [] cache_grow+0xdd/0x190
Nov 16 11:27:29 pandora kernel: [] cache_alloc_refill+0x1b3/0x280
Nov 16 11:27:29 pandora kernel: [] kmem_cache_alloc+0x8f/0xb0
Nov 16 11:27:29 pandora kernel: [] dst_alloc+0x2e/0x90
Nov 16 11:27:29 pandora kernel: [] ip_route_input_slow+0x216/0x910
Nov 16 11:27:29 pandora kernel: [] ip_route_input+0xca/0x240
Nov 16 11:27:29 pandora kernel: [] ip_rcv+0x1ac/0x500
Nov 16 11:27:29 pandora kernel: [] netif_receive_skb+0x155/0x1f0
Nov 16 11:27:29 pandora kernel: [] process_backlog+0x79/0xf0
Nov 16 11:27:29 pandora kernel: [] net_rx_action+0x6a/0xf0
Nov 16 11:27:29 pandora kernel: [] __do_softirq+0x8f/0xa0
Nov 16 11:27:29 pandora kernel: [] do_softirq+0x64/0x70
Nov 16 11:27:29 pandora kernel: slab error in cache_alloc_debugcheck_after(): cache `size-2048': double free, or memory outside object was overwritten
Nov 16 11:27:29 pandora kernel: cc5de000: redzone 1: 0x5a2cf071, redzone 2: 0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Slab corruption: start=c83e7404, len=32
Nov 16 11:27:29 pandora kernel: Redzone: 0x170fc2a5/0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Last user: [](alloc_slabmgmt+0x52/0x60)
Nov 16 11:27:29 pandora kernel: 000: 30 7f 3e c8 24 ea fe cf 00 00 00 00 00 60 50 c7
Nov 16 11:27:29 pandora kernel: 010: 01 00 00 00 ff ff ff ff 00 00 5a 5a fe ff ff ff
Nov 16 11:27:29 pandora kernel: Prev obj: start=c83e73d8, len=32
Nov 16 11:27:29 pandora kernel: Redzone: 0x170fc2a5/0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Last user: [](alloc_slabmgmt+0x52/0x60)
Nov 16 11:27:29 pandora kernel: 000: 30 2f 37 c8 80 73 3e c8 00 00 00 00 00 80 3e c8
Nov 16 11:27:29 pandora kernel: 010: 01 00 00 00 ff ff ff ff 00 00 5a 5a fe ff ff ff
Nov 16 11:27:29 pandora kernel: Next obj: start=c83e7430, len=32
Nov 16 11:27:29 pandora kernel: Redzone: 0x170fc2a5/0x170fc2a5.
Nov 16 11:27:29 pandora kernel: Last user: [](alloc_slabmgmt+0x52/0x60)
Nov 16 11:27:29 pandora kernel: 000: 88 74 3e c8 58 79 3e c8 00 00 00 00 00 60 c0 c7
Nov 16 11:27:29 pandora kernel: 010: 01 00 00 00 ff ff ff ff 00 00 5a 5a fe ff ff ff
Nov 16 11:27:29 pandora kernel: slab error in cache_alloc_debugcheck_after(): cache `size-32': double free, or memory outside object was overwritten
Nov 16 11:27:30 pandora kernel: =======================
Nov 16 11:27:30 pandora kernel: [] irq_exit+0x3d/0x40
Nov 16 11:27:30 pandora kernel: [] do_IRQ+0x6a/0xa0
Nov 16 11:27:30 pandora kernel: [] common_interrupt+0x1a/0x20
Nov 16 11:27:30 pandora kernel: [] cpu_idle+0x57/0x60
Nov 16 11:27:30 pandora kernel: [] rest_init+0x3b/0x40
Nov 16 11:27:30 pandora kernel: [] start_kernel+0x178/0x1c0
Nov 16 11:27:30 pandora kernel: [] 0xc0100199
Nov 16 11:27:30 pandora kernel: c83e7400: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5.
Nov 16 11:27:30 pandora kernel: c83e7400: redzone 1: 0x170fc2a5, redzone 2: 0x170fc2a5.
Nov 16 11:52:05 pandora kernel: Unable to handle kernel paging request at virtual address c7b9d018
Nov 16 11:52:05 pandora kernel: printing eip:
Nov 16 11:52:05 pandora kernel: *pde = 0001d067
Nov 16 11:52:05 pandora kernel: *pte = 07b9d000
Nov 16 11:52:05 pandora kernel: Oops: 0000 [#1]
Nov 16 11:52:14 pandora kernel: Unable to handle kernel paging request at virtual address c7b9d004
Nov 16 11:52:14 pandora kernel: printing eip:
Nov 16 12:08:04 pandora kernel: slab error in cache_alloc_debugcheck_after(): cache `skbuff_head_cache': double free, or memory outside object was overwritten
------------------------------------------------------------------------

The computer is 4 years old it has a
duron 800mhz
currently 256MB ram
GeforceTI4200 (nvidia driver not installed)
Sunix usb2.0/firewire card
3Com (3c590/3c595) network card

Any help that anyone can give me in regards to this problem would be greatly appreciated. I dont have my kernel config available but as soon as my computer lives long enough for me to copy it off to my external hard drive I will edit the post with it.

Filesystem problems

Anonymous (not verified)
on
November 16, 2005 - 4:11pm

Judging from this line:
Nov 15 18:20:00 pandora kernel: kernel BUG at inode.c:335!

that seems to be pretty close to the top of your debug log, it appears as that your filesystem has become so corrupted that the kernel is having a hard time working with it.

I would recommend burning a rescue disk that has a fsck utility for your root filesystem, and running a through repair on your filesystem.

I would recommend even more, booting off of a rescue disk, copying everything that matters to a different computer, and reformatting and reinstalling that machine.

Bad hardware does very bad things to the kernel, XFS even has checksums on a lot of it's structures to detect bad memory and shut down the filesystem (hopefully) before it does too much damage. XFS is the only Linux filesystem I know of that has those checks, because doing those checks too often would sevierely slow your system down.

Thats what I thought too but

birchoff
on
November 16, 2005 - 4:56pm

Thats what I thought too but I have just tried reinstalling slackware 10.2 twice and on the first try the system rebooted and on the second it froze up after sputtering some stuff to the console.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.