Hi
I get the following oops when trying to boot a arch/powerpc kernel with
preempt-rt installed (v2.6.25.4-rt1)
The board is using a Freescale 8280 as the main CPU and a Silicon Image
SII3124 SATA controller. The oops seems to happen on fileaccess right
after init starts.I need ideas what to look for.
Freeing unused kernel memory: 128k init
INIT: version 2.85 booting
Activating all swap files/partitions... [ OK ]
Mounting proc file system... [ OK ]
path=/bin:/usr/bin:/sbin:/usr/sbin
Oops: Exception in kernel mode, sig: 5 [#1]
PREEMPT Innovative Systems ApMax
Modules linked in:
NIP: c0249618 LR: c02495ec CTR: 00000000
REGS: ef29d550 TRAP: 0700 Not tainted (2.6.25.4-rt1)
MSR: 00021032 <ME,IR,DR> CR: 24044482 XER: 00000000
TASK = ef26d070[50] 'ldconfig' THREAD: ef29c000
GPR00: 00000001 ef29d600 ef26d070 00000000 11111111 00000000 ef29d64c
0000008c
GPR08: ef29d628 00000000 ef29d630 ef29c000 00000000 100b5eec 00000000
100b0000
GPR16: c00c3304 ef29dc48 0000000c 00000000 00000014 00000000 00000001
ef3818c0
GPR24: ef8a4000 00000011 00000000 c01bc3c4 ef29d698 ef381904 00009032
c0354700
NIP [c0249618] rt_spin_lock_slowlock+0x9c/0x200
LR [c02495ec] rt_spin_lock_slowlock+0x70/0x200
Call Trace:
[ef29d600] [c02495ec] rt_spin_lock_slowlock+0x70/0x200 (unreliable)
[ef29d670] [c00277d0] lock_timer_base+0x2c/0x64
[ef29d690] [c00285e8] del_timer+0x2c/0x78
[ef29d6b0] [c019d108] scsi_delete_timer+0x1c/0x3c
[ef29d6d0] [c01992d0] scsi_done+0x18/0x4c
[ef29d6f0] [c01b19dc] ata_scsi_qc_complete+0x364/0x380
[ef29d720] [c01a8708] __ata_qc_complete+0xd8/0xec
[ef29d740] [c01b011c] ata_qc_complete_multiple+0xc4/0xec
[ef29d760] [c01bcaf4] sil24_interrupt+0x46c/0x52c
[ef29d7a0] [c0048954] handle_IRQ_event+0x64/0x100
[ef29d7d0] [c0048b30] __do_IRQ+0x140/0x1bc
[ef29d7f0] [c00166c4] apmax_int_irq_demux+0x8c/0xb0
[ef29d810] [c0006448] do_IRQ+0x68/0xa8
[ef29d820] [c0010388] ret_from_except+0x0/0x14
--- Exception: 501 at __spin_unlock_irqrestore+0x28/0x4c
LR = __spin_unlock_irqre...
You're recursively entering lock_timer_base, which does a
spin_lock_irqsave(). Either interrupts are enabled when they should not
be, or an interrupt was supposed to be threaded that isn't.-Scott
--
Sort of figured. How do I figure out which one, and how to fix it?
I've never gotten any -rt patchsets to work on this CPU, and it always
seems to be related to the disk driver.
I've tried since 2.6.16 ppc.... (2.6.16, 2.6.18 on ppc, 2.6.24 and 25 on
powerpc)Even though this is a custom board, I'm pretty sure I can get it to fail
on a pq2fads board with the same disk controller.--
Almost certainly the latter. Is the disk interrupt shared with any
other interrupts, that are marked IRQF_NODELAY? The -rt patch doesn't
seem to handle mixing the two well.Oh, and just to be sure: you do have CONFIG_PREEMPT_RT turned on, and
not just CONFIG_PREEMPT, right? The non-preempt-rt versions in the -rt
patch don't look like they disable interrupts, though I may just be
getting lost in a sea of underscores and ifdefs.-Scott
--
Disk is on a muxed PCI interrupt. None of the other interrupts on the
mux is fireing at the time.
Is is possible that the demuxer is not set up right? It is based looselyFull CONFIG_PREEMPT_RT. I was actually going to try CONFIG_PREEMPT to
see if anything helped.--
Regardless of whether they're firing, any request_irq with IRQF_NODELAY
Try calling irq_set_chip_and_handler() with handle_level_irq, rather
than irq_set_chip(). The -rt patch doesn't seem to have threadified the
__do_IRQ() path.-Scott
--
The demuxer is setting itself up with set_irq_chained handler(), any
pointers on how to change to irq_set_chip_and_handler()?
--
No, I mean the call to set_irq_chip() in pci_pic_host_map() where it
sets up the IRQs it manages, not the cascade IRQ itself.-Scott
--
Thanks!!! That fixed that particular problem.
Of course I then ran headfirst into another one....
This one seems to happen when I attempt to read flash through an mtd
driver.Oops: Exception in kernel mode, sig: 5 [#1]
PREEMPT Innovative Systems ApMax
Modules linked in:
NIP: c005e780 LR: c005e758 CTR: 00000000
REGS: ef2c1d20 TRAP: 0700 Not tainted (2.6.25.4-rt1)
MSR: 00029032 <EE,ME,IR,DR> CR: 48222482 XER: 20000000
TASK = ef2a1a90[98] 'S21initenv' THREAD: ef2c0000
GPR00: 3ffcf581 ef2c1dd0 ef2a1a90 00000000 c02cae8c 00000002 00000001
3ffcf580
GPR08: 00000000 c0370000 00000000 c037000c 28222484 1009ecc0 00000000
100a5838
GPR16: 10090000 10090000 00000000 00000000 10090000 ef2a7100 3ee38385
100955ac
GPR24: 00000000 ef2a4ef0 00000000 c27dc700 c27dcb80 00000003 ef359040
c0334f88
NIP [c005e780] do_wp_page+0x650/0xc2c
LR [c005e758] do_wp_page+0x628/0xc2c
Call Trace:
[ef2c1dd0] [c005e758] do_wp_page+0x628/0xc2c (unreliable)
[ef2c1e10] [c0012420] do_page_fault+0x338/0x4b4
[ef2c1f40] [c0010120] handle_page_fault+0xc/0x80
--- Exception: 301 at 0x100322e0
LR = 0x100322dc
Instruction dump:
409e0594 4810f2e1 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a 7c0b482e
7d6b4a14 540007fa 30e0ffff 7cc70110 <0f060000> 3d20c035 3d40c035
8129b480
Oops: Exception in kernel mode, sig: 5 [#2]
PREEMPT Innovative Systems ApMax
Modules linked in:
NIP: c005db0c LR: c005dae4 CTR: 00000000
REGS: ef29fd00 TRAP: 0700 Tainted: G D (2.6.25.4-rt1)
MSR: 00029032 <EE,ME,IR,DR> CR: 48002482 XER: 20000000
TASK = ef26e070[102] 'sed' THREAD: ef29e000
GPR00: 3ee2d581 ef29fdb0 ef26e070 00000000 c02cae8c 00000000 00000001
3ee2d580
GPR08: 00000000 c0370000 00000000 c037000c 00000722 1009ecc0 4802dfa4
4802d878
GPR16: 0014f73c 00000000 00000000 00000003 4802cce0 00000000 00000001
02000000
GPR24: ef3592e0 0ffece1c ef3592e0 ef2a50fc ef2a4df4 00000003 c27dcfe0
c27ff4c0
NIP [c005db0c] __do_fault+0x1e0/0x804
LR [c005dae4] __do_fault+0x1b8/0x804
Call Trace:
[ef29fdb0] [c005dae4]...
Both if these is hitting a BUG_ON in kmap_atomic
--
Im not sure if LOCKDEP is available for that architecture. Have you
tried it? Its pretty good at flushing these kinds of issues out
(assuming its available).-Greg
--
| Chuck Ebbert | Why do so many machines need "noapic"? |
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Jeremy Allison | Re: [RFC] Heads up on sys_fallocate() |
git: | |
| Corey Minyard | [PATCH 3/3] Convert the UDP hash lock to RCU |
| David Miller | [GIT]: Networking |
| Denys Fedoryshchenko | packetloss, on e1000e worse than r8169? |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
