Re: 2.6.23-rc7 - _random_ IRQ23 : nobody cared

Previous thread: [PATCH] fix modules oopsing in lguest guests by Rusty Russell on Monday, September 24, 2007 - 1:57 am. (1 message)

Next thread: [PATCH] Move kasprintf.o to obj-y by Alexey Dobriyan on Monday, September 24, 2007 - 3:18 am. (6 messages)
To: Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>
Cc: <rol@...>
Date: Monday, September 24, 2007 - 2:33 am

Hello,

I already reported kernel 2.6.23-rcX warning about irq X : nobody cared, and
it seemed to have been fixed in 2.6.23-rc6... Unfortunately, just rebooting
with my 2.6.23-rc7, I got it appearing again, though the previous boot was
just fine, and I didn't change/recompile my kernel in between.

So, what changed ? I've compiled two modules : qc-usb-messenger, and
hsf-modem, to make sure all my hardware is fully supported.

And now, I have :
....
scsi 3:0:1:0: Direct-Access ATA ST3500641AS 3.AA PQ: 0 ANSI: 5
sd 3:0:1:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
sd 3:0:1:0: [sdd] Write Protect is off
sd 3:0:1:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support
DPO or FUA
sd 3:0:1:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
sd 3:0:1:0: [sdd] Write Protect is off
sd 3:0:1:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support
DPO or FUA
irq 23: nobody cared (try booting with the "irqpoll" option)

Call Trace:
<IRQ> [<ffffffff8105d21b>] __report_bad_irq+0x30/0x72
[<ffffffff8105d46c>] note_interrupt+0x20f/0x253
[<ffffffff8105dd38>] handle_fasteoi_irq+0xa9/0xd1
[<ffffffff8100ec65>] do_IRQ+0xf1/0x160
[<ffffffff8100b25b>] mwait_idle+0x0/0x45
[<ffffffff8100c431>] ret_from_intr+0x0/0xa
<EOI> [<ffffffff8100b29d>] mwait_idle+0x42/0x45
[<ffffffff8100b1f3>] cpu_idle+0xbd/0xe0
[<ffffffff8175ca8e>] start_kernel+0x2bb/0x2c7
[<ffffffff8175c140>] _sinittext+0x140/0x144

handlers:
[<ffffffff81307485>] (ata_interrupt+0x0/0x1d3)
Disabling IRQ #23
sdd:<3>ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata4.01: cmd c8/00:08:00:00:00/00:00:00:00:00/f0 tag 0 cdb 0x0 data 4096 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4: soft resetting port
ata4.01: qc timeout (cmd 0x27)
ata4.01: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors
(976773168) ata4.0...

To: <rol@...>
Cc: Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>
Date: Monday, September 24, 2007 - 10:26 am

Tried using the modem?
-

To: David Newall <david@...>
Cc: Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Tuesday, September 25, 2007 - 3:00 am

Hi David,

On Mon, 24 Sep 2007 23:56:59 +0930

When no problem is reported, both the libata part and the modem are OK.
When the problem is reported, at that time, only libata is handling IRQ23
(the modem is a WinModem, and the driver is an out-kernel module), this
is still kernel boot time, and the disabling of the IRQ makes my machine
unable to complete the boot process (too many disk timeout).

It could be good to be able to delay the disabling of an IRQ something long
enough to allow all the modules to be loaded...

Paul

-

To: Paul Rolland <rol@...>
Cc: David Newall <david@...>, Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Wednesday, September 26, 2007 - 8:55 pm

Can you change driver load order such that the driver for the modem is
loaded first?

--
tejun

-

To: Tejun Heo <htejun@...>
Cc: David Newall <david@...>, Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Thursday, September 27, 2007 - 2:05 am

Hi Tejun,

On Thu, 27 Sep 2007 09:55:22 +0900

As I said, it's not possible, because :
- the modem driver is an out-kernel one, so I have to wait the end of the
boot process so that it can be loaded,
- libata on IRQ23 is the one taking care of my disks, and I suspect it
quite hard to install a modem driver before having the disk driver
installed.

I was thinking of delaying the disabling of the IRQ, which is basically the
other part of the problem (the first part being that spurious IRQ from the
modem). If it is possible to do that long enough for the modem driver to be
loaded, then the "IRQ xx : nobody cared" becomes an informational message
during the boot process, and then it vanishes, leaving a perfectly working
machine.
I suspect something in note_interrupt that would do (totally
untested, just thinking loudly) :

/* Allow some delay to complete boot process before
* killing an IRQ. This allow some modules to be
* loaded before we decide the IRQ will not be handled.
*/
if (jiffies > 120*HZ) {
/*
* Now kill the IRQ
*/
printk(KERN_EMERG "Disabling IRQ #%d\n", irq);
desc->status |= IRQ_DISABLED;
desc->depth = 1;
desc->chip->disable(irq);
}

I'll try that this week-end, but if someone has an opinion about it, I'll
be glad to know :)

Regards,
Paul

-

To: Paul Rolland <rol@...>
Cc: David Newall <david@...>, Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Friday, September 28, 2007 - 5:55 am

You can do both by...

1. Build the modem driver into the kernel. char drivers are linked in
before ATA ones, so it will attach first.

2. Using a custom initrd with emergency shell. initrd is loaded by BIOS
so no driver is involved. I don't actually know how to do this tho.

3. Put in an extra disk controller and boot from it with both drivers
compiled as module.

--
tejun
-

To: Paul Rolland <rol@...>
Cc: Tejun Heo <htejun@...>, David Newall <david@...>, Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Thursday, September 27, 2007 - 5:04 am

Let me guess... this is a T61 or X61 ?

There's a problem with these that we don't fully understand yet, we're
getting those stale interrupts all over the range.

I wonder if it could be a bug with the ICH8 chipset...

If yours is one of these, it's being dealt with (or attempted to deal
with) at

http://bugzilla.kernel.org/show_bug.cgi?id=8853

Ben.

-

To: <benh@...>
Cc: Tejun Heo <htejun@...>, David Newall <david@...>, Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Thursday, September 27, 2007 - 6:05 am

Hello,

On Thu, 27 Sep 2007 19:04:11 +1000
Bad luck ;)

This is an Asus P5W-DH Deluxe motherboard, with a Core2 6400 CPU,
a bunch of disk (2 IDE, 3 SATA, 1 CDRW and 1 DVDRW-DL), and a damned
Olitec PCI V92 V2 modem.

Paul

-

To: Paul Rolland <rol@...>
Cc: Tejun Heo <htejun@...>, David Newall <david@...>, Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Thursday, September 27, 2007 - 6:27 pm

What chipset ? 965gm ?

Ben.

-

To: <benh@...>
Cc: Tejun Heo <htejun@...>, David Newall <david@...>, Linux Kernel <linux-kernel@...>, IDE/ATA development list <linux-ide@...>, <rol@...>
Date: Friday, September 28, 2007 - 2:30 am

Hi,

On Fri, 28 Sep 2007 08:27:58 +1000

975x

Paul

--
Paul Rolland E-Mail : rol(at)witbe.net
Witbe.net SA Tel. +33 (0)1 47 67 77 77
Les Collines de l'Arche Fax. +33 (0)1 47 67 77 99
F-92057 Paris La Defense RIPE : PR12-RIPE

Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un navigateur
"Some people dream of success... while others wake up and work hard at it"

"I worry about my child and the Internet all the time, even though she's too
young to have logged on yet. Here's what I worry about. I worry that 10 or 15
years from now, she will come to me and say 'Daddy, where were you when they
took freedom of the press away from the Internet?'"
--Mike Godwin, Electronic Frontier Foundation
-

Previous thread: [PATCH] fix modules oopsing in lguest guests by Rusty Russell on Monday, September 24, 2007 - 1:57 am. (1 message)

Next thread: [PATCH] Move kasprintf.o to obj-y by Alexey Dobriyan on Monday, September 24, 2007 - 3:18 am. (6 messages)