Hi, With 2.6.21-rc2-git1 I have a problem with my ps/2 port keyboard - it only works with one of the following on the command-line: - nolapic - irqfixup - pci=noacpi Otherwise it gets stuck with the numlock on. The following options have no effect: - nohz=off (who knows eh?) - pci=nomsi Here is my lspci: 00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge (rev 80) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge 00:0a.0 Network controller: RaLink RT2500 802.11g Cardbus/mini-PCI (rev 01) 00:0b.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04) 00:0b.1 Input device controller: Creative Labs SB Audigy Game Port (rev 04) 00:0b.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire Port (rev 04) 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80) 00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South] 00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60) 00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78) 01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX/MX 400] (rev a1) And here is a boot log without any of those parameters (keyboard fails): Mar 3 14:38:24 joker syslog-ng[4183]: syslog-ng starting up; version='2.0.0' Mar 3 14:38:24 joker a0000 (reserved) Mar 3 14:38:24 joker limit_regions endfor: 00000000000f0000 - ...
On Sat, 3 Mar 2007 15:14:24 +0000 > Mar 3 14:43:13 joker pnp: Device 00:
On Sun, 4 Mar 2007 14:23:50 +0000 Any thoughts on this? It still occurs with 2.6.21-rc3. Here's my config in case that helps. You'll see that I have swap-prefetch patched in (I also have RDSL and some VM changes in there), but I have confirmed that the problem occurs with no extra patches. By the way, I tested mm1 with a rather different config (I used my distro package) and still saw the problem. Also, you should probably ignore that bit above where I suggest the keyboard driver is being loaded as a module, because of course it isn't.. Yet it does start responding (with pci=noacpi) at about that time that udev does its thing. Anyhoo: # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21-beyond # Tue Mar 6 15:07:17 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="ash" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SWAP_PREFETCH=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # ...
Your config looks fine so it must be some ACPI change that affected IRQ routing. If IRQ's not being delievered AT keyboard probe will time out. You said that it broke between 2.6.20 and 2.6.20-rc2.. Have you tried -rc1? Tha happens because actual keyboard/mouse probing is offloaded to kseriod thread so nothing happens untul it actually gets scheduled. -- Dmitry -
On Wed, 7 Mar 2007 12:00:04 +0000 schnip So, I tracked this down to 2.6.21-git7, the first snapshot that gives me this problem. Tellingly it does contain an input tree merge. I would git bisect but I don't have a local copy of the tree - I tried to get one, but it stopped halfway through the clone, probably because I had to use http... So, I hope that helps. Ash PS: I should have said I'm not subscribed, so please CC me on reply. PPS: That almost rhymes. Almost. -
Hmm. There is no "2.6.21-git7" (that would be the seventh nightly snapshot after 2.6.21 is released, which hasn't happened yet!). Do you mean that it happens between 2.6.20-git6 and 2.6.20-git7? That would be git commits (the way to get them is to look at the "*.id" file that is associated with a snapshot): 66efc5a7e3061c3597ac43a8bb1026488d57e66b -git6 509cb37e173d4e39cec47238397e91b718730794 -git7 and yes, doing a gitk 66efc5a7..509cb37e Can you try "rsync"? It's not a great protocol in general, but it's perfectly fine for an initial clone.. After that, since you have already narrowed it down to a particular nightly snapshot, you could do the bisection startign from the known commits already: git bisect start git bisect good 66efc5a7 # 2.6.20-git6 was good git bisect bad 509cb37e # 2.6.20-git7 was bad and you'll have less than 500 commits to test (which is quite fast to bisect). If you want to do some manual checking first (ie guessing that the bad behaviour came from that particular input merge), you could first try out commit 2a598df5, which is the head commit before of the merged input tree (this is all trivial to see with the above "gitk" - the SHA1's may sound scary and esoteric, but they're really easy to look up). That manual check (*if* it turns out that 2a598df5 is indeed the bad one) would cut down the range from good to bad to just 18 commits (the range would be 66efc5a7..2a598df5), and then you should be able to pinpoint the exact bad one from just a few reboots.. (If you have to bisect all 500 commits, it would be ~10 reboots rather than four or five). Linus -
Hm, that is strange... 2.6.20-rc7 has i8042 AUX IRQ delivery test fix and fix for panic blink, both shoudl not really affect your keyboard. Can I please get full dmesg of boot with "i8042.debug log_buf_len=131072"? -- Dmitry -
Argh, I can't believe I forgot to get this into my tree. Could you please
tell me if the patch below fixes ytour issue?
--
Dmitry
Input: i8042 - another attempt to fix AUX delivery checks
Do not assume that AUX_LOOP command is broken unless it
completes successfully but returns wrong (unexpected) data.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
---
drivers/input/serio/i8042.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
Index: linux/drivers/input/serio/i8042.c
===================================================================
--- linux.orig/drivers/input/serio/i8042.c
+++ linux/drivers/input/serio/i8042.c
@@ -553,7 +553,8 @@ static int __devinit i8042_check_aux(voi
*/
param = 0x5a;
- if (i8042_command(&param, I8042_CMD_AUX_LOOP) || param != 0x5a) {
+ retval = i8042_command(&param, I8042_CMD_AUX_LOOP);
+ if (retval || param != 0x5a) {
/*
* External connection test - filters out AT-soldered PS/2 i8042's
@@ -567,7 +568,12 @@ static int __devinit i8042_check_aux(voi
(param && param != 0xfa && param != 0xff))
return -1;
- aux_loop_broken = 1;
+/*
+ * If AUX_LOOP completed without error but returned unexpected data
+ * mark it as broken
+ */
+ if (!retval)
+ aux_loop_broken = 1;
}
/*
-
On Wed, 7 Mar 2007 21:25:34 +0000 Apologies for only replying to my own mails, but I need to be CC'd if any alternative is to be convenient :) Linus, thanks for your detailed messages. I will try to get a bisect done, but the university firewall is likely to put up a fight against rsync as much as it does with the git protocol. We will see. And yeah, that 2.6.21-git7 business was a typo, should've been 2.6.20-git7, natch. Anyway, here's the bootlog for Dmitry from a boot with broken keyboard (2.6.21-rc3): Mar 7 23:16:41 joker syslog-ng[4349]: syslog-ng starting up; version='2.0.0' Mar 7 23:16:41 joker Linux version 2.6.21-beyondash (root@joker) (gcc version 4.1.2) #1 Wed Mar 7 11:39:45 GMT 2007 Mar 7 23:16:41 joker BIOS-provided physical RAM map: Mar 7 23:16:41 joker sanitize start Mar 7 23:16:41 joker sanitize end Mar 7 23:16:41 joker copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end: 000000000009fc00 type: 1 Mar 7 23:16:41 joker copy_e820_map() type is E820_RAM Mar 7 23:16:41 joker copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end: 00000000000a0000 type: 2 Mar 7 23:16:41 joker copy_e820_map() start: 00000000000f0000 size: 0000000000010000 end: 0000000000100000 type: 2 Mar 7 23:16:41 joker copy_e820_map() start: 0000000000100000 size: 000000001fef0000 end: 000000001fff0000 type: 1 Mar 7 23:16:41 joker copy_e820_map() type is E820_RAM Mar 7 23:16:41 joker copy_e820_map() start: 000000001fff0000 size: 0000000000008000 end: 000000001fff8000 type: 3 Mar 7 23:16:41 joker copy_e820_map() start: 000000001fff8000 size: 0000000000008000 end: 0000000020000000 type: 4 Mar 7 23:16:41 joker copy_e820_map() start: 00000000fec00000 size: 0000000000001000 end: 00000000fec01000 type: 2 Mar 7 23:16:41 joker copy_e820_map() start: 00000000fee00000 size: 0000000000001000 end: 00000000fee01000 type: 2 Mar 7 23:16:41 joker copy_e820_map() start: 00000000fff80000 size: 0000000000080000 end: 0000000100000000 type: 2 Mar 7 23:16:41 joker BIOS-e820: ...
The non-working setup doesn't get any interrupts back, and thus doesn't see the ACK for the "\xd4\xed" command. It really looks interrupt-related (especially considering that it goes away when you ask ACPI to not do certain things), but at the same time, the differences between -git6 and -git7 really don't seem to have *any* ACPI or PCI irq routing changes, so I think this really is related to the input-layer, and perhaps the real difference between ACPI irq routing and not is just the timing or IO acecss patterns that you get when you use the local apic vs the i8259 legacy irq controller. For example, if there is a edge-triggered interrupt involved (and both keyboard *and* mouse are edge-triggered), the io-apic and the i8259 work differently: temporarily disabling the interrupt will reset the edge trigger logic on the i8259, but not on an IO-APIC. So the lack of interrupts could be due to the input layer not clearing the interrupt source during setup, so some *old* interrupt just stays around, and because it's always set, on an IO-APIC it will never show as an edge at all - but on the i8259 the very action of registering the irq routine will create an edge. There's some reason to believe that you may have a pending interrupt so there was certainly *something* unexpected there. I'm not saying that's it, but it could explain why something that looks interrupt-related and that changes depending on whether you use ACPI to set up interrupts or not can have these kinds of reasons, that just depend on which interrupt controller the kernel happens to use, even though it's not "really" about lost interrupts at all, but just a driver that doesn't acknowledge a pending one. Or something. Doing the git bisect would really help. Linus -
On Wed, 7 Mar 2007 23:49:14 +0000 Yup, that patch really hit the spot. Now I don't have to bisect :) Thanks for your help (both of you), Ash -
