2.6.26-rc3 is OK top - 20:11:08 up 5 min, 1 user, load average: 7.05, 4.50, 1.93 Tasks: 103 total, 3 running, 100 sleeping, 0 stopped, 0 zombie Cpu(s): 8.3%us, 91.7%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 494172k total, 278192k used, 215980k free, 18264k buffers Swap: 500432k total, 0k used, 500432k free, 159376k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 851 root 15 -5 0 0 0 R 65.6 0.0 3:21.16 khubd 3847 bor 20 0 97960 26m 20m R 32.3 5.6 0:06.27 kontact I am unable to reboot - khubd is apparently spinning on CPU preventing it. System is using ohci_hcd. 00:00.0 Host bridge : ALi Corporation M1644/M1644T Northbridge+Trident [10b9:1644] (rev 01) Flags: bus master, medium devsel, latency 0 Memory at f0000000 (32-bit, prefetchable) [size=64M] Capabilities: <access denied> Kernel driver in use: agpgart-ali Kernel modules: ali-agp 00:01.0 PCI bridge : ALi Corporation PCI to AGP Controller [10b9:5247] (prog-if 00 [Normal decode]) Flags: bus master, slow devsel, latency 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 Memory behind bridge: f7f00000-fdffffff Prefetchable memory behind bridge: 48000000-480fffff 00:02.0 USB Controller [0c03]: ALi Corporation USB 1.1 Controller [10b9:5237] (rev 03) (prog-if 10 [OHCI]) Subsystem: Toshiba America Info Systems Device [1179:0004] Flags: bus master, medium devsel, latency 64, IRQ 11 Memory at f7eff000 (32-bit, non-prefetchable) [size=4K] Capabilities: <access denied> Kernel driver in use: ohci_hcd Kernel modules: ohci-hcd 00:04.0 IDE interface : ALi Corporation M5229 IDE [10b9:5229] (rev c3) (prog-if f0) Subsystem: Toshiba America Info Systems Device [1179:0004] Flags: bus master, medium devsel, latency 64, IRQ 255 [virtual] Memory at ...
reverting 38b375d9610e2467cb793a84d17c6f65e44cdb39 fixed it
... that is: commit 38b375d9610e2467cb793a84d17c6f65e44cdb39 Author: Alan Stern <email@example.com> Date: Mon Jul 21 09:56:26 2008 -0400 USB: OHCI: fix system hang caused by earlier patch Signed-off-by: Alan Stern <firstname.lastname@example.org> Tested by: Andrey Borzenkov <email@example.com> Signed-off-by: Greg Kroah-Hartman <firstname.lastname@example.org> so it apparently used to work for you at that time. What gives? Rafael --
Well, you should not commit a fix without commiting code that has been fixed first :)
Actually the code to be fixed _was_ committed first -- but then it was reverted before the fix was accepted, so the fix was merged without it. My advice is not to worry about it. That code has been sent once again to Linus -- it's not merged yet but presumably it will be soon. Certainly before 2.6.27-rc5 appears. On the other hand, I still have to wonder how the fix could have caused your problem without the original patch in place. The fix itself should have been totally innocuous. Alan Stern --
It looks even funnier. Right now I am running with commits 38b375d9610e2467cb793a84d17c6f65e44cdb39 *and* e872154921a6b5256a3c412dd69158ac0b135176 reverted. I.e. this should be the state which hopelessly failed in 2.6.26-rc. It seems to be doing quite well now in 2.6.27-rc. "git revert e872154921a6b5256a3c412dd69158ac0b135176" gives me this one liner patch: commit f3cf9ad86ee76077d1c6be9af7d197aa13ccdff9 Author: Andrey Borzenkov <email@example.com> Date: Fri Aug 22 21:15:26 2008 +0400 Revert "USB: don't explicitly reenable root-hub status interrupts" This reverts commit e872154921a6b5256a3c412dd69158ac0b135176. diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 107e1d2..d30f822 100644 =2D-- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -3086,6 +3086,11 @@ static void hub_events(void) if (!hdev->parent && !hub->busy_bits) usb_enable_root_hub_irq(hdev->bus); + /* If this is a root hub, tell the HCD it's okay to + * re-enable port-change interrupts now. */ + if (!hdev->parent && !hub->busy_bits) + usb_enable_root_hub_irq(hdev->bus); + loop_autopm: /* Allow autosuspend if we're not going to run again */ if (list_empty(&hub->event_list)) Either my git tree is completely botched or most parts were already reverted before. So the problem seems to have cured by itself between 2.6.26 and 2.6.27? =20
_Something_ is completely botched. e872154 is much bigger than what you quoted above. The commit you really want to revert is 09ca8adbe9f724a7e96f512c0039c4c4a1c5dcc0. Alan Stern --
Sure. Mouse slipped doing copy'n'paste :) If you are still interested in this strange effect of lone 38b375d9, I could run some tests; just tell me what is needed.
I'm not really concerned with theoretical intermediate states. So long as your system is okay with the final state and the actual intermediate versions of the kernel (other than the 2.6.26-rc form which we already know causes problems), then I'm happy. Alan Stern --
Well, such things happened in the past. I won't add this to the list of regressions for now, but please monitor the status of -rc5 and let me know if the problem reappears in there. Thanks, Rafael --