> On Fri 9.May'08 at 12:32:51 -0400, Mark Lord wrote:
>> Pallipadi, Venkatesh wrote:
>>>
>>>> -----Original Message-----
>>>> From:
linux-kernel-owner@vger.kernel.org
>>>> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Carlos R. Mafra
>>>> Sent: Friday, May 02, 2008 10:16 AM
>>>> To: Linus Torvalds
>>>> Cc: Adrian Bunk; Paul Mackerras; Josh Boyer; Arjan van de Ven; Andrew
>>>> Morton; Rafael J. Wysocki; davem@davemloft.net;
>>>> linux-kernel@vger.kernel.org; jirislaby@gmail.com; Steven Rostedt;
>>>> Pallipadi, Venkatesh
>>>> Subject: Re: RFC: starting a kernel-testers group for newbies
>>>>
>>>> On Fri 2.May'08 at 9:28:08 -0700, Linus Torvalds wrote:
>>>>
>>>>> Quite frankly, it does sound like the hang happens somewhere
>>>> around the
>>>>> hpet_init
>>>>> hpet_acpi_add
>>>>> hpet_resources
>>>>> hpet_resources: 0xfed00000 is busy
>>>>>
>>>>> printk's you added (correct?) and we've had tons of issues
>>>> with NO_HZ, so
>>>>> at a guess it is timer-related.
>>>> It happens a bit before that because when it hangs it doesn't print the
>>>> above lines, and when it does not hang these lines are
>>>> the ones right after the point where it hangs.
>>>>> (And I assume it's stable if/once it gets past that boot hang issue?
>>>> Yes you are right. When I have luck and the boot succeeds my Sony laptop
>>>> is rock solid and the kernel is wonderful (even the card reader works!).
>>>>
>>>>> That
>>>>> tends to mean that it's not some hardware instability, it's
>>>> literally our
>>>>> init code).
>>>> A few days ago I found this message in lkml in reply to a hpet patch
>>>>
http://lkml.org/lkml/2007/5/7/361 in which the reporter also had a
>>>> similar hang, which was cured by hpet=disable.
>>>> So it is in my TODO list to try to check out if that patch is in the
>>>> current -git and whether it can be reverted somehow (I added Venki to the
>>>> Cc: now)
>>>>
>>>> Thanks a lot for the answer!
>>> It depends on whether we are HPET is being force detected based on the
>>> chipset or whether it was exported by the BIOS in ACPI table.
>>>
>>> If it was force enabled and above patch is having any effect, then you
>>> should see a message like
>>>> Force enabled HPET at base address 0xfed00000
>>> In any case, off late there seems to be quite a few breakages that are
>>> related to HPET/timer interrupts. One of them was on a system which has
>>> HPET being exported by BIOS
>>>
http://bugzilla.kernel.org/show_bug.cgi?id=10409
>>> And the other one where we are force enabling based on chipset
>>>
http://bugzilla.kernel.org/show_bug.cgi?id=10561
>>>
>>> And then we have hangs once in a while reports by you, Roman and Mark
>>> here
>>>
http://bugzilla.kernel.org/show_bug.cgi?id=10377
>>>
http://bugzilla.kernel.org/show_bug.cgi?id=10117
>> ..
>>
>> Yeah. This particular bug first appeared when NOHZ & HPET were added.
>> Somebody once suggested it had something to do with an SMI interrupt
>> happening in the midst of HPET calibration or some such thing.
>>
>
> I said I was waiting for -rc1 to be released to send another email
> about my HPET problem, but curiously with v2.6.26-rc1-6-gafa26be
> my laptop did not hang after 30+ boots and counting.
>
> Somewhere between 2.6.25-07000-(something) and the above kernel
> something happened which changed significantly the probability
> of hanging during boot.
>
> I could not boot more than 3 times in
> a row without hanging with kernels up to 2.6.25-07000 (approximately),
> and now I am still booting v2.6.26-rc1-6-gafa26be a few times a day
> and no hangs yet.
>
> Yesterday I started a "reverse" bisection, trying to find which
> commit "fixed" it, but I still didn't finish (but it is past
> -7200).
>
> Of course I am not sure if after the 100th boot the latest -git
> won't hang but it definitely improved.
>
>> But nobody who works on the HPET code has ever shown more than a casual
>> interest in helping to track down and fix whatever the problem is.
>
> Well, I would like to thank Venki for his effort because he even
> answered some private emails from me about this issue and is
> tracking the bugzillas about it.