Hello,
Kernel 2.6.26-rc1 completely fails to boot on my laptop. The last line
appearing on the console is:ACPI: Processor [CPU1] (supports 8 throttling states)
Kernel 2.6.25 fails to boot in an estimated 1 in 2 attempts. The last
line on the console if this one fails is:ACPI: LNXTHERM:01 is registered as thermal_zone0
I did a git bisection with "fails to boot in the first three attempts"
as the definition of "bad" between 2.6.26-rc1 and 2.6.25-rc3 which I
believed to be free of this problem. The many reboots in the
bisection process however showed me that 2.6.25-rc3 also has an
estimated 1 in 10 chance of failing. Anyway, the bisection ended
with:8a3227268877b81096d7b7a841aaf51099ad2068 is first bad commit
The bisection log and dmesg are attached.
On Sat 10.May'08 at 22:21:28 +0200, Henny Wilbrink wrote:
I have this boot problem since before 2.6.25-rc1 but it is
very difficult to do a bisection. Sometimes the probability
of hanging was 1:30 and what I thought was a good kernel in
fact was bad.In fact the commit you got is a "merge commit" which does not
change the source code so it is clearly bogus.Mark Lord points out that this bug comes and goes with slight
modifications in the .config, and in fact very recently
the probability of hanging changed dramatically for me.I have already booted the kernel afa26be86b65 (six commits
after 2.6.26-rc1) more than 30 times and it did not hang.You can also take a look at some discussion here:
http://bugzilla.kernel.org/show_bug.cgi?id=10117And good luck with this!
--
No, 2.6.26-rc1 with hpet=disable now stops at
ata_piix 0000:00:1f.2: MAP [ P0 P2 IDE IDE ]
However, with idle=mwait as suggested by Venkatesh Pallipadi it did
Thanks,
Henny
--
I think so too, but it is difficult to really find where
is problem is (hpet?, nohz?, cpuidle? etc)So today I tried to humbly hack a bit to get a trace, and I think
I managed to do it.The observation in my notes was that with 2.6.25-rc9 the
kernel printed two lines when I pressed the power button
when the boot hung:evmisc-0145 [00] ev_queue_notify_reques: Dispatching Notify(80) on node ffff81007f04ee30
evmisc-0154 [00] ev_queue_notify_reques: Notify value: 0x80 **Device Specific*so the kernel was not completely hung. Sometimes lines similar to the above (I
don't remember the numbers anymore) were being printed continuosly after the
hang for more than 15 minutes at a rate of more or less 1 per minute.So my idea was to insert a WARN_ON(1) in a few places, and in particular the
function which had those evmisc printk's.Then I got some traces before and after the hang point, and another trace when
I pressed the power button! I don't know if they will reveal something
interesting to kernel hackers, but I will transcribe them here (I lost my
digital camera, so it took me a long time to write them).This is how the screen looked like when the boot hung (caveat lector: it
was all copied by hand, so there may be typos):[<ffffffff8022a909>] ? update_rq_clock+0x19/0x20
[<ffffffff8023fbfa>] run_timer_softirq+0x2a/0x230
[<ffffffff8022a81a>] ? __update_rq_clock+0x2a/0x100
[<ffffffff8023bc44>] __do_softirq+0x74/0xf0
[<ffffffff8020fcca>] ? profile_pc+0x3a/0x70
[<ffffffff8020d45c>] call_softirq+0x1c/0x30
[<ffffffff8020faed>] do_softirq+0x3d/0x80
[<ffffffff8023bbc5>] irq_exit+0x85/0x90
[<ffffffff8021f4de>] smp_apic_timer_interrupt+0x7e/0xc0
[<ffffffff8020b260>] ? mwait_idle+0x0/0x50
[<ffffffff8020b0e0>] ? default_idle+0x0/0x70
[<ffffffff8020cf06>] apic_timer_interrupt+0x66/0x70
<EOI> [<ffffffff8020b2a0>] ? mwait_idle+0x40/0x50
[<ffffffff8020a882>] ? enter_idle+0x22/0x30
[<ffff...
Well, for what it is worth, I just finished building 2.6.26-rc2 and it
booted ok three times in a row.Regards,
Henny
--
| Zhang, Yanmin | AIM7 40% regression with 2.6.26-rc1 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Valdis.Kletnieks | Re: ndiswrapper and GPL-only symbols redux |
| Greg Kroah-Hartman | [PATCH 011/196] sysfs: Fix a copy-n-paste typo in comment |
git: | |
| Sander | 'struct task_struct' has no member named 'mems_allowed' (was: Re: 2.6.20-rc4-mm1) |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Arjan van de Ven | Re: [GIT]: Networking |
| Antonio Almeida | Re: [PATCH iproute2] Re: HTB accuracy for high speed |
