Hi
I found a regression in version 2.6.25-rc7, which causes my computer not to
shutdown or reboot. I get a complete lock up (no keyboard and Sys-Rq) just
before it normally shuts down the alsa service.I did a git bisect and it points to this reversion:
[266c2e0abeca649fa6667a1a427ad1da507c6375] Make printk() console semaphore
accesses sensible00:00.0 Host bridge: VIA Technologies, Inc. VT8385 [K8T800 AGP] Host Bridge
(rev 01)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge [K8T800/K8T890
South]
00:05.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture
(rev 11)
00:05.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev
11)
00:08.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04)
00:08.1 Input device controller: Creative Labs SB Audigy Game Port (rev 04)
00:08.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire Port (rev 04)
00:0b.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5705 Gigabit
Ethernet (rev 03)
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID
Controller (rev 80)
00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 81)
00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
[KT600/K8T800/K8T890 South]
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] ...
Is it 100% reproducable? I mean sometimes I also have complete lockup right
after one of eth interfaces shutdown during shutdown sequence. And it's
also recent.But it triggers ~1 time in 10 subjectively..
--
it is 100% reproducible with rc7 and rc8, I haven't come across the bug with
rc9 though..--
Argh, I think I see that's going on. I switched the order of some
operations without thinking about it, and moved a test outside of the
console lock.Since you cannot see this on -rc9, can you go back to -rc8 and test this
patch (instead of the revert) on top of that? I'm pretty sure this is it,
but it would be good to have it confirmed.Linus
--
Oh, maybe I should actually include the patch too ;)
Linus
---
kernel/printk.c | 16 ++++++++++++++--
1 files changed, 14 insertions(+), 2 deletions(-)diff --git a/kernel/printk.c b/kernel/printk.c
index c46a20a..4476879 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -643,8 +643,20 @@ static int acquire_console_semaphore_for_printk(unsigned int cpu)
{
int retval = 0;- if (can_use_console(cpu))
- retval = !try_acquire_console_sem();
+ if (!try_acquire_console_sem()) {
+ retval = 1;
+
+ /*
+ * If we can't use the console, we need to release
+ * the console semaphore by hand to avoid flushing
+ * the buffer
+ */
+ if (!can_use_console(cpu)) {
+ console_locked = 0;
+ up(&console_sem);
+ retval = 0;
+ }
+ }
printk_cpu = UINT_MAX;
spin_unlock(&logbuf_lock);
return retval;
--
Sorry for the late response, we have a electric situation in the country, the
patch fixes the problem for me (tested on rc8).Thanx
--
Interesting. That commit _should_ have just moved code around with no
actual semantic changes.Can you verify that undoing just that one commit makes current git (-rc9)
work for you? Ie just try agit revert 266c2e0abeca649fa6667a1a427ad1da507c6375
on top of the current tree. I just want to check, because even after
looking at that diff again, I'm not seeing what it could actually change.Linus
--
After about 4 restarts and 4 reboots, 2.6.25-rc9 seems to work fine with and
without the revert. I'll do some more testing. The assembly output files for
kernel/printk.c don't seem that different between rc9 and rc7. I'll see what
else I can test.Thanx
--
Ok, I suspect it may be timing-dependent and slightly random.
Sadly, that is absolutely the case where "git bisect" works the worst. The
end result of bisection will basically be _totally_ random if even one of
the "git bisect bad/good" choises were wrong - doing a binary search is
a very efficient way to find the buggy commit, but it also means that a
single wrong turn will efficiently find a commit that is somewhere totally
different.So if your shutdown/reboot regression is even slightly non-deterministic,
it makes "git bisect" rather less powerful (you can still do bisection,
but you just need to be extra careful and probably try multiple shutdowns
with a suspect kernel just to be absolutely sure you mark a kernel good
only if you're *really* sure it's good. Marking a kernel bad is much
safer, since you'd presumably only do that when you have actually seen the
bug in action)..Of course, even when you're really really careful, if it really is
timing-dependent, the bug may show up with a unrelated commit just because
it changes timing. Those kinds of bugs tend to be fairly rare, but they do
occasionally happen.Linus
--
git revert 266c2e0abeca649fa6667a1a427ad1da507c6375 on rc8 makes rc8 work
again.--
| Andrew Morton | -mm merge plans for 2.6.23 |
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Gabriel C | Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] |
git: | |
| Gerrit Renker | [PATCH 03/37] dccp: List management for new feature negotiation |
| David Miller | [GIT]: Networking |
| Thomas Jarosch | Re: TCP connection stalls under 2.6.24.7 |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
