Re: [Bug #11035] System hangs on 2.6.26-rc8

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Vegard Nossum
Date: Friday, July 18, 2008 - 12:28 am

On Fri, Jul 18, 2008 at 9:11 AM, Ingo Molnar <mingo@elte.hu> wrote:

Hm.. Yes, we could do it in a similar fashion using single-stepping.
It should take little effort; we already have most of the code to do
it; mmiotrace does the same thing too, after all.

These are some considerations:

1. If the page is kernel space but currently unmapped, does it point
to a valid page of RAM even though it is non-present?
2. Should we allow reading/writing of the underlying physical page (if
it exists), or should we prevent writes (i.e. allow the instruction to
proceed, but don't really write anything) and reads (i.e. allow the
instruction to read 0 or another magic number).

For the filter you mentioned, we could perhaps use one more bit in the
PTE. This is what we do for kmemcheck, and IIRC DEBUG_PAGEALLOC is
incompatible with kmemcheck anyway (I don't remember why exactly), so
we could reuse the same bit.

BTW, I didn't consider that argument (of continuing as far as
possible) before, but it's a good one; if we don't crash completely,
the user can still copy the log we have a better report of it. I guess
kerneloops.org is currently missing out a great deal of reports which
all shut down the machine immediately without a chance to go into the
log.


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 4:39 am)
[Bug #10493] mips BCM47XX compile error, Rafael J. Wysocki, (Sun Jul 6, 4:39 am)
[Bug #10724] ACPI: EC: GPE storm detected, disabling EC GPE, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10726] x86-64 NODES_SHIFT compile failure., Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10860] total system freeze at boot with 2.6.26-rc, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10725] USB Mass storage mount fails: Write protect on, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10786] parisc: 64bit SMP does not boot on J5600, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10821] rt25xx: lock dependency warning, association ..., Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10741] bug in `tty: BKL pushdown'?, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10629] 2.6.26-rc1-$sha1: RIP __d_lookup+0x8c/0x160, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10862] forcedeth: lockdep warning on ethtool -s, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10861] 2.6.26-rc4-git2 - long pause during boot, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #11006] 2.6.26-rc6: pcmcia stopped working, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10971] radeonfb : radeon X800 family support (atombios), Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10957] pata_pcmcia with Sandisk Extreme III 8GB, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #11024] 2.6.25 to 2.6.26-rc8 regression (related to ..., Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #9791] Clock is running too fast^Wslow using acpi_pm ..., Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #11040] 2.6.26-rc: host can not shutdown: ata problem, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #11039] 2.6.28-rc8-git3 forcedeth WARNING (kills the ..., Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #11035] System hangs on 2.6.26-rc8, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #11009] No console on Riva TNT since 2.6.26-0.rc4, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10960] 2.6.26-rc: SPARC: Sun Ultra 10 can not boot, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10984] MMC print trace information when resume from ..., Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10955] v2.6.26-rc7: BUG task_struct: Poison overwritten, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
[Bug #10906] repeatable slab corruption with LTP msgctl08, Rafael J. Wysocki, (Sun Jul 6, 4:45 am)
Re: [Bug #10493] mips BCM47XX compile error, Adrian Bunk, (Sun Jul 6, 6:39 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Ingo Molnar, (Sun Jul 6, 7:14 am)
Re: [Bug #10861] 2.6.26-rc4-git2 - long pause during boot, James Bottomley, (Sun Jul 6, 7:33 am)
Re: [Bug #10741] bug in `tty: BKL pushdown'?, Johannes Weiner, (Sun Jul 6, 7:47 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Linus Torvalds, (Sun Jul 6, 8:46 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Sun Jul 6, 8:58 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, James Bottomley, (Sun Jul 6, 9:11 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Ingo Molnar, (Sun Jul 6, 9:58 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 10:05 am)
Re: [Bug #10861] 2.6.26-rc4-git2 - long pause during boot, Rafael J. Wysocki, (Sun Jul 6, 10:16 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Sun Jul 6, 10:40 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 11:02 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Sun Jul 6, 11:17 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Sun Jul 6, 11:26 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Linus Torvalds, (Sun Jul 6, 11:33 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 2:04 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Sun Jul 6, 2:32 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Linus Torvalds, (Sun Jul 6, 2:47 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 2:47 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Linus Torvalds, (Sun Jul 6, 2:54 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 2:56 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 3:00 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Sun Jul 6, 3:07 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rene Herman, (Sun Jul 6, 3:10 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Sun Jul 6, 3:19 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Linus Torvalds, (Sun Jul 6, 3:27 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 3:57 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rene Herman, (Sun Jul 6, 4:05 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Sun Jul 6, 4:06 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Nigel Cunningham, (Sun Jul 6, 4:11 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jul 6, 4:15 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Johannes Weiner, (Sun Jul 6, 4:55 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Benjamin Herrenschmidt, (Sun Jul 6, 5:51 pm)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Ingo Molnar, (Sun Jul 6, 11:32 pm)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Ingo Molnar, (Sun Jul 6, 11:42 pm)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Yinghai Lu, (Sun Jul 6, 11:57 pm)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Yinghai Lu, (Mon Jul 7, 12:15 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Ingo Molnar, (Mon Jul 7, 12:24 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Adrian Bunk, (Mon Jul 7, 1:58 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Pavel Machek, (Mon Jul 7, 10:37 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Maximilian Engelhardt, (Mon Jul 7, 11:26 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Mon Jul 7, 11:39 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Mon Jul 7, 3:40 pm)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Ingo Molnar, (Mon Jul 7, 10:16 pm)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Mon Jul 7, 10:28 pm)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Yinghai Lu, (Tue Jul 8, 12:07 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Tue Jul 8, 8:39 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Tue Jul 8, 8:44 am)
Re: [Bug #10872] x86_64 boot hang when CONFIG_NUMA=n, Yinghai Lu, (Tue Jul 8, 12:02 pm)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Roman Mindalev, (Thu Jul 10, 5:29 am)
Re: [Bug #10786] parisc: 64bit SMP does not boot on J5600, Domenico Andreoli, (Thu Jul 10, 6:42 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Luiz Fernando N. Cap ..., (Thu Jul 10, 11:08 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Roman Mindalev, (Sat Jul 12, 8:07 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Roman Mindalev, (Sat Jul 12, 8:48 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Roman Mindalev, (Tue Jul 15, 6:40 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Ingo Molnar, (Fri Jul 18, 12:11 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Nigel Cunningham, (Fri Jul 18, 12:17 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Vegard Nossum, (Fri Jul 18, 12:28 am)
Re: [Bug #11035] System hangs on 2.6.26-rc8, Ingo Molnar, (Fri Jul 18, 3:25 pm)
Re: [Bug #10786] parisc: 64bit SMP does not boot on J5600, Domenico Andreoli, (Thu Jul 24, 9:43 am)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Paul E. McKenney, (Fri Aug 1, 2:09 pm)
Re: 2.6.26-rc9: Reported regressions from 2.6.25, Paul E. McKenney, (Fri Aug 1, 2:09 pm)