login
Header Space

 
 

Re: [BUG] 2.6.25-rc3 hangs in early boot on Sun Ultra5

Previous thread: printk_ratelimit and net_ratelimit conflict and tunable behavior by Steven Hawkes on Monday, February 25, 2008 - 4:36 pm. (4 messages)

Next thread: [PATCH] video: limit stack usage of ir-kbd-i2c.c by Marcin Slusarz on Monday, February 25, 2008 - 4:51 pm. (7 messages)
To: <davem@...>
Cc: <sparclinux@...>, <linux-kernel@...>
Date: Monday, February 25, 2008 - 4:41 pm

Booting 2.6.25-rc3 on my Ultra5 causes a hang before or as
the console is switched over to the framebuffer. The console
output is (extrapolated from dmesg in -rc2 and handwritten
notes, as I don't have a serial cable to my U5):

PROMLIB: Sun IEEE Boot Prom 'OBP 3.25.3 2000/06/29 14:12'
PROMLIB: Root node compatible: 
*** the following line can't be seen in dmesg after rc2 has booted
console [earlyprom0] enabled
Linux version 2.6.25-rc3 (mikpe@sparge) (gcc version 4.2.3) #1 Mon Feb 25 18:49:41 CET 2008
ARCH: SUN4U
Ethernet address: 08:00:20:fd:ec:1f
[0000000200000000-fffff80000400000] page_structs=262144 node=0 entry=0/0
[0000000200000000-fffff80000800000] page_structs=262144 node=0 entry=1/0
[0000000200000000-fffff80000c00000] page_structs=262144 node=0 entry=2/0
[0000000200000000-fffff80001000000] page_structs=262144 node=0 entry=3/0
OF stdout device is: /pci@1f,0/pci@1,1/SUNW,m64B@2
PROM: Built device tree with 46617 bytes of memory.
On node 0 totalpages: 32299
  Normal zone: 335 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 31964 pages, LIFO batch:7
  Movable zone: 0 pages used for memmap
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 31964
Kernel command line: ro root=/dev/sda5
PID hash table entries: 1024 (order: 10, 8192 bytes)
clocksource: mult[28000] shift[16]
clockevent: mult[66666666] shift[32]
Console: colour dummy device 80x25
*** the following line can't be seen in dmesg after rc2 has booted
console handover: boot [earlyprom0] -&gt; real [tty0]

At this point rc3 hangs hard and won't even respond to sysrq.

Another difference is that with rc2 the first few lines of kernel
output while the console is still in OF mode either aren't shown
or disappear quickly since the switch to the framebuffer occurs
within a fraction of a second after the kernel has been loaded.
With rc3 the kernel output (the text shown above) in the OF-mode
console is very very slow.

(I should have quoted my .config here but I forgot to bring it.
...
To: <mikpe@...>
Cc: <sparclinux@...>, <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 4:46 pm

From: Mikael Pettersson &lt;mikpe@it.uu.se&gt;

Yes that's a new feature.  Until we switch over to the "real"
console we print the log messages using the firmware console
routines.

This way if an early crash or similar happens, you'll see it
and be able to report it instead of having to report with
"-p" on the command line.

I'll fire up my ultra5 and try to figure out what's wrong
with the atyfb framebuffer driver, that's where it's dying.
--
To: Mikael Pettersson <mikpe@...>
Cc: <davem@...>, <sparclinux@...>, <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 4:55 am

Mikael Pettersson writes:
 &gt; Booting 2.6.25-rc3 on my Ultra5 causes a hang before or as
 &gt; the console is switched over to the framebuffer. The console
 &gt; output is (extrapolated from dmesg in -rc2 and handwritten
 &gt; notes, as I don't have a serial cable to my U5):
 &gt; 
 &gt; PROMLIB: Sun IEEE Boot Prom 'OBP 3.25.3 2000/06/29 14:12'
 &gt; PROMLIB: Root node compatible: 
 &gt; *** the following line can't be seen in dmesg after rc2 has booted
 &gt; console [earlyprom0] enabled
 &gt; Linux version 2.6.25-rc3 (mikpe@sparge) (gcc version 4.2.3) #1 Mon Feb 25 18:49:41 CET 2008
 &gt; ARCH: SUN4U
 &gt; Ethernet address: 08:00:20:fd:ec:1f
 &gt; [0000000200000000-fffff80000400000] page_structs=262144 node=0 entry=0/0
 &gt; [0000000200000000-fffff80000800000] page_structs=262144 node=0 entry=1/0
 &gt; [0000000200000000-fffff80000c00000] page_structs=262144 node=0 entry=2/0
 &gt; [0000000200000000-fffff80001000000] page_structs=262144 node=0 entry=3/0
 &gt; OF stdout device is: /pci@1f,0/pci@1,1/SUNW,m64B@2
 &gt; PROM: Built device tree with 46617 bytes of memory.
 &gt; On node 0 totalpages: 32299
 &gt;   Normal zone: 335 pages used for memmap
 &gt;   Normal zone: 0 pages reserved
 &gt;   Normal zone: 31964 pages, LIFO batch:7
 &gt;   Movable zone: 0 pages used for memmap
 &gt; Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 31964
 &gt; Kernel command line: ro root=/dev/sda5
 &gt; PID hash table entries: 1024 (order: 10, 8192 bytes)
 &gt; clocksource: mult[28000] shift[16]
 &gt; clockevent: mult[66666666] shift[32]
 &gt; Console: colour dummy device 80x25
 &gt; *** the following line can't be seen in dmesg after rc2 has booted
 &gt; console handover: boot [earlyprom0] -&gt; real [tty0]
 &gt; 
 &gt; At this point rc3 hangs hard and won't even respond to sysrq.
 &gt; 
 &gt; Another difference is that with rc2 the first few lines of kernel
 &gt; output while the console is still in OF mode either aren't shown
 &gt; or disappear quickly since the switch to ...
To: <mikpe@...>
Cc: <sparclinux@...>, <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 8:49 pm

From: Mikael Pettersson &lt;mikpe@it.uu.se&gt;

Between the VT layer registering it's console and the atyfb
driver initializing we get a crash, and it happens on all
sparc64 systems.  It is caused by this commit and I am working
on a fix:

commit a0c1e9073ef7428a14309cba010633a6cd6719ea
Author: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Date:   Sat Feb 23 15:23:57 2008 -0800

    futex: runtime enable pi and robust functionality
    
    Not all architectures implement futex_atomic_cmpxchg_inatomic().  The default
    implementation returns -ENOSYS, which is currently not handled inside of the
    futex guts.
    
    Futex PI calls and robust list exits with a held futex result in an endless
    loop in the futex code on architectures which have no support.
    
    Fixing up every place where futex_atomic_cmpxchg_inatomic() is called would
    add a fair amount of extra if/else constructs to the already complex code.  It
    is also not possible to disable the robust feature before user space tries to
    register robust lists.
    
    Compile time disabling is not a good idea either, as there are already
    architectures with runtime detection of futex_atomic_cmpxchg_inatomic support.
    
    Detect the functionality at runtime instead by calling
    cmpxchg_futex_value_locked() with a NULL pointer from the futex initialization
    code.  This is guaranteed to fail, but the call of
    futex_atomic_cmpxchg_inatomic() happens with pagefaults disabled.
    
    On architectures, which use the asm-generic implementation or have a runtime
    CPU feature detection, a -ENOSYS return value disables the PI/robust features.
    
    On architectures with a working implementation the call returns -EFAULT and
    the PI/robust features are enabled.
    
    The relevant syscalls return -ENOSYS and the robust list exit code is blocked,
    when the detection fails.
    
    Fixes http://lkml.org/lkml/2008/2/11/149
    Originally reported by: Lennart Buytenhek
    
    Signe...
To: David Miller <davem@...>
Cc: <mikpe@...>, <sparclinux@...>, <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 4:27 am

David Miller writes:
 &gt; From: Mikael Pettersson &lt;mikpe@it.uu.se&gt;
 &gt; Date: Tue, 26 Feb 2008 09:55:50 +0100
 &gt; 
 &gt; &gt; Minor update: rc2-git7 has the slow initial console behaviour,
 &gt; &gt; but successfully switches to the framebuffer. rc2-git8 however
 &gt; &gt; hangs in the console handover. So I'll bisect git7-&gt;git8 next.
 &gt; 
 &gt; Between the VT layer registering it's console and the atyfb
 &gt; driver initializing we get a crash, and it happens on all
 &gt; sparc64 systems.  It is caused by this commit and I am working
 &gt; on a fix:
 &gt; 
 &gt; commit a0c1e9073ef7428a14309cba010633a6cd6719ea
 &gt; Author: Thomas Gleixner &lt;tglx@linutronix.de&gt;
 &gt; Date:   Sat Feb 23 15:23:57 2008 -0800
 &gt; 
 &gt;     futex: runtime enable pi and robust functionality

My git7-&gt;git8 bisection yesterday independently also arrived
at that specific commit as being the culprit.

Bracketing the offending cmpxchg_futex_value_locked(NULL, 0, 0)
call with #if 0 .. #endif was enough to make my kernel boot.

I'll try your do_kernel_fault() patch later today.

/Mikael
--
To: <mikpe@...>
Cc: <sparclinux@...>, <linux-kernel@...>, <tglx@...>
Date: Tuesday, February 26, 2008 - 9:06 pm

From: David Miller &lt;davem@davemloft.net&gt;
Date: Tue, 26 Feb 2008 16:49:00 -0800 (PST)

[ Thomas, forgot to CC: you earlier, changeset
  a0c1e9073ef7428a14309cba010633a6cd6719ea ("futex: runtime enable pi

The following patch will let things "work" but the trick being used
here by the FUTEX layer is borderline valid in my opinion.

Basically for 10+ years on sparc64 we've had this check here in the
fault path, which makes sure that if we're processing an exception
table entry we really, truly, are doing an access to userspace from
the kernel.  Otherwise we OOPS.

What the FUTEX checking code is doing now is doing a "user" access
with set_fs(KERNEL_DS) since it runs from the kernel bootup early init
sequence.  And this is illegal according to the existing checks.

When we do set_fs(KERNEL_DS) then pass a "user" pointer down
into a system call or something like that, we give it a pointer
that "cannot fault".  So if we get into the fault handling
path here for a case like that we really do want to scream and
print out an OOPS message in my opinion.

I realize that not many platforms other than sparc64 can check
for things this precisely, but it's something to consider.

Did this FUTEX change go into -stable too?

diff --git a/arch/sparc64/mm/fault.c b/arch/sparc64/mm/fault.c
index e2027f2..9183633 100644
--- a/arch/sparc64/mm/fault.c
+++ b/arch/sparc64/mm/fault.c
@@ -244,16 +244,8 @@ static void do_kernel_fault(struct pt_regs *regs, int si_code, int fault_code,
 	if (regs-&gt;tstate &amp; TSTATE_PRIV) {
 		const struct exception_table_entry *entry;
 
-		if (asi == ASI_P &amp;&amp; (insn &amp; 0xc0800000) == 0xc0800000) {
-			if (insn &amp; 0x2000)
-				asi = (regs-&gt;tstate &gt;&gt; 24);
-			else
-				asi = (insn &gt;&gt; 5);
-		}
-	
-		/* Look in asi.h: All _S asis have LS bit set */
-		if ((asi &amp; 0x1) &amp;&amp;
-		    (entry = search_exception_tables(regs-&gt;tpc))) {
+		entry = search_exception_tables(regs-&gt;tpc);
+		if (entry) {
 			regs-&gt;tpc = entr...
To: David Miller <davem@...>
Cc: <mikpe@...>, <sparclinux@...>, <linux-kernel@...>, <tglx@...>
Date: Wednesday, February 27, 2008 - 3:16 pm

David Miller writes:
 &gt; From: David Miller &lt;davem@davemloft.net&gt;
 &gt; Date: Tue, 26 Feb 2008 16:49:00 -0800 (PST)
 &gt; 
 &gt; [ Thomas, forgot to CC: you earlier, changeset
 &gt;   a0c1e9073ef7428a14309cba010633a6cd6719ea ("futex: runtime enable pi
 &gt;   and robust functionality") broke sparc64. ]
 &gt; 
 &gt; &gt; From: Mikael Pettersson &lt;mikpe@it.uu.se&gt;
 &gt; &gt; Date: Tue, 26 Feb 2008 09:55:50 +0100
 &gt; &gt; 
 &gt; &gt; &gt; Minor update: rc2-git7 has the slow initial console behaviour,
 &gt; &gt; &gt; but successfully switches to the framebuffer. rc2-git8 however
 &gt; &gt; &gt; hangs in the console handover. So I'll bisect git7-&gt;git8 next.
 &gt; &gt; 
 &gt; &gt; Between the VT layer registering it's console and the atyfb
 &gt; &gt; driver initializing we get a crash, and it happens on all
 &gt; &gt; sparc64 systems.  It is caused by this commit and I am working
 &gt; &gt; on a fix:
 &gt; 
 &gt; The following patch will let things "work" but the trick being used
 &gt; here by the FUTEX layer is borderline valid in my opinion.
 &gt; 
 &gt; Basically for 10+ years on sparc64 we've had this check here in the
 &gt; fault path, which makes sure that if we're processing an exception
 &gt; table entry we really, truly, are doing an access to userspace from
 &gt; the kernel.  Otherwise we OOPS.
 &gt; 
 &gt; What the FUTEX checking code is doing now is doing a "user" access
 &gt; with set_fs(KERNEL_DS) since it runs from the kernel bootup early init
 &gt; sequence.  And this is illegal according to the existing checks.
 &gt; 
 &gt; When we do set_fs(KERNEL_DS) then pass a "user" pointer down
 &gt; into a system call or something like that, we give it a pointer
 &gt; that "cannot fault".  So if we get into the fault handling
 &gt; path here for a case like that we really do want to scream and
 &gt; print out an OOPS message in my opinion.
 &gt; 
 &gt; I realize that not many platforms other than sparc64 can check
 &gt; for things this precisely, but it's s...
To: <mikpe@...>
Cc: <sparclinux@...>, <linux-kernel@...>, <tglx@...>
Date: Wednesday, February 27, 2008 - 3:37 pm

From: Mikael Pettersson &lt;mikpe@it.uu.se&gt;

Thank you for testing.
--
To: David Miller <davem@...>
Cc: <mikpe@...>, <sparclinux@...>, <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 4:02 am

So it would be correct to set_fs(USER_DS) then do the check and switch


It's queued, AFAIK

Thanks,
	tglx
--
To: <tglx@...>
Cc: <mikpe@...>, <sparclinux@...>, <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 3:05 pm

From: Thomas Gleixner &lt;tglx@linutronix.de&gt;

No, I'm saying it would be better not to take faults purposefully in
the kernel address space.  We don't have a usable user address space
setup at this point in the boot, so using USER_DS would be even worse.

I think I'll just add a different version of the sanity check to this
sparc64 code later on, one that will take into consideration this
KERNEL_DS case because I can see how it could be useful in other

Crap, I'll need to push my fix there too.
--
To: David Miller <davem@...>
Cc: <mikpe@...>, <sparclinux@...>, <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 3:55 pm

I would have preferred not to. The hassle is that we need to figure
out, whether it works or not _before_ any user space program can use
the interfaces. We could omit the check for archs where the


Ok.
 
Thanks,
	tglx
--
To: <mikpe@...>
Cc: <sparclinux@...>, <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 5:32 pm

From: Mikael Pettersson &lt;mikpe@it.uu.se&gt;

Thanks for doing this research.
--
Previous thread: printk_ratelimit and net_ratelimit conflict and tunable behavior by Steven Hawkes on Monday, February 25, 2008 - 4:36 pm. (4 messages)

Next thread: [PATCH] video: limit stack usage of ir-kbd-i2c.c by Marcin Slusarz on Monday, February 25, 2008 - 4:51 pm. (7 messages)
speck-geostationary