Re: [bug] SLOB crash, 2.6.24-rc2

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Nick Piggin <nickpiggin@...>
Cc: David Miller <davem@...>, <mpm@...>, <rjw@...>, <linux-kernel@...>, <akpm@...>, <torvalds@...>, Thomas Gleixner <tglx@...>
Date: Thursday, November 15, 2007 - 7:28 am

* Nick Piggin <nickpiggin@yahoo.com.au> wrote:


thx, i'll try your fix in a minute.


i sometimes test SLOB for -rt, but this time it's the result of my 
"automated random QA" effort, as part of arch/x86 maintainance/QA.

the main trick is to build and booting random "make randconfig" 
bzImages. That finds build bugs and a good deal of boot hang and crash 
bugs as well. (it also found a compiler bug already) I can build and 
boot about 1000 random kernels in 24 hours, and it's all fully 
automated. I usually run it overnight - when a kernel does not come up 
due to a bootup hang or crash (or the kernel log signals any exception 
condition) then the script stops and i can fix it in the morning.

The first step towards this was to get allyesconfig bzImage kernels to 
build and boot fine. That effort took months (we had many problems in 
this area) - i think you saw bugreports and fixes from me about that on 
lkml.

Once that worked reasonably well i made a small Kconfig patch that 
forcibly selects a "minimum set" of drivers and kernel subsystems that 
are needed to boot up a testsystem. Once a "make allnoconfig" and a 
"make allyesconfig" bzImage kernel boots up fine on the testbox all 
randconfig configs "inbetween" are supposed to build and boot fine as 
well.

I also have a patch that adds all the x86 boot options like nosmp, 
maxcpus=1, nohz=off, hpet=disable to be selectable as .config options - 
so those boot options are randomized as well.

I also have a small patch that disables half a dozen drivers/features 
that are not expected to work out of box in a bzImage kernel. (such as 
ISA drivers that assume the presence of hardware, or root filesystem 
features such as NFSROOT)

the resulting make randconfig kernel still has 99% of the degrees of 
freedom that a stock make randconfig kernel has, so by all practical 
purposes it's a fully random kernel - it just happens to boot on my 
testsystem all the time.

A successful bootup means the test system is able to boot up into a 
stock Fedora 8 userspace and is able to bring up its network interfaces 
and ssh out (automatically) to the build box to signal the completion of 
a successful test cycle. The logs are also analyzed for lockdep 
assertions (if lockdep is enabled - which it is in about 20% of the 
randconfig kernels) and other kernel bugs.

(just in case you were wondering about one of the reasons why the 
arch/x86 unification merge went so smoothly, with nary a regression ;-) 
Thomas is doing other types of automated QA of the x86 queue as well.)

this method found the SG-list corruption bugs the following night after 
Linus committed Jen's SG-list changes, so it's pretty good at finding 
regressions as early as possible.

	Ingo
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.24-rc2: Reported regressions from 2.6.23 (updated), Rafael J. Wysocki, (Sun Nov 11, 3:58 pm)
[bug] SLOB crash, 2.6.24-rc2, Ingo Molnar, (Wed Nov 14, 7:20 am)
Re: [bug] SLOB crash, 2.6.24-rc2, Matt Mackall, (Wed Nov 14, 1:36 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, Matt Mackall, (Wed Nov 14, 2:39 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, Ingo Molnar, (Wed Nov 14, 3:05 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, David Miller, (Wed Nov 14, 6:39 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, Matt Mackall, (Wed Nov 14, 6:53 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, David Miller, (Wed Nov 14, 7:10 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, Matt Mackall, (Wed Nov 14, 7:37 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, David Miller, (Wed Nov 14, 7:41 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, Ingo Molnar, (Thu Nov 15, 6:43 am)
Re: [bug] SLOB crash, 2.6.24-rc2, Nick Piggin, (Thu Nov 15, 6:57 am)
Re: [bug] SLOB crash, 2.6.24-rc2, Ingo Molnar, (Thu Nov 15, 7:28 am)
Re: [bug] SLOB crash, 2.6.24-rc2, Dave Haywood, (Thu Nov 15, 8:18 am)
Re: [bug] SLOB crash, 2.6.24-rc2, Nick Piggin, (Thu Nov 15, 7:39 am)
[patch] slob: fix memory corruption, Ingo Molnar, (Thu Nov 15, 7:32 am)
Re: [patch] slob: fix memory corruption, Matt Mackall, (Thu Nov 15, 12:00 pm)
Re: [patch] slob: fix memory corruption, Ingo Molnar, (Thu Nov 15, 8:48 am)
Re: [patch] slob: fix memory corruption, Nick Piggin, (Thu Nov 15, 4:25 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, David Miller, (Thu Nov 15, 6:51 am)
Re: [bug] SLOB crash, 2.6.24-rc2, Ingo Molnar, (Thu Nov 15, 7:03 am)
Re: [bug] SLOB crash, 2.6.24-rc2, David Miller, (Thu Nov 15, 7:05 am)
Re: [bug] SLOB crash, 2.6.24-rc2, Matt Mackall, (Wed Nov 14, 8:09 pm)
Re: [bug] SLOB crash, 2.6.24-rc2, Matt Mackall, (Wed Nov 14, 3:42 pm)
Re: 2.6.24-rc2: Reported regressions from 2.6.23 (updated), Francois Romieu, (Sun Nov 11, 4:33 pm)
Re: 2.6.24-rc2: Reported regressions from 2.6.23 (updated), Bartlomiej Zolnierkiewicz..., (Sun Nov 11, 6:22 pm)
Re: 2.6.24-rc2: Reported regressions from 2.6.23 (updated), Thomas Lindroth, (Tue Nov 13, 10:09 am)
Re: 2.6.24-rc2: Reported regressions from 2.6.23 (updated), Rafael J. Wysocki, (Sun Nov 11, 4:34 pm)