Cc: Herbert Xu <herbert@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, Neil Brown <neilb@...>, J. Bruce Fields <bfields@...>, <netdev@...>, Tom Tucker <tom@...>
On Jan 5, 2008 1:07 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
vanilla -rc6 is fine without these fixes.
The raid-bugs from -rc6-mm1 are probably introduced by
md-allow-devices-to-be-shared-between-md-arrays.patch and that patch
is new in this mm-release.
OK, I will try this...
Aha, I had forgotten about that one.
Looking at all the crashlogs, I do not find another one of this lockdep warning.
The only other lockdep related output was the bootup problem in vanilla -rc6.
I had hoped that I could catch use-after-freeing by using
slub_debug=FZP, but that did not help.
(first oops in http://lkml.org/lkml/2007/12/28/159 )
I think that the main skb structs come from slub and should be
poisoned by this, so it might be some other data structure that is
allocated differently...
For my setup: It's a gentoo system, so compiling packages is the
normal way of installing something.
The compile itself is done on a tmpfs so a filesystem corruption there
should be rather impossible. ;)
(The system has 4Gb RAM, so it doesn't even need to swap)
The sources are taken from a nfsv4 share that is served from a
different system. Also gentoo checksums all sources it will use.
After the crashes I also did a checksum of the last installed
packages. Only in one instance there was corruption, all new files
where completely empty. Obviously XFS did not have the time to write
them back to disk before the system crashed.
Also as all crashes show network related traces and the system is
working fine otherwise, I doubt any permanent filesystem problems.
For the raid problems: I was just unable to even start the raid that
has / on it, because of a wrong check in the raid-autostart code.
( http://lkml.org/lkml/2007/12/27/45 )
DEBUG_SLAB is off, because of:
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
But I'm currently did not have the slub_debug-option in my kernel
commandline, because:
a) slub_debug=FZP did not prevent the bug in -rc3-mm2
b) but it took a much longer time to trigger it
c) its a serious slowdown for these compiles
If you think some other slub_debug might catch it, I would try this...
Torsten
--