Re: [Bugme-new] [Bug 15709] New: swapper page allocation failure

Previous thread: [patch] security: testing the wrong variable in create_by_name() by Dan Carpenter on Thursday, April 22, 2010 - 3:05 am. (2 messages)

Next thread: [PATCH 1/8] KVM: SVM: Fix nested nmi handling by Joerg Roedel on Thursday, April 22, 2010 - 3:33 am. (10 messages)
From: Michael S. Tsirkin
Date: Thursday, April 22, 2010 - 3:03 am

I think you did the right thing. We'll have to
figure out soft lockup thing, then if page allocation failure

I'm not sure why the lockup backtrace does not show function names -

Well, so the soft lockup issue seems NFS-related?
Trond, commit cf8d2c11cb77f129675478792122f50827e5b0ae seems to
--

From: Robert Wimmer
Date: Thursday, April 22, 2010 - 10:26 pm

I'm building the kernels always with "genkernel" a Gentoo
helper programm for kernel building. But I've looked into
the log file of genkernel and there is nothing mentioned about
striping the kernel. There will be a future release of genkernel
which supports this but this is currently not the case. Since
I haven't stripped the kernel I would answer no. Maybe a
kernel option which should be enabled?

Thanks!
Robert





--

From: Michael S. Tsirkin
Date: Sunday, April 25, 2010 - 2:18 am

Hmm. I have these
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
--

From: Robert Wimmer
Date: Sunday, April 25, 2010 - 1:41 pm

I've added CONFIG_KALLSYMS and CONFIG_KALLSYMS_ALL
to my .config. I've uploaded the dmesg output. Maybe it
helps a little bit:

https://bugzilla.kernel.org/attachment.cgi?id=26138

- Robert



--

From: Michael S. Tsirkin
Date: Sunday, April 25, 2010 - 1:49 pm

So, it's an NFS-related regression, which is consistent with the bisect
results. I guess someone who knows about NFS will have to look at it...
BTW, you probably want to label the bug as regression.

--

From: Trond Myklebust
Date: Monday, April 26, 2010 - 5:15 am

That last trace is just saying that the NFSv4 reboot recovery code is
crashing (which is hardly surprising if the memory management is hosed).

The initial bisection makes little sense to me: it is basically blaming
a page allocation problem on a change to the NFSv4 mount code. The only
way I can see that possibly happen is if you are hitting a stack
overflow.
So 2 questions:

  - Are you able to reproduce the bug when using NFSv3 instead?
  - Have you tried running with stack tracing enabled?

Cheers
  Trond
--

From: Robert Wimmer
Date: Monday, April 26, 2010 - 1:25 pm

I've tried with NFSv3 now. With v4 the error normally occur
within 5 minutes. The VM is now running for one hour and no
soft lockup so far. So I would say it can't be reproduced with

Can you explain this a little bit more please? CONFIG_STACKTRACE=y
was already enabled. I've now enabled

CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FTRACE_NMI_ENTER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_FTRACE_SYSCALLS=y
CONFIG_FTRACE_NMI_ENTER=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_GENERIC_TRACER=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_STACK_TRACER=y
CONFIG_KMEMTRACE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y

and run

echo 1 > /proc/sys/kernel/stack_tracer_enabled

But the output is mostly the same in dmesg/
var/log/messages. Can you please guide me how I can
enable the stack tracing you need?

Thanks!
Robert

--

From: Trond Myklebust
Date: Monday, April 26, 2010 - 2:04 pm

Sure. In addition to what you did above, please do

mount -t debugfs none /sys/kernel/debug

and then cat the contents of the pseudofile at

/sys/kernel/debug/tracing/stack_trace

Please do this more or less immediately after you've finished mounting
the NFSv4 client.

Does your server have the 'crossmnt' or 'nohide' flags set, or does it
use the 'refer' export option anywhere? If so, then we might have to
test further, since those may trigger the NFSv4 submount feature.

Cheers
  Trond
--

From: Robert Wimmer
Date: Monday, April 26, 2010 - 3:18 pm

I've uploaded the stack trace. It was generated
directly after mounting. Here are the stacks:

After mounting:
https://bugzilla.kernel.org/attachment.cgi?id=26153
After the soft lockup:
https://bugzilla.kernel.org/attachment.cgi?id=26154
The dmesg output of the soft lockup:
The server has the following settings:
rw,nohide,insecure,async,no_subtree_check,no_root_squash

Thanks!
Robert


--

From: Trond Myklebust
Date: Monday, April 26, 2010 - 4:28 pm

That second trace is more than 5.5K deep, more than half of which is
socket overhead :-(((.

The process stack does not appear to have overflowed, however that trace
doesn't include any IRQ stack overhead.

OK... So what happens if we get rid of half of that trace by forcing
asynchronous tasks such as this to run entirely in rpciod instead of
first trying to run in the process context?

See the attachment...
From: Robert Wimmer
Date: Tuesday, April 27, 2010 - 3:56 pm

I've applied the patch against the kernel which I got
from "git clone ...." resulted in a kernel 2.6.34-rc5.

The stack trace after mounting NFS is here:
https://bugzilla.kernel.org/attachment.cgi?id=26166
/var/log/messages after soft lockup:
https://bugzilla.kernel.org/attachment.cgi?id=26167

I hope that there is any usefull information in there.

Thanks!
Robert


--

From: kernel
Date: Monday, May 3, 2010 - 1:11 am

Anything we can do to investigate this further?

Thanks!
Robert


On Wed, 28 Apr 2010 00:56:01 +0200, Robert Wimmer <kernel@tauceti.net>
--

From: Robert Wimmer
Date: Thursday, May 6, 2010 - 2:19 pm

I don't know if someone is still interested in this
but I think Trond isn't further interested because
the last error was of cource a "page allocation
failure" and not a "soft lookup" which Trond was
trying to solve. But the patch was for 2.6.34 and
the "soft lookup" comes up only with some 2.6.30 and
maybe some 2.6.31 kernel versions. But the first error
I reported was a "page allocation failure" which
all kernels >= 2.6.32 produces with this configuration
I use (NFSv4).

Michael suggested to first solve the "soft lookup"
before further investigating the "page allocation
failure". We know that the "soft lookup" only
pop's up with NFSv4 and not v3. I really want to
use v4 but since I'm not a kernel hacker someone
must guide me what to try next.

I know that you're all have a lot of other work to
do but if there're no ideas left what to do next
it's maybe best to close the bug for now and I stay with
kernel 2.6.30 for now or go back to NFS v3 if I
upgrade to a newer kernel. Maybe the error will
be fixed "by accident" in >= 2.6.35 ;-) 

Thanks!
Robert




--

From: Trond Myklebust
Date: Thursday, May 6, 2010 - 2:30 pm

Sorry. I've been caught up in work in the past few days.

I can certainly help with the soft lockup if you are able to supply
either a dump that includes all threads stuck in the NFS, or a (binary)
wireshark dump that shows the NFSv4 traffic between the client and
server around the time of the hang.

Cheers
  Trond



--

From: Robert Wimmer
Date: Thursday, May 13, 2010 - 2:08 pm

Finally I've had some time to do the next test.
Here is a wireshark dump (~750 MByte):
http://213.252.12.93/2.6.34-rc5.cap.gz

dmesg output after page allocation failure:
https://bugzilla.kernel.org/attachment.cgi?id=26371

stack trace before page allocation failure:
https://bugzilla.kernel.org/attachment.cgi?id=26369

stack trace after page allocation failure:
https://bugzilla.kernel.org/attachment.cgi?id=26370

I hope the wireshark dump is not to big to download.
It was created with
tshark -f "tcp port 2049" -i eth0 -w 2.6.34-rc5.cap

Thanks!
Robert




--

From: Trond Myklebust
Date: Thursday, May 13, 2010 - 2:13 pm

Hi Robert,

I tried the above wireshark dump URL, but it appears to point to an
empty file.

Cheers
  Trond
--

From: Robert Wimmer
Date: Thursday, May 13, 2010 - 10:42 pm

Hi Trond,

I'm sorry. There was a Varnish in front of that
webserver which doesn't like so big files ;-)
Please try this url: http://213.252.12.34/2.6.34-rc5.cap.gz
It work's for me.

Thanks!
Robert



--

From: kernel
Date: Thursday, May 20, 2010 - 12:39 am

Hi Trond,

have you had some time to download the wireshark dump?

Thanks!
Robert

On Thu, 13 May 2010 17:13:54 -0400, Trond Myklebust
--

From: Robert Wimmer
Date: Tuesday, May 25, 2010 - 1:01 pm

Hi Trond,

just a little reminder ;-)

Thanks!
Robert


--

From: kernel
Date: Wednesday, June 2, 2010 - 4:56 am

Hi Trond,

currently it seems that the problem was 
fixed by accident... ;-) Since 2.6.34 is now
in Gentoo portage I thought I should give
it a try. Using my 2.6.35-r5 .config 
the 2.6.34 release is now working for 4 hours
(instead of 5-10 minutes before). Hmmm...
Hopefully it will run for some more hours
and days now. Since I've definitely changed
nothing besides the kernel it must have been
fixed (hopefully) in one of the 2.6.34-rc's.

If it's still running tomorrow I'll close
the bug.

Greetings
Robert

On Tue, 25 May 2010 22:01:54 +0200, Robert Wimmer <kernel@tauceti.net>
--

Previous thread: [patch] security: testing the wrong variable in create_by_name() by Dan Carpenter on Thursday, April 22, 2010 - 3:05 am. (2 messages)

Next thread: [PATCH 1/8] KVM: SVM: Fix nested nmi handling by Joerg Roedel on Thursday, April 22, 2010 - 3:33 am. (10 messages)