Re: Xen kernel 2.6.23-rc7 bug at xen_mc_flush (arch/i386/xen/multicalls.c:68)

Previous thread: Linux 2.4.35.3 by Willy Tarreau on Sunday, September 23, 2007 - 3:20 pm. (1 message)

Next thread: [patch 1/3] new timerfd API - new timerfd API by Davide Libenzi on Sunday, September 23, 2007 - 3:49 pm. (4 messages)
From: osth
Date: Sunday, September 23, 2007 - 2:55 pm

Using kernel 2.6.23-rc7 as xen domU client system I observe a kernel bug
which occurs reproducibly when calling a shell from midnight commander F2
context menu or with testcase given below  (However most other programs seem
to
be well behaved and do not trigger this bug). - A kernel compiled with debug
info gives:

Kernel BUG at c01037dc [verbose debug info unavailable]
invalid opcode: 0000 [#5]
PREEMPT SMP
...
Call Trace:
[<c0103de9>] <0> [<c015d1d1>] <0> [<c0190078>] <0> [<c012633e>] <0> [<c016fa54>]
<0> [<c0106547>] <0> [<c01080d2>] <0> =======================
...
gdb) l *0xc01037dc
0xc01037dc is in xen_mc_flush (arch/i386/xen/multicalls.c:68).
63              } else
64                      BUG_ON(b->argidx != 0);
65
66              local_irq_restore(flags);
67
68              BUG_ON(ret);
69      }
0xc0103de9 is in xen_exit_mmap (arch/i386/xen/multicalls.h:42).
0xc015d1d1 is in exit_mmap (include/asm/paravirt.h:722).
0xc0190078 is in load_script (fs/binfmt_script.c:19).
0xc012633e is in mmput (kernel/fork.c:395).
0xc016fa54 is in do_execve (fs/exec.c:1421).
0xc0106547 is in sys_execve (arch/i386/kernel/process.c:793).
No source file for address 0xc01080d2.

/proc/cpuinfo: ...AMD Athlon(tm) X2 Dual Core Processor BE-2350 ...

full info is at http://spblinux.de/xen/20070923/

Same bug if preempt is disabled; same bug if vcpus is reduced to 1 in xen
domU.

Please cc to osth at freesurf.ch because I am not on the list.

Christian Ostheimer

testcase which triggers the bug:

#!/bin/bash
#
# modified configure script: max commandline length test
CONFIG_SHELL=/bin/bash
i=0
export teststring=ABCD
    while (test "X"`$CONFIG_SHELL -c "echo X$teststring" 2>/dev/null` \
	       = "XX$teststring") >/dev/null 2>&1 &&
	    new_result=`expr "X$teststring" : ".*" 2>&1` &&
	    lt_cv_sys_max_cmd_len=$new_result &&
	    test $i != 17 # 1/2 MB should be enough
    do 
      i=`expr $i + 1`
      teststring=$teststring$teststring
    done
    teststring=
    # Add ...
From: Jeremy Fitzhardinge
Date: Monday, September 24, 2007 - 12:47 am

OK, I think I've seen this before, and need to track it down.  Could you
try again with a kernel with debug info, and does anything relevant
appear in "xm desg"?

Thanks,
    J
-

From: Jeremy Fitzhardinge
Date: Monday, September 24, 2007 - 5:43 pm

Hm, it just seems that its trying to unpin an mm on the error path of
execve, and so it hasn't been pinned.  The simplest way to reproduce is:

$ echo foo > foo
$ chmod +x foo
$ ./foo

Anyway, try this patch.

    J

---
 arch/i386/xen/mmu.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

===================================================================
--- a/arch/i386/xen/mmu.c
+++ b/arch/i386/xen/mmu.c
@@ -558,6 +558,9 @@ void xen_exit_mmap(struct mm_struct *mm)
 	put_cpu();
 
 	spin_lock(&mm->page_table_lock);
-	xen_pgd_unpin(mm->pgd);
+
+	/* pgd may not be pinned in the error exit path of execve */
+	if (PagePinned(virt_to_page(mm->pgd)))
+		xen_pgd_unpin(mm->pgd);
 	spin_unlock(&mm->page_table_lock);
 }


-

From: osth
Date: Tuesday, September 25, 2007 - 2:27 am

Bug is solved by this patch. Thanks! - Maybe this patch can make it into
2.6.23 final?

Christian Ostheimer


Neu: Das erste ADSL-Abo ohne Monatsgebühr! Steigen Sie jetzt auf sunrise
ADSL free um.
http://www.sunrise.ch/privatkunden/iminternetsurfen/adsl/adsl_abosundpreise/adsl_geleg...



-

From: Jeremy Fitzhardinge
Date: Tuesday, September 25, 2007 - 9:48 am

Yes, I'll send it out today.

    J
-

Previous thread: Linux 2.4.35.3 by Willy Tarreau on Sunday, September 23, 2007 - 3:20 pm. (1 message)

Next thread: [patch 1/3] new timerfd API - new timerfd API by Davide Libenzi on Sunday, September 23, 2007 - 3:49 pm. (4 messages)