Not sure if this is related to the recent mm/vma fixes - got this while rebooting (kexec) latest git - [ 0.000000] Linux version 2.6.34-rc4 (paragw@parag-laptop) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #19 SMP PREEMPT Tue Apr 13 20:59:37 EDT 2010 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-2.6.34-rc4 root=UUID=0a0bb1b9-978c-4e16-8e43-aae24e172e12 ro quiet splash [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000100 - 000000000009fc00 (usable) [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000ef000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 00000000b8f2f000 (usable) [ 0.000000] BIOS-e820: 00000000b8f2f000 - 00000000b8f31000 (reserved) [ 0.000000] BIOS-e820: 00000000b8f31000 - 00000000b9d70000 (usable) [ 0.000000] BIOS-e820: 00000000b9d70000 - 00000000b9d80000 (ACPI NVS) [ 0.000000] BIOS-e820: 00000000b9d80000 - 00000000bc4e0000 (usable) [ 0.000000] BIOS-e820: 00000000bc4e0000 - 00000000bc6e0000 (ACPI NVS) [ 0.000000] BIOS-e820: 00000000bc6e0000 - 00000000bde92000 (usable) [ 0.000000] BIOS-e820: 00000000bde92000 - 00000000bde9a000 (reserved) [ 0.000000] BIOS-e820: 00000000bde9a000 - 00000000bdebf000 (usable) [ 0.000000] BIOS-e820: 00000000bdebf000 - 00000000bdecf000 (reserved) [ 0.000000] BIOS-e820: 00000000bdecf000 - 00000000bdfcf000 (ACPI NVS) [ 0.000000] BIOS-e820: 00000000bdfcf000 - 00000000bdfff000 (ACPI data) [ 0.000000] BIOS-e820: 00000000bdfff000 - 00000000be000000 (usable) [ 0.000000] BIOS-e820: 00000000be000000 - 00000000c0000000 (reserved) [ 0.000000] BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) [ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) [ 0.000000] BIOS-e820: 00000000fed10000 - 00000000fed14000 (reserved) [ 0.000000] BIOS-e820: 00000000fed18000 - 00000000fed1a000 (reserved) [ 0.000000] BIOS-e820: 00000000fed1c000 - ...
From: Parag Warudkar <parag.lkml@gmail.com> Date: Tue, Apr 13, 2010 at 09:53:46PM -0400 hmm, it doesn't look like it. Your code translates to something like 0: b8 00 00 00 00 mov $0x0,%eax 5: 80 ff ff cmp $0xff,%bh 8: ff 48 21 decl 0x21(%rax) b: 45 80 48 8b 45 rex.RB orb $0x45,-0x75(%r8) 10: 80 48 ff c8 orb $0xc8,-0x1(%rax) 14: 48 3b 85 40 ff ff ff cmp -0xc0(%rbp),%rax 1b: 48 8b 85 50 ff ff ff mov -0xb0(%rbp),%rax 22: 48 0f 42 7d 80 cmovb -0x80(%rbp),%rdi 27: 48 89 7d 80 mov %rdi,-0x80(%rbp) 2b:* 48 8b 38 mov (%rax),%rdi <-- trapping instruction 2e: 48 85 ff test %rdi,%rdi 31: 0f 84 f5 04 00 00 je 0x52c 37: 48 rex.W 38: b8 fb 0f 00 00 mov $0xffb,%eax 3d: 00 c0 add %al,%al 3f: ff .byte 0xff which I could correlate with what I get here (comments added): .loc 1 1051 0 movabsq $549755813888, %rax #, tmp158 PGDIR_SIZE .LVL392: leaq (%r12,%rax), %rax #, movq %rax, -88(%rbp) #, %sfp movabsq $-549755813888, %rax #, tmp159 PGDIR_MASK andq %rax, -88(%rbp) # tmp159, %sfp movq -88(%rbp), %rdx # %sfp, tmp160 movq -72(%rbp), %rax # %sfp, tmp161 decq %rdx # tmp160 __boundary decq %rax # tmp161 __end cmpq %rax, %rdx # tmp161, tmp160 rFLAGS movq -72(%rbp), %rax # %sfp, cmovb -88(%rbp), %rax # %sfp,, movq -112(%rbp), %rdx # %sfp, pgd movq %rax, -88(%rbp) #, %sfp movq (%rdx), %rax # <variable>.pgd, pgd$pgd and if this output is correct and if you scroll back a little in your assemble output, you should probably see that the value computed in pgd_offset() is being saved in -0x80(%rbp) and reloaded again for use. So you oops when dereferencing that pgd value in %rax (%rdx in my case), *pgd in pgd_none_or_clear_bad(pgd) which is called in the below ...
There's a large constant (0xffffff8000000000) in there at the beginning,
and the disassembly hasn't found the start of the next instruction very
cleanly. The same is true at the end: another large constant is cut off in
the middle.
The byte just before the dumped instruction stream is almost certainly
'48h', and the last byte of the last constant is 0xff, and the disassembly
ends up being:
0: 48 b8 00 00 00 00 80 mov $0xffffff8000000000,%rax
7: ff ff ff
a: 48 21 45 80 and %rax,-0x80(%rbp)
e: 48 8b 45 80 mov -0x80(%rbp),%rax
12: 48 ff c8 dec %rax
15: 48 3b 85 40 ff ff ff cmp -0xc0(%rbp),%rax
1c: 48 8b 85 50 ff ff ff mov -0xb0(%rbp),%rax
23: 48 0f 42 7d 80 cmovb -0x80(%rbp),%rdi
28: 48 89 7d 80 mov %rdi,-0x80(%rbp)
2c:* 48 8b 38 mov (%rax),%rdi <-- trapping instruction
2f: 48 85 ff test %rdi,%rdi
32: 0f 84 f5 04 00 00 je 0x52d
38: 48 b8 fb 0f 00 00 00 mov $0xffffc00000000ffb,%rax
3f: c0 ff ff
But yes, you found the right spot (that 0xffffff8000000000 constant is
Yup. Close enough. Btw, it's often good to look at both the *.s code _and_
the *.lst code. If you do "make mm/memory.lst", you'll find those big
constants easily, and then you'll see the code this way:
do {
next = pgd_addr_end(addr, end);
ffffffff81b2aa45: 48 b8 00 00 00 00 80 mov $0x8000000000,%rax
ffffffff81b2aa4c: 00 00 00
ffffffff81b2aa4f: 49 8d 04 04 lea (%r12,%rax,1),%rax
ffffffff81b2aa53: 48 89 45 a8 mov %rax,-0x58(%rbp)
ffffffff81b2aa57: 48 b8 00 00 00 00 80 mov $0xffffff8000000000,%rax
ffffffff81b2aa5e: ff ff ff
ffffffff81b2aa61: 48 21 45 a8 and %rax,-0x58(%rbp)
ffffffff81b2aa65: 48 8b 45 b8 mov -0x48(%rbp),%rax
ffffffff81b2aa69: 48 8b 55 a8 mov ...From: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed, Apr 14, 2010 at 07:32:08AM -0700 Right, the decodecode output looked kinda strange to me and I tried to match the instruction order and find the location. But yeah, now that I'm looking at show_registers(), we don't start dumping on precise instruction boundary but simply 64 bytes in the default case. No time [..] ok, I can't say that I'm a linux newbie but the .lst code is new to me. Well, Parag said something about kexec kernel so it is definitely interesting what he means there - a kexec-enabled kernel or is this the "second" kernel his machine kexec'd into after a previous failure. I think this could clarify the situation a bit. Thanks for looking over the asm. -- Regards/Gruss, Boris. --
FWIW, Just a data point. I pulled in latest kernel and I can boot it through BIOS as well as kexec boot on my x86_64 box. Vivek --
Hi Borislav It was the kexec'ed kernel that oopsed - the first kernel had no issues. It was kexec'ing from 2.6.34-rc4 to the same kernel. After that I have tried to reboot via kexec to try to reproduce the issue but it either hung completely or resulted in corrupted X and non-moving cursor. Kexec from Distro kernel to itself works just fine (Ubuntu 2.6.32-20) however. I will start a bisect as soon as find time. Parag --
I created a Bugzilla entry at https://bugzilla.kernel.org/show_bug.cgi?id=15795 for your bug report, please add your address to the CC list in there, thanks! -- Maciej Rutecki http://www.maciek.unixy.pl --
