[PATCH 1/1] Allow i386 crash kernels to handle x86_64 dumps

Previous thread: [BUG: kernel/irq/proc.c] unprotected iteration over the IRQ action list in name_unique() by Dmitry Adamushko on Wednesday, March 14, 2007 - 9:26 am. (2 messages)

Next thread: 2.6.20-rt8 patch tweaked for 2.6.20.3 by John on Wednesday, March 14, 2007 - 10:12 am. (2 messages)
From: Ian Campbell
Date: Wednesday, March 14, 2007 - 10:00 am

The specific case I am encountering is kdump under Xen with a 64 bit
hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit due
to the hypervisor but the dump kernel is 32 bit to match the domain 0
kernel.

It's possibly less likely to be useful in a purely native scenario but I
see no reason to disallow it.

Signed-off-by: Ian Campbell <ian.campbell@xensource.com>

--- pristine-linux-2.6.18/include/asm-i386/elf.h	2006-09-20 04:42:06.000000000 +0100
+++ linux-2.6.18-xen/include/asm-i386/elf.h	2007-03-14 16:42:30.000000000 +0000
@@ -36,7 +36,7 @@
  * This is used to ensure we don't load something for the wrong architecture.
  */
 #define elf_check_arch(x) \
-	(((x)->e_machine == EM_386) || ((x)->e_machine == EM_486))
+	(((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || ((x)->e_machine == EM_X86_64))
 
 /*
  * These are used to set parameters in the core dumps.


-

From: Horms
Date: Wednesday, March 14, 2007 - 6:46 pm

For native Linux, would this cover the case where the pre-crash kernel

I think it would be a bit nicer if this was < 80col wide,
though obviously this doesn't affect the funtionality.

diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h
index 8d33c9b..cd894dd 100644
--- a/include/asm-i386/elf.h
+++ b/include/asm-i386/elf.h
@@ -36,7 +36,8 @@ typedef struct user_fxsr_struct elf_fpxregset_t;
  * This is used to ensure we don't load something for the wrong architecture.
  */
 #define elf_check_arch(x) \
-	(((x)->e_machine == EM_386) || ((x)->e_machine == EM_486))
+	(((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || \
+	 ((x)->e_machine == EM_X86_64))
 
 /*
  * These are used to set parameters in the core dumps.
-

From: Vivek Goyal
Date: Wednesday, March 14, 2007 - 9:55 pm

But I think changing this macro might run into issues. It is being used at
few places in kernel, for example while loading module. This will essentially
mean that we allow loading 64bit x86_64 modules on 32bit i386 systems?

Similarly, load_elf_interp() is using it, again will we allow loading a 
interp written for X86_64 on a 32bit i386 machine?

Should we create a separate macro something like elf_check_allowed_arch(),
to take care of such corner cases?

Thanks
Vivek
-

From: Horms
Date: Wednesday, March 14, 2007 - 10:07 pm

That sounds reasonable to me. Though perhaps it could just be
kexec_elf_check_arch() for now, as I don't think there are any
other consumers of it.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-

From: Vivek Goyal
Date: Wednesday, March 14, 2007 - 10:47 pm

Kexec will also not allow loading an x86_64 kernel on a 32bit machine.
So how about something like vmcore_elf_allowed_cross_arch()? Vmcore code
can continue to check elf_check_arch() and if that fails it can invoke
vmcore_elf_allowed_cross_arch() to find out what cross arch are allowed
for vmcore.

Thanks
Vivek
-

From: Horms
Date: Thursday, March 15, 2007 - 1:00 am

That sounds a little messy, though perhaps it is a good solution anyway.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-

From: Ian Campbell
Date: Thursday, March 15, 2007 - 5:22 am

Something like this?

Ian.

---  

Allow i386 crash kernels to handle x86_64 dumps.

The specific case I am encountering is kdump under Xen with a 64 bit
hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit
due to the hypervisor but the dump kernel is 32 bit in for maximum
compatibility.

It's possibly less likely to be useful in a purely native scenario but
I see no reason to disallow it.

Signed-off-by: Ian Campbell <ian.campbell@xensource.com>

diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index d960507..523e109 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -514,7 +514,7 @@ static int __init parse_crash_elf64_headers(void)
 	/* Do some basic Verification. */
 	if (memcmp(ehdr.e_ident, ELFMAG, SELFMAG) != 0 ||
 		(ehdr.e_type != ET_CORE) ||
-		!elf_check_arch(&ehdr) ||
+		!vmcore_elf_check_arch(&ehdr) ||
 		ehdr.e_ident[EI_CLASS] != ELFCLASS64 ||
 		ehdr.e_ident[EI_VERSION] != EV_CURRENT ||
 		ehdr.e_version != EV_CURRENT ||
diff --git a/include/asm-i386/kexec.h b/include/asm-i386/kexec.h
index 4dfc9f5..c76737e 100644
--- a/include/asm-i386/kexec.h
+++ b/include/asm-i386/kexec.h
@@ -47,6 +47,9 @@
 /* The native architecture */
 #define KEXEC_ARCH KEXEC_ARCH_386
 
+/* We can also handle crash dumps from 64 bit kernel. */
+#define vmcore_elf_check_arch_cross(x) ((x)->e_machine == EM_X86_64)
+
 #define MAX_NOTE_BYTES 1024
 
 /* CPU does not save ss and esp on stack if execution is already
diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
index 3250365..db60dac 100644
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -14,5 +14,13 @@ extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
 extern const struct file_operations proc_vmcore_operations;
 extern struct proc_dir_entry *proc_vmcore;
 
+/* Architecture code defines this if there are other possible ELF
+ * machine types, e.g. on bi-arch capable hardware. */
+#ifndef vmcore_elf_check_arch_cross(x)
+#define ...
From: Vivek Goyal
Date: Thursday, March 15, 2007 - 6:26 am

I think for both. One of the possible reasons I think is that one never
knows is underlying machine has got 64bit extensions or not. So even if
we load the kernel it will never boot. Secondly, we might not be able to

Ideal place for this probably should have been arch dependent crash_dump.h
file. But we don't have one and no point introducing one just for this 
macro.

This change looks good to me.

Thanks
Vivek
-

From: Ian Campbell
Date: Thursday, March 15, 2007 - 6:42 am

Is there a kdump tree which you'll apply to or shall I resend CCing
apkm? (I'll add an Acked-by if that's ok).

Ian.


-

From: Horms
Date: Thursday, March 15, 2007 - 4:46 pm

There isn't a kexec tree at this time (though I am happy to entertain
creating one). For now most patches go in either through Andrew or the
relevant architecture maintainers.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-

From: Vivek Goyal
Date: Thursday, March 15, 2007 - 7:27 pm

There is no separate kdump tree. Generally Andrew picks up these changes.
I guess just resend it copying Andrew. Yes you can add Acked-by me.

Thanks
Vivek
-

From: Horms
Date: Thursday, March 15, 2007 - 4:48 pm

Perhaps I am miss-understanding what you are saying, but I do
recally kexecing from 32->64 and 64->32 bit kernels on x86_64 hardware.

Won't the above change break non i386 archtectures as
vmcore_elf_check_arch_cross isn't defined for them?

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-

From: Magnus Damm
Date: Thursday, March 15, 2007 - 7:40 pm

I recall kexecing a bzImage for x86_64 on i386, but I'm not 100% sure.
I think it worked because the bzImage loader code was regular 32 bit

Right. And maybe it's a good idea to make sure that this feature is
actually supported by kexec-tools before adding code to the kernel?

My gut feeling about this is that you are begging for trouble. The
kexec/kdump solution is fragile just by itself, and trying to go
between architectures is just going to be painful.

/ magnus
-

From: Vivek Goyal
Date: Thursday, March 15, 2007 - 8:22 pm

I stand corrected. I can kexec an bzImage 32->64bit. That's a different
thing that it ran into some initrd issues later but fundamentally kexec
could load 64bit kernel bzImage and do the successful transition.

So it will now be left to the user. If he tries to kexec to a 64bit kernel
on a machine not supporting 32bit extensions, then kexec will not give
any advance warning.

Thanks
Vivek
-

From: Horms
Date: Friday, March 16, 2007 - 12:10 am

I feel comfortable with that. Well for now anyway.
But I think that Magnus has other ideas.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-

From: Magnus Damm
Date: Friday, March 16, 2007 - 12:50 am

I don't mind switching back and forth between 32-bit and 64-bit for
plain kexec, especially if we can validate that the kernel we load
will use an instruction set that is supported. But for kdump,
switching between 32-bit and 64-bit kernels is just another new
dimension in the already too complex kdump matrix IMO.

I think more focus should be put on fixing up bugs in kexec-tools than
adding new features.

/ magnus
-

From: Ian Campbell
Date: Friday, March 16, 2007 - 12:28 am

I sent patches to the fastboot list at the same time I sent these ones
to support differences in the underlying hypervisor architecture in the
tools.

They haven't appeared in the archives yet so I fear they have gone
astray. I'll resend when I get to the office in a bit.

The tools already have support for introducing a SHIM when kexecing
between different architectures (at least in the 64->32 direction if I
understand kexec-tools-testing/purgatory/arch/i386/compat_x86_64.S and
k-t-t.../kexec/arch/i386/compat_x86_64.S correctly). This is really just

It works fine under Xen and I think going from 64Xen+32Kernel->32Kernel
makes more sense than going from 64Xen+32Kernel->64Kernel. As I said
originally I'm not so convinced it makes sense in the native case but I
see no reason to outlaw it (people get to keep both pieces etc...)

Ian.

-

From: Magnus Damm
Date: Friday, March 16, 2007 - 12:59 am

... so please resend.

We've just frozen the kexec-tools-testing tree for an upcoming
release, but if you resend soon and your patches are trivial you may


For kexec I think it is just fine. But for kdump, are you sure things
will work out ok? There are some differences between the i386 and
x86_64 kexec-tools code and I wonder if feeding i386 info into an
x86_64 kernel will work properly.

Thanks!

/ magnus
-

From: Ian Campbell
Date: Friday, March 16, 2007 - 1:50 am

It seems to work fine with Xen. A 32 bit kernel handles the 64 bit dump
just fine, my pre-kdump kernel is 32 bit but it doesn't have much to do
in this case I think.

I don't know about native. My gut feeling is that if the mechanism of
actually kexecing between 64 and 32 bit works then there is no problem
with the crash dump part of the equation.

The crash dump is pretty much opaque to the kernel -- it finds the
headers in memory and exports the relevant pieces via /proc/vmcore. The
crash kernel doesn't really care what those pieces are so long as they
constitute a valid ELF image.

My patch to kexec-tools-testing just changes e_machine to match the type
of the pre-crash system. The dump kernel pays no attention to this field
apart from the one sanity check which my patch from this thread
you would see if you had done a 64->64 dump instead of a 64->32 one. I
don't think it is any different to copying /proc/vmcore to a different
system for analysis so any userspace tools should be able to cope.

Ian.

-

From: Magnus Damm
Date: Friday, March 16, 2007 - 2:20 am

Right, that's how it is supposed to work. But there are unfortunately
differences between the architecture-specific implementations in the
kexec-tools code. So the x86_64 version of kexec-tools behaves
different compared to the i386 version.

For instance, x86_64 passes some acpi parameter on the command line to
the crash kernel, but i386 does not. This may or may not be needed.
There are differences like that or smaller sprinkled all over the
place.

/ magnus
-

From: Vivek Goyal
Date: Friday, March 16, 2007 - 2:35 am

I think passing those acpi parameter to 32bit kernel should not harm.

Got a question. When running 32bit dom0 on 64bit hypervisor, which
kexec-tools elf loader will kick in? 32bit or 64bit? Looks like in this
case 64bit one. But shouldn't it be 32bit as 32bit OS is running and we
must be using the kexec-tools binary compiled for 32bit OS? And if 32bit
loader kicks in we will not be passing any acpi parameters.

Thanks
Vivek
-

From: Magnus Damm
Date: Friday, March 16, 2007 - 3:05 am

There is no check to see if the hypervisor is 32 or 64 bits present
today. So the 32-bit version of kexec-tools will support loading
images like any other 32-bit kexec-tools.

Thanks,

/ magnus
-

From: Vivek Goyal
Date: Friday, March 16, 2007 - 4:38 am

If that is the case then in prepared elf headers, machine type should
be EM_386 or similar and not EM_X86_64 and Ian shouldn't have run into
the problem at all with vmcore. Am I missing something?

Thanks
Vivek
-

From: Ian Campbell
Date: Friday, March 16, 2007 - 4:40 am

The PRSTATUS ELF notes are generated by the hypervisor not by the kernel
so they are in 64 bit format in this scenario. The machine type should
reflect this.

Ian.

-

From: Vivek Goyal
Date: Friday, March 16, 2007 - 5:25 am

But ELF header is created in 32bit OS and pre-loaded. At creating time,
32bit kexec-tools will put machine info as EM_386. Who changes it to
EM_X86_64 before vmcore code does a sanity check on it using
elf_chcek_arch()? Does hypervisor fiddle around with this field?

Thanks
Vivek

-

From: Vivek Goyal
Date: Friday, March 16, 2007 - 5:31 am

Ok. Just now looked at your kexec-tools patch to determine the Xen
capabilities and then filling the machine type accordingly. That explains
why machine type will appear as EM_X86_64.

Thanks
Vivek
-

From: Vivek Goyal
Date: Friday, March 16, 2007 - 2:26 am

I also think so. If kexec works then kdump should work too. There might
be small issues here and there but can't think of any major one.

Thanks
Vivek
-

From: Vivek Goyal
Date: Thursday, March 15, 2007 - 7:42 pm

Yesterday I tested it. I could kexec from 64->32bit but not vice versa.
kexec-tools itself gave error message.

"Cannot determine the file type of ../x86_64-vmlinux/vmlinux"

I did not investigate deeper but I got a basic question. How will kexec
know that underlying 32bit machine supports 64bit extensions or not? Do
we allow loading 64bit kernel even underlying machine might not support
it?


In original patch he has put an arch independent definition in
include/linux/crash_dump.h which will make sure it is not broken on
other architectures.

Thanks
Vivek

-

From: Ian Campbell
Date: Friday, March 16, 2007 - 12:31 am

It looks like /proc/cpuinfo flags contains "lm" (which is long mode,
right?) even if the machine is running 32 bit mode.

Ian.

-

From: Ian Campbell
Date: Friday, March 16, 2007 - 12:17 am

No, because of this hunk:

diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
index 3250365..db60dac 100644
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -14,5 +14,13 @@ extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
 extern const struct file_operations proc_vmcore_operations;
 extern struct proc_dir_entry *proc_vmcore;
 
+/* Architecture code defines this if there are other possible ELF
+ * machine types, e.g. on bi-arch capable hardware. */
+#ifndef vmcore_elf_check_arch_cross(x)
+#define vmcore_elf_check_arch_cross(x) 0
+#endif
[snip]

-

Previous thread: [BUG: kernel/irq/proc.c] unprotected iteration over the IRQ action list in name_unique() by Dmitry Adamushko on Wednesday, March 14, 2007 - 9:26 am. (2 messages)

Next thread: 2.6.20-rt8 patch tweaked for 2.6.20.3 by John on Wednesday, March 14, 2007 - 10:12 am. (2 messages)