The specific case I am encountering is kdump under Xen with a 64 bit hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit due to the hypervisor but the dump kernel is 32 bit to match the domain 0 kernel. It's possibly less likely to be useful in a purely native scenario but I see no reason to disallow it. Signed-off-by: Ian Campbell <ian.campbell@xensource.com> --- pristine-linux-2.6.18/include/asm-i386/elf.h 2006-09-20 04:42:06.000000000 +0100 +++ linux-2.6.18-xen/include/asm-i386/elf.h 2007-03-14 16:42:30.000000000 +0000 @@ -36,7 +36,7 @@ * This is used to ensure we don't load something for the wrong architecture. */ #define elf_check_arch(x) \ - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) + (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || ((x)->e_machine == EM_X86_64)) /* * These are used to set parameters in the core dumps. -
For native Linux, would this cover the case where the pre-crash kernel I think it would be a bit nicer if this was < 80col wide, though obviously this doesn't affect the funtionality. diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h index 8d33c9b..cd894dd 100644 --- a/include/asm-i386/elf.h +++ b/include/asm-i386/elf.h @@ -36,7 +36,8 @@ typedef struct user_fxsr_struct elf_fpxregset_t; * This is used to ensure we don't load something for the wrong architecture. */ #define elf_check_arch(x) \ - (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486)) + (((x)->e_machine == EM_386) || ((x)->e_machine == EM_486) || \ + ((x)->e_machine == EM_X86_64)) /* * These are used to set parameters in the core dumps. -
But I think changing this macro might run into issues. It is being used at few places in kernel, for example while loading module. This will essentially mean that we allow loading 64bit x86_64 modules on 32bit i386 systems? Similarly, load_elf_interp() is using it, again will we allow loading a interp written for X86_64 on a 32bit i386 machine? Should we create a separate macro something like elf_check_allowed_arch(), to take care of such corner cases? Thanks Vivek -
That sounds reasonable to me. Though perhaps it could just be kexec_elf_check_arch() for now, as I don't think there are any other consumers of it. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ -
Kexec will also not allow loading an x86_64 kernel on a 32bit machine. So how about something like vmcore_elf_allowed_cross_arch()? Vmcore code can continue to check elf_check_arch() and if that fails it can invoke vmcore_elf_allowed_cross_arch() to find out what cross arch are allowed for vmcore. Thanks Vivek -
That sounds a little messy, though perhaps it is a good solution anyway. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ -
Something like this? Ian. --- Allow i386 crash kernels to handle x86_64 dumps. The specific case I am encountering is kdump under Xen with a 64 bit hypervisor and 32 bit kernel/userspace. The dump created is a 64 bit due to the hypervisor but the dump kernel is 32 bit in for maximum compatibility. It's possibly less likely to be useful in a purely native scenario but I see no reason to disallow it. Signed-off-by: Ian Campbell <ian.campbell@xensource.com> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c index d960507..523e109 100644 --- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -514,7 +514,7 @@ static int __init parse_crash_elf64_headers(void) /* Do some basic Verification. */ if (memcmp(ehdr.e_ident, ELFMAG, SELFMAG) != 0 || (ehdr.e_type != ET_CORE) || - !elf_check_arch(&ehdr) || + !vmcore_elf_check_arch(&ehdr) || ehdr.e_ident[EI_CLASS] != ELFCLASS64 || ehdr.e_ident[EI_VERSION] != EV_CURRENT || ehdr.e_version != EV_CURRENT || diff --git a/include/asm-i386/kexec.h b/include/asm-i386/kexec.h index 4dfc9f5..c76737e 100644 --- a/include/asm-i386/kexec.h +++ b/include/asm-i386/kexec.h @@ -47,6 +47,9 @@ /* The native architecture */ #define KEXEC_ARCH KEXEC_ARCH_386 +/* We can also handle crash dumps from 64 bit kernel. */ +#define vmcore_elf_check_arch_cross(x) ((x)->e_machine == EM_X86_64) + #define MAX_NOTE_BYTES 1024 /* CPU does not save ss and esp on stack if execution is already diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h index 3250365..db60dac 100644 --- a/include/linux/crash_dump.h +++ b/include/linux/crash_dump.h @@ -14,5 +14,13 @@ extern ssize_t copy_oldmem_page(unsigned long, char *, size_t, extern const struct file_operations proc_vmcore_operations; extern struct proc_dir_entry *proc_vmcore; +/* Architecture code defines this if there are other possible ELF + * machine types, e.g. on bi-arch capable hardware. */ +#ifndef vmcore_elf_check_arch_cross(x) +#define ...
I think for both. One of the possible reasons I think is that one never knows is underlying machine has got 64bit extensions or not. So even if we load the kernel it will never boot. Secondly, we might not be able to Ideal place for this probably should have been arch dependent crash_dump.h file. But we don't have one and no point introducing one just for this macro. This change looks good to me. Thanks Vivek -
Is there a kdump tree which you'll apply to or shall I resend CCing apkm? (I'll add an Acked-by if that's ok). Ian. -
There isn't a kexec tree at this time (though I am happy to entertain creating one). For now most patches go in either through Andrew or the relevant architecture maintainers. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ -
There is no separate kdump tree. Generally Andrew picks up these changes. I guess just resend it copying Andrew. Yes you can add Acked-by me. Thanks Vivek -
Perhaps I am miss-understanding what you are saying, but I do recally kexecing from 32->64 and 64->32 bit kernels on x86_64 hardware. Won't the above change break non i386 archtectures as vmcore_elf_check_arch_cross isn't defined for them? -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ -
I recall kexecing a bzImage for x86_64 on i386, but I'm not 100% sure. I think it worked because the bzImage loader code was regular 32 bit Right. And maybe it's a good idea to make sure that this feature is actually supported by kexec-tools before adding code to the kernel? My gut feeling about this is that you are begging for trouble. The kexec/kdump solution is fragile just by itself, and trying to go between architectures is just going to be painful. / magnus -
I stand corrected. I can kexec an bzImage 32->64bit. That's a different thing that it ran into some initrd issues later but fundamentally kexec could load 64bit kernel bzImage and do the successful transition. So it will now be left to the user. If he tries to kexec to a 64bit kernel on a machine not supporting 32bit extensions, then kexec will not give any advance warning. Thanks Vivek -
I feel comfortable with that. Well for now anyway. But I think that Magnus has other ideas. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ -
I don't mind switching back and forth between 32-bit and 64-bit for plain kexec, especially if we can validate that the kernel we load will use an instruction set that is supported. But for kdump, switching between 32-bit and 64-bit kernels is just another new dimension in the already too complex kdump matrix IMO. I think more focus should be put on fixing up bugs in kexec-tools than adding new features. / magnus -
I sent patches to the fastboot list at the same time I sent these ones to support differences in the underlying hypervisor architecture in the tools. They haven't appeared in the archives yet so I fear they have gone astray. I'll resend when I get to the office in a bit. The tools already have support for introducing a SHIM when kexecing between different architectures (at least in the 64->32 direction if I understand kexec-tools-testing/purgatory/arch/i386/compat_x86_64.S and k-t-t.../kexec/arch/i386/compat_x86_64.S correctly). This is really just It works fine under Xen and I think going from 64Xen+32Kernel->32Kernel makes more sense than going from 64Xen+32Kernel->64Kernel. As I said originally I'm not so convinced it makes sense in the native case but I see no reason to outlaw it (people get to keep both pieces etc...) Ian. -
... so please resend. We've just frozen the kexec-tools-testing tree for an upcoming release, but if you resend soon and your patches are trivial you may For kexec I think it is just fine. But for kdump, are you sure things will work out ok? There are some differences between the i386 and x86_64 kexec-tools code and I wonder if feeding i386 info into an x86_64 kernel will work properly. Thanks! / magnus -
It seems to work fine with Xen. A 32 bit kernel handles the 64 bit dump just fine, my pre-kdump kernel is 32 bit but it doesn't have much to do in this case I think. I don't know about native. My gut feeling is that if the mechanism of actually kexecing between 64 and 32 bit works then there is no problem with the crash dump part of the equation. The crash dump is pretty much opaque to the kernel -- it finds the headers in memory and exports the relevant pieces via /proc/vmcore. The crash kernel doesn't really care what those pieces are so long as they constitute a valid ELF image. My patch to kexec-tools-testing just changes e_machine to match the type of the pre-crash system. The dump kernel pays no attention to this field apart from the one sanity check which my patch from this thread you would see if you had done a 64->64 dump instead of a 64->32 one. I don't think it is any different to copying /proc/vmcore to a different system for analysis so any userspace tools should be able to cope. Ian. -
Right, that's how it is supposed to work. But there are unfortunately differences between the architecture-specific implementations in the kexec-tools code. So the x86_64 version of kexec-tools behaves different compared to the i386 version. For instance, x86_64 passes some acpi parameter on the command line to the crash kernel, but i386 does not. This may or may not be needed. There are differences like that or smaller sprinkled all over the place. / magnus -
I think passing those acpi parameter to 32bit kernel should not harm. Got a question. When running 32bit dom0 on 64bit hypervisor, which kexec-tools elf loader will kick in? 32bit or 64bit? Looks like in this case 64bit one. But shouldn't it be 32bit as 32bit OS is running and we must be using the kexec-tools binary compiled for 32bit OS? And if 32bit loader kicks in we will not be passing any acpi parameters. Thanks Vivek -
There is no check to see if the hypervisor is 32 or 64 bits present today. So the 32-bit version of kexec-tools will support loading images like any other 32-bit kexec-tools. Thanks, / magnus -
If that is the case then in prepared elf headers, machine type should be EM_386 or similar and not EM_X86_64 and Ian shouldn't have run into the problem at all with vmcore. Am I missing something? Thanks Vivek -
The PRSTATUS ELF notes are generated by the hypervisor not by the kernel so they are in 64 bit format in this scenario. The machine type should reflect this. Ian. -
But ELF header is created in 32bit OS and pre-loaded. At creating time, 32bit kexec-tools will put machine info as EM_386. Who changes it to EM_X86_64 before vmcore code does a sanity check on it using elf_chcek_arch()? Does hypervisor fiddle around with this field? Thanks Vivek -
Ok. Just now looked at your kexec-tools patch to determine the Xen capabilities and then filling the machine type accordingly. That explains why machine type will appear as EM_X86_64. Thanks Vivek -
I also think so. If kexec works then kdump should work too. There might be small issues here and there but can't think of any major one. Thanks Vivek -
Yesterday I tested it. I could kexec from 64->32bit but not vice versa. kexec-tools itself gave error message. "Cannot determine the file type of ../x86_64-vmlinux/vmlinux" I did not investigate deeper but I got a basic question. How will kexec know that underlying 32bit machine supports 64bit extensions or not? Do we allow loading 64bit kernel even underlying machine might not support it? In original patch he has put an arch independent definition in include/linux/crash_dump.h which will make sure it is not broken on other architectures. Thanks Vivek -
It looks like /proc/cpuinfo flags contains "lm" (which is long mode, right?) even if the machine is running 32 bit mode. Ian. -
No, because of this hunk: diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h index 3250365..db60dac 100644 --- a/include/linux/crash_dump.h +++ b/include/linux/crash_dump.h @@ -14,5 +14,13 @@ extern ssize_t copy_oldmem_page(unsigned long, char *, size_t, extern const struct file_operations proc_vmcore_operations; extern struct proc_dir_entry *proc_vmcore; +/* Architecture code defines this if there are other possible ELF + * machine types, e.g. on bi-arch capable hardware. */ +#ifndef vmcore_elf_check_arch_cross(x) +#define vmcore_elf_check_arch_cross(x) 0 +#endif [snip] -
Thanks, silly me :( -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ -
