Re: [PATCH][EFI] Run EFI in physical mode

Previous thread: [PATCH 1/1] kfifo: fix DMA sample driver by Ira W. Snyder on Friday, August 13, 2010 - 11:32 am. (2 messages)

Next thread: BISECTED: 2.6.35 (and -git) fail to boot: APIC problems by Ira W. Snyder on Friday, August 13, 2010 - 12:34 pm. (3 messages)
From: Takao Indoh
Date: Friday, August 13, 2010 - 12:18 pm

Hi all,

The attached patch enables EFI to run in physical mode.

Basically EFI is in physical mode at first and it's switched to virtual
mode after calling SetVirtualAddressMap. By applying this patch, you can
run EFI always in physical mode. And you can also specify "virtefi" as
kernel boot parameter to run EFI in virtual mode as before. Note that
this patch supports only x86_64.

This is needed to run kexec/kdump in EFI-booted system. The following is
an original discussion. In this thread, I explained that kdump does not
work because EFI system table is modified by SetVirtualAddressMap. And
the idea to run EFI in physical mode was proposed. This patch implements
it.


Basic idea of this patch is to create EFI own pagetable. This pagetable
maps physical address of EFI runtime to the virtual address which is the
same value so that we can call it directly. For example, physical 
address 0x800000 is mapped to virtual address 0x800000. Before calling
EFI runtime, cr3 register is switched to this pagetable, and restored
when we come back from EFI.

Any comments would be appreciated.

Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
---
 arch/x86/include/asm/efi.h |    3
 arch/x86/kernel/efi.c      |  142 ++++++++++++++++++++++++++++++++++-
 arch/x86/kernel/efi_32.c   |    4
 arch/x86/kernel/efi_64.c   |   92 ++++++++++++++++++++++
 include/linux/efi.h        |    1
 include/linux/init.h       |    1
 init/main.c                |   16 +++
 7 files changed, 254 insertions(+), 5 deletions(-)

diff -Nurp linux-2.6.35.org/arch/x86/include/asm/efi.h linux-2.6.35/arch/x86/include/asm/efi.h
--- linux-2.6.35.org/arch/x86/include/asm/efi.h	2010-08-01 18:11:14.000000000 -0400
+++ linux-2.6.35/arch/x86/include/asm/efi.h	2010-08-13 14:39:25.817104994 -0400
@@ -93,6 +93,9 @@ extern int add_efi_memmap;
 extern void efi_reserve_early(void);
 extern void efi_call_phys_prelog(void);
 extern void efi_call_phys_epilog(void);
+extern void ...
From: Eric W. Biederman
Date: Friday, August 13, 2010 - 3:19 pm

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

There is what appears to be unneeded redundancy (we need two
implementations of physciall calls into efi?), but that is confined to
the weird efi state.

It is a shame you haven't done the little bit extra to get
efi_pagetable_init working on x86_32.

Overall this seems sane and confined to the x86 efi, and it looks
like further improvements could easily be layered on top of this one.

Eric
--

From: Takao Indoh
Date: Monday, August 16, 2010 - 12:30 pm

Unfortunately I don't have a machine to test. The machine I'm using does 
not support EFI on x86_32:-( I'd appreciate it if anyone try it...

Thanks,
--

From: H. Peter Anvin
Date: Friday, August 13, 2010 - 3:24 pm

Any hope to get a followon patch for i386 as well?  That would make it
largely a no-brainer.

Tony, does this affect ia64 in any way?

	-hpa
--

From: Luck, Tony
Date: Friday, August 13, 2010 - 3:33 pm

> does this affect ia64 in any way?

I remember Eric complaining that set_virtual_address_map() was a one
way trap door with no way to get back to physical mode ... and thus
this was a big problem to support kexec on ia64. And yet we still call
it, and ia64 can do kexec. So some other work around must have been
found. Can't immediately remember what it was though.

-Tony
--

From: Eric W. Biederman
Date: Friday, August 13, 2010 - 4:11 pm

There is a hack in the code someplace on ia64 to pass the virtual
address efi was mapped at to the next kernel, and have the kernel make
certain to use efi there, without calling set_virtual_address_map().
For similar kernels that is fine at some point I expect kernel
divergence will make that scheme unworkable.  Essentially this is the
same as using physical addresses but starting with the virtual addresses.

For ia64 I seem to recall some weird floating point fixup routines that
benefited from the speed set_virtual_address_map() provided.  For x86_64
where the primary (sole?) reason for enabling EFI handling is to set efi
variables from linux, I don't see a case where enabling virtual mode
makes sense.  If EFI stays around on x86, always running the calls in
physical mode and in other ways slowly decreasing our dependence on
perfect efi implementations seems necessary.

As to Peter's question I did not see any of that code that affected
anything that ia64 used.

Eric
--

From: H. Peter Anvin
Date: Friday, August 13, 2010 - 4:16 pm

I guess my real question was "is this something IA64 could benefit from
and/or could we make the IA64 code more similar to the x86 bits"?

	-hpa

--

From: Tony Luck
Date: Friday, August 13, 2010 - 4:36 pm

If Eric's recollection about the "weird floating point fixup routines"[1]
performance issues are correct - then ia64 won't want to do this.

-Tony

[1] more usually called FPSWA - floating point software assist - which
handle a bunch or corner cases in denormalized floating point values
that the h/w doesn't cover.
--

From: Simon Horman
Date: Sunday, August 15, 2010 - 6:31 pm

I proposed something similar to this for ia64 at one point to solve the
problem of kexecing to Xen - which at that time mapped EFI to a different
location to Linux.

As I recall, the idea was shot-down by SGI Altix people on the basis
potential performance problems. I don't recall any reasons more specific
than that being given (and to be honest I was less than happy about
it at the time).

In the end I moved EFI in Xen to match Linux and have been able to ignore
the problem ever since. Though as Eric pointed out elsewhere in this
thread, there is ample scope for incompatibilities with future/other
kernels.
--

From: H. Peter Anvin
Date: Friday, August 13, 2010 - 3:28 pm

Another aspect of this... this plays well into the already-outstanding
proposal to keep an identity-mapped set of page tables around at all
times.  Right now we do it ad hoc for 64 bits and not really for 32
bits, but that is being changed, see the thread starting at:

http://marc.info/?i=1280940316-7966-1-git-send-email-bp@amd64.org

This would definitely be better than keeping yet another private page table.

	-hpa
--

From: huang ying
Date: Sunday, August 15, 2010 - 6:43 pm

efi_flags and save_cr3 should be per-CPU, because they now will be
used after SMP is enabled.

efi_pgd should be dynamically allocated instead of statically
allocated, because EFI may be not enabled on some platform.

And I think it is better to unify early physical mode with run-time
physical mode. Just allocate the page table with early page allocator
(lmb?).

Best Regards,
Huang Ying
--

From: H. Peter Anvin
Date: Sunday, August 15, 2010 - 8:27 pm

No, it should not be dynamic; rather we should unify all the users who need a 1:1 map and just keep that page table set around.


-- 
Sent from my mobile phone.  Please pardon any lack of formatting.
--

From: huang ying
Date: Sunday, August 15, 2010 - 9:58 pm

Agree. One known issue of global 1:1 map is that we need to make at
least part of page table PAGE_KERNEL_EXEC for EFI runtime code, and
change_page_attr can not be used before page allocator is available.

Best Regards,
Huang Ying
--

From: H. Peter Anvin
Date: Sunday, August 15, 2010 - 10:08 pm

For the 1:1 map we probably should make all pages executable; other things need it too, but we shouldn't have it mapped in except when needed.


-- 
Sent from my mobile phone.  Please pardon any lack of formatting.
--

From: Eric W. Biederman
Date: Monday, August 16, 2010 - 4:39 pm

We still want to restore cr3 from the local task structure as soon
as is reasonable, as an identity mapped page table will have page 0

We need to be careful in the setup of the global page table so that
we are in sync with the pat structure for the attributes pages
are mapped so that we don't map a page as cached and uncached
at the same time.  Otherwise we could accidentally get cache
corruption.  To do that would seem to mean change_page_attr
is relevant at least after we switch from our default set of
page permissions.

Eric
--

From: H. Peter Anvin
Date: Monday, August 16, 2010 - 4:54 pm

Quite, which is yet another reason to have a common global page table
for all the 1:1 users... right now this is all ad hoc.

	-hpa
--

Previous thread: [PATCH 1/1] kfifo: fix DMA sample driver by Ira W. Snyder on Friday, August 13, 2010 - 11:32 am. (2 messages)

Next thread: BISECTED: 2.6.35 (and -git) fail to boot: APIC problems by Ira W. Snyder on Friday, August 13, 2010 - 12:34 pm. (3 messages)