login
Header Space

 
 

Re: [PATCH 1/4 -mm] kexec based hibernation -v7 : kexec jump

Previous thread: [PATCH 0/4 -mm] kexec based hibernation -v7 by Huang, Ying on Friday, December 7, 2007 - 11:53 am. (1 message)

Next thread: [PATCH 3/4 -mm] kexec based hibernation -v7 : kexec hibernate/resume by Huang, Ying on Friday, December 7, 2007 - 11:53 am. (3 messages)
To: Eric W. Biederman <ebiederm@...>, Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>
Cc: <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Friday, December 7, 2007 - 11:53 am

This patch implements the functionality of jumping between the kexeced
kernel and the original kernel.

To support jumping between two kernels, before jumping to (executing)
the new kernel and jumping back to the original kernel, the devices
are put into quiescent state, and the state of devices and CPU is
saved. After jumping back from kexeced kernel and jumping to the new
kernel, the state of devices and CPU are restored accordingly. The
devices/CPU state save/restore code of software suspend is called to
implement corresponding function.

To support jumping without reserving memory. One shadow backup page
(source page) is allocated for each page used by new (kexeced) kernel
(destination page). When do kexec_load, the image of new kernel is
loaded into source pages, and before executing, the destination pages
and the source pages are swapped, so the contents of destination pages
are backupped. Before jumping to the new (kexeced) kernel and after
jumping back to the original kernel, the destination pages and the
source pages are swapped too.

A jump back protocol for kexec is defined and documented. It is an
extension to ordinary function calling protocol. So, the facility
provided by this patch can be used to call ordinary C function in real
mode.

A set of flags for sys_kexec_load are added to control which state are
saved/restored before/after real mode code executing. For example, you
can specify the device state and FPU state are saved/restored
before/after real mode code executing.

The states (exclude CPU state) save/restore code can be overridden
based on the "command" parameter of kexec jump. Because more states
need to be saved/restored by hibernating/resuming.

Signed-off-by: Huang Ying &lt;ying.huang@intel.com&gt;

---
 Documentation/i386/jump_back_protocol.txt |  103 ++++++++++++++
 arch/powerpc/kernel/machine_kexec.c       |    2 
 arch/ppc/kernel/machine_kexec.c           |    2 
 arch/sh/kernel/machine_kexec.c            |    2 
 arch/x86/kernel/machine_ke...
To: Huang, Ying <ying.huang@...>
Cc: Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Monday, December 10, 2007 - 10:25 pm

Why do we need var arg support?

We can't keep the same idt and gdt as the pages they are on will be
overwritten/reused.  So explictily stomping on them sounds better

Why rename relocate_kernel?
Ah.  I see.  You need to make it into a pointer again.  The crazy don't
stop the pgd support strikes again.  It used to be named rnk.

More later.

Eric

--
To: Eric W. Biederman <ebiederm@...>
Cc: Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Tuesday, December 11, 2007 - 11:50 am

If all parameters are provided in user space, the usage model may be as
follow:

- sys_kexec_load() /* with executable/data/parameters(A) loaded */
- sys_reboot(,,LINUX_REBOOT_CMD_KEXEC,) /* execute physical mode code with parameters(A)*/
- /* jump back */
- sys_kexec_load() /* with executable/data/parameters(B) loaded */
- sys_reboot(,,LINUX_REBOOT_CMD_KEXEC,) /* execute physical mode code with parameters(B)*/
- /* jump back */

That is, the kexec image should be re-loaded if the parameters are
different, and there can be no state reserved in kexec image. This is OK
for original kexec implementation, because there is no jumping back.
But, for kexec with jumping back, another usage model may be useful too.

- sys_kexec_load() /* with executable/data loaded */
- sys_reboot(,,LINUX_REBOOT_CMD_KEXEC,parameters(A)) /* execute physical mode code with parameters(A)*/
- sys_reboot(,,LINUX_REBOOT_CMD_KEXEC,parameters(B)) /* execute physical mode code with parameters(B)*/

This way the kexec image need not to be re-loaded, and the state of
kexec image can be reserved across several invoking.


Another usage model may be useful is invoking the kexec image (such as
firmware) from kernel space.

- kmalloc the needed memory and loaded the firmware image (if needed)
- sys_kexec_load() with a fake image (one segment with size 0), the
entry point of the fake image is the entry point of the firmware image.
- kexec_call(fake_image, ...) /* maybe change entry point if needed */

This way, some kernel code can invoke the firmware in physical mode just
like invoking an ordinary function.


The original idea about this code is:

If the kexec image is claimed that it need not to "perserving extensive
CPU state" (such as FPU/MMX/GDT/LDT/IDT/CS/DS/ES/FS/GS/SS etc), the
IDT/GDT/CS/DS/ES/FS/GS/SS are not touched in kexec image code. So the
segment registers need not to be set.

But this is not clear. At least more description should be provided for

You mean I should change the function pointer...
To: Huang, Ying <ying.huang@...>
Cc: Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Tuesday, December 11, 2007 - 5:27 am

Interesting.  We wind up preserving the code in between invocations.

I don't know about your particular issue, but I can see that clearly
we need a way to read values back from our target image.

And if we can read everything back one way to proceed is to read
everything out modify it and then write it back.

Amending a kexec image that is already stored may also make sense.

I'm not convinced that the var arg parameters make sense, but you
added them because of a real need.

The kexec function is split into two separate calls so that we can
unmount the filesystem the kexec image comes from before actually
doing the kexec.

If extensive user space shutdown or startup is needed I will argue
that doing the work in the sys_reboot call is the wrong place to
do it.  Although if a jump back is happening we should not need
much restart.

Can you generate a minimal patch with just the minimal necessary

That certainly seems interesting.  But that doesn't justify the vararg


You were changing something that used to be a pointer back to a pointer
and I found that confusing.    See the last one or two commits to
machine_kexec_32.c for when this happened.  I get the feeling that we
need to put the page table creation logic into machine_kexec_prepare,
instead of in assembly.

Eric
--
To: Eric W. Biederman <ebiederm@...>
Cc: Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Wednesday, December 12, 2007 - 2:27 am

Yes. Reading/Modifying the loaded kexec image is another way to do
necessary communication between the first kernel and the second kernel.
In fact, the patch [4/4] of this series with title:

[PATCH 4/4 -mm] kexec based hibernation -v7 : kimgcore

provide a ELF CORE file in /proc (/proc/kimgcore) to read the loaded
kexec image. The writing function can be added easily.

But I think communication between the first kernel and the second kernel
via reading/modifying the loaded kernel image is not very convenient
way. The usage mode may be as follow:

- sys_kexec_load() /* with executable/data loaded */
- modify the loaded kexec image to set the parameters (A)
- sys_reboot(,,LINUX_REBOOT_CMD_KEXEC,) /* execute physical mode code with parameters(A)*/
- In physical mode code, check the parameters A and executing accordingly
- modify the loaded kexec image to set the parameters (B)
- sys_reboot(,,LINUX_REBOOT_CMD_KEXEC,) /* execute physical mode code with parameters(B)*/
- In physical mode code, check the parameters B and executing accordingly

There are some issues with this usage model:

- Some parameters in kernel needed to be exported (such as the
kimage-&gt;head to let the second kernel to read the memory contents of
backupped memory).

- The physical mode code invoker (the first kernel) need to know where
to write the parameters. A common protocol or a protocol case by case
should be defined. For example, the memory address after the entry point
of kexec image is a good candidate. But for Linux kernel, there are two
types of entry point, the "jump back entry" or "purgatory". Maybe
different protocol should be defined for these two types of entry point.

- For the user space of the second kernel to get the parameters. A
interface (maybe a file in /proc or /sys) should be provided to export
the parameters to user space.

So I think the current parameters passing mechanism may be more simple
and convenient (defined in Document/i386/jump_back_protocol.txt in the
patch).

The...
To: Huang, Ying <ying.huang@...>
Cc: Eric W. Biederman <ebiederm@...>, Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>, <linux-kernel@...>
Date: Monday, December 10, 2007 - 6:31 pm

Hi,

Why do we need so many different flags for preserving different types
of state (CPU, CPU_EXT, Device, console) ? To keep things simple,
can't we can create just one flag KEXEC_PRESERVE_CONTEXT, which will
indicate any special action required for preserving the previous kernel's
context so that one can swith back to old kernel?

Thanks
Vivek
--
To: Vivek Goyal <vgoyal@...>
Cc: Eric W. Biederman <ebiederm@...>, Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>, <linux-kernel@...>
Date: Tuesday, December 11, 2007 - 4:55 am

Yes. There are too many flags, especially when we have no users of these
flags now. It is better to use one flag such as KEXEC_PRESERVE_CONTEXT
now, and create the others required flags when really needed.

Best Regards,
Huang Ying
--
To: Huang, Ying <ying.huang@...>
Cc: Eric W. Biederman <ebiederm@...>, Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>, <linux-kernel@...>
Date: Monday, December 10, 2007 - 3:55 pm

Hi,

I am just going through your patches and trying to understand it. Don't

I need jumping back to restore a already hibernated kernel image? Can

Ok, so due to swapping of source and destination pages first kernel's data
is still preserved.  How do I get the dynamic memory required for second

Is 2K sufficient for all the code in relocate_kernel_32.S? What's the

Who fills the entry point at offset 0x200?



Who is using kexec_call(). I can't seem to locate the caller of it.


Thanks
Vivek
--
To: Vivek Goyal <vgoyal@...>
Cc: Eric W. Biederman <ebiederm@...>, Pavel Machek <pavel@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>, <linux-kernel@...>
Date: Tuesday, December 11, 2007 - 4:51 am

Now, the jumping back is used to implement "kexec based hibernation",
which uses kexec/kdump to save the memory image of hibernated kernel
during hibernating, and uses /dev/oldmem to restore the memory image of
hibernated kernel and jump back to the hibernated kernel to continue
run.

The other usage model maybe include:

- Dump the system memory image then continue to run, that is, get some
memory snapshot of system during system running.
- Cooperative multi-task of different OS. You can load another OS (B)
from current OS (A), and jump between the two OSes upon needed.

All dynamic memory required for second kernel should be "loaded" by
sys_kexec_load in first kernel. For example, not only the Linux kernel
should be loaded at 1M, the memory 0~16M (exclude kernel) should be

The current size is 0x2d7 (727). I got it though objdump,

The entry point is filled by assembler code in reloate_kernel_32.S upon

There is no user of kexec_call() now. But I think it may be useful as a
physical mode caller for some firmware code.

Best Regards,
Huang Ying
--
To: Huang, Ying <ying.huang@...>
Cc: Eric W. Biederman <ebiederm@...>, <nigel@...>, Rafael J. Wysocki <rjw@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Saturday, December 8, 2007 - 7:53 pm

I'm not kexec hacker... but maybe this is in good enough state to be
merged? It is useful on its own: kexec jump and back means we can dump
system then continue running, for example...

								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To: Pavel Machek <pavel@...>
Cc: Huang, Ying <ying.huang@...>, Eric W. Biederman <ebiederm@...>, <nigel@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Saturday, December 8, 2007 - 8:19 pm

As far as I'm concerned, patches [1/4] and [2/4] can go.

The other two are not in that shape yet (especially the [3/4] patch).

Greetings,
Rafael
--
To: Rafael J. Wysocki <rjw@...>
Cc: Pavel Machek <pavel@...>, Huang, Ying <ying.huang@...>, <nigel@...>, Andrew Morton <akpm@...>, Jeremy Maitin-Shepard <jbms@...>, <linux-kernel@...>, <linux-pm@...>, Kexec Mailing List <kexec@...>
Date: Saturday, December 8, 2007 - 9:06 pm

Ok.  Then I will see if I can review these in the next couple days
and give some feedback.

At a quick skim through the code it appears there is some more infrastructure
then we need and things can still be simplified.

Since this applies in particular to the user space interface I'm not comfortable
with these patches going in just yet.

The unused KEXEC_PRESERVE_ flags especially give me pause.  Having something
like that, that isn't currently wired up sounds like a bad place to start.

Eric
--
Previous thread: [PATCH 0/4 -mm] kexec based hibernation -v7 by Huang, Ying on Friday, December 7, 2007 - 11:53 am. (1 message)

Next thread: [PATCH 3/4 -mm] kexec based hibernation -v7 : kexec hibernate/resume by Huang, Ying on Friday, December 7, 2007 - 11:53 am. (3 messages)
speck-geostationary