Cc: Benjamin Herrenschmidt <benh@...>, Pavel Machek <pavel@...>, Rafael J. Wysocki <rjw@...>, Matthew Garrett <mjg59@...>, <linux-kernel@...>, <linux-pm@...>, Alan Stern <stern@...>
Hi.
Sorry for the long delay. Busy weekend and my motivation for working on=20
programming is almost zero at the moment...
On Friday 06 July 2007 15:01:48 Kyle Moffett wrote:
Ok. First, I'll ignore the specification that userspace does this - I don't=
=20
think it matters whether it's userspace or kernel that does the suspending=
=20
and I'm yet to see a good reason for it to be [required to be] done from=20
userspace.
In this first step, you've reinvented the first part of the current freezer=
=20
implementation. The reason we don't use a real signal is precisely so we ca=
n=20
have an untrappable SIGSTOP. In this regard, I particularly remember Win4Li=
n=20
from a few years ago. It would die if you sent it a real signal, so we had =
to=20
do it this way. No doubt there are other instances I'm not aware of.
=20
How do you determine which ones are needed? Why stop them in the first plac=
e?
=20
Ok. So now you also need processes running that are needed for swapping,=20
because freeing that memory might involve swapping. Fully agree with the=20
logic though (not really surprising - this is what I do in=20
Suspend2^wTuxOnIce).
Hotplugging cpus (when all those locking issues are taken care of) is simpl=
er.=20
Prior to cpu hotplugging, I used IMPIs to put secondary cpus into a tight=20
loop, so I know it's possible to do it this way too. That way, though, you=
=20
have less flexibility. What if a cpu really is plugged in between hibernate=
=20
and resume? With cpu hotplugging, it's handled properly and transparently.=
=20
Without cpu hotplugging, you could be using uninitialised data after the=20
atomic restore.
Marking userspace as COW makes things more complicated, too. You then have =
to=20
add code to the COW handling to update the list of pages that need to be=20
saved, and you reduce the reliability of the whole process. You can't predi=
ct=20
beforehand how many of these COW pages are going to be needed, and therefor=
e=20
can't know how much memory to free earlier on in the process. If you run ou=
t=20
of memory, what will be the effect?
You still need to remember what swap you're going to use to write the image=
=2E=20
You'll probably want to get this information (and allocate the swap) sooner=
=20
rather than later so that you're not racing against the memory freeing=20
earlier, and don't run into issues with bmapping the pages or having enough=
=20
memory to record the bdevs & sector numbers (not usually an issue, but if=20
swap is highly fragmented...).
Readonly halves? I don't get that, sorry.
=20
Mmm, but you still don't know how many.
=20
Are you thinking the changed filesystem pages are caught by COW? (AFAIUI,=20
kernel writes aren't). If (as I expect), you're thinking about filesystem=20
writes to DM based storage, what about non DM-based filesystem pages?
=20
=46WIW, let me note an important variation from how Suspend2 works; it migh=
t=20
provide food for thought. In Suspend2, we treat the processes that remain=20
stopped throughout the whole process specially. We write their data to disk=
=20
before the atomic copy (usually 70 or 80% of memory), and then use the memo=
ry=20
they occupy for the destination of the atomic copy. This further reduces th=
e=20
amount of memory that has to be freed, almost always to zero.
Regards,
Nigel
=2D-=20
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.