Hi.
On Tue, 2007-05-29 at 14:15 +0200, Rafael J. Wysocki wrote:
quoted text > Please have a look at the current version of the patch (appended).
>=20
> I have followed the Nigel's suggestion not to change the current behavior
> in this patch (I'll add a couple of patches removing the freezability fro=
m
quoted text > some kernel threads), with one exception: I couldn't figure out any reaso=
n
quoted text > to have try_to_freeze() called in net/sunrpc/svcsock.c:svc_recv() .
Thanks. IIRC, svcsock is related to the NFS server code.
quoted text > I've also added a piece of documentation, freezing-of-tasks.txt . Please
> see if it's not missing anything (I'd like it to be quite complete).
[...]
Mostly just grammar and the odd typo. On the whole, it's really well
written and perfectly readable - great job!
quoted text > Index: linux-2.6.22-rc3/Documentation/power/freezing-of-tasks.txt
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
quoted text > --- /dev/null
> +++ linux-2.6.22-rc3/Documentation/power/freezing-of-tasks.txt
> @@ -0,0 +1,160 @@
> +Freezing of tasks
> + (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL
> +
> +I. What is the freezing of tasks?
> +
> +The freezing of tasks is a mechanism by which user space processes and s=
ome
quoted text > +kernel threads are controlled during hibernation or system-wide suspend =
(on some
quoted text > +architectures).
> +
> +II. How it works?
How does it work?
quoted text > +
> +There are four per-task flags used for that, PF_NOFREEZE, PF_FROZEN, TIF=
_FREEZE
quoted text > +and PF_FREEZER_SKIP (the last one is auxiliary). The tasks that have
> +PF_NOFREEZE unset (all user space processes and some kernel threads) are
> +regarded as 'freezable' and treated in a special way before the system e=
nters a
quoted text > +suspend state as well as before a hibernation image is created (in what =
follows
quoted text > +we only consider hibernation, but the description also applies to suspen=
d).
quoted text > +
> +Namely, as the first step of the hibernation procedure the function
> +freeze_processes() (defined in kernel/power/process.c) is called. It ex=
ecutes
quoted text > +try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable task=
s and
quoted text > +sends a fake signal to each of them. A task that receives such a signal=
and has
quoted text > +TIF_FREEZE set, should react to it by calling the refrigerator() functio=
n
quoted text > +(defined in kernel/power/process.c), which sets the task's PF_FROZEN fla=
g,
quoted text > +changes its state to TASK_UNINTERRUPTIBLE and makes it loop until PF_FRO=
ZEN is
quoted text > +cleared for it. Then, we say that the task is 'frozen' and therefore th=
e set of
quoted text > +functions handling this mechanism is called 'the freezer' (these functio=
ns are
quoted text > +defined in kernel/power/process.c and include/linux/freezer.h). User sp=
ace
quoted text > +processes are generally frozen before kernel threads.
> +
> +It is not recommended to call refrigerator() directly. Instead, it is
> +recommended to use the try_to_freeze() function (defined in
> +include/linux/freezer.h), that checks the task's TIF_FREEZE flag and mak=
es the
quoted text > +task enter refrigerator() if the flag is set.
> +
> +For user space processes try_to_freeze() is called automatically from th=
e
quoted text > +signal-handling code, but the freezable kernel threads need to call it
> +explicitly in suitable places. The code to do this may look like the fo=
llowing:
quoted text > +
> + do {
> + hub_events();
> + wait_event_interruptible(khubd_wait,
> + !list_empty(&hub_event_list));
> + try_to_freeze();
> + } while (!signal_pending(current));
> +
> +(from drivers/usb/core/hub.c::hub_thread()).
> +
> +If a freezable kernel thread fails to call try_to_freeze() after the fre=
ezer has
quoted text > +set TIF_FREEZE for it, the freezing of tasks will fail and the entire
> +hibernation operation will be cancelled. For this reason, freezable ker=
nel
quoted text > +threads must call try_to_freeze() somewhere.
> +
> +After the system memory state has been restored from a hibernation image=
and
quoted text > +devices have been reinitialized, the function thaw_processes() is called=
in
quoted text > +order to clear the PF_FROZEN flag for each frozen task. Then, the tasks=
that
quoted text > +have been frozen leave refrigerator() and continue running.
> +
> +III. Which kernel threads are freezable?
> +
> +Kernel threads are not freezable by default. However, a kernel thread m=
ay clear
quoted text > +PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_N=
OFREEZE
quoted text > +directly is strongly discouraged). From this point it is regarded as fr=
eezable
quoted text > +and must call try_to_freeze() in a suitable place.
> +
> +IV. Why do we do that?
> +
> +Generally speaking, there is a couple of reasons to use the freezing of =
tasks:
quoted text > +
> +1. The principal reason is to prevent filesystems from being damaged aft=
er
quoted text > +hibernation. Namely, for now we have no simple means of checkpointing
s/Namely, for now/At the moment/
No simple means or no means at all? Are you thinking of bdev freezing?
quoted text > +filesystems, so if there are any modifications made to filesystem data a=
nd/or
quoted text > +metadata on disks, we usually cannot bring them back to the state from b=
efore
If the above is changed, I'd remove 'usually' here.
quoted text > +the modifications. At the same time each hibernation image contains som=
e
quoted text > +filesystem-related information that must be consistent with the state of=
the
quoted text > +on-disk data and metadata after the system memory state has been restore=
d from
quoted text > +the image (otherwise the filesystems will be damaged in a nasty way, usu=
ally
quoted text > +making them almost impossible to repair). Therefore we freeze tasks tha=
t might
s/Therefore we/We therefore/
quoted text > +cause the on-disk filesystems' data and metadata to be modified after th=
e
quoted text > +hibernation image has been created and before the system is finally powe=
red off.
quoted text > +The majority of them is user space processes, but if any of kernel threa=
ds may
s/them is/these are/
s/of kernel/of the kernel/
quoted text > +cause something like this to happen, they have to be freezable.
> +
> +2. The second reason is to prevent user space processes and some kernel =
threads
quoted text > +from interfering with the suspending and resuming of devices. For examp=
le, a
quoted text > +user space process running on a second CPU while we are suspending devic=
es may
I'd shift the "For example" to after "may", giving "...may, for example,
be troublesome..."
quoted text > +be troublesome and without the freezing of tasks we would need some safe=
guards
quoted text > +against race conditions that might occur in such a case.
> +
> +Although Linus Torvalds doesn't like the freezing of tasks, he said this=
in one
quoted text > +of the discussions on LKML (
http://lkml.org/lkml/2007/4/27/608 ):
> +
> +'> Why we freeze tasks at all or why we freeze kernel threads?
> +
> +In many ways, "at all".
I found these first two lines confusing - I though the "Why we
freeze..." was Linus, rather than a quotation he was responding to. I'd
suggest starting the quote at what follows this point... but then as I
read further, I can see the quote is necessary to make sense of the
second paragraph below. Perhaps the best way would to put a line before
the "Why we freeze..." indicating that you're being quoted there.
quoted text > +I _do_ realize the IO request queue issues, and that we cannot actually =
do
quoted text > +s2ram with some devices in the middle of a DMA. So we want to be able t=
o
quoted text > +avoid *that*, there's no question about that. And I suspect that stoppi=
ng
quoted text > +user threads and then waiting for a sync is practically one of the easie=
r
quoted text > +ways to do so.
> +
> +So in practice, the "at all" may become a "why freeze kernel threads?" a=
nd
quoted text > +freezing user threads I don't find really objectionable.'
Oh, and double quotes should surround the whole quote, with single
quotes replacing the double quotes in the quotation. Hope all those
'quote's aren't confusing! :)
quoted text > +Still, there are kernel threads that may want to be freezable. For exam=
ple, if
quoted text > +a kernel that belongs to a device driver accesses the device directly, i=
t in
quoted text > +principle needs to know when the device is suspended, so that it doesn't=
try to
quoted text > +access it at that time. However, if the kernel thread is freezable, it =
will be
quoted text > +frozen before the driver's .suspend() callback is executed and it will b=
e
quoted text > +thawed after the driver's .resume() callback has run, so it won't be acc=
essing
quoted text > +the device while it's suspended.
> +
> +3. Another reason for freezing tasks is to prevent user space processes =
from
quoted text > +realizing that hibernation (or suspend) operation takes place. Ideally,=
user
quoted text > +space processes should not notice that such a system-wide operation has =
occured
s/occured/occurred/. That word gets me too.
quoted text > +and should continue running without any problems after the restore (or r=
esume
quoted text > +from suspend). Unfortunately, in the most general case this is quite di=
fficult
quoted text > +to achieve without the freezing of tasks. Consider, for example, a proc=
ess
quoted text > +that depends on the number of CPUs being online while it's running. Sin=
ce we
s/the number of/all/ (or secondary)
quoted text > +need to disable nonboot CPUs during the hibernation, if this process is =
not
quoted text > +frozen, it may notice that the number of CPUs has changed and may start =
to work
quoted text > +incorrectly because of that.
> +
> +V. Are there any problems related to the freezing of tasks?
> +
> +Yes, there are.
> +
> +First of all, the freezing of kernel threads may be tricky if they depen=
d one
quoted text > +on another. For example, if kernel thread A waits for a completion (in =
the
quoted text > +TASK_UNINTERRUPTIBLE state) that needs to be done by freezable kernel th=
read B
quoted text > +and B is frozen in the meantime, then A will be blocked until B is thawe=
d, which
quoted text > +may be undesirable. That's why kernel threads are not freezable by defa=
ult.
quoted text > +
> +Second, there are the following two problems related to the freezing of =
user
quoted text > +space processes:
> +1. Putting processes into an uninterruptible sleep stuffs up the load av=
erage.
s/stuffs up/distorts/ ('Stuffs up' is accurate as a colloquialism, but
I'm suggesting the change because the language in the remainder of the
file is more formal - this seems out of place).
quoted text > +2. Now that we have FUSE, plus the framework for doing device drivers in
> +userspace, it gets even more complicated because some userspace processe=
s are
quoted text tml).
Death to them all, I say! :)
quoted text > +The problem 1. seems to be fixable, although it hasn't been fixed so far=
. The
quoted text > +other one is more serious, but it seems that we can work around it by us=
ing
quoted text > +hibernation (and suspend) notifiers (in that case, though, we won't be a=
ble to
quoted text > +avoid the realization by the user space processes that the hibernation i=
s taking
quoted text > +place).
> +
> +There also are problems that the freezing of tasks tends to expose, alth=
ough
s/also are/are also/
quoted text > +they are not directly related to it. For example, if request_firmware()=
is
quoted text > +called from a device driver's .resume() routine, it will timeout and eve=
ntually
quoted text > +fail, because the user land process that should respond to the request i=
s frozen
quoted text > +at this point. So, seemingly, the failure is due to the freezing of tas=
ks.
quoted text > +Suppose, however, that the firmware file is located on a filesystem acce=
ssible
quoted text > +only through the device that needs the firmware. In that case, the syst=
em won't
quoted text > +be able to work normally after the restore regardless of whether or not =
the
quoted text > +freezing of tasks is used. Consequently, the problem is not really rela=
ted to
quoted text > +the freezing of tasks, since it generally exists regardless. [The solut=
ion to
quoted text > +this particular problem is to keep the firmware in memory after it's loa=
ded for
quoted text > +the first time and upload if from memory to the device whenever necessar=
y.]
I understand the logic and agree with that you're trying to say in this
last example, but think the example is faulty. If the firmware is on a
filesystem accessible only through the device that needs the firmware,
then you wouldn't be able to bring it up in the first place.
Regards,
Nigel