login
Header Space

 
 

Linux: Discussing Hibernation Requirements

July 16, 2007 - 8:15pm
Submitted by Jeremy on July 16, 2007 - 8:15pm.
Linux news

"Since many alternative approaches to hibernation are now being considered and discussed," Rafael Wysocki began on the lkml, "I thought it might be a good idea to list some things that in my not so humble opinion should be taken care of by any hibernation framework. They are listed below, not in any particular order, because I think they all are important." He continued with the following list, including a paragraph discussing each: "filesystems mounted before the hibernation are untouchable; swap space in use before the hibernation must be handled with care; there are memory regions that must not be saved or restored; the user should be able to limit the size of a hibernation image; hibernation should be transparent from the applications' point of view; state of devices from before hibernation should be restored, if possible; on ACPI systems special platform-related actions have to be carried out at the right points, so that the platform works correctly after the restore; hibernation and restore should not be too slow; hibernation framework should not be too difficult to set up." Rafael went on to summarize:

"In my opinion any hibernation framework that doesn't take the above requirements into account in any way will be a failure. Moreover, the existing frameworks fail to follow some of them too, so I consider all of these frameworks as a work in progress. For this reason, I will much more appreciate ideas allowing us to improve the existing frameworks in a more or less evolutionary way, then attempts to replace them all with something entirely new."


From:	Rafael J. Wysocki [email blocked]
To:	LKML [email blocked]
Subject: Hibernation considerations
Date:	Sun, 15 Jul 2007 14:33:32 +0200

Hi,

Since many alternative approaches to hibernation are now being considered and
discussed, I thought it might be a good idea to list some things that in my not
so humble opinion should be taken care of by any hibernation framework.  They
are listed below, not in any particular order, because I think they all are
important.  Still, I might have forgotten something, so everyone with
experience in implementing hibernation, especially Pavel and Nigel, please
check if the list is complete.

(1) Filesystems mounted before the hibernation are untouchable

    When there's a memory snapshot, either in the form of a hibernation image,
    or in the form of the "old" kernel and processes available to the "new"
    kexeced kernel responsible for saving their memory, the filesystems mounted
    before the hibernation should not be accessed, even for reading, because
    that would cause their on-disk state to be inconsistent with the snapshot
    and might lead to a filesystem corruption.

(2) Swap space in use before the hibernation must be handled with care

    If swap space is used for saving the memory snapshot, the snapshot-saving
    application (or kernel) must be careful enough not to overwrite swap pages
    that contain valid memory contents stored in there before the hibernation.

(3) There are memory regions that must not be saved or restored

    Some memory regions contain data that shouldn't be overwritten during the
    restore, because that might lead to the system not working correctly
    afterwards.  Also, on some systems there are valid 'struct pages'
    structures that in fact corresond to memory holes and we should not attempt
    to save those pages.

(4) The user should be able to limit the size of a hibernation image

    There are a couple of reasons of that.  For example, the storage space
    used for saving the image may be smaller than the entire RAM or the user
    may want the image to be saved quickier.

(5) Hibernation should be transparent from the applications' point of view

    Generally, applications should not notice that hibernation took place.
    [Note that I don't regard all processes as applications and I think that
    there may be processes which need to handle the hibernation in a special
    way.]  Ideally, for example, if some audio is being played when a
    hibernation starts, the audio player should be able to continue playing the
    same audio after the restore from the point in which it has been
    interrupted by the hibernation.  Also, the CPU affinities and similar
    settings requested by the applications before a hibernation should be
    binding after the restore.

(6) State of devices from before hibernation should be restored, if possible

    If possible, during a restore devices should be brought back to the same
    state in which they were before the corresponding hibernation.  Of course
    in some situations it might be impossible to do that (eg. the user
    connected the hibernated system to a different IP subnet and then
    restored), but as a general rule, we should do our best to restore the
    state of devices, which is directly related to point (5) above.

(7) On ACPI systems special platform-related actions have to be carried out at
    the right points, so that the platform works correctly after the restore

    The ACPI specification requires us to invoke some global ACPI methods
    during the hibernation and during the restore.  Moreover, the ordering of
    code related to these ACPI methods may not be arbitrary (eg. some of
    them have to be executed after devices are put into low power states etc.).

(8) Hibernation and restore should not be too slow

    In my opinion, if more than one minute is needed to hibernate the system
    with the help of certain hibernation framework, then this framework is not
    very useful in practice.  It might be useful to perform some special tasks
    (eg. moving a server to another place without taking it down), but it is
    not very useful, for example, to notebook users.

(9) Hibernation framework should not be too difficult to set up

    It follows from my experience that if the users are required to do too much
    work to set up a hibernation framework, they will not use it as long as
    there are simpler alternatives (some of them will not use hibernation at
    all if it's too difficult to get to work).  On the other hand, if the users
    are provided with a working hibernation framework by their distribution
    and they find it useful, they are not likely to use kernel.org kernels if
    t's too difficult to replace the distribution kernel with a generic one due
    to the hibernation framework's requirements.

All of the existing hibernation frameworks have been written with the above
points in mind and that's why they are what they are.  In particular, the
existence of the tasks freezer, hated by some people to the point of insanity,
follows directly from points (1), (4) and (5).

In my opinion any hibernation framework that doesn't take the above
requirements into account in any way will be a failure.  Moreover, the existing
frameworks fail to follow some of them too, so I consider all of these
frameworks as a work in progress.  For this reason, I will much more appreciate
ideas allowing us to improve the existing frameworks in a more or less
evolutionary way, then attempts to replace them all with something entirely
new.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-



Related Links:

Nope

July 17, 2007 - 1:21am
Joseph Fannin (not verified)

(2) is not a valid requirement. There are no real reasons to write the hibernate image to swapspace, and several good reasons not to do so.

The big advantage of swap

July 17, 2007 - 3:51am
Anonymous (not verified)

The big advantage of swap space is that it is there, backs up memory state anyway, and that it can be unmounted (or treated specially) quite easily. Storing the image on a mounted regular file system is very tricky.

These problems can be avoided by using a separate hibernate partition, but then you waste space.

Sure there is, the main one

July 17, 2007 - 4:27am

Sure there is, the main one being that writing to a swap partition is a lot safer than a file since it requires no filesystem access. And filesystem access is going to be a fairly large problem if the kexec hibernation implementation takes off.

read (2) more carefully

July 17, 2007 - 8:14am
Anonymous (not verified)

Requirement (2) doesn't say that you have to use swap to store the image, it says that if you do use swap, then you should be very careful about what you do and not leave swap corrupt, inconsistent or unusable in any way.

(2) swap space

July 18, 2007 - 4:13am
Anonymous (not verified)

It was mentioned that IF the swapspace is used to store the memory image - it doesn't say to use the swap for that purpose. In fact I'd say anyone who uses swap space to store the memory image before hibernating doesn't know what they are doing. Among other things, there is no guarantee that the user set up a swap space.

Personally I'd expect 'sleep' to leave everything in RAM as is and the computer goes into a low-power state where some energy is expended on maintaining RAM. It is only necessary to save used memory pages (including pages swapped out) if you plan to completely shut down the system but want a fast restart. I guess some people would want an even lower-power state than a simple 'sleep' in which case they would write out all used memory pages - but writing out memory as a matter of course is unbelievably stupid in my opinion because it takes time and really doesn't drop power consumption that much - the only advantage is that if power were to drop below the critical point you don't suddenly lose everything, but for me that just means that you need a second level to the suspend which writes out memory and shuts down when your battery level drops below a given setting. Forget the "suspend to RAM" vs "suspend to disk" wars - that is plain stupid. It should always be suspend to RAM with a write-to-disk+poweroff on demand or when the battery runs low, but it must always pass through the 'suspended to RAM' state.

Device drivers are the trickiest and I don't believe they will support proper power management until there is a single suspend API to use. Just one example of a problem: you have an ethernet device which needs firmware loaded before operating - if you completely power down that subsystem then you need to reload the firmware before the driver attempts to do anything more.

I very much disagree about

July 18, 2007 - 3:43pm
Anonymous (not verified)

I very much disagree about suspend-to-disk being worthless. Most of the time, I know my laptop is going to be off for a while. The battery WILL die before I turn it back on. I'd rather just turn it off, but have it suspended. Please don't take away that functionality.

Swapspace is a requirement

July 18, 2007 - 6:13pm

Swapspace is a requirement for swsusp and uswsusp, but Suspend2/TuxOnIce can use regular files if you want. And it's been common knowledge that if you want to hibernate on Linux you need swapspace for that so if you setup a box without swap and want to hibernate it's your fault.

Personally I'd expect 'sleep' to leave everything in RAM as is and the computer goes into a low-power state where some energy is expended on maintaining RAM

Yes, we have that, it's called suspend-to-ram.

but writing out memory as a matter of course is unbelievably stupid in my opinion

It doesn't happen in the suspend-to-ram case so it's not stupid at all.

the only advantage is that if power were to drop below the critical point you don't suddenly lose everything, but for me that just means that you need a second level to the suspend which writes out memory and shuts down when your battery level drops below a given setting.

We have that, it's called suspend-to-both.

Forget the "suspend to RAM" vs "suspend to disk" wars - that is plain stupid. It should always be suspend to RAM with a write-to-disk+poweroff on demand or when the battery runs low, but it must always pass through the 'suspended to RAM' state.

There is no war and limiting people to your "one true way" is what's plain stupid. I hibernate my laptop because I don't want it running hot in the case after I've suspended it. And having it go through the suspend-to-ram paths just to ignore what they did and hibernate is equally stupid.

But the point is moot since Linus has decreed that suspend and hibernate are separate operations so work will continue to separate the code paths so the option of one, the other or both will always be there

Device drivers are the trickiest and I don't believe they will support proper power management until there is a single suspend API to use. Just one example of a problem: you have an ethernet device which needs firmware loaded before operating - if you completely power down that subsystem then you need to reload the firmware before the driver attempts to do anything more.

There is currently a single API and most drivers still get it wrong or don't try at all, with firmware loading being one of the most problematic of the current issues so even once the operations are separate things won't be any worse off than they are now.

complex !

July 17, 2007 - 4:34am
Anonymous (not verified)

This feature seems very complex with many tricky part like locked pages or security related information (crypto keys...).

Dumping correctly the kernel memory looks hard. Why not split process information and drivers one ? Starting/stoping a process with a dump to a file looks simple. But does the kernel could no be simply "stopped" and then every drivers restarted instead of a memory reload ?

program hibernation?

July 17, 2007 - 5:58pm
Anonymous (not verified)

Would it make sence to let programs sleep and just killoff the kernel/driver part?

Another operating system

July 17, 2007 - 11:23am
Fred Flinta (not verified)

Another popular operating system has hibernation and has had so for quite a while.
It writes it to a file on the hard disk, and its very easy to use and it works well.

Re: Another operating system

July 17, 2007 - 3:02pm
Anonymous (not verified)

It writes it to a file on the hard disk

It's all hard disk. You mean a file on a mounted filesystem..

And the key there is that it pre-allocates the disk space (which I think needs to be contiguous), and the hibernation and restore code bypasses the filesystem layer and directly writes to the blocks in question.

This is an approach that could be used for Linux, too..

Works well is subjective,

July 17, 2007 - 4:14pm

Works well is subjective, back when I used that other OS on my notebook just having FF open would cause the hibernation to fail.

Possibly in userspace

July 17, 2007 - 11:44am
xSTEFANØx (not verified)

Another important feature is that the system should work entirely in user space, if possible (like uswsusp).

Excuse me, but why should I

July 17, 2007 - 2:31pm
Anonymous (not verified)

Excuse me, but why should I care whether the code runs in kernel space or not, as long as it works?

Sounds like religous bullshit to me.

That's a retarded

July 17, 2007 - 4:17pm

That's a retarded requirement and even Pavel Machek, the kernel dev who keeps screaming the same thing, can't come up with a good reason why other than "If it doesn't have to be in the kernel it shouldn't be." which in general isn't a bad rule but it shouldn't be a hard requirement. uswsusp still can't do everything that suspend2/tuxonice does and according to Pavel development would be a lot faster since so much could be done in userspace.

uswsusp is a long way from

July 19, 2007 - 4:29pm
Anonymous (not verified)

uswsusp is a long way from being *entirely* in user space ;-)

If you don't mind me asking

July 18, 2007 - 8:56am
Anonymous (not verified)

Who is Rafael Wysocki to dictate hibernation requirements?

One of the few people

July 18, 2007 - 5:58pm

One of the few people actually willing to do the work on the hibernation implementation. He's been working on the core PM support and uswsusp for a while now.

And anyway, most of his requirements are just that. They're either required by the hardware, kernel assumptions or features of the current implementations that losing would be considered a regression. The list looks reasonable to me.

'Since many alternative

July 19, 2007 - 8:29am
Anonymous (not verified)

'Since many alternative approaches to hibernation are now being considered and discussed,'

why are new approached being discussed ? aren't the current software suspend mechanisms good enough ? if not, can they not be improved, instead of re-inventing...

No, the in-kernel swsusp is

July 19, 2007 - 9:44am

No, the in-kernel swsusp is slow as hell and uswsusp is better but it's still not as fast or functional Suspend2/TuxOnIce.

But the big deal right now is the process freezer, it's used to stop all userland processes (and some kernel threads) so that there's no chance of anything in memory changing while the snapshot is being taken. The main problem everyone's focusing on is that it has problems with FUSE. Since FUSE runs in kernel space it's possible to freeze one process before another while the latter process is waiting on the first. Since the first won't continue until thawed the second will never freeze and hibernation will timeout and abort.

Some people are convinced that a kexec-based approach will be simpler since you can kexec a new kernel thus freezing the whole system at once with no chance of deadlock by crap like FUSE. If it works out it should be a good bit more reliable and flexible than the current implementations but I don't know if they'll be able to sort it all out or if the extra complexity will be worth tradeoff.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
speck-geostationary