"Since many alternative approaches to hibernation are now being considered and discussed," Rafael Wysocki began on the lkml, "I thought it might be a good idea to list some things that in my not so humble opinion should be taken care of by any hibernation framework. They are listed below, not in any particular order, because I think they all are important." He continued with the following list, including a paragraph discussing each: "filesystems mounted before the hibernation are untouchable; swap space in use before the hibernation must be handled with care; there are memory regions that must not be saved or restored; the user should be able to limit the size of a hibernation image; hibernation should be transparent from the applications' point of view; state of devices from before hibernation should be restored, if possible; on ACPI systems special platform-related actions have to be carried out at the right points, so that the platform works correctly after the restore; hibernation and restore should not be too slow; hibernation framework should not be too difficult to set up." Rafael went on to summarize:
"In my opinion any hibernation framework that doesn't take the above requirements into account in any way will be a failure. Moreover, the existing frameworks fail to follow some of them too, so I consider all of these frameworks as a work in progress. For this reason, I will much more appreciate ideas allowing us to improve the existing frameworks in a more or less evolutionary way, then attempts to replace them all with something entirely new."
From: Rafael J. Wysocki [email blocked]
To: LKML [email blocked]
Subject: Hibernation considerations
Date: Sun, 15 Jul 2007 14:33:32 +0200
Hi,
Since many alternative approaches to hibernation are now being considered and
discussed, I thought it might be a good idea to list some things that in my not
so humble opinion should be taken care of by any hibernation framework. They
are listed below, not in any particular order, because I think they all are
important. Still, I might have forgotten something, so everyone with
experience in implementing hibernation, especially Pavel and Nigel, please
check if the list is complete.
(1) Filesystems mounted before the hibernation are untouchable
When there's a memory snapshot, either in the form of a hibernation image,
or in the form of the "old" kernel and processes available to the "new"
kexeced kernel responsible for saving their memory, the filesystems mounted
before the hibernation should not be accessed, even for reading, because
that would cause their on-disk state to be inconsistent with the snapshot
and might lead to a filesystem corruption.
(2) Swap space in use before the hibernation must be handled with care
If swap space is used for saving the memory snapshot, the snapshot-saving
application (or kernel) must be careful enough not to overwrite swap pages
that contain valid memory contents stored in there before the hibernation.
(3) There are memory regions that must not be saved or restored
Some memory regions contain data that shouldn't be overwritten during the
restore, because that might lead to the system not working correctly
afterwards. Also, on some systems there are valid 'struct pages'
structures that in fact corresond to memory holes and we should not attempt
to save those pages.
(4) The user should be able to limit the size of a hibernation image
There are a couple of reasons of that. For example, the storage space
used for saving the image may be smaller than the entire RAM or the user
may want the image to be saved quickier.
(5) Hibernation should be transparent from the applications' point of view
Generally, applications should not notice that hibernation took place.
[Note that I don't regard all processes as applications and I think that
there may be processes which need to handle the hibernation in a special
way.] Ideally, for example, if some audio is being played when a
hibernation starts, the audio player should be able to continue playing the
same audio after the restore from the point in which it has been
interrupted by the hibernation. Also, the CPU affinities and similar
settings requested by the applications before a hibernation should be
binding after the restore.
(6) State of devices from before hibernation should be restored, if possible
If possible, during a restore devices should be brought back to the same
state in which they were before the corresponding hibernation. Of course
in some situations it might be impossible to do that (eg. the user
connected the hibernated system to a different IP subnet and then
restored), but as a general rule, we should do our best to restore the
state of devices, which is directly related to point (5) above.
(7) On ACPI systems special platform-related actions have to be carried out at
the right points, so that the platform works correctly after the restore
The ACPI specification requires us to invoke some global ACPI methods
during the hibernation and during the restore. Moreover, the ordering of
code related to these ACPI methods may not be arbitrary (eg. some of
them have to be executed after devices are put into low power states etc.).
(8) Hibernation and restore should not be too slow
In my opinion, if more than one minute is needed to hibernate the system
with the help of certain hibernation framework, then this framework is not
very useful in practice. It might be useful to perform some special tasks
(eg. moving a server to another place without taking it down), but it is
not very useful, for example, to notebook users.
(9) Hibernation framework should not be too difficult to set up
It follows from my experience that if the users are required to do too much
work to set up a hibernation framework, they will not use it as long as
there are simpler alternatives (some of them will not use hibernation at
all if it's too difficult to get to work). On the other hand, if the users
are provided with a working hibernation framework by their distribution
and they find it useful, they are not likely to use kernel.org kernels if
t's too difficult to replace the distribution kernel with a generic one due
to the hibernation framework's requirements.
All of the existing hibernation frameworks have been written with the above
points in mind and that's why they are what they are. In particular, the
existence of the tasks freezer, hated by some people to the point of insanity,
follows directly from points (1), (4) and (5).
In my opinion any hibernation framework that doesn't take the above
requirements into account in any way will be a failure. Moreover, the existing
frameworks fail to follow some of them too, so I consider all of these
frameworks as a work in progress. For this reason, I will much more appreciate
ideas allowing us to improve the existing frameworks in a more or less
evolutionary way, then attempts to replace them all with something entirely
new.
Greetings,
Rafael
--
"Premature optimization is the root of all evil." - Donald Knuth
-
Nope
(2) is not a valid requirement. There are no real reasons to write the hibernate image to swapspace, and several good reasons not to do so.
The big advantage of swap
The big advantage of swap space is that it is there, backs up memory state anyway, and that it can be unmounted (or treated specially) quite easily. Storing the image on a mounted regular file system is very tricky.
These problems can be avoided by using a separate hibernate partition, but then you waste space.
Sure there is, the main one
Sure there is, the main one being that writing to a swap partition is a lot safer than a file since it requires no filesystem access. And filesystem access is going to be a fairly large problem if the kexec hibernation implementation takes off.
read (2) more carefully
Requirement (2) doesn't say that you have to use swap to store the image, it says that if you do use swap, then you should be very careful about what you do and not leave swap corrupt, inconsistent or unusable in any way.
(2) swap space
It was mentioned that IF the swapspace is used to store the memory image - it doesn't say to use the swap for that purpose. In fact I'd say anyone who uses swap space to store the memory image before hibernating doesn't know what they are doing. Among other things, there is no guarantee that the user set up a swap space.
Personally I'd expect 'sleep' to leave everything in RAM as is and the computer goes into a low-power state where some energy is expended on maintaining RAM. It is only necessary to save used memory pages (including pages swapped out) if you plan to completely shut down the system but want a fast restart. I guess some people would want an even lower-power state than a simple 'sleep' in which case they would write out all used memory pages - but writing out memory as a matter of course is unbelievably stupid in my opinion because it takes time and really doesn't drop power consumption that much - the only advantage is that if power were to drop below the critical point you don't suddenly lose everything, but for me that just means that you need a second level to the suspend which writes out memory and shuts down when your battery level drops below a given setting. Forget the "suspend to RAM" vs "suspend to disk" wars - that is plain stupid. It should always be suspend to RAM with a write-to-disk+poweroff on demand or when the battery runs low, but it must always pass through the 'suspended to RAM' state.
Device drivers are the trickiest and I don't believe they will support proper power management until there is a single suspend API to use. Just one example of a problem: you have an ethernet device which needs firmware loaded before operating - if you completely power down that subsystem then you need to reload the firmware before the driver attempts to do anything more.
I very much disagree about
I very much disagree about suspend-to-disk being worthless. Most of the time, I know my laptop is going to be off for a while. The battery WILL die before I turn it back on. I'd rather just turn it off, but have it suspended. Please don't take away that functionality.
Swapspace is a requirement
Swapspace is a requirement for swsusp and uswsusp, but Suspend2/TuxOnIce can use regular files if you want. And it's been common knowledge that if you want to hibernate on Linux you need swapspace for that so if you setup a box without swap and want to hibernate it's your fault.
Yes, we have that, it's called suspend-to-ram.
It doesn't happen in the suspend-to-ram case so it's not stupid at all.
We have that, it's called suspend-to-both.
There is no war and limiting people to your "one true way" is what's plain stupid. I hibernate my laptop because I don't want it running hot in the case after I've suspended it. And having it go through the suspend-to-ram paths just to ignore what they did and hibernate is equally stupid.
But the point is moot since Linus has decreed that suspend and hibernate are separate operations so work will continue to separate the code paths so the option of one, the other or both will always be there
There is currently a single API and most drivers still get it wrong or don't try at all, with firmware loading being one of the most problematic of the current issues so even once the operations are separate things won't be any worse off than they are now.
complex !
This feature seems very complex with many tricky part like locked pages or security related information (crypto keys...).
Dumping correctly the kernel memory looks hard. Why not split process information and drivers one ? Starting/stoping a process with a dump to a file looks simple. But does the kernel could no be simply "stopped" and then every drivers restarted instead of a memory reload ?
program hibernation?
Would it make sence to let programs sleep and just killoff the kernel/driver part?
Another operating system
Another popular operating system has hibernation and has had so for quite a while.
It writes it to a file on the hard disk, and its very easy to use and it works well.
Re: Another operating system
It's all hard disk. You mean a file on a mounted filesystem..
And the key there is that it pre-allocates the disk space (which I think needs to be contiguous), and the hibernation and restore code bypasses the filesystem layer and directly writes to the blocks in question.
This is an approach that could be used for Linux, too..
Works well is subjective,
Works well is subjective, back when I used that other OS on my notebook just having FF open would cause the hibernation to fail.
Possibly in userspace
Another important feature is that the system should work entirely in user space, if possible (like uswsusp).
Excuse me, but why should I
Excuse me, but why should I care whether the code runs in kernel space or not, as long as it works?
Sounds like religous bullshit to me.
That's a retarded
That's a retarded requirement and even Pavel Machek, the kernel dev who keeps screaming the same thing, can't come up with a good reason why other than "If it doesn't have to be in the kernel it shouldn't be." which in general isn't a bad rule but it shouldn't be a hard requirement. uswsusp still can't do everything that suspend2/tuxonice does and according to Pavel development would be a lot faster since so much could be done in userspace.
uswsusp is a long way from
uswsusp is a long way from being *entirely* in user space ;-)
If you don't mind me asking
Who is Rafael Wysocki to dictate hibernation requirements?
One of the few people
One of the few people actually willing to do the work on the hibernation implementation. He's been working on the core PM support and uswsusp for a while now.
And anyway, most of his requirements are just that. They're either required by the hardware, kernel assumptions or features of the current implementations that losing would be considered a regression. The list looks reasonable to me.
'Since many alternative
'Since many alternative approaches to hibernation are now being considered and discussed,'
why are new approached being discussed ? aren't the current software suspend mechanisms good enough ? if not, can they not be improved, instead of re-inventing...
No, the in-kernel swsusp is
No, the in-kernel swsusp is slow as hell and uswsusp is better but it's still not as fast or functional Suspend2/TuxOnIce.
But the big deal right now is the process freezer, it's used to stop all userland processes (and some kernel threads) so that there's no chance of anything in memory changing while the snapshot is being taken. The main problem everyone's focusing on is that it has problems with FUSE. Since FUSE runs in kernel space it's possible to freeze one process before another while the latter process is waiting on the first. Since the first won't continue until thawed the second will never freeze and hibernation will timeout and abort.
Some people are convinced that a kexec-based approach will be simpler since you can kexec a new kernel thus freezing the whole system at once with no chance of deadlock by crap like FUSE. If it works out it should be a good bit more reliable and flexible than the current implementations but I don't know if they'll be able to sort it all out or if the extra complexity will be worth tradeoff.