Hi.
On Wed, 2007-04-25 at 20:03 -0700, Linus Torvalds wrote:
quoted text >=20
> On Thu, 26 Apr 2007, Nigel Cunningham wrote:
> >=20
> > Sorry. I wasn't clear. I wasn't saying that suspend to ram has a
> > snapshot point. I was trying to say it has a point where you're seeking
> > to save information (PCI state / SCSI transaction number or whatever)
> > that you'll need to get the hardware into the same state at a later
> > stage. That (saving information) is the point of similarity.
>=20
> Yes, they do both save information, but I'm not actually convinced they=20
> would necessarily even save the *same* information.
>=20
> Let's just take an example of USB, and to make things more interesting,=20
> say that the disk you want to suspend to is itself over USB (not=20
> necessarily something you _want_ to do, but I think we can all agree that=
=20
quoted text > it's something that should potentially work, no?)
Agreed - it would be nice.
quoted text > Now, USB devices actually have per-connection state (at a minimum, the=20
> "toggle" bit or whatever), and that's obviously something that will=20
> inevitably *change* as a result of the device being used after=20
> snapshotting (and even if not used, by the rediscovery by the first kerne=
l=20
quoted text > to boot), and we fundamentally cannot put the final toggle state in the=20
> snapshot.
>=20
> So in the snapshot-to-disk scenario, there are some pieces of data that=20
> simply fundamentally *cannot* be snapshotted, because they are not=20
> controller state, they are "connection" state.
>=20
> So in that case, you basically know that you *have* to rebuild the=20
> connection when you do the "snapshot_resume()" thing. So there's no point=
=20
quoted text > in even keeping these kinds of connection states (the same is true of=20
> keyboards, mice, anything else - it's how USB works).
Sort of agree - you might want to record some serial number that might
let you recognise it as the same thing at resume time when everything is
re-hotplugged (assuming it's even there then). Nevertheless, I don't
think that diminishes what you're saying.
quoted text > In contrast, in suspend-to-RAM, USB connections might just be things you=20
> actually want to keep open and active, and you *can* do so, in ways you=20
> simply cannot do with "snapshot to disk". In fact, if you are something=20
> like an OLPC and actually go to s2ram very aggressively, you might well=20
> want to keep the connection established, because it's conceivable that yo=
u=20
quoted text > might otherwise lose keypresses etc issues)
>=20
> See? There are real *technical* reasons to believe that the two "save=20
> state" operations are really fundamentally different. There are reasons t=
o=20
quoted text > believe that a s2ram can actually happen while keeping some connections=20
> open that cannot be kept open over a disk snapshot.
>=20
> Do they *have* to be different? Of course not. For many devices the "save=
"=20
quoted text > and "freeze" operations will likely all be no-ops, and there would be=20
> absolutely no difference between suspending and snapshotting, because the=
=20
quoted text > driver state already natively contains all the information needed to get=20
> the device going again.
>=20
> Equally, I don't doubt that in many drivers you'll have very similar "sav=
e=20
quoted text > state" logic, but in fact I believe that in many cases that "save state"=20
> logic will often just be a simple
>=20
> pci_save_state(dev);
>=20
> call, so it's literally the case that they will not be just shared betwee=
n=20
quoted text > the "suspend" and "snapshot" case, they'll be shared across all simple PC=
I=20
quoted text > devices too!
>=20
> But that doesn't mean that the functions to do so should be the same. You=
=20
quoted text > might have
>=20
> static int mypcidevice_suspend(struct pci_dev *dev)
> {
> pci_save_state(dev);
> pci_set_power_state(dev, PCI_D3);
> return 0;
> }
>=20
> static int mupcidevice_snapshot(struct pci_dev *dev)
> {
> pci_save_state(dev);
> return 0;
> }
>=20
> and who cares if they both have that same call to a shared "save state"=20
> function? They're still totally different operations, and the fact that=20
> *some* devices may save the same things doesn't make them any more=20
> similar! See above why some devices might save totally *different* things=
=20
quoted text > for a "snapshot" vs a "suspend" event.
No disagreement here.
quoted text > > I suppose that's another point of similarity - for snapshotting, the
> > same ordering is probably needed?
>=20
> I agree that you're likely to walk the device list in the same order. The=
=20
quoted text > whole "shut down leaf devices first", "start up root devices first" is=20
> pretty fundamental.
>=20
> But that's true of reboot and device discovery too. Should that ordering=20
> mean that we should use the "discovery()" function and pass it a flag and=
=20
quoted text > say "you shouldn't discover, you should snapshot or suspend now"? No.=20
> Everybody agrees that device discovery is something different from device=
=20
quoted text > suspend. The fact that it's done in a topological order and thus they bea=
r=20
quoted text > some kind of inverse relationship to each other doesn't make them "the=20
> same".
>=20
> > > And yes, the _individual_ "save-and-suspend" events obviously needs t=
o be=20
quoted text > > > "atomic", but it's purely about that particular individual device, so=
=20
quoted text > > > there's never any cross-device issues about that.
> >=20
> > No interdependencies? I'm not sure.
>=20
> Well, we pretty much count on it, since we will *suspend* the devices at=20
> the same time. So if they had interdependencies that aren't described by=20
> the ordering we enforce, they are pretty much screwed anyway ;)
>=20
> So yes, the device list needs to be topologically sorted (and you need to=
=20
quoted text > walk it in the right direction), but apart from that we'd *better* not=20
> have any interdependencies, or we simply cannot suspend at all.
Thanks for your reply.
Nigel