True. My remarks only apply to frontswap-to-hypervisor, for internally
consumed frontswap the situation is different.
So it seems a bare-metal hypervisor has less access to the bare metal
than a non-bare-metal hypervisor?
Seriously, leave the bare-metal FUD to Simon. People on this list know
that kvm and Xen have exactly the same access to the hardware (well
actually Xen needs to use privileged guests to access some of its hardware).
There's still an exit. It's much faster than a vmx/svm vmexit but still
nontrivial.
But why are we optimizing for 5 year old hardware?
It's determined by the hypervisor, same as with tmem. The guest swaps
to a virtual disk, the hypervisor places the data in RAM if it's
available, or on disk if it isn't. Write-back caching in all its glory.
You can have multiple swap devices.
wrt SR/IOV, you'll see synchronous frontswap reduce throughput. SR/IOV
will swap with <1 exit/page and DMA guest pages, while frontswap/tmem
will carry a 1 exit/page hit (even if no swap actually happens) and the
copy cost (if it does).
The API really, really wants to be asynchronous.
In-kernel compressed swap does seem to be a good match for a synchronous
API. For future memory devices, or even bare-metal buzzword-compliant
hypervisors, I disagree. An asynchronous API is required for
efficiency, and they'll all have swap capability sooner or later (kvm,
vmware, and I believe xen 4 already do).
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--