It's low, but why introduce an inefficiency when you can avoid doing it
for the same effort?
Measure before optimize is good for code but not for protocols.
Protocols have to be robust against future changes. Virtio is warty
enough already, we can't keep doing local optimizations.
A bounce is a bounce.
Virtio is already way too bouncy due to the indirection between the
avail/used rings and the descriptor pool. A device with out of order
completion (like virtio-blk) will quickly randomize the unused
descriptor indexes, so every descriptor fetch will require a bounce.
In contrast, if the rings hold the descriptors themselves instead of
pointers, we bounce (sizeof(descriptor)/cache_line_size) cache lines for
every descriptor, amortized.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--