Virtualization is about not doing that. Sometimes it's necessary (when
you have made unfixable design mistakes), but just to replace a bus,
with no advantages to the guest that has to be changed (other
hypervisors or hypervisorless deployment scenarios aren't).
Well, Xen requires pre-translation (since the guest has to give the host
(which is just another guest) permissions to access the data). So
neither is a superset of the other, they're just different.
It doesn't really matter since Xen is unlikely to adopt virtio.
You can simply use the same vector for both rx and tx and poll both at
every interrupt.
(irq window exits should only be required on a small percentage of
interrupt injections, since the guest will try to disable interrupts for
short periods only)
Can you please stop comparing userspace-based virtio hosts to
kernel-based venet hosts? We know the userspace implementation sucks.
Requiring all three exits means the guest is spending most of its time
with interrupts disabled; that's unlikely.
Thanks for the numbers. Are those 11% attributable to rx/tx
piggybacking from the same interface?
Also, 170K interupts -> 17K interrupts/sec -> 55kbit/interrupt ->
6.8kB/interrupt. Ignoring interrupt merging and assuming equal rx/tx
distribution, that's about 13kB/interrupt. Seems rather low for a
saturated link.
With standard PCI, they do not. But all modern host adapters support
MSI and they will happily give you one interrupt per queue.
Look at the vmxnet3 submission (recently posted on virtualization@).
It's a perfectly ordinary PCI NIC driver, apart from having so many 'V's
in the code. 16 rx queues, 8 tx queues, 25 MSIs, BARs for the
registers. So while the industry as a whole might disagree with me, it
seems VMware does not.
Let's do that then. Please reserve the corresponding comparisons from
your side as well.
What are scheduler coordination and non-802.x fabrics?
(avoiding infinite loop)
I think Ira said he can make vhost work?
virtio-net over pci is deployed. Replacing the backend with vhost-net
will require no guest modifications. Replacing the frontend with venet
or virt-net/vbus-pci will require guest modifications.
Obviously virtio-net isn't deployed in non-virt. But if we adopt vbus,
we have to migrate guests.
But we have to implement vbus for each guest we want to support. That
includes Windows and older Linux which has a different internal API, so
we have to port the code multiple times, to get existing functionality.
virtio-net doesn't use any pv layer.
virtio-net doesn't modify the PCI model. And if you look at vmxnet3,
they mention that it conforms to somthing called UPT, which allows
hardware vendors to implement parts of their NIC model. So vmxnet3 is
apparently suitable to both hardware and software implementations.
You can have dynamic MSI/queue routing with virtio, and each MSI can be
routed to a vcpu at will.
Do you mean interrupt priority? Well, apic allows interrupt priorities
and Windows uses them; Linux doesn't. I don't see a reason to provide
more than native hardware.
N:1 breaks down on large guests since one vcpu will have to process all
events. You could do N:M, with commands to change routings, but where's
your userspace interface? you can't tell from /proc/interrupts which
vbus interupts are active, and irqbalance can't steer them towards less
busy cpus since they're invisible to the interrupt controller.
The larger your installed base, the more difficult it is. Of course
it's doable, but I prefer not doing it and instead improving things in a
binary backwards compatible manner. If there is no choice we will bow
to the inevitable and make our users upgrade. But at this point there
is a choice, and I prefer to stick with vhost-net until it is proven
that it won't work.
One of the benefits of virtualization is that the guest model is
stable. You can live-migrate guests and upgrade the hardware
underneath. You can have a single guest image that you clone to
provision new guests. If you switch to a new model, you give up those
benefits, or you support both models indefinitely.
Note even hardware nowadays is binary compatible. One e1000 driver
supports a ton of different cards, and I think (not sure) newer cards
will work with older drivers, just without all their features.
For a new install, sure. I'm talking about existing deployments (and
those that will exist by the time vbus is ready for roll out).
virtio was certainly not pain free, needing Windows drivers, updates to
management tools (you can't enable it by default, so you have to offer
it as a choice), mkinitrd, etc. I'd rather not have to go through that
again.
No, you have to update the driver in your initrd (for Linux) or properly
install the new driver (for Windows). It's especially difficult for
Windows.
I don't want to support both virtio and vbus in parallel. There's
enough work already. If we adopt vbus, we'll have to deprecate and
eventually kill off virtio.
PCI is continuously updated, with MSI, MSI-X, and IOMMU support being
some recent updates. I'd like to ride on top of that instead of having
to clone it for every guest I support.
Right, it means you can hand off those eventfds to other qemus or other
pure userspace servers. It's more flexible.
No kvm feature will ever be exposed to a guest without userspace
intervention. It's a basic requirement. If it causes complexity (and
it does) we have to live with it.
Ah, you have a Windows venet driver?
It's the compare venet-in-kernel to virtio-in-userspace thing again.
Let's defer that until mst complete vhost-net mergable buffers, it which
time we can compare vhost-net to venet and see how much vbus contributes
to performance and how much of it comes from being in-kernel.
Since this is getting confusing to me, I'll start from scratch looking
at the vbus layers, top to bottom:
Guest side:
1. venet guest kernel driver - AFAICT, duplicates the virtio-net guest
driver functionality
2. vbus guest driver (config and hotplug) - duplicates pci, or if you
need non-pci support, virtio config and its pci bindings; needs
reimplementation for all supported guests
3. vbus guest driver (interrupt coalescing, priority) - if needed,
should be implemented as an irqchip (and be totally orthogonal to the
driver); needs reimplementation for all supported guests
4. vbus guest driver (shm/ioq) - finder grained layering than virtio
(which only supports the combination, due to the need for Xen support);
can be retrofitted to virtio at some cost
Host side:
1. venet host kernel driver - is duplicated by vhost-net; doesn't
support live migration, unprivileged users, or slirp
2. vbus host driver (config and hotplug) - duplicates pci support in
userspace (which will need to be kept in any case); already has two
userspace interfaces
3. vbus host driver (interrupt coalescing, priority) - if we think we
need it (and I don't), should be part of kvm core, not a bus
4. vbus host driver (shm) - partially duplicated by vhost memory slots
5. vbus host driver (ioq) - duplicates userspace virtio, duplicated by vhost
To me, compatible means I can live migrate an image to a new system
without the user knowing about the change. You'll be able to do that
with vhost-net.
You'll probably need to change that as you start running smp guests.
Please post your issues. I see ioeventfd/irqfd as critical kvm interfaces.
I'm missing something. Where's the pv layer for virtio-net?
Linux drivers have an abstraction layer to deal with non-pci. But the
Windows drivers are ordinary pci drivers with nothing that looks
pv-ish. You could implement virtio-net hardware if you wanted to.
(and selinux label)
It always begins with a 119-line patch and then grows, that's life.
For virt uses, I don't see the need. For non-virt, I have no opinion.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html