On Thu, Feb 19, 2009 at 09:48:04PM +1030, Rusty Russell wrote:
Ok. There currently isn't an interface to access more than 32 bits
anyway.
I'm not connecting two guests directly. My eventual setup will have a
single x86 computer (the host) and many guest systems. I don't care if
the guests cannot communicate between each other, just that they can
communicate with the host.
I wanted to avoid the extra trip to userspace, so I just connected two
instances of virtio_net together. This way you just recv packets in the
kernel, rather than jumping to userspace and then using TAP/TUN to drive
packets back into the kernel. Plus, I have no idea how I would do a
userspace interface. I'd definitely need help.
I have no need to use virtio_blk, so I pretty much ignored it. In fact, I
didn't make any attempt to support RO and WO buffers in the same queue.
Virtio_net only uses queues this way, and it was much easier for me to
wrap my head around.
I don't think that virtio_console is symmetric either, but I haven't
really studied it. I was thinking about implementing a virtio_uart which
would be symmetric. That would be plenty for my needs.
It's not that virtio_net doesn't understand chained buffers, it just
doesn't write them. Grep for uses of the num_buffers field in
virtio_net. It uses them in recv, it just doesn't write them in xmit.
It assumes that add_buf() can accept something like:
idx address len flags next
0 XXXXXXX 12 N 1
1 XXXXXXX 8000 - 2
That would mean it can shove an 8000 byte packet into the virtqueue. It
doesn't have any way of knowing to split packets up into chunks, nor how
many chunks are available. It assumes that the receiver can read from
any address on the sender.
I think that this is a perfectly reasonable assumption in a shared
memory system, but it breaks down in my case. I cannot just tell the
host "the packet data is at this address" because it cannot do DMA. I
have to use the guest system to do DMA. The host has to have
pre-allocated the recv memory so the DMA engine has somewhere to copy
the data to.
Maybe I'm explaining this poorly, but try to think about it this way:
1) Unlike a virtual machine, both systems are NOT sharing memory
2) Both systems have some limited access to each other's memory
3) Both systems can write descriptors equally fast
4) Copying payload data is extremely slow for the host
5) Copying payload data is extremely fast for the guest
It would be possible to just alter virtio_net's headers in-flight to set
the number of buffers actually used. This would split the 8000 byte
packet up into two chunks, 4096 byte and 3904 byte, then set num_buffers
to 2. This would add some complexity, but I think it is probably
reasonable.
Ira
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html