Re: [PATCH v2] Add Mergeable RX buffer feature to vhost_net

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Michael S. Tsirkin
Date: Thursday, April 1, 2010 - 3:54 am

On Wed, Mar 31, 2010 at 03:04:43PM -0700, David Stevens wrote:

Yes but please note that in practice if this is what we do,
then vhost header size is 0 and then there is no actual copy.


Yes. However, we also have an ioctl that sets vhost header size, and it
allows supporting simple backends which do not support an (equivalent
of) TUNSETVNETHDRSZ. We have two of these now: packet and macvtap.
So I thought that unless there are gains in code simplicity and/or
performance we can keep supporting these just as well.


Guest could be buggy so we'll get EFAULT.
If skb is taken off the rx queue (as below), we might get EAGAIN.


In practice, when the first iovec includes the header what
we will copy is iov[0].iov_len + iov[0].iov_base.

We also still have the problem that sendmsg might modify the iovec,
(for tap it doesn't) so using iov_len after sendmsg isn't legal,
you need to copy it anyway.


Seems more code, doesn't it? Do you really believe this is a performance gain?
It looks a bit like prepature optimization ...


I can dig up the demo, but there's also the macvtap backend. Try using that.


I just don't want to require that backends support GSO/vnet header,
so vhost needs an ability to skip the header.


Note that vhost doesn't really *know* about header, per se,
it just skips the first X bytes in iovec.


The advantage of using int is that you'll be able to return
an error code. Good for debugging.


OK.


The main reason is that long term, we might want to
detach the arrays from vq - they are there simply to
avoid sticking too much data on stack - and the
fixed size becomes limiting for vhost block.
The advantage of passing them would be that
vhost won't have to change.


Well, vhost_get_desc_n is just like vhost_get_desc but returns
multiple heads. In C you can return a single int by value
but many ints must be returned in an array, so I don't
think it is that dissimilar.


Sorry, will have to re-read.


This is the idiom I use all over this code.


I think there's some macro in kernel to shut the compiler up.
Or just ignore the warning. Let's not add extra initializers
just to shut up compiler, I think this might hide real bugs.


So why isn't it static?


Right. Sorry about the noise.


You see, receive is really a net concept. virtio has in and out,
and a single descriptor can do both.


My question is, guests can not rely on the 'available is not
enough but still not 0' to work because it relies on a heuristic
to figure out how much is 'not enough'. So why do we need it at all.


There are ordering rules that make 3 impossible.
Let's try to rethink the concept of redefining NOTIFY_ON_EMPTY.
Is it really necessary?


Well, we do not need to rerun translate_desc, we can keep
the results around, however ...


I don't understand completely. We looked into the skb, so
we know the required length?


Which guest kernel or spec are you looking at? As far as I know
F_NOTIFY_ON_EMPTY was only used for TX at some point.
I am not aware of this bit having the meaning you assign it.
I suspect that what you outline above simply means guest
can not combine F_NOTIFY_ON_EMPTY with mergeable buffers for
virtio net RX VQ.


I am kind of confused. We did a peek and have the skb length,
why do we need the maxheads?

Hmm, the iovec consumed is not just skb length, it is skb length + vnet
header. Is this it? Maybe we can just export this from socket somehow?


Note it doesn't manipluate the header, just skips it.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [PATCH v2] Add Mergeable RX buffer feature to vhost_net, Michael S. Tsirkin, (Thu Apr 1, 3:54 am)