I looked this stuff and found it would be difficult to not grab a=20
reference (and more important not writing to dst) in input path.
ip_rcv_finish() calls ip_route_input()
and ip_route_input() calls dst_use(&rth->u.dst, jiffies);
static inline void dst_use(struct dst_entry *dst, unsigned long time)
{
dst_hold(dst);
dst->__use++;
dst->lastuse =3D time;
}
Even if we avoid the refcount increment, I guess we need the lastuse
assignement in order to keep dst in cache. Not sure about the role of
__use field. Hum... for a tcp connection, dst refcount should already
be pinned by a sk->sk_dst_cache. Maybe test refcount value, and if this
value is > 1, dont take a reference. (given rcu_read_lock() is done
before calling ip_rcv_finish())
In the meantime, what do you think of the following patch ?
[PATCH] net: release skb->dst in sock_queue_rcv_skb()
When queuing a skb to sk->sk_receive_queue, we can release its dst, not
anymore needed.
Since current cpu did the dst_hold(), refcount is probably still hot
int this cpu caches.
This avoids readers to access the original dst to decrement its refcount,=
possibly a long time after packet reception. This should speedup UDP
and RAW receive path.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>