Bug here, if bind() returns -1 (all ports are in use)
Hello Vitaly, thanks for this excellent report.
Yes, current code is really not good when all ports are in use :
We now have to scan 28232 [1] times long chains of 220 sockets.
Thats very long (but at least thread is preemptable)
In the past (before patches), only one thread was allowed to run in kernel while scanning
udp port table (we had only one global lock udp_hash_lock protecting the whole udp table).
This thread was faster because it was not slowed down by other threads.
(But the rwlock we used was responsible for starvations of writers if many UDP frames
were received)
One way to solve the problem could be to use following :
1) Raising UDP_HTABLE_SIZE from 128 to 1024 to reduce average chain lengths.
2) In bind(0) algo, use rcu locking to find a possible usable port. All cpus can run in //, without
dirtying locks. Then lock the found chain and recheck port is available before using it.
[1] replace 28232 by your actual /proc/sys/net/ipv4/ip_local_port_range values
61000 - 32768 = 28232
I will try to code a patch before this week end.
Thanks
Note : I tried to use a mutex to force only one thread in bind(0) code but got no real speedup.
But it should help if you have a SMP machine, since only one cpu will be busy in bind(0)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cf5ab05..a572407 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -155,6 +155,8 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
struct udp_hslot *hslot;
struct udp_table *udptable = sk->sk_prot->h.udp_table;
int error = 1;
+ static DEFINE_MUTEX(bind0_mutex);
+ int mutex_acquired = 0;
struct net *net = sock_net(sk);
if (!snum) {
@@ -162,6 +164,8 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
unsigned rand;
unsigned short first;
+ mutex_lock(&bind0_mutex);
+ mutex_acquired = 1;
inet_get_local_port_range(&low, &high);
remaining = (high - low) + 1;
@@ -196,6 +200,8 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
fail_unlock:
spin_unlock_bh(&hslot->lock);
fail:
+ if (mutex_acquired)
+ mutex_unlock(&bind0_mutex);
return error;
}
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html