login
Header Space

 
 

[patch] sunrpc: make closing of old temporary sockets work (was: problems with lockd in 2.6.22.6)

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: <trond.myklebust@...>, <bfields@...>
Cc: <netdev@...>, <nfs@...>, <linux-kernel@...>
Date: Wednesday, September 12, 2007 - 8:07 am

Hello,

as already described old temporary sockets (client is gone) of lockd aren't
closed after some time. So, with enough clients and some time gone, there
are 80 open dangling sockets and you start getting messages of the form:

lockd: too many open TCP sockets, consider increasing the number of nfsd threads.

If I understand the code then the intention was that the server closes
temporary sockets after about 6 to 12 minutes:

	a timer is started which calls svc_age_temp_sockets every 6 minutes.

	svc_age_temp_sockets:
		if a socket is marked OLD it gets closed.
		sockets which are not marked as OLD are marked OLD

	every time the sockets receives something OLD is cleared.

But svc_age_temp_sockets never closes any socket though because it only
closes sockets with svsk->sk_inuse == 0. This seems to be a bug.

Here is a patch against 2.6.22.6 which changes the test to
svsk->sk_inuse <= 0 which was probably meant. The patched kernel runs fine
here. Unused sockets get closed (after 6 to 12 minutes)

Signed-off-by: Wolfgang Walter <wolfgang.walter@studentenwerk.mhn.de>

--- ../linux-2.6.22.6/net/sunrpc/svcsock.c	2007-08-27 18:10:14.000000000 +0200
+++ net/sunrpc/svcsock.c	2007-09-11 11:07:13.000000000 +0200
@@ -1572,7 +1575,7 @@
 
 		if (!test_and_set_bit(SK_OLD, &svsk->sk_flags))
 			continue;
-		if (atomic_read(&svsk->sk_inuse) || test_bit(SK_BUSY, &svsk->sk_flags))
+		if (atomic_read(&svsk->sk_inuse) <= 0 || test_bit(SK_BUSY, &svsk->sk_flags))
 			continue;
 		atomic_inc(&svsk->sk_inuse);
 		list_move(le, &to_be_aged);


As svc_age_temp_sockets did not do anything before this change may trigger
hidden bugs.

To be true I don't see why this check

(atomic_read(&svsk->sk_inuse) <= 0 || test_bit(SK_BUSY, &svsk->sk_flags))

is needed at all (it can only be an optimation) as this fields change after
the check. In svc_tcp_accept there is no such check when a temporary socket
is closed.


Regards,
-- 
Wolfgang Walter
Studentenwerk M
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[patch] sunrpc: make closing of old temporary sockets work (..., Wolfgang Walter, (Wed Sep 12, 8:07 am)
speck-geostationary