Re: [PATCH] [Bug 16494] NFS client over TCP hangs due to packet loss

Previous thread: Re: [PATCH] usb gadget: don't save bind callback in struct usb_gadget_driver by Uwe Kleine-König on Tuesday, August 3, 2010 - 12:40 am. (1 message)

Next thread: [GIT PULL] KVM updates for the 2.6.36 merge window by Avi Kivity on Tuesday, August 3, 2010 - 1:30 am. (1 message)
From: Andy Chittenden
Date: Tuesday, August 3, 2010 - 1:14 am

I don't know whether this patch is the correct fix or not but it enables the
NFS client to recover.

Kernel version: 2.6.34.1 and 2.6.32.

Fixes <https://bugzilla.kernel.org/show_bug.cgi?id=16494>. It clears down
any previous shutdown attempts so that reconnects on a socket that's been
shutdown leave the socket in a usable state (otherwise tcp_sendmsg() returns
-EPIPE).

# diff -up /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c
net/ipv4
--- /home/company/software/src/linux-2.6.34.1/net/ipv4/tcp_output.c
2010-07-27 08:46:46.917000000 +0100
+++ net/ipv4/tcp_output.c       2010-07-27 09:19:16.000000000 +0100
@@ -2522,6 +2522,13 @@ static void tcp_connect_init(struct sock
        struct tcp_sock *tp = tcp_sk(sk);
        __u8 rcv_wscale;

+       /* clear down any previous shutdown attempts so that
+        * reconnects on a socket that's been shutdown leave the
+        * socket in a usable state (otherwise tcp_sendmsg() returns
+        * -EPIPE).
+        */
+       sk->sk_shutdown = 0;
+
        /* We'll fix this up when we get a response from the other end.
         * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
         */

Signed-off-by: Andy Chittenden <andyc.bluearc@gmail.com>



--

From: David Miller
Date: Tuesday, August 3, 2010 - 1:21 am

From: "Andy Chittenden" <andyc.bluearc@gmail.com>

If the SunRPC code wants to close a TCP socket then use it again,
it should disconnect by doing a connect() with sa_family == AF_UNSPEC
--

From: Andrew Morton
Date: Tuesday, August 3, 2010 - 2:11 am

(cc linux-nfs)

--

From: Andy Chittenden
Date: Tuesday, August 3, 2010 - 3:25 am

There is code to do that in the SunRPC code in xs_abort_connection() but 
that's conditionally called from xs_tcp_reuse_connection():

static void xs_tcp_reuse_connection(struct rpc_xprt *xprt, struct 
sock_xprt *transport)
{
	unsigned int state = transport->inet->sk_state;

	if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED)
		return;
	if ((1 << state) & (TCPF_ESTABLISHED|TCPF_SYN_SENT))
		return;
	xs_abort_connection(xprt, transport);
}

That's changed since 2.6.26 where it unconditionally did the connect() 
with sa_family == AF_UNSPEC. FWIW we cannot reproduce this problem with 
2.6.26.


--

From: Andy Chittenden
Date: Thursday, August 5, 2010 - 7:55 am

The problem is fixed with this patch which also prints out that sk_shutdown
can be non-zero on entry to xs_tcp_reuse_connection:

# diff -up /home/company/software/src/linux-2.6.34.2/net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c 
--- /home/company/software/src/linux-2.6.34.2/net/sunrpc/xprtsock.c
2010-08-02 18:30:51.000000000 +0100
+++ net/sunrpc/xprtsock.c       2010-08-05 12:21:11.000000000 +0100
@@ -1322,10 +1322,11 @@ static void xs_tcp_state_change(struct s
        if (!(xprt = xprt_from_sock(sk)))
                goto out;
        dprintk("RPC:       xs_tcp_state_change client %p...\n", xprt);
-       dprintk("RPC:       state %x conn %d dead %d zapped %d\n",
+       dprintk("RPC:       state %x conn %d dead %d zapped %d sk_shutdown
%d\n",
                        sk->sk_state, xprt_connected(xprt),
                        sock_flag(sk, SOCK_DEAD),
-                       sock_flag(sk, SOCK_ZAPPED));
+                       sock_flag(sk, SOCK_ZAPPED),
+                       sk->sk_shutdown);
 
        switch (sk->sk_state) {
        case TCP_ESTABLISHED:
@@ -1796,10 +1797,18 @@ static void xs_tcp_reuse_connection(stru
 {
        unsigned int state = transport->inet->sk_state;
 
-       if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED)
-               return;
-       if ((1 << state) & (TCPF_ESTABLISHED|TCPF_SYN_SENT))
-               return;
+       if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED)
{
+               if (transport->inet->sk_shutdown == 0)
+                       return;
+               printk("%s: TCP_CLOSEd and sk_shutdown set to %d\n",
+                       __func__, transport->inet->sk_shutdown);
+       }
+       if ((1 << state) & (TCPF_ESTABLISHED|TCPF_SYN_SENT)) {
+               if (transport->inet->sk_shutdown == 0)
+                       return;
+               printk("%s: sk_shutdown set to %d\n",
+                       __func__, transport->inet->sk_shutdown);
+       }
        ...
From: Trond Myklebust
Date: Thursday, August 5, 2010 - 12:50 pm

Hi Andy,

I note that you are adding in two new printk()s. Why should they be
printk(), and not dprintk()? Are you trying to report an exception that
the user needs to be aware of, or is this only debugging info that we'll
want to turn off under normal operation?

Also, it might be useful to add a comment to the code here to remind us
what the 'sk_shutdown == 0' case corresponds to as far as the socket
state is concerned, so that the casual reader can see why we shouldn't
reset the connection.

Cheers
  Trond
--

From: Andy Chittenden
Date: Friday, August 6, 2010 - 2:30 am

Hi Trond


If I knew what sk_shutdown == 0 really corresponded to, I could well add a comment! :-). I just knew that in 2.6.26 we didn't see this problem and that in later kernels the connection abort sequence was being done conditionally and that the sk_shutdown flag being left set was making tcp_sendmsg return an error. So, putting two and two together, I've effectively just added another condition in which to abort the connection.

As nobody has objected to the essence of my patch, I'll attempt a new patch that changes those printk()s into dprintk() and drop in what I think are appropriate comments. So here's a revised patch:

# diff -up /home/company/software/src/linux-2.6.34.2/net/sunrpc/xprtsock.c net/sunrpc/xprtsock.c 
--- /home/company/software/src/linux-2.6.34.2/net/sunrpc/xprtsock.c     2010-08-02 18:30:51.000000000 +0100
+++ net/sunrpc/xprtsock.c       2010-08-06 08:09:08.000000000 +0100
@@ -1322,10 +1322,11 @@ static void xs_tcp_state_change(struct s
        if (!(xprt = xprt_from_sock(sk)))
                goto out;
        dprintk("RPC:       xs_tcp_state_change client %p...\n", xprt);
-       dprintk("RPC:       state %x conn %d dead %d zapped %d\n",
+       dprintk("RPC:       state %x conn %d dead %d zapped %d sk_shutdown %d\n",
                        sk->sk_state, xprt_connected(xprt),
                        sock_flag(sk, SOCK_DEAD),
-                       sock_flag(sk, SOCK_ZAPPED));
+                       sock_flag(sk, SOCK_ZAPPED),
+                       sk->sk_shutdown);
 
        switch (sk->sk_state) {
        case TCP_ESTABLISHED:
@@ -1796,10 +1797,25 @@ static void xs_tcp_reuse_connection(stru
 {
        unsigned int state = transport->inet->sk_state;
 
-       if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED)
-               return;
-       if ((1 << state) & (TCPF_ESTABLISHED|TCPF_SYN_SENT))
-               return;
+       if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED) {
+               /* we don't need ...
From: Andy Chittenden
Date: Monday, August 9, 2010 - 2:27 am

A weekend run with that patch applied to 2.6.34.2 was successful. As nobody has objected, what's the next step to getting it applied to the official source trees?

--

From: Trond Myklebust
Date: Monday, August 9, 2010 - 9:55 am

Please resend me a version with a cleaned up changelog entry. I can then
push it as a bugfix.

Cheers
  Trond
--

From: Andy Chittenden
Date: Tuesday, August 10, 2010 - 1:40 am

Thanks. I think this sums it up:

SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)

When reusing a TCP connection, ensure that it's aborted if a previous shutdown attempt has been made on that connection so that the RPC over TCP recovery mechanism succeeds.

# diff -up /home/company/software/src/linux-2.6.34.2/net/sunrpc/xprtsock.c net/sunrpc/xprtsock.c 
--- /home/company/software/src/linux-2.6.34.2/net/sunrpc/xprtsock.c     2010-08-02 18:30:51.000000000 +0100
+++ net/sunrpc/xprtsock.c       2010-08-06 08:09:08.000000000 +0100
@@ -1322,10 +1322,11 @@ static void xs_tcp_state_change(struct s
        if (!(xprt = xprt_from_sock(sk)))
                goto out;
        dprintk("RPC:       xs_tcp_state_change client %p...\n", xprt);
-       dprintk("RPC:       state %x conn %d dead %d zapped %d\n",
+       dprintk("RPC:       state %x conn %d dead %d zapped %d sk_shutdown %d\n",
                        sk->sk_state, xprt_connected(xprt),
                        sock_flag(sk, SOCK_DEAD),
-                       sock_flag(sk, SOCK_ZAPPED));
+                       sock_flag(sk, SOCK_ZAPPED),
+                       sk->sk_shutdown);
 
        switch (sk->sk_state) {
        case TCP_ESTABLISHED:
@@ -1796,10 +1797,25 @@ static void xs_tcp_reuse_connection(stru
 {
        unsigned int state = transport->inet->sk_state;
 
-       if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED)
-               return;
-       if ((1 << state) & (TCPF_ESTABLISHED|TCPF_SYN_SENT))
-               return;
+       if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED) {
+               /* we don't need to abort the connection if the socket
+                * hasn't undergone a shutdown
+                */
+               if (transport->inet->sk_shutdown == 0)
+                       return;
+               dprintk("RPC:       %s: TCP_CLOSEd and sk_shutdown set to %d\n",
+                       __func__, transport->inet->sk_shutdown);
+       }
+    ...
Previous thread: Re: [PATCH] usb gadget: don't save bind callback in struct usb_gadget_driver by Uwe Kleine-König on Tuesday, August 3, 2010 - 12:40 am. (1 message)

Next thread: [GIT PULL] KVM updates for the 2.6.36 merge window by Avi Kivity on Tuesday, August 3, 2010 - 1:30 am. (1 message)