[TCP bug] stuck distcc connections in latest -git

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Linus Torvalds <torvalds@...>
Cc: David Miller <davem@...>, <akpm@...>, <netdev@...>, <linux-kernel@...>, Stefan Richter <stefanr@...>
Date: Tuesday, July 22, 2008 - 7:21 am

* Ingo Molnar <mingo@elte.hu> wrote:


hm, the distcc TCP hangs are back:

Distcc client box (quad, 10.0.1.16) running v2.6.24:

 dione:~> netstat -nt | grep -vw TIME_WAIT | grep 3632
 tcp        0 250455 10.0.1.16:55559             10.0.1.19:3632              ESTABLISHED
 tcp        0 254743 10.0.1.16:56096             10.0.1.19:3632              ESTABLISHED
 tcp        0 219617 10.0.1.16:55674             10.0.1.19:3632              ESTABLISHED

              [ ^--- note the stuck send-queue ]

Distcc server box (16-way, 10.0.1.19) running very-latest:

 phoenix:~> netstat -nt | grep 10.0.1.16 | grep 3632 

 tcp        0      0 10.0.1.19:3632              10.0.1.16:55559             ESTABLISHED 
 tcp        0      0 10.0.1.19:3632              10.0.1.16:56096             ESTABLISHED 
 tcp        0      0 10.0.1.19:3632              10.0.1.16:55674             ESTABLISHED 

 tcp        0      0 10.0.1.19:3632              10.0.1.16:34411             ESTABLISHED 
 tcp        0      0 10.0.1.19:3632              10.0.1.16:51094             ESTABLISHED 
 tcp        0      0 10.0.1.19:3632              10.0.1.16:60787             ESTABLISHED 
 tcp        0      0 10.0.1.19:3632              10.0.1.16:50874             ESTABLISHED 

I.e. the client side send-queue is stuck in established state, server 
side thinks it's a proper established connection. Nobody makes any 
progress.

Also note the final 4 connections on the server side - those are not 
present on the client box.

The hung condition seemed permanent (i waited a couple of minutes).

Then i shut down the distccd on the server side, which propagated to the 
client:

 distcc[18496] (dcc_pump_sendfile) ERROR: sendfile failed: Broken pipe
 distcc[18496] (dcc_readx) ERROR: unexpected eof on fd4
 distcc[18496] (dcc_r_token_int) ERROR: read failed while waiting for token "DONE"
 distcc[18496] Warning: failed to distribute kernel/futex.c to ph/20, running locally instead

Server side lingered in FIN_WAIT2 a bit:

Proto Recv-Q Send-Q Local Address               Foreign Address             State
tcp        0      0 10.0.1.19:3632              10.0.1.16:56096             FIN_WAIT2
tcp        0      0 10.0.1.19:3632              10.0.1.16:55559             FIN_WAIT2

I retried the same build 10 times and it would not reproduce - so this 
again is a hard to reproduce condition. (and there's no chance to get a 
proper tcpdump either, at these traffic levels)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[GIT]: Networking, David Miller, (Sun Jul 20, 1:44 pm)
Re: [GIT]: Networking, Ingo Molnar, (Mon Jul 21, 9:50 am)
Re: [GIT]: Networking, Stefan Richter, (Mon Jul 21, 10:15 am)
[crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 2:23 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Linus Torvalds, (Mon Jul 21, 2:35 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, David Miller, (Mon Jul 21, 3:00 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 3:44 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, David Miller, (Mon Jul 21, 4:20 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Stefan Richter, (Mon Jul 21, 3:20 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, David Miller, (Mon Jul 21, 4:11 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Stefan Richter, (Mon Jul 21, 5:26 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 2:46 pm)
[TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Tue Jul 22, 7:21 am)
Re: [TCP bug] stuck distcc connections in latest -git, David Newall, (Tue Jul 22, 9:45 am)
Re: [TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Tue Jul 22, 9:57 am)
Re: [TCP bug] stuck distcc connections in latest -git, David Newall, (Tue Jul 22, 10:54 am)
Re: [TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Tue Jul 22, 11:34 am)
Re: [TCP bug] stuck distcc connections in latest -git, Willy Tarreau, (Tue Jul 22, 5:12 pm)
Re: [TCP bug] stuck distcc connections in latest -git, Ingo Molnar, (Wed Jul 23, 4:26 am)
Re: [regression] nf_iterate(), BUG: unable to handle kernel ..., Krzysztof Oledzki, (Thu Jul 24, 2:00 pm)
Re: [crash] kernel BUG at net/core/dev.c:1328!, Ingo Molnar, (Mon Jul 21, 3:30 pm)
iwlwifi: fix build bug in "iwlwifi: fix LED stall", Ingo Molnar, (Mon Jul 21, 4:36 am)
RE: iwlwifi: fix build bug in "iwlwifi: fix LED stall", Winkler, Tomas, (Mon Jul 21, 6:02 am)
Re: iwlwifi: fix build bug in "iwlwifi: fix LED stall", Ingo Molnar, (Mon Jul 21, 6:53 am)
Re: [GIT]: Networking, Linus Torvalds, (Sun Jul 20, 8:54 pm)
Re: [GIT]: Networking, Linus Torvalds, (Sun Jul 20, 9:07 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 9:17 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 9:03 pm)
Re: [GIT]: Networking, Patrick McHardy, (Sun Jul 20, 9:20 pm)
Re: [GIT]: Networking, James Morris, (Mon Jul 21, 7:45 am)
Re: [GIT]: Networking, Patrick McHardy, (Mon Jul 21, 8:05 am)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 1:28 pm)
Re: [GIT]: Networking, Linus Torvalds, (Mon Jul 21, 1:40 pm)
Re: [GIT]: Networking, Patrick McHardy, (Mon Jul 21, 4:33 pm)
Re: [GIT]: Networking, David Miller, (Wed Jul 23, 7:42 pm)
Re: [GIT]: Networking, Stefan Richter, (Mon Jul 21, 7:28 am)
Re: [GIT]: Networking, Alexey Dobriyan, (Sun Jul 20, 9:09 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 9:14 pm)
Re: [GIT]: Networking, Alexey Dobriyan, (Sun Jul 20, 10:40 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 10:48 pm)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 1:11 am)
Re: [GIT]: Networking, Linus Torvalds, (Mon Jul 21, 12:49 pm)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 12:53 pm)
Re: [GIT]: Networking, Alexey Dobriyan, (Mon Jul 21, 7:57 am)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 11:27 am)
Re: [GIT]: Networking, Alexander Beregalov, (Mon Jul 21, 5:48 am)
Re: [GIT]: Networking, Ben Hutchings, (Mon Jul 21, 6:16 am)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 11:35 am)
Re: [GIT]: Networking, Alexander Beregalov, (Mon Jul 21, 12:04 pm)
Re: [GIT]: Networking, Alexey Dobriyan, (Sun Jul 20, 9:22 pm)
Re: [GIT]: Networking, Arjan van de Ven, (Sun Jul 20, 1:59 pm)
Re: [GIT]: Networking, David Miller, (Mon Jul 21, 4:32 pm)
Re: [GIT]: Networking, David Miller, (Sun Jul 20, 7:52 pm)