Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Ingo Molnar
Date: Monday, June 2, 2008 - 2:23 am

* Eric Dumazet <dada1@cosmosbay.com> wrote:


I turned off localhost distcc two days ago and there has not been a 
single hung socket since then, so we now know it for sure that without 
localhost distcc connections, -tip's QA will not produce any hung 
sockets in about 1000 random-kernel-build+boot iterations.

i've added those reverts this morning and added back the localhost 
distcc rules - we'll see whether the hung sockets are back.


i'm wondering whether your suspicion on broken TCP timers is consistent 
with the symptoms i've seen: the hung sockets clearly produced periodic 
packet activity every 180 seconds, up to 8 hours, without ever changing 
their receive of send queue. So at least a part of the TCP timer 
mechanism for that specific stuck socket was working fine.

is there no sysctl or other debug mechanism to somehow get its full TCP 
state and the reasons for why it is stuck? I'm wondering how you debug 
broken TCP state machines without enabling testers to be able to dump 
all state and passing it to developers.

I have a clearly reproducable testcase and i'd like to help out, but the 
whole effort is stalled on 'not enough information' it appears. Doing 
random reverts might help in truly helpless situations where a bug has 
no debuggable state - but this situation seems really routine to me: 
it's very difficult to trigger the bug but once it triggers the bug 
scenario is stable and analyzable. I'd be glad to test any 
instrumentation patch that makes similar scenarios more analyzable.

	Ingo
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[bug] stuck localhost TCP connections, v2.6.26-rc3+, Ingo Molnar, (Mon May 26, 4:56 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Mon May 26, 6:28 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Mon May 26, 7:43 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Mon May 26, 7:58 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Arjan van de Ven, (Mon May 26, 9:24 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Mon May 26, 9:32 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Mon May 26, 10:08 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ingo Molnar, (Mon May 26, 11:12 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Mon May 26, 2:20 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Peter Zijlstra, (Wed May 28, 2:27 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Thu May 29, 4:14 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Evgeniy Polyakov, (Thu May 29, 6:05 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Thu May 29, 6:48 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ingo Molnar, (Fri May 30, 11:18 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Fri May 30, 2:12 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ingo Molnar, (Fri May 30, 11:09 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 4:46 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 5:18 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 5:58 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Sat May 31, 7:25 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 9:09 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 10:22 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Sat May 31, 10:58 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 11:37 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Sat May 31, 1:25 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 2:39 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Sat May 31, 2:45 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Patrick McManus, (Sat May 31, 3:46 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Sat May 31, 10:51 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Eric Dumazet, (Sat May 31, 11:04 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ingo Molnar, (Mon Jun 2, 2:23 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Tue Jun 3, 5:10 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Wed Jun 4, 4:14 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Wed Jun 4, 7:00 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Wed Jun 4, 8:09 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Fri Jun 6, 2:32 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Mon Jun 9, 12:24 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Tue Jun 10, 4:26 pm)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Ilpo Järvinen, (Wed Jun 11, 6:39 am)
Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+, Håkon Løvdal, (Wed Jun 18, 5:30 pm)