On Sat, 22 Dec 2007 23:30:56 PST, Andrew Morton said:I've bisected it down this far: kvm-ist-kaput.patch GOOD git-lblnet.patch git-lblnet-fixup.patch git-leds.patch git-libata-all.patch git-libata-all-fix-pata_winbond-borkage.patch git-libata-all-wtf.patch BAD and somehow, I doubt the leds or libata trees horked up networking. ;) Symptoms - semi-sporadic failures in making network connections. The test case that tripped it up was the 'make test' from the Tcl 8.5 - several of the test cases will create a listening socket, and then try to connect to it. Under 2.6.24-rc5-mm1, it works just fine, but I'm seeing hangs under -rc6-mm1. Doing a 'netstat -n -a -A inet -p' while it's hung shows me this: Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:34118 0.0.0.0:* LISTEN 2236/tcltest tcp 0 1 127.0.0.1:59460 127.0.0.1:34118 SYN_SENT 2236/tcltest Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:47842 0.0.0.0:* LISTEN 2352/tcltest tcp 0 1 127.0.0.1:46510 127.0.0.1:47842 SYN_SENT 2352/tcltest Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:47842 0.0.0.0:* LISTEN 2352/tcltest tcp 0 1 127.0.0.1:46510 127.0.0.1:47842 SYN_SENT 2352/tcltest Pretty consistent failure mode - a socket is in 'listen', and the connection gets hung in 'SYN_SENT'. There's 3 outputs listed - the first one from one run of the test case, the second 2 are some 20 seconds apart on the same run. It's pretty obvious that if you can't complete a 3-packet handshake to loopback in 20 seconds, something is hosed. However, it's apparently some sort of race/timing issue, as many *other* test cases in the Tcl test tree do in fact work OK. I already checked, it's not a slam-dunk to just 'patch -R' as there's 3 or 4 conflicts where later patches need massaging/reverting as well. It's a problem with both 'classic RCU' and 'preempt RCU' (that was my *first* guess as to the cause). Any clues/hints/advice/patches?
| Benjamin Herrenschmidt | Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway |
| Ulrich Drepper | Re: [patch 7/8] fdmap v2 - implement sys_socket2 |
| Washington Odhiambo | Weird Problem with NAT - more details |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Frans Pop | svc: failed to register lockdv1 RPC service (errno 97). |
| Jarek Poplawski | [PATCH take 2] pkt_sched: Protect gen estimators under est_lock. |
| David Miller | Re: [GIT]: Networking |
