Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: David Miller
Date: Saturday, June 14, 2008 - 6:07 pm

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sat, 14 Jun 2008 17:41:24 -0700 (PDT)


I agree with the gist of your analysis.

And it seems that Apache does try to use the deferred accept socket
option.  So we may indeed have a hit on this IA64 bug.

The wording in the report about versions is a little confusing:

	With kernel 2.6.26-rc5 and a git kernel just between rc4
	and rc5, my kernel panic...

Does this mean that the problem appeared between rc4 and rc5?  Or
that all 2.6.26-rcX releases have the problem?  That's an important
fact because the change in question showed up in 2.6.26-rc1, as it
came in the inital networking merge for the 2.6.26 merge window.


Because of the requirements to trigger the new code, this case is
not likely to match the revert.  SSH absolutely does not use the
deferred accept socket option.

Let's look at the change in question.

Every single code path touched in the data paths are guarded
with "tp->defer_tcp_accept.request" which will be NULL unless
1) defer-accept socket option enabled and 2) a new connection
got queued up there.

Nothing about the normal accept queue handling got modified by those
changes which were reverted.

And note that this means the behavior change only hits listening
sockets.  So if we have a report that client outgoing SSH
connections hang with the current kernel, that report cannot
reasonably match this revert.

I also anticipate that if this change could trigger problems for
non-deferred-accept cases, we'd see a ton more reports than we have.

And we did some research and one of the only major servers that use
this obscure defer-accept feature is distcc and apache.  It is this
element of Ingo's bug report (that he uses distcc heavily and it was a
distcc socket which hung) that helped us narrow things down.

The SSH report clearly states "With kernel 2.6.26-rc5, ssh connections
to _remote_ servers randomly hang".  So this is a report about SSH
client connections under 2.6.26-rc5, not SSH server connections and
therefore not listening sockets.

So right now I'd say that the IA64 case could definitely be a match
but the SSH case very much is not.
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.26-rc6-git2: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sat Jun 14, 1:04 pm)
[Bug 10493] mips BCM47XX compile error, Rafael J. Wysocki, (Sat Jun 14, 1:04 pm)
[Bug 10642] general protection fault: 0000 [1] PREEMPT SMP ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10711] BUG: unable to handle kernel paging request - ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10629] 2.6.26-rc1-$sha1: RIP __d_lookup+0x8c/0x160, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10764] some serial configurations are now broken, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10725] Write protect on on, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10741] bug in `tty: BKL pushdown'?, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10730] build issue #503 for v2.6.26-rc2-433-gf26a398 ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10726] x86-64 NODES_SHIFT compile failure., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10724] ACPI: EC: GPE storm detected, disabling EC GPE, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10714] Badness seen on 2.6.26-rc2 with lockdep enabled, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10861] 2.6.26-rc4-git2 - long pause during boot, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10860] total system freeze at boot with 2.6.26-rc, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10830] two different oopses with 2.6.26-rc4, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10843] Display artifacts on XOrg logout with PAT kern ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10827] 2.6.26rc4 GFS2 oops., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10826] NFS oops in 2.6.26rc4, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10821] rt25xx: lock dependency warning, association f ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10819] Fatal DMA error with b43 driver since 2.6.26, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10815] 2.6.26-rc4: RIP find_pid_ns+0x6b/0xa0, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10794] mips: CONF_CM_DEFAULT build error, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10799] sky2 general protection fault, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10786] 2.6.26-rc3 64bit SMP does not boot on J5600, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10903] ssh connections hang with 2.6.26-rc5, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10892] Sometime (often) X come out blank (black scree ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10866] /dev/rtc was missing until I disabled CONFIG_R ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10868] Oops on loading ipaq module since 2.6.26, prev ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10862] forcedeth: lockdep warning on ethtool -s, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10864] [regression][bisected] ~90,000 wakeups as of 2 ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10865] i get the following oops trying to mount an nt ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 9791] Clock is running too fast^Wslow using acpi_pm c ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10912] Regressions in the last kernels, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10908] IPF Montvale machine panic when running a netw ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10906] repeatable slab corruption with LTP msgctl08, Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
[Bug 10905] 2.6.26: x86/kernel/pci_dma.c: gfp |= __GFP_NOR ..., Rafael J. Wysocki, (Sat Jun 14, 1:12 pm)
Re: [Bug 10794] mips: CONF_CM_DEFAULT build error, Adrian Bunk, (Sat Jun 14, 2:24 pm)
Re: [Bug 10493] mips BCM47XX compile error, Adrian Bunk, (Sat Jun 14, 2:26 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Linus Torvalds, (Sat Jun 14, 2:42 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Maciej W. Rozycki, (Sat Jun 14, 3:00 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Vegard Nossum, (Sat Jun 14, 3:09 pm)
Re: [Bug 10714] Badness seen on 2.6.26-rc2 with lockdep en ..., Benjamin Herrenschmidt, (Sat Jun 14, 4:29 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, David Miller, (Sat Jun 14, 4:31 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Linus Torvalds, (Sat Jun 14, 5:00 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Linus Torvalds, (Sat Jun 14, 5:41 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, David Miller, (Sat Jun 14, 6:07 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Linus Torvalds, (Sat Jun 14, 7:15 pm)
Re: [Bug 10861] 2.6.26-rc4-git2 - long pause during boot, Chris Clayton, (Sat Jun 14, 11:12 pm)
Re: [Bug 10864] [regression][bisected] ~90,000 wakeups as ..., Németh Márton, (Sat Jun 14, 11:30 pm)
Re: [Bug 10868] Oops on loading ipaq module since 2.6.26, ..., Adam Williamson, (Sat Jun 14, 11:44 pm)
Re: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Romano Giannetti, (Sun Jun 15, 3:22 am)
Re: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Rafael J. Wysocki, (Sun Jun 15, 3:40 am)
Re: [Bug 10864] [regression][bisected] ~90,000 wakeups as ..., Rafael J. Wysocki, (Sun Jun 15, 3:48 am)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jun 15, 3:51 am)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Rafael J. Wysocki, (Sun Jun 15, 4:14 am)
Re: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Romano Giannetti, (Sun Jun 15, 4:25 am)
Re: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Romano Giannetti, (Sun Jun 15, 4:26 am)
Re: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Romano Giannetti, (Sun Jun 15, 4:35 am)
Re: [Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Sun Jun 15, 9:35 am)
Re: [Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Yinghai Lu, (Sun Jun 15, 12:18 pm)
RE: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Siddha, Suresh B, (Sun Jun 15, 12:29 pm)
Re: 2.6.26-rc6-git2: Reported regressions from 2.6.25, Maciej W. Rozycki, (Sun Jun 15, 4:31 pm)
Re: [Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Sun Jun 15, 6:11 pm)
Re: [Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Yinghai Lu, (Sun Jun 15, 9:12 pm)
Re: [Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Yinghai Lu, (Sun Jun 15, 9:15 pm)
Re: [Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Sun Jun 15, 10:14 pm)
Re: [Bug 10741] bug in `tty: BKL pushdown'?, Johannes Weiner, (Mon Jun 16, 3:03 am)
Re: [Bug 10741] bug in `tty: BKL pushdown'?, Alan Cox, (Mon Jun 16, 3:33 am)
Re: [Bug 10905] 2.6.26: x86/kernel/pci_dma.c: gfp |= __GFP ..., Miquel van Smoorenburg, (Mon Jun 16, 4:26 am)
Re: [Bug 10741] bug in `tty: BKL pushdown'?, Alan Cox, (Mon Jun 16, 4:46 am)
Re: [Bug 10905] 2.6.26: x86/kernel/pci_dma.c: gfp |= __GFP ..., Rafael J. Wysocki, (Mon Jun 16, 6:19 am)
Re: [Bug 10872] x86_64 boot hang when CONFIG_NUMA=n, Randy Dunlap, (Mon Jun 16, 8:32 am)
Re: [Bug 10741] bug in `tty: BKL pushdown'?, Johannes Weiner, (Mon Jun 16, 8:33 am)
Re: [Bug 10741] bug in `tty: BKL pushdown'?, Alan Cox, (Mon Jun 16, 11:22 am)
Re: [Bug 10741] bug in `tty: BKL pushdown'?, Johannes Weiner, (Thu Jun 19, 4:06 am)
Re: [Bug 10827] 2.6.26rc4 GFS2 oops., Adrian Bunk, (Sun Jun 22, 2:09 am)
Re: [Cluster-devel] Re: [Bug 10827] 2.6.26rc4 GFS2 oops., Rafael J. Wysocki, (Mon Jun 23, 10:02 am)
RE: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Pallipadi, Venkatesh, (Fri Sep 12, 5:43 am)
RE: [Bug 10843] Display artifacts on XOrg logout with PAT ..., Pallipadi, Venkatesh, (Fri Sep 12, 9:05 am)