Adrian Bunk posted a list of known regressions in the latest 2.6.20-rc4 Linux kernel compared to the previous 2.6.19 stable release [story]. In two emails, he listed six regressions that don't have fixes yet, and six regressions with fixes that haven't been merged yet.
In another email thread, Linux creator Linus Torvalds noted that his goal for 2.6.20 is to focus primarily on stability. He also noted that he intends to release the stable kernel at some point after linux.conf.au which is happening this year in Sydney, Australia between January 15th and 20th. He explains, "hopefully 'final -rc' before LCA, but I'll do the actual 2.6.20 release afterwards. I don't want to have a merge window during LCA, as I and many others will all be out anyway. So it's much better to have LCA happen during the end of the stabilization phase when there's hopefully not a lot going on. (Of course, often at the end of the stabilization phase there is all the 'ok, what about regression XyZ?' panic)"
From: Adrian Bunk [email blocked] To: Linus Torvalds [email blocked], Andrew Morton [email blocked] Subject: 2.6.20-rc4: known unfixed regressions (v2) Date: Tue, 9 Jan 2007 06:25:10 +0100 This email lists some known regressions in 2.6.20-rc4 compared to 2.6.19 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : BUG: at mm/truncate.c:60 cancel_dirty_page() (reiserfs) References : http://lkml.org/lkml/2007/1/7/117 Submitter : Malte Schröder [email blocked] Status : unknown Subject : BUG: at fs/inotify.c:172 set_dentry_child_flags() References : http://bugzilla.kernel.org/show_bug.cgi?id=7785 Submitter : Cijoml Cijomlovic Cijomlov [email blocked] Status : unknown Subject : BUG: scheduling while atomic: hald-addon-stor/... cdrom_{open,release,ioctl} in trace References : http://lkml.org/lkml/2006/12/26/105 http://lkml.org/lkml/2006/12/29/22 http://lkml.org/lkml/2006/12/31/133 Submitter : Jon Smirl [email blocked] Damien Wyart <damien.wyart@free.fr> Aaron Sethman [email blocked] Status : unknown Subject : problems with CD burning References : http://www.spinics.net/lists/linux-ide/msg06545.html Submitter : Uwe Bugla <uwe.bugla@gmx.de> Status : unknown Subject : USB keyboard unresponsive after some time References : http://lkml.org/lkml/2006/12/25/35 http://lkml.org/lkml/2006/12/26/106 Submitter : Florin Iucha [email blocked] Handled-By : Jiri Kosina [email blocked] Status : problem is being debugged Subject : Acer Extensa 3002 WLMi: 'shutdown -h now' reboots the system References : http://lkml.org/lkml/2006/12/25/40 Submitter : Berthold Cogel [email blocked]-koeln.de> Handled-By : Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> Status : problem is being debugged
From: Linus Torvalds [email blocked] Subject: Re: 2.6.20-rc4: known unfixed regressions (v2) Date: Tue, 9 Jan 2007 09:58:19 -0800 (PST) On Tue, 9 Jan 2007, Adrian Bunk wrote: > > Subject : BUG: at mm/truncate.c:60 cancel_dirty_page() (reiserfs) > References : http://lkml.org/lkml/2007/1/7/117 > Submitter : Malte Schröder [email blocked] > Status : unknown Adrian, this is also available as http://lkml.org/lkml/2007/1/5/308 But, at worst, I don't think this is a show-stopper (oh, well: I actually liked it better when "WARN_ON()" said _warning_, not BUG, since it separates out the two cases visually much better, but others disagreed. Crud). It does show that something is wrong in reiserfs-land, although probably not any worse than it ever was before, so in that sense this is not a "regression", it's actually an _improvement_. Now it warns about reiserfs trying to clear the dirty bit on a page cache that is still mapped (and that _may_ be dirty in the page tables, although it almost certainly isn't in practice). That warning just didn't exist before. Now, that said, the call stack is interestign: BUG: at mm/truncate.c:60 cancel_dirty_page() [<c0137371>] cancel_dirty_page+0x45/0x7b [<df944b18>] reiserfs_cut_from_item+0x7cc/0x7fd [reiserfs] [<c01e5eba>] __kfree_skb+0x9b/0xf7 [<df9316a0>] make_cpu_key+0x3f/0x46 [reiserfs] [<df944efa>] reiserfs_do_truncate+0x3b1/0x515 [reiserfs] [<df949901>] journal_begin+0x3f/0xd0 [reiserfs] [<df9322fc>] reiserfs_truncate_file+0x1c1/0x2ad [reiserfs] [<df938172>] reiserfs_file_release+0x35f/0x379 [reiserfs] [<c013be42>] free_pgtables+0x70/0x7c [<c01491f1>] __fput+0xa5/0x14d [<c0146e7a>] filp_close+0x51/0x58 [<c0147de8>] sys_close+0x55/0x8a [<c0102ab2>] sysenter_past_esp+0x5f/0x85 in that a final "sys_close()" that releases the file and causes it to be truncated (which is apparently what is going on) should NOT have any mappings of that file active any more! If there are mappings active, the reiserfs_truncate_file() thing should have been delayed until the mappins are gone! So something interesting is definitely going on, but I don't know exactly what it is. Why does reiserfs do the truncate as part of a close, if the same inode is actually mapped somewhere else? And if it's a race with two different CPU's (one doing a "munmap()" and the other doing a "close()", then the unmap should _still_ have actually unmapped the pages before it actually did _its_ "release()" call. In general, a filesystem should never do a truncate at "release()" time _anyway_. It should do it at "drop_inode" time. So I think this does show some confusion in reiserfs, but it's not anything new. The only new thing is that the _message_ happens. So I don't personally consider this a regression. Just a sign of old and preexisting confusion that is now uncovered by new code (and it will print out the scary message at most four times, and then stop complaining about it. So apart from the scary message, nothing new and bad has really happened). Linus
From: Malte Schröder [email blocked] Subject: Re: 2.6.20-rc4: known unfixed regressions (v2) Date: Tue, 9 Jan 2007 19:08:40 +0100 On Tuesday 09 January 2007 18:58, Linus Torvalds wrote: > On Tue, 9 Jan 2007, Adrian Bunk wrote: > > Subject : BUG: at mm/truncate.c:60 cancel_dirty_page() (reiserfs) > > References : http://lkml.org/lkml/2007/1/7/117 > > Submitter : Malte Schröder [email blocked] > > Status : unknown > > Adrian, this is also available as > > http://lkml.org/lkml/2007/1/5/308 > > But, at worst, I don't think this is a show-stopper (oh, well: I actually > liked it better when "WARN_ON()" said _warning_, not BUG, since it > separates out the two cases visually much better, but others disagreed. > Crud). --8<-- > So something interesting is definitely going on, but I don't know exactly > what it is. Why does reiserfs do the truncate as part of a close, if the > same inode is actually mapped somewhere else? And if it's a race with two > different CPU's (one doing a "munmap()" and the other doing a "close()", > then the unmap should _still_ have actually unmapped the pages before it > actually did _its_ "release()" call. This was on a single core. But with CONFIG_PREEMPT_VOLUNTARY=y. It didn't happen again since then. > > In general, a filesystem should never do a truncate at "release()" time > _anyway_. It should do it at "drop_inode" time. > > So I think this does show some confusion in reiserfs, but it's not > anything new. The only new thing is that the _message_ happens. > > So I don't personally consider this a regression. Just a sign of old and > preexisting confusion that is now uncovered by new code (and it will print > out the scary message at most four times, and then stop complaining about > it. So apart from the scary message, nothing new and bad has really > happened). I also didn't reboot the machine afterwards and did not notice any problems beside that one message. -- --------------------------------------- Malte Schröder MalteSch@gmx.de ICQ# 68121508 ---------------------------------------
From: Linus Torvalds [email blocked] Subject: Re: 2.6.20-rc4: known unfixed regressions (v2) Date: Tue, 9 Jan 2007 10:30:43 -0800 (PST) On Tue, 9 Jan 2007, Malte Schröder wrote: > > > So something interesting is definitely going on, but I don't know exactly > > what it is. Why does reiserfs do the truncate as part of a close, if the > > same inode is actually mapped somewhere else? And if it's a race with two > > different CPU's (one doing a "munmap()" and the other doing a "close()", > > then the unmap should _still_ have actually unmapped the pages before it > > actually did _its_ "release()" call. > > This was on a single core. But with CONFIG_PREEMPT_VOLUNTARY=y. > It didn't happen again since then. Yeah, PREEMPT would be able to show most races like this too. In fact, some races show up much better with preemption than they do with real SMP. But I haven't looked at what exactly reiserfs does. I did check that the VM layer definitely does the remove_vma() stuff (that actually closes the files) _after_ it has unmapped everything. It would have surprised me if we had had that kind of bug, but still.. Linus
From: Adrian Bunk [email blocked] Subject: 2.6.20-rc4: known regressions with patches (v2) Date: Tue, 9 Jan 2007 06:51:01 +0100 This email lists some known regressions in 2.6.20-rc4 compared to 2.6.19 with patches available. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : BUG: at mm/truncate.c:60 cancel_dirty_page() (XFS) References : http://lkml.org/lkml/2007/1/5/308 Submitter : Sami Farin [email blocked] Handled-By : David Chinner [email blocked] Patch : http://lkml.org/lkml/2007/1/7/201 Status : patch available Subject : bluetooth oopses because of multiple kobject_add() References : http://lkml.org/lkml/2007/1/2/101 Submitter : Pavel Machek [email blocked] Handled-By : Marcel Holtmann [email blocked] Patch : http://lkml.org/lkml/2007/1/2/147 Status : patch available Subject : ftp: get or put stops during file-transfer References : http://lkml.org/lkml/2006/12/16/174 Submitter : Komuro [email blocked] Caused-By : YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> commit cfb6eeb4c860592edd123fdea908d23c6ad1c7dc Handled-By : Craig Schlenter [email blocked] YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Patch : http://lkml.org/lkml/2007/1/9/5 Status : patch available Subject : nf_conntrack_netbios_ns.c causes Oops References : http://lkml.org/lkml/2007/1/7/188 Submitter : Peter Osterlund [email blocked] Caused-By : Patrick McHardy [email blocked] commit 92703eee4ccde3c55ee067a89c373e8a51a8adf9 Handled-By : Patrick McHardy [email blocked] Patch : http://lkml.org/lkml/2007/1/8/290 Status : patch available Subject : forcedeth.c 0.59: problem with sideband managment References : http://bugzilla.kernel.org/show_bug.cgi?id=7684 Submitter : Michael Reske [email blocked] Handled-By : Ayaz Abdulla [email blocked] Patch : http://bugzilla.kernel.org/show_bug.cgi?id=7684 Status : patch available Subject : nVidia CK804 chipset: not detecting HT MSI capabilities References : http://lkml.org/lkml/2007/1/5/215 Submitter : Brice Goglin [email blocked] Robert Hancock [email blocked] Handled-By : Brice Goglin [email blocked] Patch : http://lkml.org/lkml/2007/1/5/215 Status : patch available
email address bleed through
jeremy,
looks like some email addresses are sneaking past your script/filter.
and something interesting happened with "Berthold Cogel [email blocked]-koeln.de>"
just thought i would point it out as you have always tried hard in the past to take care of that.
thanks for kerneltrap!