This message contains a list of some regressions from 2.6.26, for which there are no fixes in the mainline I know of. If any of them have been fixed already, please let me know. If you know of any other unresolved regressions from 2.6.26, please let me know either and I'll add them to the list. Also, please let me know if any of the entries below are invalid. Each entry from the list will be sent additionally in an automatic reply to this message with CCs to the people involved in reporting and handling the issue. Listed regressions statistics: Date Total Pending Unresolved ---------------------------------------- 2008-08-16 103 47 37 2008-08-10 80 52 31 2008-08-02 47 31 20 Unresolved regressions ---------------------- Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11356 Subject : Linux 2.6.27-rc3 - build failure: undefined reference to `.lockdep_count_forward_deps' Submitter : Frans Pop <elendil@planet.nl> Date : 2008-08-16 19:11 (1 days old) References : http://marc.info/?l=linux-kernel&m=121891396320127&w=4 Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11355 Subject : Regression in 2.6.27-rc2 when cross-building the kernel Submitter : Larry Finger <Larry.Finger@lwfinger.net> Date : 2008-08-16 2:38 (1 days old) References : http://marc.info/?l=linux-kernel&m=121885432118368&w=4 Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11354 Subject : AMD Elan regression with 2.6.27-rc3 Submitter : Sean Young <sean@mess.org> Date : 2008-08-15 18:37 (2 days old) References : http://marc.info/?l=linux-kernel&m=121882578430056&w=4 Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11344 Subject : lockdep link failed Submitter : Ming Lei <tom.leiming@gmail.com> Date : 2008-08-14 9:58 (3 days old) References : http://marc.info/?l=linux-kernel&m=121870792715847&w=4 Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11343 Subject : ...
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11141 Subject : no battery or DC status - Dell i1501 Submitter : Gu Rui <chaos.proton@gmail.com> Date : 2008-07-21 19:43 (27 days old) Handled-By : Zhao Yakui <yakui.zhao@intel.com> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11207 Subject : VolanoMark regression with 2.6.27-rc1 Submitter : Zhang, Yanmin <yanmin_zhang@linux.intel.com> Date : 2008-07-31 3:20 (17 days old) References : http://marc.info/?l=linux-kernel&m=121747464114335&w=4 Handled-By : Zhang, Yanmin <yanmin_zhang@linux.intel.com> Peter Zijlstra <a.p.zijlstra@chello.nl> Dhaval Giani <dhaval@linux.vnet.ibm.com> Miao Xie <miaox@cn.fujitsu.com> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11191 Subject : 2.6.26-git8: spinlock lockup in c1e_idle() Submitter : Mikhail Kshevetskiy <mikhail.kshevetskiy@gmail.com> Date : 2008-07-24 03:22 (24 days old) References : http://lkml.org/lkml/2008/7/23/317 Handled-By : Thomas Gleixner <tglx@linutronix.de> --
As of 2.6.26-rc3-git3 bug still exist. It affect both i386 and x86_64 architectures. Mikhail On Sat, 16 Aug 2008 21:02:46 +0200 (CEST) --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11189 Subject : sky2 WOL broken Submitter : Rafael J. Wysocki <rjw@sisk.pl> Date : 2008-07-20 0:20:10 (28 days old) References : http://marc.info/?l=linux-next&m=121651311115104&w=4 Handled-By : Stephen Hemminger <shemminger@vyatta.com> Rafael J. Wysocki <rjw@sisk.pl> Patch : http://marc.info/?l=linux-kernel&m=121838931923267&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11205 Subject : x86: 2.6.27-rc1 does not build with gcc-3.2.3 any more Submitter : Mikael Pettersson <mikpe@it.uu.se> Date : 2008-07-30 11:02 (18 days old) References : http://marc.info/?l=linux-kernel&m=121741584608240&w=4 Handled-By : Mikael Pettersson <mikpe@it.uu.se> Patch : http://marc.info/?l=linux-kernel&m=121742199419686&w=2 --
The fix is now in Linus' tree. Commit 1c5b0eb66d74683e2be5da0c53e33c1f4ca982fd. --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11209 Subject : 2.6.27-rc1 process time accounting Submitter : Lukas Hejtmanek <xhejtman@ics.muni.cz> Date : 2008-07-31 10:43 (17 days old) References : http://marc.info/?l=linux-kernel&m=121750102917490&w=4 Handled-By : Peter Zijlstra <a.p.zijlstra@chello.nl> --
Should be fixed by: e26b33e9552c29c1d3fe67dc602c6264c29f5dc7 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11210 Subject : libata badness Submitter : Kumar Gala <galak@kernel.crashing.org> Date : 2008-07-31 18:53 (17 days old) References : http://marc.info/?l=linux-ide&m=121753059307310&w=4 Handled-By : Ben Dooks <ben-linux@fluff.org> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11228 Subject : p54usb broken by commit b19fa1f Submitter : Larry Finger <Larry.Finger@lwfinger.net> Date : 2008-08-02 3:06 (15 days old) References : http://marc.info/?l=linux-kernel&m=121764647801783&w=4 Handled-By : Larry Finger <Larry.Finger@lwfinger.net> Patch : http://marc.info/?l=linux-kernel&m=121779445431434&w=4 --
The fix was pushed from wireless (Linville) to networks (davem) on 8/17. --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11224 Subject : Only three cores found on quad-core machine. Submitter : Dave Jones <davej@redhat.com> Date : 2008-08-01 18:15 (16 days old) References : http://marc.info/?l=linux-kernel&m=121761475224719&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11230 Subject : Kconfig no longer outputs a .config with freshly updated defconfigs Submitter : Josh Boyer <jwboyer@linux.vnet.ibm.com> Date : 2008-08-02 16:03 (15 days old) References : http://marc.info/?l=linux-kernel&m=121769306319391&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11219 Subject : KVM modules break emergency reboot Submitter : Zdenek Kabelac <zdenek.kabelac@gmail.com> Date : 2008-08-01 20:25 (16 days old) References : http://marc.info/?l=linux-kernel&m=121762241105336&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11220 Subject : Heavy suspend and io problems in 2.6.27-rc1-00156-g94ad374 Submitter : Nico Schottelius <nico@schottelius.org> Date : 2008-07-31 21:05 (17 days old) References : http://marc.info/?l=linux-kernel&m=121753882422899&w=4 Handled-By : Rafael J. Wysocki <rjw@sisk.pl> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11215 Subject : INFO: possible recursive locking detected ps2_command Submitter : Zdenek Kabelac <zdenek.kabelac@gmail.com> Date : 2008-07-31 9:41 (17 days old) References : http://marc.info/?l=linux-kernel&m=121749737011637&w=4 Handled-By : Peter Zijlstra <a.p.zijlstra@chello.nl> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11237 Subject : corrupt PMD after resume Submitter : Alan Jenkins <alan-jenkins@tuffmail.co.uk> Date : 2008-08-02 9:51 (15 days old) References : http://marc.info/?l=linux-kernel&m=121767073424952&w=4 Handled-By : Hugh Dickins <hugh@veritas.com> --
Definitely should still be listed: Alan has verified it still happens with -rc3. I keep on going back to look at the info he's sent, to try and work out what might be happening and what to try next. Hugh --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11245 Subject : acpi error on 2.6.27-rc1+ (ACPI Error (dsobject-0501)) Submitter : Marcin Slusarz <marcin.slusarz@gmail.com> Date : 2008-08-03 18:29 (14 days old) References : http://marc.info/?l=linux-kernel&m=121778823123488&w=4 Handled-By : Zhang Rui <rui.zhang@intel.com> Zhao Yakui <yakui.zhao@intel.com> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11254 Subject : KVM: fix userspace ABI breakage Submitter : Adrian Bunk <bunk@kernel.org> Date : 21 Jul 2008 17:58:26 (0 days old) References : http://lkml.org/lkml/2008/7/21/197 Handled-By : Adrian Bunk <bunk@kernel.org> Patch : http://lkml.org/lkml/2008/7/21/197 --
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11260 Subject : Regression: USB memory stick triggers several USB resets before settling with bogus capacity Submitter : Alex Villacis Lasso <avillaci@ceibo.fiec.espol.edu.ec> Date : 2008-08-06 13:33 (11 days old) Handled-By : Hugh Dickins <hugh@veritas.com> Patch : http://marc.info/?l=linux-kernel&m=121804333614405&w=2 --
James has this fix queued in his scsi-rc-fixes for 2.6.27 http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-rc-fixes-2.6.git;a=commit;h=d211f0... but hasn't asked Linus to pull for a while: I'm hoping it'll get into -rc4. Hugh --
Yes ... sure. linux-next has slowed my push to rcs because it's in there as soon as it's in my git tree. However, give it a couple of days to test out the rest of the fixes in the tree and I'll send a push request. James --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11271 Subject : BUG: fealnx in 2.6.27-rc1 Submitter : Jaswinder Singh <jaswinderlinux@gmail.com> Date : 2008-08-05 14:58 (12 days old) References : http://marc.info/?l=linux-netdev&m=121794762016830&w=4 http://lkml.org/lkml/2008/8/10/98 Handled-By : Francois Romieu <romieu@fr.zoreil.com> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11264 Subject : Invalid op opcode in kernel/workqueue Submitter : Jean-Luc Coulon <jean.luc.coulon@gmail.com> Date : 2008-08-07 04:18 (10 days old) --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11263 Subject : Re: 2.6.27-rc2: uvcvideo WARNING after suspend to ram Submitter : Alan Jenkins <alan-jenkins@tuffmail.co.uk> Date : 2008-08-07 04:02 (10 days old) References : http://comments.gmane.org/gmane.linux.kernel/717552 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11276 Subject : build error: CONFIG_OPTIMIZE_INLINING=y causes gcc 4.2 to do stupid things Submitter : Randy Dunlap <randy.dunlap@oracle.com> Date : 2008-08-06 17:18 (11 days old) References : http://marc.info/?l=linux-kernel&m=121804329014332&w=4 http://lkml.org/lkml/2008/7/22/353 Handled-By : Bjorn Helgaas <bjorn.helgaas@hp.com> Patch : http://lkml.org/lkml/2008/7/22/364 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11272 Subject : BUG: parport_serial in 2.6.27-rc1 for NetMos Technology PCI 9835 Submitter : Jaswinder Singh <jaswinderlinux@gmail.com> Date : 2008-08-05 15:12 (12 days old) References : http://marc.info/?l=linux-kernel&m=121794900319776&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11282 Subject : Please fix x86 defconfig regression Submitter : Andi Kleen <andi@firstfloor.org> Date : 2008-08-07 20:46 (10 days old) References : http://marc.info/?l=linux-kernel&m=121814188805666&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11278 Subject : 2.6.27-rc2: Very odd top: '5124095h kthreadd' display Submitter : Grant Coady <grant_lkml@dodo.com.au> Date : 2008-08-07 7:03 (10 days old) References : http://marc.info/?l=linux-kernel&m=121809267318795&w=4 Handled-By : Peter Zijlstra <peterz@infradead.org> --
The problem is not evident in 2.6.27-rc3 Grant. --
Should be fixed by: e26b33e9552c29c1d3fe67dc602c6264c29f5dc7 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11279 Subject : 2.6.27-rc0 Power Bugs with HP/Compaq Laptops Submitter : Matt Parnell <mparnell@gmail.com> Date : 2008-08-07 14:57 (10 days old) References : http://marc.info/?l=linux-kernel&m=121812108031685&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11293 Subject : 2.6.27-rc2: suspend regression on EeePC Submitter : Alan Jenkins <alan-jenkins@tuffmail.co.uk> Date : 2008-08-06 18:59 (11 days old) References : http://thread.gmane.org/gmane.linux.kernel.kernel-testers/701 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11296 Subject : 2.6.27-rc2-git4: suspend and power off fails on Asus M3A32-MVP Submitter : Rafael J. Wysocki <rjw@sisk.pl> Date : 2008-08-09 21:21 (8 days old) References : http://marc.info/?l=linux-kernel&m=121831675111794&w=4 Handled-By : Langsdorf, Mark <mark.langsdorf@amd.com> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11308 Subject : tbench regression on each kernel release from 2.6.22 -&gt; 2.6.28 Submitter : Christoph Lameter <cl@linux-foundation.org> Date : 2008-08-11 18:36 (6 days old) References : http://marc.info/?l=linux-kernel&m=121847986119495&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11313 Subject : Plugging HDMI causes "unable to handle kernel paging request" Submitter : Rafał Miłecki <zajec5@gmail.com> Date : 2008-08-12 14:30 (5 days old) --
MjAwOC84LzE2LCBSYWZhZWwgSi4gV3lzb2NraSA8cmp3QHNpc2sucGw+Ogo+IFRoaXMgbWVzc2Fn ZSBoYXMgYmVlbiBnZW5lcmF0ZWQgYXV0b21hdGljYWxseSBhcyBhIHBhcnQgb2YgYSByZXBvcnQK PiBvZiByZWNlbnQgcmVncmVzc2lvbnMuCj4KPiBUaGUgZm9sbG93aW5nIGJ1ZyBlbnRyeSBpcyBv biB0aGUgY3VycmVudCBsaXN0IG9mIGtub3duIHJlZ3Jlc3Npb25zCj4gZnJvbSAyLjYuMjYuICBQ bGVhc2UgdmVyaWZ5IGlmIGl0IHN0aWxsIHNob3VsZCBiZSBsaXN0ZWQgYW5kIGxldCBtZSBrbm93 Cj4gKGVpdGhlciB3YXkpLgo+Cj4KPiBCdWctRW50cnkJOiBodHRwOi8vYnVnemlsbGEua2VybmVs Lm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MTEzMTMKPiBTdWJqZWN0CQk6IFBsdWdnaW5nIEhETUkgY2F1 c2VzICJ1bmFibGUgdG8gaGFuZGxlIGtlcm5lbCBwYWdpbmcgcmVxdWVzdCIKPiBTdWJtaXR0ZXIJ OiBSYWZhxYIgTWnFgmVja2kgPHphamVjNUBnbWFpbC5jb20+Cj4gRGF0ZQkJOiAyMDA4LTA4LTEy IDE0OjMwICg1IGRheXMgb2xkKQoKQnVnIHN0aWxsIGV4aXN0cyBpbiBjdXJyZW50IGdpdCAodGVz dGVkIDE1IG1pbnV0ZXMgYWdvKS4KCi0tIApSYWZhxYIgTWnFgmVja2kK --
What's your .config on this kernel, BTW?
J
--
Could you apply this patch and post the output of dmesg from booting (no
need to crash it again).
Thanks,
J
diff -r 3f465c361b3c arch/x86/mm/init_64.c
--- a/arch/x86/mm/init_64.c Wed Aug 13 20:50:10 2008 -0700
+++ b/arch/x86/mm/init_64.c Tue Aug 19 16:50:10 2008 -0700
@@ -331,6 +331,8 @@
}
if (pmd_val(*pmd)) {
+ printk("addr %lx reusing pmd %lx %016lx\n",
+ address, __pa(pmd), pmd_val(*pmd));
if (!pmd_large(*pmd))
last_map_addr = phys_pte_update(pmd, address,
end);
@@ -392,6 +394,8 @@
}
if (pud_val(*pud)) {
+ printk("addr %lx reusing pud %lx %016lx\n",
+ addr, __pa(pud), pud_val(*pud));
if (!pud_large(*pud))
last_map_addr = phys_pmd_update(pud, addr, end,
page_size_mask);
@@ -500,6 +504,8 @@
next = end;
if (pgd_val(*pgd)) {
+ printk("addr %lx reusing pgd %lx %016lx\n",
+ __pa(start), __pa(pgd), pgd_val(*pgd));
last_map_addr = phys_pud_update(pgd, __pa(start),
__pa(end), page_size_mask);
continue;
--
I mostly used openSUSE's kernel configuration. I just disabled paravirt. http://bugzilla.kernel.org/attachment.cgi?id=17329 Sure, as everything to help debugging this :) http://bugzilla.kernel.org/attachment.cgi?id=17330 -- Rafał Miłecki
Thanks, but could you post the *full* dmesg output, including the added
lines? I want to see the other things it prints around there.
That said, I don't see anything unexpected in here. It would be
interested to compare to the E820 map.
Also, what kind of machine is this? Oh, Vaio. Hm. Have you checked to
see whether there's an updated BIOS? How much memory does it have
installed?
J
--
MjAwOC84LzIwIEplcmVteSBGaXR6aGFyZGluZ2UgPGplcmVteUBnb29wLm9yZz46Cj4gUmFmYcWC IE1pxYJlY2tpIHdyb3RlOgo+PiBTdXJlLCBhcyBldmVyeXRoaW5nIHRvIGhlbHAgZGVidWdnaW5n IHRoaXMgOikKPj4gaHR0cDovL2J1Z3ppbGxhLmtlcm5lbC5vcmcvYXR0YWNobWVudC5jZ2k/aWQ9 MTczMzAKPj4KPgo+IFRoYW5rcywgYnV0IGNvdWxkIHlvdSBwb3N0IHRoZSAqZnVsbCogZG1lc2cg b3V0cHV0LCBpbmNsdWRpbmcgdGhlIGFkZGVkCj4gbGluZXM/ICBJIHdhbnQgdG8gc2VlIHRoZSBv dGhlciB0aGluZ3MgaXQgcHJpbnRzIGFyb3VuZCB0aGVyZS4KSSBhZGRlZCBmdWxsIGRtZXNnIDUg bWludXRlcyBhZnRlciBhZGRpbmcgZ3JlcHBlZDoKaHR0cDovL2J1Z3ppbGxhLmtlcm5lbC5vcmcv YXR0YWNobWVudC5jZ2k/aWQ9MTczMzEKCj4gVGhhdCBzYWlkLCBJIGRvbid0IHNlZSBhbnl0aGlu ZyB1bmV4cGVjdGVkIGluIGhlcmUuICBJdCB3b3VsZCBiZQo+IGludGVyZXN0ZWQgdG8gY29tcGFy ZSB0byB0aGUgRTgyMCBtYXAuCk9LLCBJJ2xsIGNvbXBhcmUgdGhhdCB0b21vcnJvdy4KCj4gQWxz bywgd2hhdCBraW5kIG9mIG1hY2hpbmUgaXMgdGhpcz8gIE9oLCBWYWlvLiBIbS4gIEhhdmUgeW91 IGNoZWNrZWQgdG8KPiBzZWUgd2hldGhlciB0aGVyZSdzIGFuIHVwZGF0ZWQgQklPUz8gIEhvdyBt dWNoIG1lbW9yeSBkb2VzIGl0IGhhdmUKPiBpbnN0YWxsZWQ/Ckl0J3MgU29ueSBWYWlvIEZXMTEg d2l0aCA0R0Igb2YgUkFNLiBXaWxsIGNoZWMgZm9yIEJJT1MgdXBkYXRlIHRvbW9ycm93LgoKCi0t IApSYWZhxYIgTWnFgmVja2kK --
Some notes as I pick through all the evidence so far:
- the crash is specifically because there are reserved bits set in the pmd
- the pmd is b02a00043a6001a3 in both cases
- the vaddr is ffff88013a600000 in the first crash, and
ffff81013a6d1c00 in the second
corresponding to the same large-page pmd mapping of phys page 0x13a600000
- this maps to e820 entry
BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
- the corresponding boot-time mapping is
init_memory_mapping
0100000000 - 0140000000 page 2M
kernel direct mapping tables up to 140000000 @ b000-11000
^^^^^^^^^^
addr 100000000 reusing pgd 201880 0000000000202063
last_map_addr: 140000000 end: 140000000
!!! and the memory allocated for this pagetable is:
#5 [0000008000 - 000000b000] PGTABLE ==> [0000008000 - 000000b000]
#6 [000000b000 - 000000c000] PGTABLE ==> [000000b000 - 000000c000]
^^^^^^^^^^^^^^^^^^^^^^^
IOW, it's mapping using b000-11000, but it has only reserved b000 - c000
Also, this is right in the middle of the ISA area, which seems risky.
<<<<
Bug #11237 shows the same symptom, so I'm pretty confident they're dups now.
J
--
I have marked #11313 as a duplicate of #11237. Please use the latter one from now on. Thanks, Rafael --
Yes, it's corrupt, it should be [ Sorry, I'm replying to #11313 even though we think it's dup of #11237. ] Haven't you got that backwards? My reading is that find_early_table_space set aside b000-11000 for the worst case possible, but actually only b000-c000 was needed (because most of the tables were already there): no problem. --
Drat. I think you may be right.
J
--
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11316 Subject : severe performance regression for iptables nat routing Submitter : Alex Williamson <alex.williamson@hp.com> Date : 2008-08-12 22:04 (5 days old) Handled-By : Herbert Xu <herbert@gondor.apana.org.au> Patch : http://bugzilla.kernel.org/show_bug.cgi?id=11316#c15 http://bugzilla.kernel.org/show_bug.cgi?id=11316#c16 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11323 Subject : /proc/diskstats does not contain all disk devices Submitter : Andy Ryan <genanr@emsphone.com> Date : 2008-08-13 12:12 (4 days old) Handled-By : Greg Kroah-Hartman <greg@kroah.com> Kay Sievers <kay.sievers@vrfy.org> Patch : http://bugzilla.kernel.org/attachment.cgi?id=17257&action=view --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11330 Subject : int3: 0000 in tsc_read_refs when using powernow_k7 Submitter : Mikko Vinni <mmvinni@yahoo.com> Date : 2008-08-14 04:21 (3 days old) Patch : http://bugzilla.kernel.org/show_bug.cgi?id=11330#c2 --
There is this fix
commit d554d9a4295dd0595d12eeccbc55d1f495b15176
Author: Marcin Slusarz <marcin.slusarz@gmail.com>
Date: Mon Aug 11 00:07:44 2008 +0200
x86, tsc: fix section mismatch warning
which is in x86/tip-master which fixes this issue.
I don't see the fix in the mainline tree yet.
Maybe Ingo, has it queued, for upstream ?
Ingo, other than a section mismatch warning it also fixes a real bug.
Thanks,
--
yeah, hpa queued it up into x86/urgent as well earlier today, it will go out with the next pull request. Ingo --
FYI, commit d554d9a4295d is upstream now, and will be part of -rc4. Ingo --
Thanks, I closed the bug. Rafael --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11334 Subject : myri10ge: use ioremap_wc: compilation failure on ARM Submitter : Martin Michlmayr <tbm@cyrius.com> Date : 2008-08-10 11:25 (7 days old) References : http://marc.info/?l=linux-netdev&m=121836771727632&w=2 Handled-By : Brice Goglin <brice@myri.com> --
This is still there. -- Martin Michlmayr http://www.cyrius.com/ --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11335 Subject : 2.6.27-rc2-git5 BUG: unable to handle kernel paging request Submitter : Randy Dunlap <randy.dunlap@oracle.com> Date : 2008-08-12 4:18 (5 days old) References : http://marc.info/?l=linux-kernel&m=121851477201960&w=4 Handled-By : Hugh Dickins <hugh@veritas.com> --
This should still be listed for now, it's interesting, but I doubt we'll make any progress unless it can be reproduced. Hugh --
zap_pte_range() overruns the page tables if the distance between the
start and end is not a multiple of the pagesize. Because then,
`start' will never be equal to `end' and we will keep looping.
To fix this, round the boundary addresses to exclude partial pages from
the range completely, we must not unmap them anyway.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
---
I think this patch fixes it. exit_mmap() even calls unmap_vmas() with
an ending address of -1UL which is not page-aligned in my book and on my
architecture :)
It is a similar problem to what we had with gup some weeks ago.
diff --git a/mm/memory.c b/mm/memory.c
index 1002f47..483c5d0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -896,11 +896,17 @@ unsigned long unmap_vmas(struct mmu_gather **tlbp,
long zap_work = ZAP_BLOCK_SIZE;
unsigned long tlb_start = 0; /* For tlb_finish_mmu */
int tlb_start_valid = 0;
- unsigned long start = start_addr;
+ unsigned long start;
spinlock_t *i_mmap_lock = details? details->i_mmap_lock: NULL;
int fullmm = (*tlbp)->fullmm;
struct mm_struct *mm = vma->vm_mm;
+ /* Preserve partial pages */
+ start_addr = PAGE_ALIGN(start_addr);
+ end_addr &= PAGE_MASK;
+
+ start = start_addr;
+
mmu_notifier_invalidate_range_start(mm, start_addr, end_addr);
for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) {
unsigned long end;
--
You need to take into consideration that gazillions of calls to exit_mmap(), unmap_vmas() and zap_pte_range() have been succeeding since we reworked those loops three years ago. exit_mmap() calls unmap_vmas() with a start_addr of 0 (so your patch won't help that), and the (unsigned long) end_addr of -1 is simply an upper bound on on how far the vma loop goes, it doesn't need the alignment your patch enforces. That's a great idea that overrunning a pagetable may account for Randy's apparent pagetable corruption: I (and please, you too) need to go back over the info he's given with that hypothesis in mind, it certainly fits well the fact that 6 out of 7 entries were found bad at the _start_ of a pagetable before collapsing - though OTOH I don't think it does fit with the two processes seeing similar but different corruption, or the general protection faults. But definitely worth pursuing, it hadn't crossed my mind. But if a pagetable is being overrun in that way, doesn't that mean that a vma->vm_start (or vma->vm_end?) has got corrupted, and then we'll need to work that out. vm_start and vm_end (unless corrupted) are always page aligned, and there's lots of code which assumes that: You're right that those pgd_addr_end() etc. loops have an implicit and fragile dependence on the page alignment of addr and end. They were written that way to maximize efficiency and be homogeneous across the levels, while handling the wrapped end 0 case. But both fast gup and pagewalk have stumbled on those assumptions recently. Hugh --
Hi Hugh, Now that you say it, yes, I don't see any way how the upper bound of -1UL could break it as vm_end is most probably lower than that :) However: start = max(vma->vm_start, start_addr); end = min(vma->vm_end, end_addr); The overrun *is* possible if the given ending address is lower than the vm_end. Frankly, I didn't look too much at what Randy reported. I ran off a bit quick when I saw that the fault came on an empty PMD within this code as this overrun issue was still in the back of my head and I knew there were similar loops involved. No, I have not. But an overrun condition also does not require broken VMA bounds. Yeah, especially since they could cause silent page table corruption :( In this respect, I still think that my patch has a point. Because yes, the looping depends on page aligned boundaries, but we don't check for this required dependency and values leading to overruns are able to pass Hannes --
I don't think the patch you sent had a lot of point: if there is a problem, it extends way beyond just the entry to unmap_vmas(); and really it's not the well-established loops we have to worry about, it's where people add new ones without thinking about alignment. If we put alignment BUG_ONs at the start of every such loop, yes, that would help the new ones to follow the same pattern. Or if we put alignment VM_BUG_ONs inside p?d_addr_next(), that might help too - I say VM_BUG_ONs because we don't really want to slow down the usual config, though that would then miss any cases of vma corruption in the wild. But even if we did so, it looks like we go for a long while only testing the page-aligned cases anyway (which, barring corruption, is always the case coming from vm_start and vm_end: the exceptions are things like fault addresses or atypical I/O sizes), which would not BUG anyway. As soon as someone does try the unaligned, we veer off to an unbounded loop and hit something nasty quite noisily, don't we? I do think there's a message about review and testing here, but not a great case for BUGs. Well, you didn't BUG, you enforced alignment; but if the input is wrong, you cannot tell whether to round up or round down in there, so better to BUG or WARN. Hugh --
Hi, The loops might have been there for long but the usage and input is prone to change. For example remap_pfn_range is used by drivers and it has the same alignment requirements. Perhaps an explicit comment in the kerneldoc? Iff there is even a problem with all these things, still looking through callsites, rereading your mails and thinking about it.. Hey, this thing Agreed. Well, in the unmap_vmas() case you can not unmap partial pages, so you would probably be able to guess correct. But I agree it should be up to the callsite. Hannes --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11333 Subject : Rewrite SSB DMA API breaks compilation on ARM Submitter : Martin Michlmayr <tbm@cyrius.com> Date : 2008-08-10 12:16 (7 days old) References : http://marc.info/?l=linux-wireless&m=121837082431460&w=2 --
This just got fixed by "[ARM] dma-mapping: provide sync_range APIs": http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=9dd428... -- Martin Michlmayr http://www.cyrius.com/ --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11336 Subject : 2.6.27-rc2:stall while mounting root fs Submitter : Torsten Kaiser <just.for.lkml@googlemail.com> Date : 2008-08-12 12:37 (5 days old) References : http://marc.info/?l=linux-kernel&m=121854484015909&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11338 Subject : ia64 allmodconfig on current mainline Submitter : Andrew Morton <akpm@linux-foundation.org> Date : 2008-08-12 22:06 (5 days old) References : http://marc.info/?l=linux-ia64&m=121857881314455&w=4 Handled-By : Luck, Tony <tony.luck@intel.com> Robin Holt <holt@sgi.com> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11337 Subject : Warning in during hotplug on 2.6.27-rc2-git5 Submitter : Mark Langsdorf <mark.langsdorf@amd.com> Date : 2008-08-12 21:56 (5 days old) References : http://marc.info/?l=linux-kernel&m=121857820413373&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11339 Subject : Only one of my cpus seems to powered down by cpufreq Submitter : Torsten Kaiser <just.for.lkml@googlemail.com> Date : 2008-08-13 20:18 (4 days old) References : http://marc.info/?l=linux-kernel&m=121865907511340&w=4 Handled-By : Langsdorf, Mark <mark.langsdorf@amd.com> --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11340 Subject : LTP overnight run resulted in unusable box Submitter : Alexey Dobriyan <adobriyan@gmail.com> Date : 2008-08-13 9:24 (4 days old) References : http://marc.info/?l=linux-kernel&m=121861951902949&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11343 Subject : SATA Cold Boot Problems with 2.6.27-rc[23] on nVidia 680i Submitter : Manny Maxwell <mannymax@mannymax.net> Date : 2008-08-14 4:16 (3 days old) References : http://marc.info/?l=linux-kernel&m=121868782917600&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11344 Subject : lockdep link failed Submitter : Ming Lei <tom.leiming@gmail.com> Date : 2008-08-14 9:58 (3 days old) References : http://marc.info/?l=linux-kernel&m=121870792715847&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11346 Subject : kernel BUG at arch/x86/mm/pat.c:233! Submitter : Jean Delvare <khali@linux-fr.org> Date : 2008-08-15 02:10 (2 days old) Handled-By : Andi Kleen <andi@firstfloor.org> Patch : http://bugzilla.kernel.org/attachment.cgi?id=17270&action=view --
Hi Rafael, Andi's patch still needs to be pushed to Linus. -- Jean Delvare --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11354 Subject : AMD Elan regression with 2.6.27-rc3 Submitter : Sean Young <sean@mess.org> Date : 2008-08-15 18:37 (2 days old) References : http://marc.info/?l=linux-kernel&m=121882578430056&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11341 Subject : 2.6.27-rc1 - ext4 e2fsck false prompting for fixing i_size of Inode Submitter : Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Date : 2008-08-13 6:56 (4 days old) References : http://marc.info/?l=linux-kernel&m=121861058720051&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11355 Subject : Regression in 2.6.27-rc2 when cross-building the kernel Submitter : Larry Finger <Larry.Finger@lwfinger.net> Date : 2008-08-16 2:38 (1 days old) References : http://marc.info/?l=linux-kernel&m=121885432118368&w=4 --
This message has been generated automatically as a part of a report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.26. Please verify if it still should be listed and let me know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11356 Subject : Linux 2.6.27-rc3 - build failure: undefined reference to `.lockdep_count_forward_deps' Submitter : Frans Pop <elendil@planet.nl> Date : 2008-08-16 19:11 (1 days old) References : http://marc.info/?l=linux-kernel&m=121891396320127&w=4 --
This wasn't a regression because it wasn't a kernel bug (and so by definition it existed on prior kernel versions :-). I've just checked in a fix into the e2fsprogs repository, and I've included the e2fsprogs patch in the bugzilla record for the user's convenience. - Ted --
