This message contains a list of some regressions from 2.6.33, for which there are no fixes in the mainline known to the tracking team. If any of them have been fixed already, please let us know. If you know of any other unresolved regressions from 2.6.33, please let us know either and we'll add them to the list. Also, please let us know if any of the entries below are invalid. Each entry from the list will be sent additionally in an automatic reply to this message with CCs to the people involved in reporting and handling the issue. Listed regressions statistics: Date Total Pending Unresolved ---------------------------------------- 2010-04-20 64 35 34 2010-04-07 48 35 33 2010-03-21 15 13 10 Unresolved regressions ---------------------- Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15812 Subject : utsname.domainname not set in x86_32 processes (causing "YPBINDPROC_DOMAIN: domain not bound" errors) Submitter : <adi@hexapodia.org> Date : 2010-04-19 21:28 (1 days old) Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15805 Subject : reiserfs locking Submitter : Alexander Beregalov <a.beregalov@gmail.com> Date : 2010-04-15 21:02 (5 days old) Message-ID : <t2ka4423d671004151402n7b2dc425mdc9c6bb9640d63fb@mail.gmail.com> References : http://marc.info/?l=linux-kernel&m=127136535323933&w=2 Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15796 Subject : [REGRESSION bisected] Sound goes too fast due to commit 7b3a177b0 Submitter : Éric Piel <Eric.Piel@tremplin-utc.net> Date : 2010-04-13 21:54 (7 days old) First-Bad-Commit: http://kernel.org/git/linus/7b3a177b0d4f92b3431b8dca777313a07533a710 Message-ID : <4BC4E812.6050602@tremplin-utc.net> References : http://marc.info/?l=linux-kernel&m=127119569009790&w=2 Handled-By : Takashi Iwai <tiwai@suse.de> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15795 Subject : 2.6.34-rc4 : OOPS in ...
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15505 Subject : No more b43 wireless interface since 2.6.34-rc1 Submitter : Christian Casteyde <casteyde.christian@free.fr> Date : 2010-03-10 06:59 (41 days old) Handled-By : Yinghai Lu <yinghai@kernel.org> Patch : https://bugzilla.kernel.org/show_bug.cgi?id=15505#c11 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15551 Subject : WARNING: at net/mac80211/work.c:811 ieee80211_work_work+0x7f/0xde8 [mac80211]() Submitter : Alex Zhavnerchik <alex.vizor@gmail.com> Date : 2010-03-16 22:03 (35 days old) --
I merged hch's fix for this twelve seconds ago. --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15589 Subject : 2.6.34-rc1: Badness at fs/proc/generic.c:316 Submitter : Christian Kujau <lists@nerdbynature.de> Date : 2010-03-13 23:53 (38 days old) Message-ID : <alpine.DEB.2.01.1003131544340.5493@bogon.housecafe.de> References : http://marc.info/?l=linux-kernel&m=126852442903680&w=2 --
Yes, unless something in this area has changed from -rc4 to -rc5, this is still printed during boot: device-tree: Duplicate name in /cpus/PowerPC,G4@0, renamed to "l2-cache#1" name 'pulses/rev' ------------[ cut here ]------------ Badness at fs/proc/generic.c:317 NIP: c00e14b8 LR: c00e14b8 CTR: c01fc2c0 REGS: c045bdc0 TRAP: 0700 Not tainted (2.6.34-rc4) MSR: 00029032 <EE,ME,CE,IR,DR> CR: 22000022 XER: 20000000 TASK = c043b410[0] 'swapper' THREAD: c045a000 GPR00: c00e14b8 c045be70 c043b410 00000024 000012ff ffffffff ffffffff 00000000 GPR08: ef808320 c0458670 00000000 000012ff 42000028 00000000 00cb3ccc 00cb39b8 GPR16: 00cd682c 00cb3ca8 00cb38c8 00cb39ac 00240e18 00240e20 00cb3954 00240e24 GPR24: 00000000 0049b000 c045be98 c045bec8 c0da0a42 c0da0a42 00000006 00000000 NIP [c00e14b8] __xlate_proc_name+0xd0/0xf8 LR [c00e14b8] __xlate_proc_name+0xd0/0xf8 Call Trace: [c045be70] [c00e14b8] __xlate_proc_name+0xd0/0xf8 (unreliable) [c045be90] [c00e1a2c] __proc_create+0x60/0xf0 [c045bec0] [c00e2194] proc_create_data+0x54/0xc4 [c045bee0] [c00e6310] __proc_device_tree_add_prop+0x64/0xd4 [c045bf00] [c00e64b4] proc_device_tree_add_node+0x134/0x164 [c045bf20] [c00e6434] proc_device_tree_add_node+0xb4/0x164 [c045bf40] [c00e6434] proc_device_tree_add_node+0xb4/0x164 [c045bf60] [c00e6434] proc_device_tree_add_node+0xb4/0x164 [c045bf80] [c00e6434] proc_device_tree_add_node+0xb4/0x164 [c045bfa0] [c0421c30] proc_device_tree_init+0x4c/0x78 [c045bfb0] [c0421698] proc_root_init+0xcc/0xf0 [c045bfc0] [c040e798] start_kernel+0x230/0x284 [c045bff0] [00003444] 0x3444 Instruction dump: 93ba0000 38600000 93fb0000 80010024 bb410008 38210020 7c0803a6 4e800020 3c60c03c 7f84e378 386300a8 48273a45 <0fe00000> 80010024 3860fffe bb410008 -- BOFH excuse #37: heavy gravity fluctuation, move computer to floor rapidly --
Try this 100% unbuilt, 100% untested patch.
cheers
diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c
index f8650dc..9502b48 100644
--- a/fs/proc/proc_devtree.c
+++ b/fs/proc/proc_devtree.c
@@ -175,6 +175,24 @@ retry:
return fixed_name;
}
+static const char *unslash_name(const char *name)
+{
+ char *p, *fixed_name;
+
+ fixed_name = kstrdup(name);
+ if (!fixed_name) {
+ printk(KERN_ERR "device-tree: Out of memory trying to unslash "
+ "name \"%s\"\n", name);
+ return name;
+ }
+
+ p = fixed_name;
+ while ((p = strstr(p, "/")))
+ *p++ = '_';
+
+ return fixed_name;
+}
+
/*
* Process a node, adding entries for its children and its properties.
*/
@@ -211,6 +229,9 @@ void proc_device_tree_add_node(struct device_node *np,
if (duplicate_name(de, p))
p = fixup_name(np, de, p);
+ if (strstr(p, "/"))
+ p = unslash_name(p);
+
ent = __proc_device_tree_add_prop(de, pp, p);
if (ent == NULL)
break;
--
This is wasteful. :-) Also, I hope we won't spit message every time allocation fail. --
We do. Your system is mostly hosed anyway, but feel free to rate limit it or something. The error handling in there is a bit dubious, if the alloc fails we just return the old name, which we know is bogus. It should probably return NULL and the calling code can check - same for fixup_name(). cheers
OK Is anyone going to post a clean patch for that with a sign-off? Rafael --
I added GFP_KERNEL to kstrdup to make the compile error go away: fs/proc/proc_devtree.c: In function ‘unslash_name’: fs/proc/proc_devtree.c:183: error: too few arguments to function ‘kstrdup’ make[2]: *** [fs/proc/proc_devtree.o] Error 1 make[1]: *** [fs/proc] Error 2 make: *** [fs] Error 2 And now 2.6.34-rc5 compiles and boots without the warning. Thanks! New dmesg and /proc/device-tree on: http://nerdbynature.de/bits/2.6.34-rc1/xlate_proc_name/ Alexey mentioned that this is "wasteful" - does it make the kernel slower? I have not done any performance tests, but I'd rather stick with the warning than make this Powerbook G4 any more slower :-\ Thanks again, Christian. diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c index ce94801..019581d 100644 --- a/fs/proc/proc_devtree.c +++ b/fs/proc/proc_devtree.c @@ -176,6 +176,24 @@ retry: return fixed_name; } +static const char *unslash_name(const char *name) +{ + char *p, *fixed_name; + + fixed_name = kstrdup(name, GFP_KERNEL); + if (!fixed_name) { + printk(KERN_ERR "device-tree: Out of memory trying to unslash " + "name \"%s\"\n", name); + return name; + } + + p = fixed_name; + while ((p = strstr(p, "/"))) + *p++ = '_'; + + return fixed_name; +} + /* * Process a node, adding entries for its children and its properties. */ @@ -212,6 +230,9 @@ void proc_device_tree_add_node(struct device_node *np, if (duplicate_name(de, p)) p = fixup_name(np, de, p); + if (strstr(p, "/")) + p = unslash_name(p); + ent = __proc_device_tree_add_prop(de, pp, p); if (ent == NULL) break; -- BOFH excuse #369: Virus transmitted from computer to sysadmins. --
You want to use strchr. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." --
Cool, and we see: ./uni-n@f8000000/i2c@f8001000/i2c-bus@1/fan@5c/pulses_rev Maybe a little. It has to check every string to see if it contains a "/". But then you save the cost of taking an exeception for the WARN, which might make up the difference. But it's a one time fixup at boot, so it's not going to be noticeable. cheers
Hi This also continues to be a problem with b2c2-flexcop and 2.6.34-rc5-git10: [ 10.119807] b2c2-flexcop: B2C2 FlexcopII/II(b)/III digital TV receiver chip loaded successfully [ 10.129183] flexcop-pci: will use the HW PID filter. [ 10.129187] flexcop-pci: card revision 2 [ 10.129195] b2c2_flexcop_pci 0000:06:01.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19 [ 10.129239] ------------[ cut here ]------------ [ 10.129244] WARNING: at /tmp/buildd/linux-sidux-2.6-2.6.34~rc5/debian/build/source_amd64_none/fs/proc/generic.c:317 __xlate_proc_name+0xb5/0xd0() [ 10.129246] Hardware name: EP45-DS3 [ 10.129247] name 'Technisat/B2C2 FlexCop II/IIb/III Digital TV PCI Driver' [ 10.129248] Modules linked in: b2c2_flexcop_pci(+) ath9k_common b2c2_flexcop v4l1_compat snd_timer radeon(+) dvb_core ar9170usb(+) ath9k_hw snd_seq_device ir_common tveeprom ttm v4l2_compat_ioctl32 snd drm_kms_helper ir_core ath mac80211 soundcore videobuf_dma_sg cx24123 drm i2c_i801 i2c_algo_bit snd_page_alloc videobuf_core cx24113 s5h1420 cfg80211 rfkill evdev i2c_core tpm_tis btcx_risc tpm led_class pcspkr tpm_bios rtc_cmos button rtc_core intel_agp rtc_lib processor ext4 mbcache jbd2 crc16 dm_mod sg sr_mod cdrom sd_mod usbhid hid uhci_hcd firewire_ohci firewire_core ahci r8169 ehci_hcd mii libata crc_itu_t scsi_mod thermal usbcore nls_base [last unloaded: scsi_wait_scan] [ 10.129279] Pid: 1124, comm: modprobe Not tainted 2.6.34-rc5-sidux-amd64 #1 [ 10.129281] Call Trace: [ 10.129285] [<ffffffff8104ba83>] ? warn_slowpath_common+0x73/0xb0 [ 10.129287] [<ffffffff8104bb20>] ? warn_slowpath_fmt+0x40/0x50 [ 10.129290] [<ffffffff8114f545>] ? __xlate_proc_name+0xb5/0xd0 [ 10.129292] [<ffffffff8114fb2e>] ? __proc_create+0x7e/0x150 [ 10.129294] [<ffffffff811504e7>] ? proc_mkdir_mode+0x27/0x60 [ 10.129297] [<ffffffff8109fb55>] ? register_handler_proc+0x115/0x130 [ 10.129300] [<ffffffff8109d4c1>] ? __setup_irq+0x1d1/0x330 [ 10.129303] [<ffffffffa011b160>] ? flexcop_pci_isr+0x0/0x190 ...
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15625 Subject : BUG: 2.6.34-rc1, RIP is (null) Submitter : Randy Dunlap <randy.dunlap@oracle.com> Date : 2010-03-18 22:22 (33 days old) Message-ID : <4BA2A7A9.4080503@oracle.com> References : http://marc.info/?l=linux-kernel&m=126895098217351&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15659 Subject : [Regresion] [2.6.34-rc1] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung Submitter : Maciej Rutecki <maciej.rutecki@gmail.com> Date : 2010-03-25 20:04 (26 days old) Message-ID : <201003252104.24965.maciej.rutecki@gmail.com> References : http://marc.info/?l=linux-kernel&m=126954749618319&w=2 --
Bug still exists in 2.6.34-rc4 -- Maciej Rutecki http://www.maciek.unixy.pl --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15668 Subject : start_kernel(): bug: interrupts were enabled early Submitter : Rabin Vincent <rabin@rab.in> Date : 2010-03-25 19:53 (26 days old) First-Bad-Commit: http://kernel.org/git/linus/773e3eb7b81e5ba13b5155dfb3bb75b8ce37f8f9 Message-ID : <20100325194100.GA2364@debian> References : http://marc.info/?l=linux-kernel&m=126954607216519&w=2 --
This was fixed by 3eac4abaa69949af0e2f64e5c55ee8a22bbdd3e7 ("rwsem generic spinlock: use IRQ save/restore spinlocks"). Rabin --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15671 Subject : intel graphic card hanging (Hangcheck timer elapsed... GPU hung) Submitter : Norbert Preining <preining@logic.at> Date : 2010-03-27 16:11 (24 days old) Message-ID : <20100327161104.GA12043@gamma.logic.tuwien.ac.at> References : http://marc.info/?l=linux-kernel&m=126970883105262&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15698 Subject : Freeze on power-off / suspend to ram Submitter : arond <hector1987@gmail.com> Date : 2010-04-05 13:53 (15 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15704 Subject : [r8169] WARNING: at net/sched/sch_generic.c Submitter : Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Date : 2010-03-31 10:21 (20 days old) Message-ID : <<20100331102142.GA3294@swordfish.minsk.epam.com>> References : http://marc.info/?l=linux-kernel&m=127003090406108&w=2 --
Hello, .34-rc5-git7 kernel: [12887.906682] pktgen 2.72: Packet Generator for packet performance testing. kernel: [12938.998730] ------------[ cut here ]------------ kernel: [12938.998741] WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0xc1/0x129() kernel: [12938.998745] Hardware name: F3JC kernel: [12938.998748] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out kernel: [12938.998751] Modules linked in: pktgen usb_storage ipv6 snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek asus_laptop sparse_keymap sdhci_pci sdhci snd_hda_intel mmc_core led_class psmouse snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc serio_raw rng_core sg i2c_i801 evdev r8169 mii usbhid hid uhci_hcd ehci_hcd sr_mod cdrom sd_mod usbcore ata_piix kernel: [12938.998808] Pid: 4617, comm: kpktgend_0 Not tainted 2.6.34-rc5-dbg-r8169-git7 #97 kernel: [12938.998811] Call Trace: kernel: [12938.998819] [<c102e1b2>] warn_slowpath_common+0x65/0x7c kernel: [12938.998824] [<c1268085>] ? dev_watchdog+0xc1/0x129 kernel: [12938.998829] [<c102e1fd>] warn_slowpath_fmt+0x24/0x27 kernel: [12938.998834] [<c1268085>] dev_watchdog+0xc1/0x129 kernel: [12938.998841] [<c1040039>] ? __kfifo_from_user_generic+0x30/0x5c kernel: [12938.998848] [<c1036ae3>] ? run_timer_softirq+0x136/0x203 kernel: [12938.998853] [<c1036b3c>] run_timer_softirq+0x18f/0x203 kernel: [12938.998858] [<c1036ae3>] ? run_timer_softirq+0x136/0x203 kernel: [12938.998864] [<c1267fc4>] ? dev_watchdog+0x0/0x129 kernel: [12938.998870] [<c1032a72>] __do_softirq+0x88/0x10c kernel: [12938.998875] [<c1032b25>] do_softirq+0x2f/0x47 kernel: [12938.998883] [<f8095488>] ? pktgen_xmit+0xd3e/0xe0b [pktgen] kernel: [12938.998888] [<c1032d08>] _local_bh_enable_ip+0x8b/0xb3 kernel: [12938.998894] [<c1032d38>] local_bh_enable_ip+0x8/0xa kernel: [12938.998900] [<c12c3818>] _raw_spin_unlock_bh+0x2f/0x32 kernel: [12938.998906] [<f8095488>] pktgen_xmit+0xd3e/0xe0b [pktgen] kernel: [12938.998913] [<c104463c>] ? ...
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15768 Subject : Incorrectly calculated free blocks result in ENOSPC from writepage Submitter : Dmitry Monakhov <dmonakhov@openvz.org> Date : 2010-04-12 11:24 (8 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15774 Subject : 2.6.34-rc3: eth0 (8139too): transmit queue 0 timed out Submitter : Németh Márton <nm127@freemail.hu> Date : 2010-04-10 12:33 (10 days old) Message-ID : <4BC07022.6000708@freemail.hu> References : http://marc.info/?l=linux-kernel&m=127090287021976&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15744 Subject : [2.6.34-rc1 REGRESSION] ahci 0000:00:1f.2: controller reset failed (0xffffffff) Submitter : Andy Isaacson <adi@hexapodia.org> Date : 2010-04-06 22:54 (14 days old) Message-ID : <<4BC51312.6080302@oracle.com></desc>> References : http://marc.info/?l=linux-kernel&m=127059449031511&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15712 Subject : [regression] 2.6.34-rc1 to -rc3 on zaurus: no longer boots Submitter : Pavel Machek <pavel@ucw.cz> Date : 2010-04-01 6:06 (19 days old) Message-ID : <20100401060624.GA1329@ucw.cz> References : http://marc.info/?l=linux-kernel&m=127010200817402&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15805 Subject : reiserfs locking Submitter : Alexander Beregalov <a.beregalov@gmail.com> Date : 2010-04-15 21:02 (5 days old) Message-ID : <t2ka4423d671004151402n7b2dc425mdc9c6bb9640d63fb@mail.gmail.com> References : http://marc.info/?l=linux-kernel&m=127136535323933&w=2 --
That doesn't look like related to the bkl removal.
In fact what I wonder is how we missed that before.
vfs_readdir() take the directory inode mutex
|
------- copy_to_user() takes the mm->mmap_sem
sys_unmap() takes mm->mmap_sem
|
------- reiserfs_file_release() takes inode mutex
The lock inversion can not happen as sys_getdents() can't be called
after the directory is closed.
I'm not sure what to do. Adding more people in Cc.
--
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15790 Subject : Meta-Bug: Regressions Submitter : Florian Mickler <fmickler@gmx.de> Date : 2010-04-15 18:21 (5 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15788 Subject : external usb sound card doesn't work after resume Submitter : François Valenduc <francois.valenduc@tvcablenet.be> Date : 2010-04-15 10:16 (5 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15795 Subject : 2.6.34-rc4 : OOPS in unmap_vma Submitter : Parag Warudkar <parag.lkml@gmail.com> Date : 2010-04-14 (6 days old) Message-ID : <alpine.DEB.2.00.1004132147260.1881@parag-laptop> References : http://marc.info/?l=linux-kernel&m=127121006625429&w=2 --
Hasn't reproduced after many retries and I am not sure it can be called a regression, may be it's always been there, just not reproducible easily - let's close this, I will reopen if needed. Thanks, Parag --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15812 Subject : utsname.domainname not set in x86_32 processes (causing "YPBINDPROC_DOMAIN: domain not bound" errors) Submitter : <adi@hexapodia.org> Date : 2010-04-19 21:28 (1 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15796 Subject : [REGRESSION bisected] Sound goes too fast due to commit 7b3a177b0 Submitter : Éric Piel <Eric.Piel@tremplin-utc.net> Date : 2010-04-13 21:54 (7 days old) First-Bad-Commit: http://kernel.org/git/linus/7b3a177b0d4f92b3431b8dca777313a07533a710 Message-ID : <4BC4E812.6050602@tremplin-utc.net> References : http://marc.info/?l=linux-kernel&m=127119569009790&w=2 Handled-By : Takashi Iwai <tiwai@suse.de> --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15717 Subject : bluetooth oops Submitter : Pavel Machek <pavel@ucw.cz> Date : 2010-03-14 20:14 (37 days old) Message-ID : <20100314201434.GE22059@elf.ucw.cz> References : http://marc.info/?l=linux-kernel&m=126859771528426&w=4 Handled-By : Marcel Holtmann <marcel@holtmann.org> --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15719 Subject : virtio_net causing kernel BUG when running under VirtualBox Submitter : Thomas Müller <thomas@mathtm.de> Date : 2010-03-27 14:32 (24 days old) First-Bad-Commit: http://kernel.org/git/linus/9ab86bbcf8be755256f0a5e994e0b38af6b4d399 Message-ID : <4BAE1707.2050803@mathtm.de> References : http://marc.info/?l=linux-kernel&m=126970039227740&w=4 Handled-By : Shirley Ma <mashirle@us.ibm.com> --
Fixed by commit 0e413f22e4c1cbfe12907e462a7d739a2e316f2b. Regards Thomas --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15713 Subject : hackbench regression due to commit 9dfc6e68bfe6e Submitter : Alex Shi <alex.shi@intel.com> Date : 2010-03-25 8:40 (26 days old) First-Bad-Commit: http://kernel.org/git/linus/9dfc6e68bfe6ee452efb1a4e9ca26a9007f2b864 Message-ID : <1269506457.4513.141.camel@alexs-hp.sh.intel.com> References : http://marc.info/?l=linux-kernel&m=126950632920682&w=4 Handled-By : Christoph Lameter <cl@linux-foundation.org> Pekka Enberg <penberg@cs.helsinki.fi> --
I have not been able to reproduce it so far. --
So what are our options? We can revert the SLUB conversion patch for now but I still can't see what's wrong with it... Pekka --
I haven't been able to reproduce this either on my Core 2 machine.
Yanmin, does something like this help on your machines? I'm thinking false
sharing with some other per-CPU data structure that happens to be put in
same percpu slot as struct kmem_cache_cpu...
Pekka
diff --git a/mm/slub.c b/mm/slub.c
index 7d6c8b1..d8159d6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2066,7 +2066,7 @@ init_kmem_cache_node(struct kmem_cache_node *n, struct kmem_cache *s)
#endif
}
-static DEFINE_PER_CPU(struct kmem_cache_cpu, kmalloc_percpu[KMALLOC_CACHES]);
+static DEFINE_PER_CPU_ALIGNED(struct kmem_cache_cpu, kmalloc_percpu[KMALLOC_CACHES]);
static inline int alloc_kmem_cache_cpus(struct kmem_cache *s, gfp_t flags)
{
@@ -2077,7 +2077,7 @@ static inline int alloc_kmem_cache_cpus(struct kmem_cache *s, gfp_t flags)
*/
s->cpu_slab = kmalloc_percpu + (s - kmalloc_caches);
else
- s->cpu_slab = alloc_percpu(struct kmem_cache_cpu);
+ s->cpu_slab = __alloc_percpu(sizeof(struct kmem_cache_cpu), cache_line_size());
if (!s->cpu_slab)
return 0;
--
Mostly, the regression exists on Nehalem machines. I suspect it's related to A quick testing doesn't show any help. I did a new testing. After the machine boots, I hot remove 8 hyper-threading cpu which means last 8 are just cores. The regression between 2.6.33 and 2.6.34-rc becomes small. My opinion is we needn't revert the patch, but still keep an eye on it when testing other new RC kernel releases. One reason is volanoMark and netperf have no such regression. Is it ok? Yanmin --
Hi Yanmin,
On Mon, Apr 26, 2010 at 9:59 AM, Zhang, Yanmin
OK, so does anyone know why hyper-threading would change things for
We need to get this fixed. In my experience, it's pretty common that
slab regressions pop up only in one or few benchmarks. The problem is
likely to pop up in some real-world workload where it's even more
difficult to track down because basic CPU profiles don't pin-point the
problem.
Do we have some Intel CPU expert hanging around here that could
enlighten me of the effects of hyper-threading on CPU caching? I also
wonder why it's showing up with the new per-CPU allocator and not with
the homebrewn one we had in SLUB previously.
Pekka
--
Hello, My wild speculation is that previously the cpu_slub structures of two neighboring threads ended up on the same cacheline by accident thanks to the back to back allocation. W/ the percpu allocator, this no longer would happen as the allocator groups percpu data together per-cpu. Thanks. -- tejun --
Hi,
On Mon, Apr 26, 2010 at 9:59 AM, Zhang, Yanmin
Yanmin, do we see a lot of remote frees for your hackbench run? IIRC,
it's the "deactivate_remote_frees" stat when CONFIG_SLAB_STATS is
enabled.
Pekka
--
I'm not familiar with the details or scales here so please take whatever I say with a grain of salt. For hyperthreading configuration I think operations don't have to be remote to be affected. If the data for cpu0 and cpu1 were on the same cache line, and cpu0 and cpu1 are occupying the same physical core thus sharing all the resources it would benefit from the sharing whether any operation was remote or not as it saves the physical processor one cache line. Thanks. -- tejun --
Even if the cacheline is dirtied like in the struct kmem_cache_cpu case?
If that's the case, don't we want the per-CPU allocator to support
back to back allocation for cores that are in the same package?
Btw, I focused on remote frees initially before I understood what you
actually meant and scetched the following untested patch to take advantage
of the fact that struct kmem_cache_cpu doesn't fill a whole cache line. It
tries amortize remote free costs by "queuing" objects. It would be
interesting to see if it helps here (or in the other SLUB regressions like
netperf and the famous Intel one).
Pekka
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 0249d41..b554a67 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -34,10 +34,14 @@ enum stat_item {
ORDER_FALLBACK, /* Number of times fallback was necessary */
NR_SLUB_STAT_ITEMS };
+#define SLUB_MAX_NR_REMOTES 5
+
struct kmem_cache_cpu {
void **freelist; /* Pointer to first free per cpu object */
struct page *page; /* The slab from which we are allocating */
int node; /* The node of the page (or -1 for debug) */
+ int nr_remotes; /* Number of remotely free'd objects */
+ void *remotelist[SLUB_MAX_NR_REMOTES]; /* List of remotely free'd objects */
#ifdef CONFIG_SLUB_STATS
unsigned stat[NR_SLUB_STAT_ITEMS];
#endif
diff --git a/mm/slub.c b/mm/slub.c
index 7d6c8b1..e8e5523 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1480,6 +1480,24 @@ static void deactivate_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
unfreeze_slab(s, page, tail);
}
+static void __slab_free(struct kmem_cache *s, struct page *page, void *x, unsigned long addr);
+
+static void flush_remotelist(struct kmem_cache *s, struct kmem_cache_cpu *c)
+{
+ int i;
+
+ for (i = 0; i < c->nr_remotes; i++) {
+ struct page *page;
+ void *x;
+
+ x = c->remotelist[i];
+ page = virt_to_head_page(x);
+
+ __slab_free(s, page, x, _RET_IP_);
+ }
+ c->nr_remotes = ...Hello, Pekka. If my hypothesis is the case, I don't think dirtying or not would matter. It's about two cpus sharing a cache line which usually is a bad idea but in this case happens to be a good idea because the two I think it's probably gonna be an over-engineering effort. W/ percpu allocator the rest of the cacheline would likely be occupied by another percpu item for the cpu, so it's not really wasted. It's just used differently. It would be good if we have a way to better pack small hot ones (for the same cpu) into the same cachelines but I don't think it would be wise to interleave stuff from different cpus. It's not like there's only single way to save a cacheline after all. Thanks. -- tejun --
After runing the testing with 2.6.34-rc5: #slabinfo -AD Name Objects Alloc Free %Fast Fallb O skbuff_head_cache 2518 800011810 800009770 95 19 0 1 kmalloc-512 1101 800009118 800008441 95 19 0 2 anon_vma_chain 2500 195878 194477 98 13 0 0 vm_area_struct 2487 160755 158908 97 20 0 1 anon_vma 2645 88626 87637 99 12 0 0 [ymzhang@lkp-ne01 ~]$ cat /sys/kernel/slab/skbuff_head_cache/deactivate_remote_frees 1 C13=1 [ymzhang@lkp-ne01 ~]$ cat /sys/kernel/slab/kmalloc-512/deactivate_remote_frees 3 C8=2 C15=1 After running testing against 2.6.33 kernel: #slabinfo -AD Name Objects Alloc Free %Fast Fallb O kmalloc-1024 961 800011628 800011167 93 1 0 3 skbuff_head_cache 2518 800012055 800010015 93 1 0 1 vm_area_struct 2892 162196 159987 97 19 0 1 names_cache 128 47139 47141 99 97 0 3 kmalloc-64 3612 40180 37287 99 89 0 0 Acpi-State 816 36301 36301 99 98 0 0 I remember with 2.6.34-rc1, the fast alloc/free are close to the one of 2.6.33. --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15730 Subject : Ugly rmap NULL ptr deref oopsie on hibernate (was Linux 2.6.34-rc3) Submitter : Borislav Petkov <bp@alien8.de> Date : 2010-04-02 17:59 (18 days old) Message-ID : <20100402175937.GA19690@liondog.tnic> References : http://marc.info/?l=linux-kernel&m=127023173329741&w=2 --
From: "Rafael J. Wysocki" <rjw@sisk.pl>
Fixed by commit ea90002b0fa7bdee86ec22eba1d951f30bf043a6.
--
Regards/Gruss,
Boris.
--
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15711 Subject : 2.6.34-rc3, BUG at mm/slab.c:2989 Submitter : Heinz Diehl <htd@fancy-poultry.org> Date : 2010-04-01 17:52 (19 days old) Message-ID : <20100401175225.GA6581@fancy-poultry.org> References : http://marc.info/?l=linux-kernel&m=127014437406250&w=2 --
Don't know if this is still present, after reporting it here on the list, I've been advised by Chr. Lameter to switch from slab to slub. I did, and haven't seen this again. --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15729 Subject : BUG: physmap modprobe & rmmod Submitter : Randy Dunlap <randy.dunlap@oracle.com> Date : 2010-04-02 20:40 (18 days old) Message-ID : <20100402134058.c4682716.randy.dunlap@oracle.com> References : http://marc.info/?l=linux-kernel&m=127024096210230&w=2 --
Patch is here: https://patchwork.kernel.org/patch/90497/ I'd ack it if I still had the original mail :) CCing linux-mtd and some more people, so it can get picked up. -- Pengutronix e.K. | Wolfram Sang | Industrial Linux Solutions | http://www.pengutronix.de/ |
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15673 Subject : 2.6.34-rc2: "ima_dec_counts: open/free imbalance"? Submitter : Thomas Meyer <thomas@m3y3r.de> Date : 2010-03-28 11:31 (23 days old) Message-ID : <1269775909.5301.4.camel@localhost.localdomain> References : http://marc.info/?l=linux-kernel&m=126977593326800&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15672 Subject : KVM bug, git bisected Submitter : Kent Overstreet <kent.overstreet@gmail.com> Date : 2010-03-27 12:43 (24 days old) First-Bad-Commit: http://kernel.org/git/linus/5beb49305251e5669852ed541e8e2f2f7696c53e Message-ID : <4BADFD74.8060904@gmail.com> References : http://marc.info/?l=linux-kernel&m=126969385121711&w=2 --
Should be fixed by commit ea90002b0fa7bdee86ec22eba1d951f30bf043a6 --
Never mind me - this is a harmless (but loud) overflow of PREEMPT_BITS in the preempt count. --
OK, what am I supposed to do with this entry, then? Close? Rafael --
From: "Rafael J. Wysocki" <rjw@sisk.pl>
FWIW, I hit that warning too when chasing the anon_vma regression. It
seems on certain workloads (for me it was several kvm guests) we're
close to max preemption depth.
Anyway, adding some more people to Cc.
--
Regards/Gruss,
Boris.
--
Right, so my proposed solution to this is to make those locks preemptible, but that's a large and unfinished patch-set. As it is, its only a warning, nothing really serious should happen, but the situation does suck. --
I'm not sure if it's worth keeping that listed, though, as the problem is known and won't be solved before .34 final. OK to close as "will fix later"? Rafael --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15669 Subject : INFO: suspicious rcu_dereference_check() Submitter : Zdenek Kabelac <zdenek.kabelac@gmail.com> Date : 2010-03-08 1:26 (43 days old) Message-ID : <c4e36d111003250348q678eb2e6w4f3e8133e7fd6e58@mail.gmail.com> References : http://marc.info/?l=linux-kernel&m=126801163107713&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15661 Subject : PROBLEM: crash on halt with 2.6.34-0.16.rc2.git0.fc14.x86_64 Submitter : Jon Masters <jonathan@jonmasters.org> Date : 2010-03-26 15:29 (25 days old) Message-ID : <<1269617372.3779.234.camel@localhost>> References : http://marc.info/?l=linux-kernel&m=126961739803949&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15664 Subject : Graphics hang and kernel backtrace when starting Azureus with Compiz enabled Submitter : Alex Villacis Lasso <avillaci@ceibo.fiec.espol.edu.ec> Date : 2010-04-01 01:09 (19 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15610 Subject : fsck leads to swapper - BUG: unable to handle kernel NULL pointer dereference & panic Submitter : Ozgur Yuksel <ozgur.yuksel@oracle.com> Date : 2010-03-22 15:59 (29 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15611 Subject : Failure with the 2.6.34-rc1 kernel Submitter : Rupjyoti Sarmah <rsarmah@amcc.com> Date : 2010-03-16 15:45 (35 days old) Message-ID : <AC311A8E81420D4EBC1F26E6479848FE065B7D3D@SDCEXCHANGE01.ad.amcc.com> References : http://marc.info/?l=linux-kernel&m=126875435718396&w=2 --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15553 Subject : Screen backlight doesn't come back on after lid was closed (GM45) Submitter : <bugs@kaijauch.de> Date : 2010-03-17 14:35 (34 days old) --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15601 Subject : [BUG] SLOB breaks Crypto Submitter : michael-dev@fami-braun.de Date : 2010-03-15 13:39 (36 days old) Message-ID : <4B9E38AF.70309@fami-braun.de> References : http://marc.info/?l=linux-kernel&m=126866044724539&w=2 --
This was last in a need-more-info state, if I recall correctly. I haven't reproduced it. -- http://selenic.com : development and support for Mercurial and Linux --
This message has been generated automatically as a part of a summary report of recent regressions. The following bug entry is on the current list of known regressions from 2.6.33. Please verify if it still should be listed and let the tracking team know (either way). Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15590 Subject : 2.6.34-rc1: regression: ^Z no longer stops sound Submitter : Pavel Machek <pavel@ucw.cz> Date : 2010-03-14 7:58 (37 days old) Message-ID : <20100314075831.GA13457@elf.ucw.cz> References : http://marc.info/?l=linux-kernel&m=126855353122623&w=2 --
Please list these two similar regressions from 2.6.33 in the r600 DRM:
* r600 CS checker rejects GL_DEPTH_TEST w/o depth buffer:
https://bugs.freedesktop.org/show_bug.cgi?id=27571
* r600 CS checker rejects narrow FBO renderbuffers:
https://bugs.freedesktop.org/show_bug.cgi?id=27609
Thanks.
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
--
First one is userspace bug, i need to look into the second one. ie we were lucky the hw didn't lockup without depth buffer and depth test enabled. Cheers, Jerome --
As upstream doesn't consider the first to be a kernel issue, I guess you should just list the second. OK, if the failure is due to userspace is doing Very Bad Things(tm), catching that seems reasonable. Nevertheless, even if it happened by luck, the result was (ostensibly) working programs that suddenly break once one "upgrades" to the latest kernel. If userspace can't be fixed before 2.6.34 is released, perhaps a less cryptic log message would be appropriate? -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/) --
I pushed fix into mesa for the depth issue i will look into the other one today and likely push kernel fix. Cheers, Jerome --
Great, thanks. I'll try out the depth fix tonight. -- Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/) --
I have recently reported this suspend regression on my Dell laptop hardware. References: http://lkml.org/lkml/2010/4/18/20 Bug-report: https://bugzilla.kernel.org/show_bug.cgi?id=15820 Thanks, - Ben --
This has been added to the list now. Please check my comment in the Bugzilla entry. Rafael --
| Jesse Barnes | Re: [stable] [BUG][PATCH] cpqphp: fix kernel NULL pointer dereference |
| Greg KH | [003/136] p54usb: add Zcomax XG-705A usbid |
| Magnus Damm | [PATCH 03/07] ARM: Use shared GIC entry macros on Realview |
| Oliver Neukum | Re: [Bug #13682] The webcam stopped working when upgrading from 2.6.29 to 2.6.30 |
| Martin Schwidefsky | Re: [PATCH] optimized ktime_get[_ts] for GENERIC_TIME=y |
git: | |
| Junio C Hamano | Re: Some advanced index playing |
| Jeff King | Re: confusion over the new branch and merge config |
| Robin Rosenberg | Re: cvs2svn conversion directly to git ready for experimentation |
| Linus Torvalds | git binary size... |
| Ævar Arnfjörð Bjarmason | Re: Challenge with Git-Bash |
| Linux Kernel Mailing List | md: move allocation of ->queue from mddev_find to md_probe |
| Linux Kernel Mailing List | md: raid0: Represent zone->zone_offset in sectors. |
| Linux Kernel Mailing List | [ARM] S3C24XX: Add gpio_to_irq() facility |
| Linux Kernel Mailing List |
