2.6.26-rc2-mm1: sloooow mkfs.ext2

Previous thread: [RFC/PATCH 0/6] memcg: peformance improvement at el. v3 by KAMEZAWA Hiroyuki on Wednesday, May 14, 2008 - 1:02 am. (14 messages)

Next thread: m68k: main.c:(.init.text+0x730): undefined reference to `strlen' by Geert Uytterhoeven on Wednesday, May 14, 2008 - 1:02 am. (19 messages)
From: Andrew Morton
Date: Wednesday, May 14, 2008 - 1:01 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.26-rc2/2.6.26-rc2-mm1/


- The -mm tree is now based on linux-next.

  I will occasionally pick up later versions of trees which are already
  in linux-next, to catch material which was added after Stephen last
  pulled that tree.  That happened this time: git-net had a lot of driver
  changes which weren't in linux-next and which I wanted in
  2.6.26-rc2-mm1.

- A few more git trees were added: git-ubifs.patch, git-regulator.patch,
  git-logfs.patch, git-orion.patch.


Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.  These ...
From: Kamalesh Babulal
Date: Wednesday, May 14, 2008 - 4:24 am

Hi Andrew,

The 2.6.26-rc2-mm1 kernel panic's while bootup on the x86_64 machine.


BUG: unable to handle kernel paging request at 0000000000001e08
IP: [<ffffffff8026ac60>] __alloc_pages_internal+0x80/0x470
PGD 0 
Oops: 0000 [1] SMP 
last sysfs file: 
CPU 31 
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.26-rc2-mm1-autotest #1
RIP: 0010:[<ffffffff8026ac60>]  [<ffffffff8026ac60>] __alloc_pages_internal+0x80/0x470
RSP: 0018:ffff810bf9dbdbc0  EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff810bef4786c0 RCX: 0000000000000001
RDX: 0000000000001e00 RSI: 0000000000000001 RDI: 0000000000001020
RBP: ffff810bf9dbb6d0 R08: 0000000000001020 R09: 0000000000000000
R10: 0000000000000008 R11: ffffffff8046d130 R12: 0000000000001020
R13: 0000000000000001 R14: 0000000000001e00 R15: ffff810bf8d29878
FS:  0000000000000000(0000) GS:ffff810bf916dec0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000001e08 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff810bf9dbc000, task ffff810bf9dbb6d0)
Stack:  0002102000000000 0000000000000002 0000000000000000 0000000200000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 ffff810bef4786c0 0000000000001020 ffffffffffffffff
Call Trace:
 [<ffffffff802112e9>] dma_alloc_coherent+0xa9/0x280
 [<ffffffff804e8c9e>] tg3_init_one+0xa3e/0x15e0
 [<ffffffff8028d0e4>] alternate_node_alloc+0x84/0xd0
 [<ffffffff802286fc>] task_rq_lock+0x4c/0x90
 [<ffffffff8022de62>] set_cpus_allowed_ptr+0x72/0xf0
 [<ffffffff802e12fb>] sysfs_addrm_finish+0x1b/0x210
 [<ffffffff802e0f99>] sysfs_find_dirent+0x29/0x40
 [<ffffffff8036cc34>] pci_device_probe+0xe4/0x130
 [<ffffffff803bfc26>] driver_probe_device+0x96/0x1a0
 [<ffffffff803bfdb9>] __driver_attach+0x89/0x90
 [<ffffffff803bfd30>] __driver_attach+0x0/0x90
 [<ffffffff803bf29d>] ...
From: Andrew Morton
Date: Wednesday, May 14, 2008 - 10:36 am

grumble.  why.  There are lots of patches already which changed the
page allocator.

config, please?

Is it NUMA?
--

From: Kamalesh Babulal
Date: Wednesday, May 14, 2008 - 11:21 am

It is a NUMA box, with 4 nodes.

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
From: Andrew Morton
Date: Wednesday, May 14, 2008 - 12:44 pm

On Wed, 14 May 2008 23:51:36 +0530


Can you bisect it please?

Wrecking the page allocator is a fairly unusual thing to do.  I'd start
out by looking at *bootmem*.patch and perhaps
acpi-acpi_numa_init-build-fix.patch.
--

From: KAMEZAWA Hiroyuki
Date: Wednesday, May 14, 2008 - 6:54 pm

On Wed, 14 May 2008 12:44:55 -0700

From stack trace, it seems NODE_DATA(nid) is NULL.
There are 2 cases.
 - nid is bad.
 - NODE_DATA(nid) is not initialized...

Hmm..
Thanks,
-Kame











--

From: Kamalesh Babulal
Date: Sunday, May 18, 2008 - 1:00 am

After bisecting, the acpi-acpi_numa_init-build-fix.patch patch seems
to be causing the kernel panic during the bootup. Reverting the patch helps
in booting up the machine without the panic.

commit 5dc90c0b2d4bd0127624bab67cec159b2c6c4daf
Author: Ingo Molnar <mingo@elte.hu>
Date:   Thu May 1 09:51:47 2008 +0000

    acpi-acpi_numa_init-build-fix
    
    x86.git testing found the following build error on latest -git:
    
     drivers/acpi/numa.c: In function 'acpi_numa_init':
     drivers/acpi/numa.c:226: error: 'NR_NODE_MEMBLKS' undeclared (first use in this function)
     drivers/acpi/numa.c:226: error: (Each undeclared identifier is reported only once
     drivers/acpi/numa.c:226: error: for each function it appears in.)
    
    with this config:
    
     http://redhat.com/~mingo/misc/config-Wed_Apr_30_22_42_42_CEST_2008.bad
    
    i suspect we dont want SRAT parsing when CONFIG_HAVE_ARCH_PARSE_SRAT
    is unset - but the fix looks a bit ugly. Perhaps we should define
    NR_NODE_MEMBLKS even in this case and just let the code fall back
    to some sane behavior?
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 5d59cb3..8cab8c5 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -176,6 +176,7 @@ acpi_parse_processor_affinity(struct acpi_subtable_header * header,
 	return 0;
 }
 
+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
 static int __init
 acpi_parse_memory_affinity(struct acpi_subtable_header * header,
 			   const unsigned long end)
@@ -193,6 +194,7 @@ acpi_parse_memory_affinity(struct acpi_subtable_header * header,
 
 	return 0;
 }
+#endif
 
 static int __init acpi_parse_srat(struct acpi_table_header *table)
 {
@@ -221,9 +223,11 @@ int __init acpi_numa_init(void)
 	if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
 		acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
 				      acpi_parse_processor_affinity, ...
From: KOSAKI Motohiro
Date: Sunday, May 18, 2008 - 10:07 am

this patch break Fujitsu ia64 numa box too.
after revert, my test environment works well.

Thanks.
--

From: Lee Schermerhorn
Date: Monday, May 19, 2008 - 7:49 am

On HP ia64 numa, that patch causes all memory to show up on node 0, but
otherwise the platform boots and runs.  Didn't notice it until I tried
to run some numa tests.

Reverting the patch restores numaness.

Lee

--

From: Kamalesh Babulal
Date: Wednesday, May 14, 2008 - 7:03 am

Hi Andrew,

While running the dbench benchmark on the reiserfs filesystem,
over the x86_64 box booted with the 2.6.26-rc2-mm1 kernel. The
Kernel BUG() is seen on the console.

------------[ cut here ]------------
kernel BUG at fs/reiserfs/journal.c:1414!
invalid opcode: 0000 [1] SMP 
last sysfs file: /sys/devices/pci0000:20/0000:20:04.1/resource
CPU 3 
Modules linked in:
Pid: 5160, comm: umount Not tainted 2.6.26-rc2-mm1-autotest #1
RIP: 0010:[<ffffffff802e47e5>]  [<ffffffff802e47e5>] flush_journal_list+0x78/0x575
RSP: 0000:ffff8101fec6fb18  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000003
RDX: 0000000000000000 RSI: ffff81007edf7c00 RDI: ffffc2000210d0a0
RBP: ffff8101fec6fb58 R08: ffff8101fec6e000 R09: 0000000000000001
R10: 0000000000000000 R11: ffffc2000212f1b0 R12: ffff81007edf7c00
R13: 000000000000000d R14: ffff8101fe5d2c00 R15: ffffc2000210d000
FS:  0000000000000000(0000) GS:ffff8101fff07f80(0063) knlGS:00000000f7fbdb20
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f7f5d260 CR3: 00000001fb475000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 5160, threadinfo ffff8101fec6e000, task ffff8101fe642c80)
Stack:  00000000fec6fb10 ffffffff802a557b 00000000fc7593f0 ffffc2000210d000
 ffff81007f65f900 000000000000000d ffff8101fe5d2c00 ffffc2000210d000
 ffff8101fec6fba8 ffffffff802e4aff 000000000000000d 0000000100000001
Call Trace:
 [<ffffffff802a557b>] submit_bh+0x105/0x111
 [<ffffffff802e4aff>] flush_journal_list+0x392/0x575
 [<ffffffff802e7e76>] do_journal_end+0xb6d/0xe0c
 [<ffffffff80261f7f>] __writepage+0x0/0x2a
 [<ffffffff80263c1f>] pagevec_lookup_tag+0x20/0x2a
 [<ffffffff8025ad8e>] wait_on_page_writeback_range+0xeb/0x13e
 [<ffffffff802e8376>] do_journal_begin_r+0x261/0x2a2
 [<ffffffff802e8a13>] do_journal_release+0x4c/0x180
 [<ffffffff80242070>] bit_waitqueue+0x12/0xa4
 [<ffffffff802e359e>] ...
From: Andrew Morton
Date: Wednesday, May 14, 2008 - 11:01 am

This?

--- a/fs/reiserfs/journal.c~reiserfs-convert-j_flush_sem-to-mutex-fix
+++ a/fs/reiserfs/journal.c
@@ -1412,7 +1412,7 @@ static int flush_journal_list(struct sup
 	/* if flushall == 0, the lock is already held */
 	if (flushall) {
 		mutex_lock(&journal->j_flush_mutex);
-	} else if (!mutex_trylock(&journal->j_flush_mutex)) {
+	} else if (mutex_trylock(&journal->j_flush_mutex)) {
 		BUG();
 	}
 

--

From: Kamalesh Babulal
Date: Wednesday, May 14, 2008 - 8:34 am

Hi Andrew,

2.6.26-rc2-mm1 kernel panics on powerpc, while running ltp test over it.
I have attached the gdb output of the pc and lr registers. The patch
list_for_each_rcu-must-die-networking.patch points to changes made 
to the same lines listed by the gdb output.

 Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000481fa0
cpu 0x0: Vector: 300 (Data Access) at [c0000000eae37900]
    pc: c000000000481fa0: .inet_create+0xb4/0x330
    lr: c000000000413340: .__sock_create+0x190/0x280
    sp: c0000000eae37b80
   msr: 8000000000009032
   dar: 0
 dsisr: 40010000
  current = 0xc0000000cd201500
  paca    = 0xc0000000007c3480
    pid   = 6462, comm = socket01
enter ? for help
[c0000000eae37c30] c000000000413340 .__sock_create+0x190/0x280
[c0000000eae37cf0] c0000000004137e0 .sys_socket+0x40/0x98
[c0000000eae37d90] c000000000438e18 .compat_sys_socketcall+0xc0/0x234
[c0000000eae37e30] c0000000000086b4 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 000000000ff20484
SP (ffc8f770) is in userspace


0xc000000000481fa0 is in inet_create (net/ipv4/af_inet.c:290).
285             /* Look for the requested type/protocol pair. */
286             answer = NULL;
287     lookup_protocol:
288             err = -ESOCKTNOSUPPORT;
289             rcu_read_lock();
290             list_for_each_entry_rcu(answer, &inetsw[sock->type], list) {
291
292                     /* Check the non-wild match. */
293                     if (protocol == answer->protocol) {
294                             if (protocol != IPPROTO_IP)


0xc000000000413340 is in __sock_create (net/socket.c:1171).
1166                    goto out_release;
1167
1168            /* Now protected by module ref count */
1169            rcu_read_unlock();
1170
1171            err = pf->create(net, sock, protocol);
1172            if (err < 0)
1173                    goto out_module_put;
1174
1175            /*

-- 
Thanks & Regards,
Kamalesh ...
From: Paul E. McKenney
Date: Wednesday, May 14, 2008 - 9:07 am

Hmmm....  Does the panic go away when this patch is reverted?

--

From: Alexey Dobriyan
Date: Wednesday, May 14, 2008 - 1:05 pm

Yes.

--

From: Paul E. McKenney
Date: Wednesday, May 14, 2008 - 1:32 pm

OK, am awake now, apologies for my confusion.  Not sure -what- state
I was in when generating and validating the original...

							Thanx, Paul
--

From: Mariusz Kozlowski
Date: Wednesday, May 14, 2008 - 11:29 am

Hello,

	Got this on sparc64 startup:

=============================================
[ INFO: possible recursive locking detected ]
2.6.26-rc2-mm1 #2
---------------------------------------------
modprobe/514 is trying to acquire lock:
 (&cls->mutex){--..}, at: [<00000000005ff538>] device_add+0x3c0/0x5c0

but task is already holding lock:
 (&cls->mutex){--..}, at: [<000000000060287c>] class_interface_register+0x44/0xe0

other info that might help us debug this:
1 lock held by modprobe/514:
 #0:  (&cls->mutex){--..}, at: [<000000000060287c>] class_interface_register+0x44/0xe0

stack backtrace:
Call Trace:
 [000000000048cc64] __lock_acquire+0x104c/0x1400
 [000000000048d098] lock_acquire+0x80/0xa0
 [0000000000701898] mutex_lock_nested+0xc0/0x4e0
 [00000000005ff538] device_add+0x3c0/0x5c0
 [00000000005ff74c] device_register+0x14/0x20
 [00000000005ff808] device_create+0xb0/0xe0
 [0000000010012ef8] sg_add+0x160/0x380 [sg]
 [00000000006028d4] class_interface_register+0x9c/0xe0
 [0000000000617050] scsi_register_interface+0x18/0x40
 [000000001001c0a4] init_sg+0xac/0x180 [sg]
 [00000000004960c8] sys_init_module+0xb0/0x1c0
 [00000000004463cc] sys32_init_module+0x14/0x20
 [0000000000406294] linux_sparc_syscall32+0x3c/0x40
 [0000000000013698] 0x136a0
From: Andrew Morton
Date: Wednesday, May 14, 2008 - 11:41 am

Yeah, this is a bug which has always been there, afaik.  A
semaphore was converted to a mutex.  Semaphores don't have lockdep
checking, but mutexes do, so we just now got to find out about it. 
Some finger-pointing is occurring over on the scsi list ;)

I assume the machine otherwise works OK?
--

From: Mariusz Kozlowski
Date: Wednesday, May 14, 2008 - 11:50 am

Yes - seems it's running fine. I'm doing some other tests now so if anything pops out
you'll know it.

	Mariusz
--

From: Torsten Kaiser
Date: Wednesday, May 14, 2008 - 12:12 pm

On Wed, May 14, 2008 at 10:01 AM, Andrew Morton

Nice! This one works for me again.

But somehow the NUMAness of my system is gone.

2.6.26-rc2-mm1:
[    0.000000] max_pfn_mapped = 1179648
[    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[    0.000000] init_memory_mapping
[    0.000000] DMI present.
[    0.000000] ACPI: RSDP 000FB080, 0024 (r2 ACPIAM)
[    0.000000] ACPI: XSDT DFFD0100, 0064 (r1 A_M_I_ OEMXSDT   4000713
MSFT       97)
[    0.000000] ACPI: FACP DFFD0290, 00F4 (r3 A_M_I_ OEMFACP   4000713
MSFT       97)
[    0.000000] ACPI: DSDT DFFD0450, 4FC5 (r1  S0027 S0027000        0
INTL 20051117)
[    0.000000] ACPI: FACS DFFDE000, 0040
[    0.000000] ACPI: APIC DFFD0390, 0080 (r1 A_M_I_ OEMAPIC   4000713
MSFT       97)
[    0.000000] ACPI: MCFG DFFD0410, 003C (r1 A_M_I_ OEMMCFG   4000713
MSFT       97)
[    0.000000] ACPI: OEMB DFFDE040, 0060 (r1 A_M_I_ AMI_OEM   4000713
MSFT       97)
[    0.000000] ACPI: HPET DFFD5420, 0038 (r1 A_M_I_ OEMHPET0  4000713
MSFT       97)
[    0.000000] ACPI: MCFG DFFD5460, 003C (r1 A_M_I_ OEMMCFG   4000713
MSFT       97)
[    0.000000] ACPI: SRAT DFFD54A0, 0110 (r1 AMD    HAMMER          1
AMD         1)
[    0.000000] ACPI: SSDT DFFD55B0, 04F0 (r1 A_M_I_ POWERNOW        1
AMD         1)
[    0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[    0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 2 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 3 -> Node 1
[    0.000000] SRAT: PXMs only cover 0MB of your 4608MB e820 RAM. Not used.
[    0.000000] SRAT: SRAT not used.
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at 0000000000000000-0000000120000000
[    0.000000] Bootmem setup node 0 0000000000000000-0000000120000000
[    0.000000]   NODE_DATA [0000000000001000 - 0000000000004fff]
[    0.000000]   bootmap [000000000000e000 -  0000000000031fff] pages 24
[    0.000000]   early res: 0 [0-fff] BIOS data page
[    0.000000]   early res: 1 [6000-7fff] ...
From: Andrew Morton
Date: Wednesday, May 14, 2008 - 12:35 pm

On Wed, 14 May 2008 21:12:13 +0200


I suspect that this might be caused by the below.

That patch no longer seems to be necessary so I'll drop it.  Perhaps
you could try reverting it, please?



From: Ingo Molnar <mingo@elte.hu>

x86.git testing found the following build error on latest -git:

 drivers/acpi/numa.c: In function 'acpi_numa_init':
 drivers/acpi/numa.c:226: error: 'NR_NODE_MEMBLKS' undeclared (first use in this function)
 drivers/acpi/numa.c:226: error: (Each undeclared identifier is reported only once
 drivers/acpi/numa.c:226: error: for each function it appears in.)

with this config:

 http://redhat.com/~mingo/misc/config-Wed_Apr_30_22_42_42_CEST_2008.bad

i suspect we dont want SRAT parsing when CONFIG_HAVE_ARCH_PARSE_SRAT
is unset - but the fix looks a bit ugly. Perhaps we should define
NR_NODE_MEMBLKS even in this case and just let the code fall back
to some sane behavior?

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/acpi/numa.c |    4 ++++
 1 file changed, 4 insertions(+)

diff -puN drivers/acpi/numa.c~acpi-acpi_numa_init-build-fix drivers/acpi/numa.c
--- a/drivers/acpi/numa.c~acpi-acpi_numa_init-build-fix
+++ a/drivers/acpi/numa.c
@@ -176,6 +176,7 @@ acpi_parse_processor_affinity(struct acp
 	return 0;
 }
 
+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
 static int __init
 acpi_parse_memory_affinity(struct acpi_subtable_header * header,
 			   const unsigned long end)
@@ -193,6 +194,7 @@ acpi_parse_memory_affinity(struct acpi_s
 
 	return 0;
 }
+#endif
 
 static int __init acpi_parse_srat(struct acpi_table_header *table)
 {
@@ -221,9 +223,11 @@ int __init acpi_numa_init(void)
 	if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
 		acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
 				      acpi_parse_processor_affinity, NR_CPUS);
+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
 		acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
 				      acpi_parse_memory_affinity,
 ...
From: Torsten Kaiser
Date: Thursday, May 15, 2008 - 10:44 am

On Wed, May 14, 2008 at 9:35 PM, Andrew Morton

Yes, reverting the patch below gets the system back to its normal state.

[    0.000000] ACPI: SSDT DFFD55B0, 04F0 (r1 A_M_I_ POWERNOW        1
AMD         1)
[    0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[    0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0
[    0.000000] SRAT: PXM 1 -> APIC 2 -> Node 1
[    0.000000] SRAT: PXM 1 -> APIC 3 -> Node 1
[    0.000000] SRAT: Node 0 PXM 0 0-a0000
[    0.000000] SRAT: Node 0 PXM 0 100000-80000000
[    0.000000] SRAT: Node 1 PXM 1 80000000-e0000000
[    0.000000] SRAT: Node 1 PXM 1 100000000-120000000
[    0.000000] NUMA: Allocated memnodemap from e000 - 10440
[    0.000000] NUMA: Using 20 for the hash shift.
[    0.000000] Bootmem setup node 0 0000000000000000-0000000080000000
[    0.000000]   NODE_DATA [0000000000001000 - 0000000000004fff]
[    0.000000]   bootmap [0000000000011000 -  0000000000020fff] pages 10
[    0.000000]   early res: 0 [0-fff] BIOS data page
[    0.000000]   early res: 1 [6000-7fff] TRAMPOLINE
[    0.000000]   early res: 2 [200000-9601db] TEXT DATA BSS
[    0.000000]   early res: 3 [37ec8000-37fefc27] RAMDISK
[    0.000000]   early res: 4 [9fc00-fffff] BIOS reserved
[    0.000000]   early res: 5 [8000-dfff] PGTABLE
[    0.000000]   early res: 6 [e000-1043f] MEMNODEMAP
[    0.000000] Bootmem setup node 1 0000000080000000-0000000120000000
[    0.000000]   NODE_DATA [0000000080000000 - 0000000080003fff]
[    0.000000]   bootmap [0000000080004000 -  0000000080017fff] pages 14
[    0.000000]  [ffffe20000000000-ffffe20001bfffff] PMD ->
[ffff81000c200000-ffff81000ddfffff] on node 0
[    0.000000]  [ffffe20001c00000-ffffe20003ffffff] PMD ->
[ffff810080200000-ffff810081ffffff] on node 1
[    0.000000] sizeof(struct page) = 56

Just for your information: I'm also using a 64bit Gentoo system with
gcc 4.3.0-alpha20080410 and I'm also seeing these strange time
outputs:

[    0.000000] NR_CPUS: 4, nr_cpu_ids: 4
[42949372.960000] Built 2 zonelists in Node order, mobility ...
From: Andrew Morton
Date: Thursday, May 15, 2008 - 11:49 am

Great, thanks for checking.
--

From: Alexey Dobriyan
Date: Wednesday, May 14, 2008 - 2:16 pm

mkfs.ext2 became kick-ass slow:

+ sudo mkfs.ext2 -F 
mke2fs 1.40.6 (09-Feb-2008)
Warning: 256-byte inodes not usable on older systems
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
9773056 inodes, 39072726 blocks
1953636 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
1193 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
	...

Writing inode tables:  193/1193
		       ^^^^
		       counter moves slowly,
		       occasional counting at what seems to be normal
		       speed occur.

160 GB SATA disk, no partitions.
According to sysfs, CFQ is in use, the rest is compiled out.
2.6.26-rc2 is fine, mkfs takes ~1 min.

Slowdown is totally reproducible.


CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_SATA_AHCI=y
CONFIG_ATA_SFF=y
CONFIG_ATA_PIIX=y
CONFIG_PATA_JMICRON=y


/sys/block/sdb/queue/iosched/back_seek_max
16384
/sys/block/sdb/queue/iosched/back_seek_penalty
2
/sys/block/sdb/queue/iosched/fifo_expire_async
250
/sys/block/sdb/queue/iosched/fifo_expire_sync
120
/sys/block/sdb/queue/iosched/quantum
4
/sys/block/sdb/queue/iosched/slice_async
40
/sys/block/sdb/queue/iosched/slice_async_rq
2
/sys/block/sdb/queue/iosched/slice_idle
10
/sys/block/sdb/queue/iosched/slice_sync
100

--

From: Alexey Dobriyan
Date: Wednesday, May 14, 2008 - 2:33 pm

Here is where it spends time (seems to be always the same):

mkfs.ext2     D 0000000000000000     0  4760   4759
 ffff81017ce93a58 0000000000000046 0000000000000000 0000000000000282
 ffff81017e14d640 ffffffff8056f4c0 ffff81017e14d880 ffffffff804679a2
 00000000ffffb5c4 000000007ce93a68 0000000000000003 ffffffff8023d504
Call Trace:
 [<ffffffff804679a2>] ? _spin_unlock_irqrestore+0x42/0x80
 [<ffffffff8023d504>] ? __mod_timer+0xc4/0x110
 [<ffffffff80465012>] schedule_timeout+0x62/0xe0
 [<ffffffff8023cee0>] ? process_timeout+0x0/0x10
 [<ffffffff80464ef8>] io_schedule_timeout+0x28/0x40
 [<ffffffff8027663a>] congestion_wait+0x8a/0xb0
 [<ffffffff80248720>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8026fe31>] balance_dirty_pages_ratelimited_nr+0x1a1/0x3f0
 [<ffffffff8026915f>] generic_file_buffered_write+0x1ff/0x740
 [<ffffffff80467870>] ? _spin_unlock+0x30/0x60
 [<ffffffff802acafb>] ? mnt_drop_write+0x7b/0x160
 [<ffffffff80269b30>] __generic_file_aio_write_nolock+0x2a0/0x460
 [<ffffffff802548ed>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff80269df7>] generic_file_aio_write_nolock+0x37/0xa0
 [<ffffffff80292be1>] do_sync_write+0xf1/0x130
 [<ffffffff80256485>] ? trace_hardirqs_on_caller+0xd5/0x160
 [<ffffffff80248720>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff80256485>] ? trace_hardirqs_on_caller+0xd5/0x160
 [<ffffffff8025651d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff8029339a>] vfs_write+0xaa/0xe0
 [<ffffffff80293940>] sys_write+0x50/0x90
 [<ffffffff8020b69b>] system_call_after_swapgs+0x7b/0x80

--

From: Jiri Slaby
Date: Thursday, May 15, 2008 - 2:41 pm

And not only mkfs, ld took ages to link vmlinux.o:
ld            D 0000000000000000     0 17340  17339
  ffff8100681819c8 0000000000000082 0000000000000000 ffff81006818198c
  ffffffff806c90c0 ffff81006b50d2e0 ffffffff80636360 ffff81006b50d558
  0000000068181978 0000000100a7523e ffff81006b50d558 0000000100a75269
Call Trace:
  [<ffffffff805056b2>] schedule_timeout+0x62/0xd0
  [<ffffffff802403b0>] ? process_timeout+0x0/0x10
  [<ffffffff805056ad>] ? schedule_timeout+0x5d/0xd0
  [<ffffffff80504956>] io_schedule_timeout+0x76/0xd0
  [<ffffffff80282cac>] congestion_wait+0x6c/0x90
  [<ffffffff8024c2c0>] ? autoremove_wake_function+0x0/0x40
  [<ffffffff8027c82f>] balance_dirty_pages_ratelimited_nr+0x13f/0x330
  [<ffffffff80275a3d>] generic_file_buffered_write+0x1dd/0x6d0
  [<ffffffff8027d0e7>] ? __do_page_cache_readahead+0x167/0x220
  [<ffffffff802763ae>] __generic_file_aio_write_nolock+0x25e/0x450
  [<ffffffff80276c75>] ? generic_file_aio_read+0x565/0x640
  [<ffffffff80276607>] generic_file_aio_write+0x67/0xd0
  [<ffffffff802f8bd6>] ext3_file_write+0x26/0xc0
  [<ffffffff8029ffa1>] do_sync_write+0xf1/0x140
  [<ffffffff8024c2c0>] ? autoremove_wake_function+0x0/0x40
  [<ffffffff80289703>] ? remove_vma+0x53/0x70
  [<ffffffff80505a01>] ? mutex_lock+0x11/0x30
  [<ffffffff802a0a2b>] vfs_write+0xcb/0x190
  [<ffffffff802a0be0>] sys_write+0x50/0x90
  [<ffffffff8020b82b>] system_call_after_swapgs+0x7b/0x80
--

From: Randy Dunlap
Date: Wednesday, May 14, 2008 - 1:39 pm

Using WARN() with CONFIG_BUG=n causes:

linux-2.6.26-rc2-mm1/lib/kobject.c: In function 'kobject_add_internal':
linux-2.6.26-rc2-mm1/lib/kobject.c:218: error: implicit declaration of function 'WARN'
make[2]: *** [lib/kobject.o] Error 1

---
~Randy
--

From: Randy Dunlap
Date: Wednesday, May 14, 2008 - 1:43 pm

With CONFIG_*FD=n:
# CONFIG_SIGNALFD is not set
# CONFIG_TIMERFD is not set
# CONFIG_EVENTFD is not set

the build fails with:

arch/x86/kernel/built-in.o: In function `sys_call_table':
(.rodata+0x89c): undefined reference to `sys_signalfd4'
arch/x86/kernel/built-in.o: In function `sys_call_table':
(.rodata+0x8a0): undefined reference to `sys_eventfd2'
make[1]: *** [.tmp_vmlinux1] Error 1

---
~Randy
--

From: Zan Lynx
Date: Wednesday, May 14, 2008 - 1:49 pm

No good on my first attempt.  Here is what I ran into:

The printk timestamps have gone wild.  I cannot paste a dmesg but here
is one line I wrote down:
[17180644.495790] Testing tracer ftrace: NMI watchdog ...

Which leads into the next problem: The kernel freezes after Testing
tracer ftrace.  Then I rebooted with my special testing command line
"kernel /bzImage-2.6.26-rc2-mm1 root=3D/dev/sda2 rootfstype=3Dreiser4
rootflags=3Ddefaults,noatime i8042.nomux elevator=3Dcfq resume=3D/dev/sda3
panic=3D5 nmi_watchdog=3D2,panic debug idle=3Dpoll nohz=3Doff"

and I got the same freeze but then the NMI watchdog message.  Which is
the third problem.

Why did the NMI watchdog not panic and reboot the system?  It detected
the lock and printed the message.  It should have then panicked, waited
5 seconds, and rebooted.

System is a 64-bit Gentoo AMD-64 Compaq R3000 laptop.  Compiler is GCC
4.3.

Config follows:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.26-rc2-mm1
# Wed May 14 09:59:19 2008
#
CONFIG_64BIT=3Dy
# CONFIG_X86_32 is not set
CONFIG_X86_64=3Dy
CONFIG_X86=3Dy
CONFIG_DEFCONFIG_LIST=3D"arch/x86/configs/x86_64_defconfig"
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=3Dy
CONFIG_GENERIC_CMOS_UPDATE=3Dy
CONFIG_CLOCKSOURCE_WATCHDOG=3Dy
CONFIG_GENERIC_CLOCKEVENTS=3Dy
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=3Dy
CONFIG_LOCKDEP_SUPPORT=3Dy
CONFIG_STACKTRACE_SUPPORT=3Dy
CONFIG_HAVE_LATENCYTOP_SUPPORT=3Dy
CONFIG_FAST_CMPXCHG_LOCAL=3Dy
CONFIG_MMU=3Dy
CONFIG_ZONE_DMA=3Dy
CONFIG_GENERIC_ISA_DMA=3Dy
CONFIG_GENERIC_IOMAP=3Dy
CONFIG_GENERIC_BUG=3Dy
CONFIG_GENERIC_HWEIGHT=3Dy
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=3Dy
CONFIG_RWSEM_GENERIC_SPINLOCK=3Dy
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not ...
From: Andrew Morton
Date: Wednesday, May 14, 2008 - 2:00 pm

On Wed, 14 May 2008 14:49:07 -0600

I've seen reports like this against mainline, but I'm not sure that


Thanks.
--

From: me
Date: Wednesday, May 14, 2008 - 2:14 pm

I've reported problems with -next and ftrace. The timestamps look very similar 
to what I've seen as well. I don't have those kernels available anymore - I 
decided to wipe my system and move to a distro where it's easier to test new 
kernels.

However, it stands to reason that it isn't ftrace actually causing the 

That is quite similar... I'm on a Core2Duo (Dell Inspiron 1420 laptop) and was 
seeing the problems with GCC4.3 and a pure 64bit userland.




-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
--

From: Zan Lynx
Date: Wednesday, May 14, 2008 - 3:06 pm

I disabled a bunch of trace and self test options, and I am now running
2.6.26-rc2-mm1.  So far, so good.

I am including some dmesg with the weird timestamps in case it is useful
to anyone.

[    0.000000] Linux version 2.6.26-rc2-mm1 (lynx@zephyr) (gcc version 4.3.=
0 (Gentoo 4.3.0 p1.0) ) #13 SMP Wed May 14 15:16:26 MDT 2008
[    0.000000] Command line: root=3D/dev/sda2 rootfstype=3Dreiser4 rootflag=
s=3Ddefaults,noatime i8042.nomux elevator=3Dcfq resume=3D/dev/sda3 panic=3D=
5 debug
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[    0.000000]  BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000d0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000003ff70000 (usable)
[    0.000000]  BIOS-e820: 000000003ff70000 - 000000003ff7f000 (ACPI data)
[    0.000000]  BIOS-e820: 000000003ff7f000 - 000000003ff80000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000003ff80000 - 0000000040000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[    0.000000] max_pfn_mapped =3D 1048576
[    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x701060007=
0106
[    0.000000] init_memory_mapping
[    0.000000] DMI present.
[    0.000000] ACPI: RSDP 000F7240, 0014 (r0 PTLTD )
[    0.000000] ACPI: RSDT 3FF7A87E, 0034 (r1 PTLTD    RSDT    6040000  LTP =
       0)
[    0.000000] ACPI: FACP 3FF7EE13, 0074 (r1 NVIDIA CK8       6040000 PTL_ =
   F4240)
[    0.000000] ACPI: DSDT 3FF7A8B2, 4561 (r1 NVIDIA      CK8  6040000 MSFT =
 100000E)
[    0.000000] ACPI: FACS 3FF7FFC0, 0040
[    0.000000] ACPI: APIC 3FF7EE87, 005A (r1 NVIDIA NV_APIC_  6040000  LTP =
       0)
[    0.000000] ACPI: BOOT 3FF7EEE1, 0028 (r1 PTLTD  $SBFTBL$  6040000  LTP =
       1)
[    0.000000] ACPI: SSDT 3FF7EF09, 00F7 (r1 PTLTD  POWERNOW  6040000  LTP =
       1)
[    0.000000] ACPI: DMI detected: Hewlett-Packard
[    0.000000]   ...
From: Randy Dunlap
Date: Wednesday, May 14, 2008 - 2:13 pm

SCSI_DH has some problems when CONFIG_SCSI=n:

drivers/built-in.o: In function `activate_path':
dm-mpath.c:(.text+0x18a292): undefined reference to `scsi_dh_activate'
drivers/built-in.o: In function `multipath_ctr':
dm-mpath.c:(.text+0x18a6f0): undefined reference to `scsi_dh_handler_exist'
make[1]: *** [.tmp_vmlinux1] Error 1


#
# SCSI device support
#
CONFIG_RAID_ATTRS=y
# CONFIG_SCSI is not set
# CONFIG_SCSI_DMA is not set
# CONFIG_SCSI_NETLINK is not set
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y


---
~Randy
--

From: James Bottomley
Date: Thursday, May 15, 2008 - 7:46 am

This is one more of those annoying selects.  The SCSI_DH Kconfig file is
correctly dependent on SCSI:

menuconfig SCSI_DH
	tristate "SCSI Device Handlers"
	depends on SCSI
	default n
	help

but we've also got a select in md/Kconfig:

config DM_MULTIPATH
	tristate "Multipath target"
	depends on BLK_DEV_DM
	select SCSI_DH

Which ignores the dependency.

My best guess for fixing this is either to make the select a depends or
just drop it altogether (after all, it's possible to have multipath on
non-SCSI devices).

James


--

From: Chandra Seetharaman
Date: Thursday, May 15, 2008 - 12:56 pm

Hi James, Andrew,

Here is a patch to remove the automatic "select" of scsi_dh for
dm-multipath.

Sorry about the mishap.

chandra
From: Andrew Morton
Date: Thursday, May 22, 2008 - 8:25 pm

You obviously wanted `static inline' there, but it still fails i386
allmodconfig compilation.

--

From: Chandra Seetharaman
Date: Friday, May 23, 2008 - 12:39 pm

Yikes.... Sorry again... Hopefully this attached patch work properly.

chandra
-------------------------
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow  hardware handlers to be used.

Handle SCSI_DH being a module also.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Hannes Reinecke <hare@suse.de>
---

Index: scsi-misc-2.6/drivers/md/Kconfig
===================================================================
--- scsi-misc-2.6.orig/drivers/md/Kconfig
+++ scsi-misc-2.6/drivers/md/Kconfig
@@ -252,7 +252,6 @@ config DM_ZERO
 config DM_MULTIPATH
 	tristate "Multipath target"
 	depends on BLK_DEV_DM
-	select SCSI_DH
 	---help---
 	  Allow volume managers to support multipath hardware.
 
Index: scsi-misc-2.6/drivers/md/dm-mpath.c
===================================================================
--- scsi-misc-2.6.orig/drivers/md/dm-mpath.c
+++ scsi-misc-2.6/drivers/md/dm-mpath.c
@@ -664,6 +664,8 @@ static int parse_hw_handler(struct arg_s
 	request_module("scsi_dh_%s", m->hw_handler_name);
 	if (scsi_dh_handler_exist(m->hw_handler_name) == 0) {
 		ti->error = "unknown hardware handler type";
+		kfree(m->hw_handler_name);
+		m->hw_handler_name = NULL;
 		return -EINVAL;
 	}
 	consume(as, hw_argc - 1);
Index: scsi-misc-2.6/include/scsi/scsi_dh.h
===================================================================
--- scsi-misc-2.6.orig/include/scsi/scsi_dh.h
+++ scsi-misc-2.6/include/scsi/scsi_dh.h
@@ -54,6 +54,16 @@ enum {
 	SCSI_DH_NOSYS,
 	SCSI_DH_DRIVER_MAX,
 };
-
+#if defined(CONFIG_SCSI_DH) || defined(CONFIG_SCSI_DH_MODULE)
 extern int scsi_dh_activate(struct request_queue *);
 extern int scsi_dh_handler_exist(const char *);
+#else
+static inline int ...
From: Randy Dunlap
Date: Friday, May 23, 2008 - 1:28 pm

Did it build cleanly for you?
Hint:


-- 
~Randy
--

From: Chandra Seetharaman
Date: Friday, May 23, 2008 - 6:16 pm

Oh, my... it is getting very tricky.

Here is a patch that compiles clean in different combinations. But, I
agree that the "depends" (under DM_MULTIPATH) sure looks weird.

-----------
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow  hardware handlers to be used.

Handle SCSI_DH being a module also. Make sure it doesn't allow DM_MULTIPATH
to be compiled in when SCSI_DH is a module.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Mike Anderson <andmike@us.ibm.com>
Cc: Hannes Reinecke <hare@suse.de>
---

Index: scsi-misc-2.6/drivers/md/Kconfig
===================================================================
--- scsi-misc-2.6.orig/drivers/md/Kconfig
+++ scsi-misc-2.6/drivers/md/Kconfig
@@ -252,7 +252,7 @@ config DM_ZERO
 config DM_MULTIPATH
 	tristate "Multipath target"
 	depends on BLK_DEV_DM
-	select SCSI_DH
+	depends on SCSI_DH || !SCSI_DH
 	---help---
 	  Allow volume managers to support multipath hardware.
 
Index: scsi-misc-2.6/drivers/md/dm-mpath.c
===================================================================
--- scsi-misc-2.6.orig/drivers/md/dm-mpath.c
+++ scsi-misc-2.6/drivers/md/dm-mpath.c
@@ -664,6 +664,8 @@ static int parse_hw_handler(struct arg_s
 	request_module("scsi_dh_%s", m->hw_handler_name);
 	if (scsi_dh_handler_exist(m->hw_handler_name) == 0) {
 		ti->error = "unknown hardware handler type";
+		kfree(m->hw_handler_name);
+		m->hw_handler_name = NULL;
 		return -EINVAL;
 	}
 	consume(as, hw_argc - 1);
Index: scsi-misc-2.6/include/scsi/scsi_dh.h
===================================================================
--- scsi-misc-2.6.orig/include/scsi/scsi_dh.h
+++ scsi-misc-2.6/include/scsi/scsi_dh.h
@@ -54,6 +54,16 @@ enum ...
From: Randy Dunlap
Date: Wednesday, May 14, 2008 - 2:16 pm

net/built-in.o: In function `init_p9':
mod.c:(.init.text+0x4b0d): undefined reference to `p9_trans_fd_init'
make[1]: *** [.tmp_vmlinux1] Error 1


CONFIG_NET_9P=y
CONFIG_NET_9P_FD=m
CONFIG_NET_9P_VIRTIO=m
CONFIG_NET_9P_DEBUG=y

# CONFIG_9P_FS is not set


---
~Randy
--

From: Eric Van Hensbergen
Date: Wednesday, May 14, 2008 - 5:00 pm

This is probably a side effect of the merge issue v9fs-devel and -mm
had last week.  It should no longer be possible with the code that's
been in my v9fs-devel tree on kernel.org for the past 5 days
(CONFIG_NET_9P_FD no longer exists).

             -eric


--

From: Andrew Morton
Date: Wednesday, May 14, 2008 - 5:05 pm

On Wed, 14 May 2008 19:00:12 -0500


But I'm still reverting the v9fs tree due to

git-v9fs is causing i386 allmodconfig failures:

net/9p/trans_fd.o: In function `init_module':
trans_fd.c:(.init.text+0x0): multiple definition of `init_module'
net/9p/mod.o:mod.c:(.init.text+0x0): first defined here
/opt/crosstool/gcc-4.1.0-glibc-2.3.6/i686-unknown-linux-gnu/bin/i686-unknown-linux-gnu-ld: Warning: size of symbol `init_module' changed from 27 in net/9p/mod.o to 128 in net/9p/trans_fd.o


--

From: Eric Van Hensbergen
Date: Wednesday, May 14, 2008 - 7:29 pm

On Wed, May 14, 2008 at 7:05 PM, Andrew Morton

Okay, clearly I'm doing something wrong.  I've tried the allmodconfig
on my local sandbox and its fine.  When I look to see if there is
still a module_init in net/9p/trans_fd on kernel.org via gitweb, I
can't find it. (http://git.kernel.org/?p=linux/kernel/git/ericvh/v9fs.git;a=blob;f=net/9p/trans_fd.c;h...)

Are you pulling from my v9fs-devel tree or is --mm switched over to
pull from linux-next or something?

           -eric
--

From: Andrew Morton
Date: Wednesday, May 14, 2008 - 8:04 pm

It has mysteriously gone away.  Perhaps it was triggered by some other

The algorithm to determine this is to look at the first line of -mm's
git-v9fs.patch:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.26-rc2/2.6.26-rc2-mm...
has:

GIT 38bfbd9f766f0b33de6bc16fd9ad1018b8fd3fe2 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git#v9fs-devel

Yes, -mm uses both linux-next and git-v9fs (aka #v9fs-devel)

linux-next uses #for-next and afacit that was empty as of a few hours
ago.  Nothing for 2.6.27?
--

From: Eric Van Hensbergen
Date: Wednesday, May 14, 2008 - 8:53 pm

On Wed, May 14, 2008 at 10:04 PM, Andrew Morton

Oh, there's stuff for 2.6.27, I'm still working on stablizing it --
but I put it on hold while I tried to clear out my bugzilla backlog.
Trying to stick to a policy of increased testing and removing bugs
before potentially introducing new ones.  It helps that there have
been several additional groups starting to use 9p and hitting corner
cases my testing didn't cover before.

       -eric
--

From: Rafael J. Wysocki
Date: Wednesday, May 14, 2008 - 2:54 pm

My HP nx6325 doesn't resume from suspend.  It looks like the graphics doesn't
come up, so probably s2ram is busted.

I'll try to bisect on weekend, if I have the time (not sure).

Thanks,
Rafael
--

From: Mariusz Kozlowski
Date: Thursday, May 15, 2008 - 10:58 am

Parenthesis fix in include/asm-arm/arch-omap/control.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

diff -upr linux-2.6.26-rc2-mm1-a/include/asm-arm/arch-omap/control.h linux-2.6.26-rc2-mm1-b/include/asm-arm/arch-omap/control.h
--- linux-2.6.26-rc2-mm1-a/include/asm-arm/arch-omap/control.h	2008-05-15 19:44:38.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-arm/arch-omap/control.h	2008-05-15 19:51:30.000000000 +0200
@@ -80,7 +80,7 @@
 #define OMAP24XX_CONTROL_SEC_TAP	(OMAP2_CONTROL_GENERAL + 0x0064)
 #define OMAP24XX_CONTROL_OCM_PUB_RAM_ADD	(OMAP2_CONTROL_GENERAL + 0x006c)
 #define OMAP24XX_CONTROL_EXT_SEC_RAM_START_ADD	(OMAP2_CONTROL_GENERAL + 0x0070)
-#define OMAP24XX_CONTROL_EXT_SEC_RAM_STOP_ADD	(OMAP2_CONTROL_GENERAL + 0x0074
+#define OMAP24XX_CONTROL_EXT_SEC_RAM_STOP_ADD	(OMAP2_CONTROL_GENERAL + 0x0074)
 #define OMAP24XX_CONTROL_SEC_STATUS		(OMAP2_CONTROL_GENERAL + 0x0080)
 #define OMAP24XX_CONTROL_SEC_ERR_STATUS		(OMAP2_CONTROL_GENERAL + 0x0084)
 #define OMAP24XX_CONTROL_STATUS			(OMAP2_CONTROL_GENERAL + 0x0088)
	
--

From: Mariusz Kozlowski
Date: Thursday, May 15, 2008 - 10:59 am

Parenthesis fix in include/asm-mips/gic.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

diff -upr linux-2.6.26-rc2-mm1-a/include/asm-mips/gic.h linux-2.6.26-rc2-mm1-b/include/asm-mips/gic.h
--- linux-2.6.26-rc2-mm1-a/include/asm-mips/gic.h	2008-05-15 19:44:48.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-mips/gic.h	2008-05-15 19:52:20.000000000 +0200
@@ -330,7 +330,7 @@
 
 #define GIC_SH_RMASK_OFS		0x0300
 #define GIC_CLR_INTR_MASK(intr, val) \
-	GICWRITE(GIC_REG_ADDR(SHARED, GIC_SH_RMASK_OFS + 4 + (((((intr) / 32) ^ 1) - 1) * 4)), ((val) << ((intr) % 32))
+	GICWRITE(GIC_REG_ADDR(SHARED, GIC_SH_RMASK_OFS + 4 + (((((intr) / 32) ^ 1) - 1) * 4)), ((val) << ((intr) % 32)))
 
 /* Register Map for Local Section */
 #define GIC_VPE_CTL_OFS			0x0000
--

From: Mariusz Kozlowski
Date: Thursday, May 15, 2008 - 11:01 am

Parenthesis fix in include/asm-mips/mach-au1x00/au1000.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

diff -upr linux-2.6.26-rc2-mm1-a/include/asm-mips/mach-au1x00/au1000.h linux-2.6.26-rc2-mm1-b/include/asm-mips/mach-au1x00/au1000.h
--- linux-2.6.26-rc2-mm1-a/include/asm-mips/mach-au1x00/au1000.h	2008-05-15 19:44:48.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-mips/mach-au1x00/au1000.h	2008-05-15 19:52:38.000000000 +0200
@@ -1036,7 +1036,7 @@ enum soc_au1200_ints {
 #define USBD_INTSTAT		0xB020001C
 #  define USBDEV_INT_SOF	(1 << 12)
 #  define USBDEV_INT_HF_BIT	6
-#  define USBDEV_INT_HF_MASK	0x3f << USBDEV_INT_HF_BIT)
+#  define USBDEV_INT_HF_MASK	(0x3f << USBDEV_INT_HF_BIT)
 #  define USBDEV_INT_CMPLT_BIT	0
 #  define USBDEV_INT_CMPLT_MASK (0x3f << USBDEV_INT_CMPLT_BIT)
 #define USBD_CONFIG		0xB0200020
--

From: Mariusz Kozlowski
Date: Thursday, May 15, 2008 - 11:21 am

Hello,

	To get this I simply modprobe wusbcore. modprobe itself ends with
SIGSEGV. This comes from x86_32.

UWB: workarounds enabled for bugs:445 514 543 548 010612024004
BUG: unable to handle kernel NULL pointer dereference at 0000000c
IP: [<c01e0e4c>] scatterwalk_start+0xc/0x1f
*pde = 00000000 
Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:05.0/resource
Modules linked in: cbc wusbcore(+) uwb radeon drm orinoco_cs orinoco hermes parport_pc parport floppy pcmcia firmware_class rtc psmouse pcspkr 8139too ide_cd_mod cdrom ehci_hcd uhci_hcd usbcore sony_laptop backlight snd_ali5451 snd_ac97_codec ac97_bus snd_pcm snd_timer snd snd_page_alloc yenta_socket rsrc_nonstatic ati_agp agpgart

Pid: 5423, comm: modprobe Not tainted (2.6.26-rc2-mm1 #1)
EIP: 0060:[<c01e0e4c>] EFLAGS: 00010296 CPU: 0
EIP is at scatterwalk_start+0xc/0x1f
EAX: da471c78 EBX: da471c78 ECX: da471c78 EDX: 00000000
ESI: da471dbb EDI: da4a5010 EBP: da471ba8 ESP: da471ba8
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process modprobe (pid: 5423, ti=da471000 task=dceb0000 task.ti=da471000)
Stack: da471bb4 c01e0ea9 00000000 da471bd4 c01e0f96 00000010 da471c78 da4a5010 
       00000010 fffffffc 00000000 da471c04 c01e21ca 00000000 da471c68 da471dc8 
       00000003 00000010 da471c84 da471c78 00000030 da4a5010 da4b8320 da471c34 
Call Trace:
 [<c01e0ea9>] ? scatterwalk_pagedone+0x4a/0x84
 [<c01e0f96>] ? scatterwalk_copychunks+0x2f/0xbb
 [<c01e21ca>] ? blkcipher_walk_next+0x311/0x38b
 [<c01e1cdf>] ? blkcipher_walk_done+0xb2/0x28c
 [<de86e308>] ? crypto_cbc_encrypt+0xc6/0x13b [cbc]
 [<c01e3ac6>] ? aes_encrypt+0x0/0x114d
 [<c02d1ba8>] ? _spin_unlock_irqrestore+0x3e/0x5f
 [<c01f8e34>] ? sg_init_one+0xb/0x66
 [<dedea2ba>] ? wusb_prf+0x2b0/0x3e2 [wusbcore]
 [<c013ec7e>] ? trace_hardirqs_on+0xb/0xd
 [<dedea445>] ? wusb_crypto_init+0x59/0x274 [wusbcore]
 [<c02d1ba8>] ? _spin_unlock_irqrestore+0x3e/0x5f
 [<de85f00b>] ? wusbcore_init+0xb/0x75 [wusbcore]
 [<c0152e69>] ? ...
From: Andrew Morton
Date: Thursday, May 15, 2008 - 11:58 am

From: Inaky Perez-Gonzalez
Date: Thursday, May 15, 2008 - 1:05 pm

This was fixed by David Vrabel recently, s/g arrays weren't 
proerly initialized (I am to blame for that).

David?

--

From: Alexey Dobriyan
Date: Friday, May 16, 2008 - 3:17 pm

>  linux-next.patch

That's terse. ;-)

Who is responsible for something called "Option High Speed Mobile
Devices"?

It's using create_proc_read_entry() interface, so should be switched
to seq_files before merging.

And "procfs" module parameter is plain stupid, sorry.

--

From: Andrew Morton
Date: Friday, May 16, 2008 - 2:31 pm

On Sat, 17 May 2008 02:17:49 +0400

well, it's a git tree, and all that this implies.  The git URL is

The full changelog is contained in linux-next.patch.  Searching it for
"Option" quickly leads to

commit a50a26ba350a5f32ec6481c85b938fc7fb476671
Author: Greg Kroah-Hartman <gregkh@suse.de>
Date:   Mon Apr 14 11:41:16 2008 -0700

    USB: add option hso driver
    
    This driver is for a number of different Option devices.  Originally
    written by Option and Andrew Bird, but cleaned up massivly for
    acceptance into mainline by me (Greg).
    
    TODO:
    	- remove proc files and move to debugfs
    	- review network interfaces
    	- add better changelog information
    	- Use netif_msg_ for the message level rather than module parameter
    	- net_device_stats are now available in dev->stats
    
    Many thanks to the following for their help in cleaning up the driver by
    providing feedback and patches to it:
    	- Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
    	- Oliver Neukum <oliver@neukum.org>
    	- Alan Cox <alan@lxorguk.ukuu.org.uk>
    
    
    Cc: Andrew Bird <ajb@spheresystems.co.uk>
    Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
    Cc: Filip Aben <f.aben@option.com>
    Cc: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
    Cc: Oliver Neukum <oliver@neukum.org>

stupid people cc'ed ;)
--

From: Greg KH
Date: Friday, May 16, 2008 - 3:00 pm

That parameter is gone, see the patches posted to lkml for an updated
version.

thanks,

greg "i'm stupid" k-h
--

From: Valdis.Kletnieks
Date: Saturday, May 17, 2008 - 3:28 am

Seen in a 'make silentoldconfig':

---
    LED Default ON Trigger (LEDS_TRIGGER_DEFAULT_ON) [N/m/y/?] (NEW) ?

This allows LEDs to be initialised in the ON state.
If unsure, say Y.
---

The default is N, but if unsure, say Y.  Some digging shows that it's because
there's a "depends on LEDS_TRIGGERS" that I had set to N.  I wonder if the
various 'config LEDS_TRIGGER_FOO' in drivers/leds/Kconfig should all be
wrapped in one 'if LEDS_TRIGGERS'?  Kind of like this totally untested patch:

If I'm actually right here, here's a:

Signed-Off-By: Valdis Kletnieks <valdis.kletnieks@vt.edu>

--- linux-2.6.26-rc2-mm1/drivers/leds/Kconfig.before	2008-05-17 06:22:03.000000000 -0400
+++ linux-2.6.26-rc2-mm1/drivers/leds/Kconfig	2008-05-17 06:22:55.000000000 -0400
@@ -164,9 +164,9 @@ config LEDS_TRIGGERS
 	  These triggers allow kernel events to drive the LEDs and can
 	  be configured via sysfs. If unsure, say Y.
 
+if LEDS_TRIGGERS
 config LEDS_TRIGGER_TIMER
 	tristate "LED Timer Trigger"
-	depends on LEDS_TRIGGERS
 	help
 	  This allows LEDs to be controlled by a programmable timer
 	  via sysfs. Some LED hardware can be programmed to start
@@ -177,14 +177,13 @@ config LEDS_TRIGGER_TIMER
 
 config LEDS_TRIGGER_IDE_DISK
 	bool "LED IDE Disk Trigger"
-	depends on LEDS_TRIGGERS && BLK_DEV_IDEDISK
+	depends on BLK_DEV_IDEDISK
 	help
 	  This allows LEDs to be controlled by IDE disk activity.
 	  If unsure, say Y.
 
 config LEDS_TRIGGER_HEARTBEAT
 	tristate "LED Heartbeat Trigger"
-	depends on LEDS_TRIGGERS
 	help
 	  This allows LEDs to be controlled by a CPU load average.
 	  The flash frequency is a hyperbolic function of the 1-minute
@@ -193,9 +192,9 @@ config LEDS_TRIGGER_HEARTBEAT
 
 config LEDS_TRIGGER_DEFAULT_ON
 	tristate "LED Default ON Trigger"
-	depends on LEDS_TRIGGERS
 	help
 	  This allows LEDs to be initialised in the ON state.
 	  If unsure, say Y.
 
+endif # LEDS_TRIGGERS
 endif # NEW_LEDS


From: Kamalesh Babulal
Date: Monday, May 19, 2008 - 4:33 am

Hi Andrew,

The 2.6.26-rc2-mm1 kernel gets stuck, while booting up on x86_64 machine,
with the CONFIG_FTRACE_STARTUP_TEST enabled. The following .config
options related to FTRACE are enabled. 

CONFIG_FTRACE_SELFTEST=y
CONFIG_FTRACE_STARTUP_TEST=y
CONFIG_FTRACE=y
CONFIG_HAVE_FTRACE=y
CONFIG_DYNAMIC_FTRACE=y

BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009dc00 (usable)
 BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000d7fcca00 (usable)
 BIOS-e820: 00000000d7fcca00 - 00000000d7fd0000 (ACPI data)
 BIOS-e820: 00000000d7fd0000 - 00000000d8000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 00000001e8000000 (usable)
max_pfn_mapped = 1998848
init_memory_mapping
DMI 2.3 present.
ACPI: RSDP 000FDFB0, 0024 (r2 IBM   )
ACPI: XSDT D7FCFF00, 0044 (r1 IBM    SERONYXP     1001 IBM  45444F43)
ACPI: FACP D7FCFE40, 0084 (r2 IBM    SERONYXP     1001 IBM  45444F43)
ACPI: DSDT D7FCCA00, 2AA0 (r2 IBM    SERTURQU     1000 INTL 20041203)
ACPI: FACS D7FCFD00, 0040
ACPI: APIC D7FCFD80, 00B4 (r1 IBM    SERONYXP     1001 IBM  45444F43)
ACPI: MCFG D7FCFD40, 003C (r1 IBM    SERONYXP     1001 IBM  45444F43)
ACPI: SSDT D7FCFA40, 02BD (r2 IBM    YETA0        1000 INTL 20041203)
No NUMA configuration found
Faking a node at 0000000000000000-00000001e8000000
Bootmem setup node 0 0000000000000000-00000001e8000000
  NODE_DATA [0000000000011000 - 0000000000016fff]
  bootmap [0000000000017000 -  0000000000053fff] pages 3d
  early res: 0 [0-fff] BIOS data page
  early res: 1 [6000-7fff] TRAMPOLINE
  early res: 2 [200000-b4e40b] TEXT DATA BSS
  early res: 3 [37e81000-37fefaa0] RAMDISK
  early res: 4 [9dc00-fffff] BIOS reserved
  early res: 5 [8000-10fff] PGTABLE
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 ->  1998848
Movable zone start PFN for each ...
From: Steven Rostedt
Date: Monday, May 19, 2008 - 6:02 am

Hi, could you do nmi_watchdog=1 and see if that gives you a stack dump?

Thanks.

-- Steve


--

From: Kamalesh Babulal
Date: Monday, May 19, 2008 - 7:08 am

Hi Steven,

Passing nmi_watchdog=1 did not help in getting any extra information, over the previous

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Steven Rostedt
Date: Monday, May 19, 2008 - 7:38 am

Thanks for trying.

Can you send your config privately to my goodmis account.

  rostedt@goodmis.org

Thanks,

-- Steve
--

From: Mariusz Kozlowski
Date: Tuesday, May 20, 2008 - 3:01 am

Hello,

	This lockdep warning is seen when I remove pcmcia wifi card
from the slot. Doesn't happen every time. It's x86_32.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.26-rc2-mm1 #2
-------------------------------------------------------
pccardd/1037 is trying to acquire lock:
 (rtnl_mutex){--..}, at: [<c02870f1>] rtnl_lock+0x14/0x16

but task is already holding lock:
 (&socket->skt_mutex){--..}, at: [<c02608ba>] pccardd+0x161/0x28c

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&socket->skt_mutex){--..}:
       [<c013fff0>] __lock_acquire+0xf3b/0x103b
       [<c0140169>] lock_acquire+0x79/0x92
       [<c02cfcd5>] mutex_lock_nested+0x90/0x290
       [<c02600a6>] pccard_register_pcmcia+0x22/0x78
       [<ded5af02>] pcmcia_bus_add_socket+0x9f/0xe0 [pcmcia]
       [<c0251c02>] class_interface_register+0x83/0xb2
       [<ded6003a>] 0xded6003a
       [<c0146115>] sys_init_module+0x11e/0x18e4
       [<c0103001>] sysenter_past_esp+0x6a/0xa5
       [<ffffffff>] 0xffffffff

-> #1 (&cls->mutex){--..}:
       [<c013fff0>] __lock_acquire+0xf3b/0x103b
       [<c0140169>] lock_acquire+0x79/0x92
       [<c02cfcd5>] mutex_lock_nested+0x90/0x290
       [<c024f4a0>] device_add+0x42f/0x557
       [<c02895a1>] netdev_register_kobject+0x76/0x7b
       [<c027e3f6>] register_netdevice+0x22e/0x39a
       [<c027e599>] register_netdev+0x37/0x44
       [<c03ce7fb>] loopback_net_init+0x38/0x7d
       [<c027bb59>] register_pernet_operations+0x18/0x1a
       [<c027bbd3>] register_pernet_device+0x24/0x51
       [<c03ce7c1>] loopback_init+0x12/0x14
       [<c03b9721>] kernel_init+0x80/0x227
       [<c0103c13>] kernel_thread_helper+0x7/0x10
       [<ffffffff>] 0xffffffff

-> #0 (rtnl_mutex){--..}:
       [<c013fb8e>] __lock_acquire+0xad9/0x103b
       [<c0140169>] lock_acquire+0x79/0x92
       [<c02cfcd5>] mutex_lock_nested+0x90/0x290
       [<c02870f1>] ...
From: Andrew Morton
Date: Tuesday, May 20, 2008 - 3:22 am

cls->mutex

	rtnl_lock

	cls->mutex

This bug has always been there, and is now exposed by the conversion
of cls->mutex from a semaphore to a mutex.  Because lockdep doesn't
check semaphores.

I don't know how to get this fixed, sorry.  I'll just push
struct-class-sem-to-mutex-converting.patch at Greg until it sticks,
then it will go into mainline, then we'll get a shower of bug reports,
including this one, then someone someday will do soemthing about it.

Fun.
--

Previous thread: [RFC/PATCH 0/6] memcg: peformance improvement at el. v3 by KAMEZAWA Hiroyuki on Wednesday, May 14, 2008 - 1:02 am. (14 messages)

Next thread: m68k: main.c:(.init.text+0x730): undefined reference to `strlen' by Geert Uytterhoeven on Wednesday, May 14, 2008 - 1:02 am. (19 messages)