Re: 2.6.26-rc2-mm1: sparc64 - possible recursive locking detected

Previous thread: [RFC/PATCH 0/6] memcg: peformance improvement at el. v3 by KAMEZAWA Hiroyuki on Wednesday, May 14, 2008 - 4:02 am. (14 messages)

Next thread: m68k: main.c:(.init.text+0x730): undefined reference to `strlen' by Geert Uytterhoeven on Wednesday, May 14, 2008 - 4:02 am. (19 messages)
To: <linux-kernel@...>
Date: Wednesday, May 14, 2008 - 4:01 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.26-rc2...

- The -mm tree is now based on linux-next.

I will occasionally pick up later versions of trees which are already
in linux-next, to catch material which was added after Stephen last
pulled that tree. That happened this time: git-net had a lot of driver
changes which weren't in linux-next and which I wanted in
2.6.26-rc2-mm1.

- A few more git trees were added: git-ubifs.patch, git-regulator.patch,
git-logfs.patch, git-orion.patch.

Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
mm-commits mailing list.

echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
most valuable if you can perform a bisection search to identify which patch
introduced the bug. Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

But beware that this process takes some time (around ten rebuilds and
reboots), so consider reporting the bug first and if we cannot immediately
identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
email Subject: in some manner to reflect the nature of the bug. Some
developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
the mm-commits list. These probab...

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Tuesday, May 20, 2008 - 6:01 am

Hello,

This lockdep warning is seen when I remove pcmcia wifi card
from the slot. Doesn't happen every time. It's x86_32.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.26-rc2-mm1 #2
-------------------------------------------------------
pccardd/1037 is trying to acquire lock:
(rtnl_mutex){--..}, at: [<c02870f1>] rtnl_lock+0x14/0x16

but task is already holding lock:
(&socket->skt_mutex){--..}, at: [<c02608ba>] pccardd+0x161/0x28c

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&socket->skt_mutex){--..}:
[<c013fff0>] __lock_acquire+0xf3b/0x103b
[<c0140169>] lock_acquire+0x79/0x92
[<c02cfcd5>] mutex_lock_nested+0x90/0x290
[<c02600a6>] pccard_register_pcmcia+0x22/0x78
[<ded5af02>] pcmcia_bus_add_socket+0x9f/0xe0 [pcmcia]
[<c0251c02>] class_interface_register+0x83/0xb2
[<ded6003a>] 0xded6003a
[<c0146115>] sys_init_module+0x11e/0x18e4
[<c0103001>] sysenter_past_esp+0x6a/0xa5
[<ffffffff>] 0xffffffff

-> #1 (&cls->mutex){--..}:
[<c013fff0>] __lock_acquire+0xf3b/0x103b
[<c0140169>] lock_acquire+0x79/0x92
[<c02cfcd5>] mutex_lock_nested+0x90/0x290
[<c024f4a0>] device_add+0x42f/0x557
[<c02895a1>] netdev_register_kobject+0x76/0x7b
[<c027e3f6>] register_netdevice+0x22e/0x39a
[<c027e599>] register_netdev+0x37/0x44
[<c03ce7fb>] loopback_net_init+0x38/0x7d
[<c027bb59>] register_pernet_operations+0x18/0x1a
[<c027bbd3>] register_pernet_device+0x24/0x51
[<c03ce7c1>] loopback_init+0x12/0x14
[<c03b9721>] kernel_init+0x80/0x227
[<c0103c13>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff

-> #0 (rtnl_...

To: Mariusz Kozlowski <m.kozlowski@...>
Cc: <linux-kernel@...>
Date: Tuesday, May 20, 2008 - 6:22 am

cls->mutex

rtnl_lock

cls->mutex

This bug has always been there, and is now exposed by the conversion
of cls->mutex from a semaphore to a mutex. Because lockdep doesn't
check semaphores.

I don't know how to get this fixed, sorry. I'll just push
struct-class-sem-to-mutex-converting.patch at Greg until it sticks,
then it will go into mainline, then we'll get a shower of bug reports,
including this one, then someone someday will do soemthing about it.

Fun.
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Ingo Molnar <mingo@...>, <srostedt@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Monday, May 19, 2008 - 7:33 am

Hi Andrew,

The 2.6.26-rc2-mm1 kernel gets stuck, while booting up on x86_64 machine,
with the CONFIG_FTRACE_STARTUP_TEST enabled. The following .config
options related to FTRACE are enabled.

CONFIG_FTRACE_SELFTEST=y
CONFIG_FTRACE_STARTUP_TEST=y
CONFIG_FTRACE=y
CONFIG_HAVE_FTRACE=y
CONFIG_DYNAMIC_FTRACE=y

BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009dc00 (usable)
BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000d7fcca00 (usable)
BIOS-e820: 00000000d7fcca00 - 00000000d7fd0000 (ACPI data)
BIOS-e820: 00000000d7fd0000 - 00000000d8000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 00000001e8000000 (usable)
max_pfn_mapped = 1998848
init_memory_mapping
DMI 2.3 present.
ACPI: RSDP 000FDFB0, 0024 (r2 IBM )
ACPI: XSDT D7FCFF00, 0044 (r1 IBM SERONYXP 1001 IBM 45444F43)
ACPI: FACP D7FCFE40, 0084 (r2 IBM SERONYXP 1001 IBM 45444F43)
ACPI: DSDT D7FCCA00, 2AA0 (r2 IBM SERTURQU 1000 INTL 20041203)
ACPI: FACS D7FCFD00, 0040
ACPI: APIC D7FCFD80, 00B4 (r1 IBM SERONYXP 1001 IBM 45444F43)
ACPI: MCFG D7FCFD40, 003C (r1 IBM SERONYXP 1001 IBM 45444F43)
ACPI: SSDT D7FCFA40, 02BD (r2 IBM YETA0 1000 INTL 20041203)
No NUMA configuration found
Faking a node at 0000000000000000-00000001e8000000
Bootmem setup node 0 0000000000000000-00000001e8000000
NODE_DATA [0000000000011000 - 0000000000016fff]
bootmap [0000000000017000 - 0000000000053fff] pages 3d
early res: 0 [0-fff] BIOS data page
early res: 1 [6000-7fff] TRAMPOLINE
early res: 2 [200000-b4e40b] TEXT DATA BSS
early res: 3 [37e81000-37fefaa0] RAMDISK
early res: 4 [9dc00-fffff] BIOS reserved
early res: 5 [8000-10fff] PGTABLE
Zone PFN ranges:
DMA 0 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 1998848
Movable zone start PFN for each node
e...

To: Kamalesh Babulal <kamalesh@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Ingo Molnar <mingo@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Monday, May 19, 2008 - 9:02 am

Hi, could you do nmi_watchdog=1 and see if that gives you a stack dump?

Thanks.

-- Steve

--

To: Steven Rostedt <srostedt@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Ingo Molnar <mingo@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Monday, May 19, 2008 - 10:08 am

Hi Steven,

Passing nmi_watchdog=1 did not help in getting any extra information, over the previous

--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

To: Kamalesh Babulal <kamalesh@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Ingo Molnar <mingo@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>, <rostedt@...>
Date: Monday, May 19, 2008 - 10:38 am

Thanks for trying.

Can you send your config privately to my goodmis account.

rostedt@goodmis.org

Thanks,

-- Steve
--

To: Andrew Morton <akpm@...>, <rpurdie@...>
Cc: <linux-kernel@...>
Date: Saturday, May 17, 2008 - 6:28 am

Seen in a 'make silentoldconfig':

---
LED Default ON Trigger (LEDS_TRIGGER_DEFAULT_ON) [N/m/y/?] (NEW) ?

This allows LEDs to be initialised in the ON state.
If unsure, say Y.
---

The default is N, but if unsure, say Y. Some digging shows that it's because
there's a "depends on LEDS_TRIGGERS" that I had set to N. I wonder if the
various 'config LEDS_TRIGGER_FOO' in drivers/leds/Kconfig should all be
wrapped in one 'if LEDS_TRIGGERS'? Kind of like this totally untested patch:

If I'm actually right here, here's a:

Signed-Off-By: Valdis Kletnieks <valdis.kletnieks@vt.edu>

--- linux-2.6.26-rc2-mm1/drivers/leds/Kconfig.before 2008-05-17 06:22:03.000000000 -0400
+++ linux-2.6.26-rc2-mm1/drivers/leds/Kconfig 2008-05-17 06:22:55.000000000 -0400
@@ -164,9 +164,9 @@ config LEDS_TRIGGERS
These triggers allow kernel events to drive the LEDs and can
be configured via sysfs. If unsure, say Y.

+if LEDS_TRIGGERS
config LEDS_TRIGGER_TIMER
tristate "LED Timer Trigger"
- depends on LEDS_TRIGGERS
help
This allows LEDs to be controlled by a programmable timer
via sysfs. Some LED hardware can be programmed to start
@@ -177,14 +177,13 @@ config LEDS_TRIGGER_TIMER

config LEDS_TRIGGER_IDE_DISK
bool "LED IDE Disk Trigger"
- depends on LEDS_TRIGGERS && BLK_DEV_IDEDISK
+ depends on BLK_DEV_IDEDISK
help
This allows LEDs to be controlled by IDE disk activity.
If unsure, say Y.

config LEDS_TRIGGER_HEARTBEAT
tristate "LED Heartbeat Trigger"
- depends on LEDS_TRIGGERS
help
This allows LEDs to be controlled by a CPU load average.
The flash frequency is a hyperbolic function of the 1-minute
@@ -193,9 +192,9 @@ config LEDS_TRIGGER_HEARTBEAT

config LEDS_TRIGGER_DEFAULT_ON
tristate "LED Default ON Trigger"
- depends on LEDS_TRIGGERS
help
This allows LEDs to be initialised in the ON state.
If unsure, say Y.

+endif # LEDS_TRIGGERS
endif # NEW_LEDS

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <linux-usb@...>
Date: Friday, May 16, 2008 - 6:17 pm

> linux-next.patch

That's terse. ;-)

Who is responsible for something called "Option High Speed Mobile
Devices"?

It's using create_proc_read_entry() interface, so should be switched
to seq_files before merging.

And "procfs" module parameter is plain stupid, sorry.

--

To: Alexey Dobriyan <adobriyan@...>
Cc: <linux-kernel@...>, <linux-usb@...>, Greg KH <greg@...>, Andrew Bird <ajb@...>
Date: Friday, May 16, 2008 - 5:31 pm

On Sat, 17 May 2008 02:17:49 +0400

well, it's a git tree, and all that this implies. The git URL is

The full changelog is contained in linux-next.patch. Searching it for
"Option" quickly leads to

commit a50a26ba350a5f32ec6481c85b938fc7fb476671
Author: Greg Kroah-Hartman <gregkh@suse.de>
Date: Mon Apr 14 11:41:16 2008 -0700

USB: add option hso driver

This driver is for a number of different Option devices. Originally
written by Option and Andrew Bird, but cleaned up massivly for
acceptance into mainline by me (Greg).

TODO:
- remove proc files and move to debugfs
- review network interfaces
- add better changelog information
- Use netif_msg_ for the message level rather than module parameter
- net_device_stats are now available in dev->stats

Many thanks to the following for their help in cleaning up the driver by
providing feedback and patches to it:
- Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
- Oliver Neukum <oliver@neukum.org>
- Alan Cox <alan@lxorguk.ukuu.org.uk>

Cc: Andrew Bird <ajb@spheresystems.co.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Filip Aben <f.aben@option.com>
Cc: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Cc: Oliver Neukum <oliver@neukum.org>

stupid people cc'ed ;)
--

To: Andrew Morton <akpm@...>
Cc: Alexey Dobriyan <adobriyan@...>, <linux-kernel@...>, <linux-usb@...>, Andrew Bird <ajb@...>
Date: Friday, May 16, 2008 - 6:00 pm

That parameter is gone, see the patches posted to lkml for an updated
version.

thanks,

greg "i'm stupid" k-h
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Thursday, May 15, 2008 - 2:21 pm

Hello,

To get this I simply modprobe wusbcore. modprobe itself ends with
SIGSEGV. This comes from x86_32.

UWB: workarounds enabled for bugs:445 514 543 548 010612024004
BUG: unable to handle kernel NULL pointer dereference at 0000000c
IP: [<c01e0e4c>] scatterwalk_start+0xc/0x1f
*pde = 00000000
Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:05.0/resource
Modules linked in: cbc wusbcore(+) uwb radeon drm orinoco_cs orinoco hermes parport_pc parport floppy pcmcia firmware_class rtc psmouse pcspkr 8139too ide_cd_mod cdrom ehci_hcd uhci_hcd usbcore sony_laptop backlight snd_ali5451 snd_ac97_codec ac97_bus snd_pcm snd_timer snd snd_page_alloc yenta_socket rsrc_nonstatic ati_agp agpgart

Pid: 5423, comm: modprobe Not tainted (2.6.26-rc2-mm1 #1)
EIP: 0060:[<c01e0e4c>] EFLAGS: 00010296 CPU: 0
EIP is at scatterwalk_start+0xc/0x1f
EAX: da471c78 EBX: da471c78 ECX: da471c78 EDX: 00000000
ESI: da471dbb EDI: da4a5010 EBP: da471ba8 ESP: da471ba8
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process modprobe (pid: 5423, ti=da471000 task=dceb0000 task.ti=da471000)
Stack: da471bb4 c01e0ea9 00000000 da471bd4 c01e0f96 00000010 da471c78 da4a5010
00000010 fffffffc 00000000 da471c04 c01e21ca 00000000 da471c68 da471dc8
00000003 00000010 da471c84 da471c78 00000030 da4a5010 da4b8320 da471c34
Call Trace:
[<c01e0ea9>] ? scatterwalk_pagedone+0x4a/0x84
[<c01e0f96>] ? scatterwalk_copychunks+0x2f/0xbb
[<c01e21ca>] ? blkcipher_walk_next+0x311/0x38b
[<c01e1cdf>] ? blkcipher_walk_done+0xb2/0x28c
[<de86e308>] ? crypto_cbc_encrypt+0xc6/0x13b [cbc]
[<c01e3ac6>] ? aes_encrypt+0x0/0x114d
[<c02d1ba8>] ? _spin_unlock_irqrestore+0x3e/0x5f
[<c01f8e34>] ? sg_init_one+0xb/0x66
[<dedea2ba>] ? wusb_prf+0x2b0/0x3e2 [wusbcore]
[<c013ec7e>] ? trace_hardirqs_on+0xb/0xd
[<dedea445>] ? wusb_crypto_init+0x59/0x274 [wusbcore]
[<c02d1ba8>] ? _spin_unlock_irqrestore+0...

To: Mariusz Kozlowski <m.kozlowski@...>
Cc: <linux-kernel@...>, Greg KH <greg@...>, Inaky Perez-Gonzalez <inaky@...>
Date: Thursday, May 15, 2008 - 2:58 pm

--

To: Andrew Morton <akpm@...>, <david.vrabel@...>
Cc: Mariusz Kozlowski <m.kozlowski@...>, <linux-kernel@...>, Greg KH <greg@...>
Date: Thursday, May 15, 2008 - 4:05 pm

This was fixed by David Vrabel recently, s/g arrays weren't
proerly initialized (I am to blame for that).

David?

--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Thursday, May 15, 2008 - 2:01 pm

Parenthesis fix in include/asm-mips/mach-au1x00/au1000.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

diff -upr linux-2.6.26-rc2-mm1-a/include/asm-mips/mach-au1x00/au1000.h linux-2.6.26-rc2-mm1-b/include/asm-mips/mach-au1x00/au1000.h
--- linux-2.6.26-rc2-mm1-a/include/asm-mips/mach-au1x00/au1000.h 2008-05-15 19:44:48.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-mips/mach-au1x00/au1000.h 2008-05-15 19:52:38.000000000 +0200
@@ -1036,7 +1036,7 @@ enum soc_au1200_ints {
#define USBD_INTSTAT 0xB020001C
# define USBDEV_INT_SOF (1 << 12)
# define USBDEV_INT_HF_BIT 6
-# define USBDEV_INT_HF_MASK 0x3f << USBDEV_INT_HF_BIT)
+# define USBDEV_INT_HF_MASK (0x3f << USBDEV_INT_HF_BIT)
# define USBDEV_INT_CMPLT_BIT 0
# define USBDEV_INT_CMPLT_MASK (0x3f << USBDEV_INT_CMPLT_BIT)
#define USBD_CONFIG 0xB0200020
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Thursday, May 15, 2008 - 1:59 pm

Parenthesis fix in include/asm-mips/gic.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

diff -upr linux-2.6.26-rc2-mm1-a/include/asm-mips/gic.h linux-2.6.26-rc2-mm1-b/include/asm-mips/gic.h
--- linux-2.6.26-rc2-mm1-a/include/asm-mips/gic.h 2008-05-15 19:44:48.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-mips/gic.h 2008-05-15 19:52:20.000000000 +0200
@@ -330,7 +330,7 @@

#define GIC_SH_RMASK_OFS 0x0300
#define GIC_CLR_INTR_MASK(intr, val) \
- GICWRITE(GIC_REG_ADDR(SHARED, GIC_SH_RMASK_OFS + 4 + (((((intr) / 32) ^ 1) - 1) * 4)), ((val) << ((intr) % 32))
+ GICWRITE(GIC_REG_ADDR(SHARED, GIC_SH_RMASK_OFS + 4 + (((((intr) / 32) ^ 1) - 1) * 4)), ((val) << ((intr) % 32)))

/* Register Map for Local Section */
#define GIC_VPE_CTL_OFS 0x0000
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Thursday, May 15, 2008 - 1:58 pm

Parenthesis fix in include/asm-arm/arch-omap/control.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

diff -upr linux-2.6.26-rc2-mm1-a/include/asm-arm/arch-omap/control.h linux-2.6.26-rc2-mm1-b/include/asm-arm/arch-omap/control.h
--- linux-2.6.26-rc2-mm1-a/include/asm-arm/arch-omap/control.h 2008-05-15 19:44:38.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-arm/arch-omap/control.h 2008-05-15 19:51:30.000000000 +0200
@@ -80,7 +80,7 @@
#define OMAP24XX_CONTROL_SEC_TAP (OMAP2_CONTROL_GENERAL + 0x0064)
#define OMAP24XX_CONTROL_OCM_PUB_RAM_ADD (OMAP2_CONTROL_GENERAL + 0x006c)
#define OMAP24XX_CONTROL_EXT_SEC_RAM_START_ADD (OMAP2_CONTROL_GENERAL + 0x0070)
-#define OMAP24XX_CONTROL_EXT_SEC_RAM_STOP_ADD (OMAP2_CONTROL_GENERAL + 0x0074
+#define OMAP24XX_CONTROL_EXT_SEC_RAM_STOP_ADD (OMAP2_CONTROL_GENERAL + 0x0074)
#define OMAP24XX_CONTROL_SEC_STATUS (OMAP2_CONTROL_GENERAL + 0x0080)
#define OMAP24XX_CONTROL_SEC_ERR_STATUS (OMAP2_CONTROL_GENERAL + 0x0084)
#define OMAP24XX_CONTROL_STATUS (OMAP2_CONTROL_GENERAL + 0x0088)

--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Wednesday, May 14, 2008 - 5:54 pm

My HP nx6325 doesn't resume from suspend. It looks like the graphics doesn't
come up, so probably s2ram is busted.

I'll try to bisect on weekend, if I have the time (not sure).

Thanks,
Rafael
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <ericvh@...>, <v9fs-developer@...>
Date: Wednesday, May 14, 2008 - 5:16 pm

net/built-in.o: In function `init_p9':
mod.c:(.init.text+0x4b0d): undefined reference to `p9_trans_fd_init'
make[1]: *** [.tmp_vmlinux1] Error 1

CONFIG_NET_9P=y
CONFIG_NET_9P_FD=m
CONFIG_NET_9P_VIRTIO=m
CONFIG_NET_9P_DEBUG=y

# CONFIG_9P_FS is not set

---
~Randy
--

To: Randy Dunlap <randy.dunlap@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <v9fs-developer@...>
Date: Wednesday, May 14, 2008 - 8:00 pm

This is probably a side effect of the merge issue v9fs-devel and -mm
had last week. It should no longer be possible with the code that's
been in my v9fs-devel tree on kernel.org for the past 5 days
(CONFIG_NET_9P_FD no longer exists).

-eric

--

To: Eric Van Hensbergen <ericvh@...>
Cc: <randy.dunlap@...>, <linux-kernel@...>, <v9fs-developer@...>
Date: Wednesday, May 14, 2008 - 8:05 pm

On Wed, 14 May 2008 19:00:12 -0500

But I'm still reverting the v9fs tree due to

git-v9fs is causing i386 allmodconfig failures:

net/9p/trans_fd.o: In function `init_module':
trans_fd.c:(.init.text+0x0): multiple definition of `init_module'
net/9p/mod.o:mod.c:(.init.text+0x0): first defined here
/opt/crosstool/gcc-4.1.0-glibc-2.3.6/i686-unknown-linux-gnu/bin/i686-unknown-linux-gnu-ld: Warning: size of symbol `init_module' changed from 27 in net/9p/mod.o to 128 in net/9p/trans_fd.o

--

To: Andrew Morton <akpm@...>
Cc: <randy.dunlap@...>, <linux-kernel@...>, <v9fs-developer@...>
Date: Wednesday, May 14, 2008 - 10:29 pm

On Wed, May 14, 2008 at 7:05 PM, Andrew Morton

Okay, clearly I'm doing something wrong. I've tried the allmodconfig
on my local sandbox and its fine. When I look to see if there is
still a module_init in net/9p/trans_fd on kernel.org via gitweb, I
can't find it. (http://git.kernel.org/?p=linux/kernel/git/ericvh/v9fs.git;a=blob;f=net/9...)

Are you pulling from my v9fs-devel tree or is --mm switched over to
pull from linux-next or something?

-eric
--

To: Eric Van Hensbergen <ericvh@...>
Cc: <randy.dunlap@...>, <linux-kernel@...>, <v9fs-developer@...>
Date: Wednesday, May 14, 2008 - 11:04 pm

It has mysteriously gone away. Perhaps it was triggered by some other

The algorithm to determine this is to look at the first line of -mm's
git-v9fs.patch:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.26-rc2...
has:

GIT 38bfbd9f766f0b33de6bc16fd9ad1018b8fd3fe2 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git#v9fs-devel

Yes, -mm uses both linux-next and git-v9fs (aka #v9fs-devel)

linux-next uses #for-next and afacit that was empty as of a few hours
ago. Nothing for 2.6.27?
--

To: Andrew Morton <akpm@...>
Cc: <randy.dunlap@...>, <linux-kernel@...>, <v9fs-developer@...>
Date: Wednesday, May 14, 2008 - 11:53 pm

On Wed, May 14, 2008 at 10:04 PM, Andrew Morton

Oh, there's stuff for 2.6.27, I'm still working on stablizing it --
but I put it on hold while I tried to clear out my bugzilla backlog.
Trying to stick to a policy of increased testing and removing bugs
before potentially introducing new ones. It helps that there have
been several additional groups starting to use 9p and hitting corner
cases my testing didn't cover before.

-eric
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <hare@...>, scsi <linux-scsi@...>
Date: Wednesday, May 14, 2008 - 5:13 pm

SCSI_DH has some problems when CONFIG_SCSI=n:

drivers/built-in.o: In function `activate_path':
dm-mpath.c:(.text+0x18a292): undefined reference to `scsi_dh_activate'
drivers/built-in.o: In function `multipath_ctr':
dm-mpath.c:(.text+0x18a6f0): undefined reference to `scsi_dh_handler_exist'
make[1]: *** [.tmp_vmlinux1] Error 1

#
# SCSI device support
#
CONFIG_RAID_ATTRS=y
# CONFIG_SCSI is not set
# CONFIG_SCSI_DMA is not set
# CONFIG_SCSI_NETLINK is not set
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y

---
~Randy
--

To: Randy Dunlap <randy.dunlap@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <hare@...>, scsi <linux-scsi@...>, Chandra Seetharaman <sekharan@...>
Date: Thursday, May 15, 2008 - 10:46 am

This is one more of those annoying selects. The SCSI_DH Kconfig file is
correctly dependent on SCSI:

menuconfig SCSI_DH
tristate "SCSI Device Handlers"
depends on SCSI
default n
help

but we've also got a select in md/Kconfig:

config DM_MULTIPATH
tristate "Multipath target"
depends on BLK_DEV_DM
select SCSI_DH

Which ignores the dependency.

My best guess for fixing this is either to make the select a depends or
just drop it altogether (after all, it's possible to have multipath on
non-SCSI devices).

James

--

To: James Bottomley <James.Bottomley@...>
Cc: Randy Dunlap <randy.dunlap@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <hare@...>, scsi <linux-scsi@...>
Date: Thursday, May 15, 2008 - 3:56 pm

Hi James, Andrew,

Here is a patch to remove the automatic "select" of scsi_dh for
dm-multipath.

Sorry about the mishap.

chandra

To: <sekharan@...>
Cc: James Bottomley <James.Bottomley@...>, Randy Dunlap <randy.dunlap@...>, <linux-kernel@...>, <hare@...>, scsi <linux-scsi@...>
Date: Thursday, May 22, 2008 - 11:25 pm

You obviously wanted `static inline' there, but it still fails i386
allmodconfig compilation.

--

To: Andrew Morton <akpm@...>
Cc: James Bottomley <James.Bottomley@...>, Randy Dunlap <randy.dunlap@...>, <linux-kernel@...>, <hare@...>, scsi <linux-scsi@...>
Date: Friday, May 23, 2008 - 3:39 pm

Yikes.... Sorry again... Hopefully this attached patch work properly.

chandra
-------------------------
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow hardware handlers to be used.

Handle SCSI_DH being a module also.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Hannes Reinecke <hare@suse.de>
---

Index: scsi-misc-2.6/drivers/md/Kconfig
===================================================================
--- scsi-misc-2.6.orig/drivers/md/Kconfig
+++ scsi-misc-2.6/drivers/md/Kconfig
@@ -252,7 +252,6 @@ config DM_ZERO
config DM_MULTIPATH
tristate "Multipath target"
depends on BLK_DEV_DM
- select SCSI_DH
---help---
Allow volume managers to support multipath hardware.

Index: scsi-misc-2.6/drivers/md/dm-mpath.c
===================================================================
--- scsi-misc-2.6.orig/drivers/md/dm-mpath.c
+++ scsi-misc-2.6/drivers/md/dm-mpath.c
@@ -664,6 +664,8 @@ static int parse_hw_handler(struct arg_s
request_module("scsi_dh_%s", m->hw_handler_name);
if (scsi_dh_handler_exist(m->hw_handler_name) == 0) {
ti->error = "unknown hardware handler type";
+ kfree(m->hw_handler_name);
+ m->hw_handler_name = NULL;
return -EINVAL;
}
consume(as, hw_argc - 1);
Index: scsi-misc-2.6/include/scsi/scsi_dh.h
===================================================================
--- scsi-misc-2.6.orig/include/scsi/scsi_dh.h
+++ scsi-misc-2.6/include/scsi/scsi_dh.h
@@ -54,6 +54,16 @@ enum {
SCSI_DH_NOSYS,
SCSI_DH_DRIVER_MAX,
};
-
+#if defined(CONFIG_SCSI_DH) || defined(CONFIG_SCSI_DH_MODULE)
extern int scsi_dh_activate(struct request_queue *);
extern int scsi_dh_han...

To: <sekharan@...>
Cc: Andrew Morton <akpm@...>, James Bottomley <James.Bottomley@...>, <linux-kernel@...>, <hare@...>, scsi <linux-scsi@...>
Date: Friday, May 23, 2008 - 4:28 pm

Did it build cleanly for you?
Hint:

--
~Randy
--

To: Randy Dunlap <randy.dunlap@...>
Cc: Andrew Morton <akpm@...>, James Bottomley <James.Bottomley@...>, <linux-kernel@...>, <hare@...>, scsi <linux-scsi@...>
Date: Friday, May 23, 2008 - 9:16 pm

Oh, my... it is getting very tricky.

Here is a patch that compiles clean in different combinations. But, I
agree that the "depends" (under DM_MULTIPATH) sure looks weird.

-----------
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow hardware handlers to be used.

Handle SCSI_DH being a module also. Make sure it doesn't allow DM_MULTIPATH
to be compiled in when SCSI_DH is a module.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Mike Anderson <andmike@us.ibm.com>
Cc: Hannes Reinecke <hare@suse.de>
---

Index: scsi-misc-2.6/drivers/md/Kconfig
===================================================================
--- scsi-misc-2.6.orig/drivers/md/Kconfig
+++ scsi-misc-2.6/drivers/md/Kconfig
@@ -252,7 +252,7 @@ config DM_ZERO
config DM_MULTIPATH
tristate "Multipath target"
depends on BLK_DEV_DM
- select SCSI_DH
+ depends on SCSI_DH || !SCSI_DH
---help---
Allow volume managers to support multipath hardware.

Index: scsi-misc-2.6/drivers/md/dm-mpath.c
===================================================================
--- scsi-misc-2.6.orig/drivers/md/dm-mpath.c
+++ scsi-misc-2.6/drivers/md/dm-mpath.c
@@ -664,6 +664,8 @@ static int parse_hw_handler(struct arg_s
request_module("scsi_dh_%s", m->hw_handler_name);
if (scsi_dh_handler_exist(m->hw_handler_name) == 0) {
ti->error = "unknown hardware handler type";
+ kfree(m->hw_handler_name);
+ m->hw_handler_name = NULL;
return -EINVAL;
}
consume(as, hw_argc - 1);
Index: scsi-misc-2.6/include/scsi/scsi_dh.h
===================================================================
--- scsi-misc-2.6.orig/include/scsi/scsi_dh.h
+++ scsi-...

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Wednesday, May 14, 2008 - 4:49 pm

No good on my first attempt. Here is what I ran into:

The printk timestamps have gone wild. I cannot paste a dmesg but here
is one line I wrote down:
[17180644.495790] Testing tracer ftrace: NMI watchdog ...

Which leads into the next problem: The kernel freezes after Testing
tracer ftrace. Then I rebooted with my special testing command line
"kernel /bzImage-2.6.26-rc2-mm1 root=3D/dev/sda2 rootfstype=3Dreiser4
rootflags=3Ddefaults,noatime i8042.nomux elevator=3Dcfq resume=3D/dev/sda3
panic=3D5 nmi_watchdog=3D2,panic debug idle=3Dpoll nohz=3Doff"

and I got the same freeze but then the NMI watchdog message. Which is
the third problem.

Why did the NMI watchdog not panic and reboot the system? It detected
the lock and printed the message. It should have then panicked, waited
5 seconds, and rebooted.

System is a 64-bit Gentoo AMD-64 Compaq R3000 laptop. Compiler is GCC
4.3.

Config follows:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.26-rc2-mm1
# Wed May 14 09:59:19 2008
#
CONFIG_64BIT=3Dy
# CONFIG_X86_32 is not set
CONFIG_X86_64=3Dy
CONFIG_X86=3Dy
CONFIG_DEFCONFIG_LIST=3D"arch/x86/configs/x86_64_defconfig"
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=3Dy
CONFIG_GENERIC_CMOS_UPDATE=3Dy
CONFIG_CLOCKSOURCE_WATCHDOG=3Dy
CONFIG_GENERIC_CLOCKEVENTS=3Dy
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=3Dy
CONFIG_LOCKDEP_SUPPORT=3Dy
CONFIG_STACKTRACE_SUPPORT=3Dy
CONFIG_HAVE_LATENCYTOP_SUPPORT=3Dy
CONFIG_FAST_CMPXCHG_LOCAL=3Dy
CONFIG_MMU=3Dy
CONFIG_ZONE_DMA=3Dy
CONFIG_GENERIC_ISA_DMA=3Dy
CONFIG_GENERIC_IOMAP=3Dy
CONFIG_GENERIC_BUG=3Dy
CONFIG_GENERIC_HWEIGHT=3Dy
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=3Dy
CONFIG_RWSEM_GENERIC_SPINLOCK=3Dy
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=3Dy
CONFIG_GENERIC_CALIBRATE_DELAY=3Dy
CONFIG_GENERIC_TIME_VSYSCALL=3Dy
CONFIG_ARCH_HAS_CPU_RELAX=3Dy
CONFIG_ARC...

To: Zan Lynx <zlynx@...>
Cc: <linux-kernel@...>, Ingo Molnar <mingo@...>, Thomas Gleixner <tglx@...>
Date: Wednesday, May 14, 2008 - 5:00 pm

On Wed, 14 May 2008 14:49:07 -0600

I've seen reports like this against mainline, but I'm not sure that

Thanks.
--

To: Andrew Morton <akpm@...>
Cc: Zan Lynx <zlynx@...>, <linux-kernel@...>, Ingo Molnar <mingo@...>, Thomas Gleixner <tglx@...>
Date: Wednesday, May 14, 2008 - 5:14 pm

I've reported problems with -next and ftrace. The timestamps look very similar
to what I've seen as well. I don't have those kernels available anymore - I
decided to wipe my system and move to a distro where it's easier to test new
kernels.

However, it stands to reason that it isn't ftrace actually causing the

That is quite similar... I'm on a Core2Duo (Dell Inspiron 1420 laptop) and was
seeing the problems with GCC4.3 and a pure 64bit userland.

--
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Ingo Molnar <mingo@...>, Thomas Gleixner <tglx@...>, me <dhazelton@...>
Date: Wednesday, May 14, 2008 - 6:06 pm

I disabled a bunch of trace and self test options, and I am now running
2.6.26-rc2-mm1. So far, so good.

I am including some dmesg with the weird timestamps in case it is useful
to anyone.

[ 0.000000] Linux version 2.6.26-rc2-mm1 (lynx@zephyr) (gcc version 4.3.=
0 (Gentoo 4.3.0 p1.0) ) #13 SMP Wed May 14 15:16:26 MDT 2008
[ 0.000000] Command line: root=3D/dev/sda2 rootfstype=3Dreiser4 rootflag=
s=3Ddefaults,noatime i8042.nomux elevator=3Dcfq resume=3D/dev/sda3 panic=3D=
5 debug
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000d0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000003ff70000 (usable)
[ 0.000000] BIOS-e820: 000000003ff70000 - 000000003ff7f000 (ACPI data)
[ 0.000000] BIOS-e820: 000000003ff7f000 - 000000003ff80000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000003ff80000 - 0000000040000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[ 0.000000] max_pfn_mapped =3D 1048576
[ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x701060007=
0106
[ 0.000000] init_memory_mapping
[ 0.000000] DMI present.
[ 0.000000] ACPI: RSDP 000F7240, 0014 (r0 PTLTD )
[ 0.000000] ACPI: RSDT 3FF7A87E, 0034 (r1 PTLTD RSDT 6040000 LTP =
0)
[ 0.000000] ACPI: FACP 3FF7EE13, 0074 (r1 NVIDIA CK8 6040000 PTL_ =
F4240)
[ 0.000000] ACPI: DSDT 3FF7A8B2, 4561 (r1 NVIDIA CK8 6040000 MSFT =
100000E)
[ 0.000000] ACPI: FACS 3FF7FFC0, 0040
[ 0.000000] ACPI: APIC 3FF7EE87, 005A (r1 NVIDIA NV_APIC_ 6040000 LTP =
0)
[ 0.000000] ACPI: BOOT 3FF7EEE1, 0028 (r1 PTLTD $SBFTBL$ 6040000 LTP =
1)
[ 0.000000] ACPI: SSDT 3FF7EF09, 00F7 (r1 PTLTD POWERNOW 6040000 LTP =
1)
[ 0.000000] ACPI: DMI detected: Hewlett-Packard
[ 0.000000] early r...

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <davidel@...>, <drepper@...>
Date: Wednesday, May 14, 2008 - 4:43 pm

With CONFIG_*FD=n:
# CONFIG_SIGNALFD is not set
# CONFIG_TIMERFD is not set
# CONFIG_EVENTFD is not set

the build fails with:

arch/x86/kernel/built-in.o: In function `sys_call_table':
(.rodata+0x89c): undefined reference to `sys_signalfd4'
arch/x86/kernel/built-in.o: In function `sys_call_table':
(.rodata+0x8a0): undefined reference to `sys_eventfd2'
make[1]: *** [.tmp_vmlinux1] Error 1

---
~Randy
--

To: lkml <linux-kernel@...>
Date: Wednesday, May 14, 2008 - 4:39 pm

Using WARN() with CONFIG_BUG=n causes:

linux-2.6.26-rc2-mm1/lib/kobject.c: In function 'kobject_add_internal':
linux-2.6.26-rc2-mm1/lib/kobject.c:218: error: implicit declaration of function 'WARN'
make[2]: *** [lib/kobject.o] Error 1

---
~Randy
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Wednesday, May 14, 2008 - 5:16 pm

mkfs.ext2 became kick-ass slow:

+ sudo mkfs.ext2 -F
mke2fs 1.40.6 (09-Feb-2008)
Warning: 256-byte inodes not usable on older systems
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
9773056 inodes, 39072726 blocks
1953636 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
1193 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
...

Writing inode tables: 193/1193
^^^^
counter moves slowly,
occasional counting at what seems to be normal
speed occur.

160 GB SATA disk, no partitions.
According to sysfs, CFQ is in use, the rest is compiled out.
2.6.26-rc2 is fine, mkfs takes ~1 min.

Slowdown is totally reproducible.

CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_SATA_AHCI=y
CONFIG_ATA_SFF=y
CONFIG_ATA_PIIX=y
CONFIG_PATA_JMICRON=y

/sys/block/sdb/queue/iosched/back_seek_max
16384
/sys/block/sdb/queue/iosched/back_seek_penalty
2
/sys/block/sdb/queue/iosched/fifo_expire_async
250
/sys/block/sdb/queue/iosched/fifo_expire_sync
120
/sys/block/sdb/queue/iosched/quantum
4
/sys/block/sdb/queue/iosched/slice_async
40
/sys/block/sdb/queue/iosched/slice_async_rq
2
/sys/block/sdb/queue/iosched/slice_idle
10
/sys/block/sdb/queue/iosched/slice_sync
100

--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Wednesday, May 14, 2008 - 5:33 pm

Here is where it spends time (seems to be always the same):

mkfs.ext2 D 0000000000000000 0 4760 4759
ffff81017ce93a58 0000000000000046 0000000000000000 0000000000000282
ffff81017e14d640 ffffffff8056f4c0 ffff81017e14d880 ffffffff804679a2
00000000ffffb5c4 000000007ce93a68 0000000000000003 ffffffff8023d504
Call Trace:
[<ffffffff804679a2>] ? _spin_unlock_irqrestore+0x42/0x80
[<ffffffff8023d504>] ? __mod_timer+0xc4/0x110
[<ffffffff80465012>] schedule_timeout+0x62/0xe0
[<ffffffff8023cee0>] ? process_timeout+0x0/0x10
[<ffffffff80464ef8>] io_schedule_timeout+0x28/0x40
[<ffffffff8027663a>] congestion_wait+0x8a/0xb0
[<ffffffff80248720>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8026fe31>] balance_dirty_pages_ratelimited_nr+0x1a1/0x3f0
[<ffffffff8026915f>] generic_file_buffered_write+0x1ff/0x740
[<ffffffff80467870>] ? _spin_unlock+0x30/0x60
[<ffffffff802acafb>] ? mnt_drop_write+0x7b/0x160
[<ffffffff80269b30>] __generic_file_aio_write_nolock+0x2a0/0x460
[<ffffffff802548ed>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff80269df7>] generic_file_aio_write_nolock+0x37/0xa0
[<ffffffff80292be1>] do_sync_write+0xf1/0x130
[<ffffffff80256485>] ? trace_hardirqs_on_caller+0xd5/0x160
[<ffffffff80248720>] ? autoremove_wake_function+0x0/0x40
[<ffffffff80256485>] ? trace_hardirqs_on_caller+0xd5/0x160
[<ffffffff8025651d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff8029339a>] vfs_write+0xaa/0xe0
[<ffffffff80293940>] sys_write+0x50/0x90
[<ffffffff8020b69b>] system_call_after_swapgs+0x7b/0x80

--

To: Alexey Dobriyan <adobriyan@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-ext4@...>, Al Viro <viro@...>, <linux-fsdevel@...>
Date: Thursday, May 15, 2008 - 5:41 pm

And not only mkfs, ld took ages to link vmlinux.o:
ld D 0000000000000000 0 17340 17339
ffff8100681819c8 0000000000000082 0000000000000000 ffff81006818198c
ffffffff806c90c0 ffff81006b50d2e0 ffffffff80636360 ffff81006b50d558
0000000068181978 0000000100a7523e ffff81006b50d558 0000000100a75269
Call Trace:
[<ffffffff805056b2>] schedule_timeout+0x62/0xd0
[<ffffffff802403b0>] ? process_timeout+0x0/0x10
[<ffffffff805056ad>] ? schedule_timeout+0x5d/0xd0
[<ffffffff80504956>] io_schedule_timeout+0x76/0xd0
[<ffffffff80282cac>] congestion_wait+0x6c/0x90
[<ffffffff8024c2c0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8027c82f>] balance_dirty_pages_ratelimited_nr+0x13f/0x330
[<ffffffff80275a3d>] generic_file_buffered_write+0x1dd/0x6d0
[<ffffffff8027d0e7>] ? __do_page_cache_readahead+0x167/0x220
[<ffffffff802763ae>] __generic_file_aio_write_nolock+0x25e/0x450
[<ffffffff80276c75>] ? generic_file_aio_read+0x565/0x640
[<ffffffff80276607>] generic_file_aio_write+0x67/0xd0
[<ffffffff802f8bd6>] ext3_file_write+0x26/0xc0
[<ffffffff8029ffa1>] do_sync_write+0xf1/0x140
[<ffffffff8024c2c0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff80289703>] ? remove_vma+0x53/0x70
[<ffffffff80505a01>] ? mutex_lock+0x11/0x30
[<ffffffff802a0a2b>] vfs_write+0xcb/0x190
[<ffffffff802a0be0>] sys_write+0x50/0x90
[<ffffffff8020b82b>] system_call_after_swapgs+0x7b/0x80
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Wednesday, May 14, 2008 - 3:12 pm

On Wed, May 14, 2008 at 10:01 AM, Andrew Morton

Nice! This one works for me again.

But somehow the NUMAness of my system is gone.

2.6.26-rc2-mm1:
[ 0.000000] max_pfn_mapped = 1179648
[ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[ 0.000000] init_memory_mapping
[ 0.000000] DMI present.
[ 0.000000] ACPI: RSDP 000FB080, 0024 (r2 ACPIAM)
[ 0.000000] ACPI: XSDT DFFD0100, 0064 (r1 A_M_I_ OEMXSDT 4000713
MSFT 97)
[ 0.000000] ACPI: FACP DFFD0290, 00F4 (r3 A_M_I_ OEMFACP 4000713
MSFT 97)
[ 0.000000] ACPI: DSDT DFFD0450, 4FC5 (r1 S0027 S0027000 0
INTL 20051117)
[ 0.000000] ACPI: FACS DFFDE000, 0040
[ 0.000000] ACPI: APIC DFFD0390, 0080 (r1 A_M_I_ OEMAPIC 4000713
MSFT 97)
[ 0.000000] ACPI: MCFG DFFD0410, 003C (r1 A_M_I_ OEMMCFG 4000713
MSFT 97)
[ 0.000000] ACPI: OEMB DFFDE040, 0060 (r1 A_M_I_ AMI_OEM 4000713
MSFT 97)
[ 0.000000] ACPI: HPET DFFD5420, 0038 (r1 A_M_I_ OEMHPET0 4000713
MSFT 97)
[ 0.000000] ACPI: MCFG DFFD5460, 003C (r1 A_M_I_ OEMMCFG 4000713
MSFT 97)
[ 0.000000] ACPI: SRAT DFFD54A0, 0110 (r1 AMD HAMMER 1
AMD 1)
[ 0.000000] ACPI: SSDT DFFD55B0, 04F0 (r1 A_M_I_ POWERNOW 1
AMD 1)
[ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC 2 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 3 -> Node 1
[ 0.000000] SRAT: PXMs only cover 0MB of your 4608MB e820 RAM. Not used.
[ 0.000000] SRAT: SRAT not used.
[ 0.000000] No NUMA configuration found
[ 0.000000] Faking a node at 0000000000000000-0000000120000000
[ 0.000000] Bootmem setup node 0 0000000000000000-0000000120000000
[ 0.000000] NODE_DATA [0000000000001000 - 0000000000004fff]
[ 0.000000] bootmap [000000000000e000 - 0000000000031fff] pages 24
[ 0.000000] early res: 0 [0-fff] BIOS data page
[ 0.000000] early res: 1 [60...

To: Torsten Kaiser <just.for.lkml@...>
Cc: <linux-kernel@...>, Ingo Molnar <mingo@...>
Date: Wednesday, May 14, 2008 - 3:35 pm

On Wed, 14 May 2008 21:12:13 +0200

I suspect that this might be caused by the below.

That patch no longer seems to be necessary so I'll drop it. Perhaps
you could try reverting it, please?

From: Ingo Molnar <mingo@elte.hu>

x86.git testing found the following build error on latest -git:

drivers/acpi/numa.c: In function 'acpi_numa_init':
drivers/acpi/numa.c:226: error: 'NR_NODE_MEMBLKS' undeclared (first use in this function)
drivers/acpi/numa.c:226: error: (Each undeclared identifier is reported only once
drivers/acpi/numa.c:226: error: for each function it appears in.)

with this config:

http://redhat.com/~mingo/misc/config-Wed_Apr_30_22_42_42_CEST_2008.bad

i suspect we dont want SRAT parsing when CONFIG_HAVE_ARCH_PARSE_SRAT
is unset - but the fix looks a bit ugly. Perhaps we should define
NR_NODE_MEMBLKS even in this case and just let the code fall back
to some sane behavior?

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

drivers/acpi/numa.c | 4 ++++
1 file changed, 4 insertions(+)

diff -puN drivers/acpi/numa.c~acpi-acpi_numa_init-build-fix drivers/acpi/numa.c
--- a/drivers/acpi/numa.c~acpi-acpi_numa_init-build-fix
+++ a/drivers/acpi/numa.c
@@ -176,6 +176,7 @@ acpi_parse_processor_affinity(struct acp
return 0;
}

+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
static int __init
acpi_parse_memory_affinity(struct acpi_subtable_header * header,
const unsigned long end)
@@ -193,6 +194,7 @@ acpi_parse_memory_affinity(struct acpi_s

return 0;
}
+#endif

static int __init acpi_parse_srat(struct acpi_table_header *table)
{
@@ -221,9 +223,11 @@ int __init acpi_numa_init(void)
if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
acpi_parse_processor_affinity, NR_CPUS);
+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
acpi_parse_memory...

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Ingo Molnar <mingo@...>
Date: Thursday, May 15, 2008 - 1:44 pm

On Wed, May 14, 2008 at 9:35 PM, Andrew Morton

Yes, reverting the patch below gets the system back to its normal state.

[ 0.000000] ACPI: SSDT DFFD55B0, 04F0 (r1 A_M_I_ POWERNOW 1
AMD 1)
[ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC 2 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 3 -> Node 1
[ 0.000000] SRAT: Node 0 PXM 0 0-a0000
[ 0.000000] SRAT: Node 0 PXM 0 100000-80000000
[ 0.000000] SRAT: Node 1 PXM 1 80000000-e0000000
[ 0.000000] SRAT: Node 1 PXM 1 100000000-120000000
[ 0.000000] NUMA: Allocated memnodemap from e000 - 10440
[ 0.000000] NUMA: Using 20 for the hash shift.
[ 0.000000] Bootmem setup node 0 0000000000000000-0000000080000000
[ 0.000000] NODE_DATA [0000000000001000 - 0000000000004fff]
[ 0.000000] bootmap [0000000000011000 - 0000000000020fff] pages 10
[ 0.000000] early res: 0 [0-fff] BIOS data page
[ 0.000000] early res: 1 [6000-7fff] TRAMPOLINE
[ 0.000000] early res: 2 [200000-9601db] TEXT DATA BSS
[ 0.000000] early res: 3 [37ec8000-37fefc27] RAMDISK
[ 0.000000] early res: 4 [9fc00-fffff] BIOS reserved
[ 0.000000] early res: 5 [8000-dfff] PGTABLE
[ 0.000000] early res: 6 [e000-1043f] MEMNODEMAP
[ 0.000000] Bootmem setup node 1 0000000080000000-0000000120000000
[ 0.000000] NODE_DATA [0000000080000000 - 0000000080003fff]
[ 0.000000] bootmap [0000000080004000 - 0000000080017fff] pages 14
[ 0.000000] [ffffe20000000000-ffffe20001bfffff] PMD ->
[ffff81000c200000-ffff81000ddfffff] on node 0
[ 0.000000] [ffffe20001c00000-ffffe20003ffffff] PMD ->
[ffff810080200000-ffff810081ffffff] on node 1
[ 0.000000] sizeof(struct page) = 56

Just for your information: I'm also using a 64bit Gentoo system with
gcc 4.3.0-alpha20080410 and I'm also seeing these strange time
outputs:

[ 0.000000] NR_CPUS: 4, nr_cpu_ids: 4
[42949372.960000] Built 2 zonelists in ...

To: Torsten Kaiser <just.for.lkml@...>
Cc: <linux-kernel@...>, Ingo Molnar <mingo@...>
Date: Thursday, May 15, 2008 - 2:49 pm

Great, thanks for checking.
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <davem@...>, <kernel-testers@...>
Date: Wednesday, May 14, 2008 - 2:29 pm

Hello,

Got this on sparc64 startup:

=============================================
[ INFO: possible recursive locking detected ]
2.6.26-rc2-mm1 #2
---------------------------------------------
modprobe/514 is trying to acquire lock:
(&cls->mutex){--..}, at: [<00000000005ff538>] device_add+0x3c0/0x5c0

but task is already holding lock:
(&cls->mutex){--..}, at: [<000000000060287c>] class_interface_register+0x44/0xe0

other info that might help us debug this:
1 lock held by modprobe/514:
#0: (&cls->mutex){--..}, at: [<000000000060287c>] class_interface_register+0x44/0xe0

stack backtrace:
Call Trace:
[000000000048cc64] __lock_acquire+0x104c/0x1400
[000000000048d098] lock_acquire+0x80/0xa0
[0000000000701898] mutex_lock_nested+0xc0/0x4e0
[00000000005ff538] device_add+0x3c0/0x5c0
[00000000005ff74c] device_register+0x14/0x20
[00000000005ff808] device_create+0xb0/0xe0
[0000000010012ef8] sg_add+0x160/0x380 [sg]
[00000000006028d4] class_interface_register+0x9c/0xe0
[0000000000617050] scsi_register_interface+0x18/0x40
[000000001001c0a4] init_sg+0xac/0x180 [sg]
[00000000004960c8] sys_init_module+0xb0/0x1c0
[00000000004463cc] sys32_init_module+0x14/0x20
[0000000000406294] linux_sparc_syscall32+0x3c/0x40
[0000000000013698] 0x136a0

To: Mariusz Kozlowski <m.kozlowski@...>
Cc: <linux-kernel@...>, <davem@...>, <kernel-testers@...>
Date: Wednesday, May 14, 2008 - 2:41 pm

Yeah, this is a bug which has always been there, afaik. A
semaphore was converted to a mutex. Semaphores don't have lockdep
checking, but mutexes do, so we just now got to find out about it.
Some finger-pointing is occurring over on the scsi list ;)

I assume the machine otherwise works OK?
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <davem@...>, <kernel-testers@...>
Date: Wednesday, May 14, 2008 - 2:50 pm

Yes - seems it's running fine. I'm doing some other tests now so if anything pops out
you'll know it.

Mariusz
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <linuxppc-dev@...>, <paulmck@...>, <davem@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Wednesday, May 14, 2008 - 11:34 am

Hi Andrew,

2.6.26-rc2-mm1 kernel panics on powerpc, while running ltp test over it.
I have attached the gdb output of the pc and lr registers. The patch
list_for_each_rcu-must-die-networking.patch points to changes made
to the same lines listed by the gdb output.

Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000481fa0
cpu 0x0: Vector: 300 (Data Access) at [c0000000eae37900]
pc: c000000000481fa0: .inet_create+0xb4/0x330
lr: c000000000413340: .__sock_create+0x190/0x280
sp: c0000000eae37b80
msr: 8000000000009032
dar: 0
dsisr: 40010000
current = 0xc0000000cd201500
paca = 0xc0000000007c3480
pid = 6462, comm = socket01
enter ? for help
[c0000000eae37c30] c000000000413340 .__sock_create+0x190/0x280
[c0000000eae37cf0] c0000000004137e0 .sys_socket+0x40/0x98
[c0000000eae37d90] c000000000438e18 .compat_sys_socketcall+0xc0/0x234
[c0000000eae37e30] c0000000000086b4 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 000000000ff20484
SP (ffc8f770) is in userspace

0xc000000000481fa0 is in inet_create (net/ipv4/af_inet.c:290).
285 /* Look for the requested type/protocol pair. */
286 answer = NULL;
287 lookup_protocol:
288 err = -ESOCKTNOSUPPORT;
289 rcu_read_lock();
290 list_for_each_entry_rcu(answer, &inetsw[sock->type], list) {
291
292 /* Check the non-wild match. */
293 if (protocol == answer->protocol) {
294 if (protocol != IPPROTO_IP)

0xc000000000413340 is in __sock_create (net/socket.c:1171).
1166 goto out_release;
1167
1168 /* Now protected by module ref count */
1169 rcu_read_unlock();
1170
1171 err = pf->create(net, sock, protocol);
1172 if (err < 0)
1173 goto out_module_put;
1174
1175 /*

--
Thanks & Regards,
Kamales...

To: Kamalesh Babulal <kamalesh@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <linuxppc-dev@...>, <davem@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Wednesday, May 14, 2008 - 12:07 pm

Hmmm.... Does the panic go away when this patch is reverted?

--

To: Paul E. McKenney <paulmck@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linuxppc-dev@...>, <davem@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Wednesday, May 14, 2008 - 4:05 pm

Yes.

--

To: Alexey Dobriyan <adobriyan@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linuxppc-dev@...>, <davem@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Wednesday, May 14, 2008 - 4:32 pm

OK, am awake now, apologies for my confusion. Not sure -what- state
I was in when generating and validating the original...

Thanx, Paul
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <reiserfs-dev@...>, <reiserfs-devel@...>, Andy Whitcroft <apw@...>
Date: Wednesday, May 14, 2008 - 10:03 am

Hi Andrew,

While running the dbench benchmark on the reiserfs filesystem,
over the x86_64 box booted with the 2.6.26-rc2-mm1 kernel. The
Kernel BUG() is seen on the console.

------------[ cut here ]------------
kernel BUG at fs/reiserfs/journal.c:1414!
invalid opcode: 0000 [1] SMP
last sysfs file: /sys/devices/pci0000:20/0000:20:04.1/resource
CPU 3
Modules linked in:
Pid: 5160, comm: umount Not tainted 2.6.26-rc2-mm1-autotest #1
RIP: 0010:[<ffffffff802e47e5>] [<ffffffff802e47e5>] flush_journal_list+0x78/0x575
RSP: 0000:ffff8101fec6fb18 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000003
RDX: 0000000000000000 RSI: ffff81007edf7c00 RDI: ffffc2000210d0a0
RBP: ffff8101fec6fb58 R08: ffff8101fec6e000 R09: 0000000000000001
R10: 0000000000000000 R11: ffffc2000212f1b0 R12: ffff81007edf7c00
R13: 000000000000000d R14: ffff8101fe5d2c00 R15: ffffc2000210d000
FS: 0000000000000000(0000) GS:ffff8101fff07f80(0063) knlGS:00000000f7fbdb20
CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f7f5d260 CR3: 00000001fb475000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 5160, threadinfo ffff8101fec6e000, task ffff8101fe642c80)
Stack: 00000000fec6fb10 ffffffff802a557b 00000000fc7593f0 ffffc2000210d000
ffff81007f65f900 000000000000000d ffff8101fe5d2c00 ffffc2000210d000
ffff8101fec6fba8 ffffffff802e4aff 000000000000000d 0000000100000001
Call Trace:
[<ffffffff802a557b>] submit_bh+0x105/0x111
[<ffffffff802e4aff>] flush_journal_list+0x392/0x575
[<ffffffff802e7e76>] do_journal_end+0xb6d/0xe0c
[<ffffffff80261f7f>] __writepage+0x0/0x2a
[<ffffffff80263c1f>] pagevec_lookup_tag+0x20/0x2a
[<ffffffff8025ad8e>] wait_on_page_writeback_range+0xeb/0x13e
[<ffffffff802e8376>] do_journal_begin_r+0x261/0x2a2
[<ffffffff802e8a13>] do_journal_release+0x4c/0x180
[<ffffffff8024...

To: Kamalesh Babulal <kamalesh@...>
Cc: <linux-kernel@...>, <reiserfs-devel@...>, Andy Whitcroft <apw@...>, Jeff Mahoney <jeffm@...>
Date: Wednesday, May 14, 2008 - 2:01 pm

This?

--- a/fs/reiserfs/journal.c~reiserfs-convert-j_flush_sem-to-mutex-fix
+++ a/fs/reiserfs/journal.c
@@ -1412,7 +1412,7 @@ static int flush_journal_list(struct sup
/* if flushall == 0, the lock is already held */
if (flushall) {
mutex_lock(&journal->j_flush_mutex);
- } else if (!mutex_trylock(&journal->j_flush_mutex)) {
+ } else if (mutex_trylock(&journal->j_flush_mutex)) {
BUG();
}

--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>
Date: Wednesday, May 14, 2008 - 7:24 am

Hi Andrew,

The 2.6.26-rc2-mm1 kernel panic's while bootup on the x86_64 machine.

BUG: unable to handle kernel paging request at 0000000000001e08
IP: [<ffffffff8026ac60>] __alloc_pages_internal+0x80/0x470
PGD 0
Oops: 0000 [1] SMP
last sysfs file:
CPU 31
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.26-rc2-mm1-autotest #1
RIP: 0010:[<ffffffff8026ac60>] [<ffffffff8026ac60>] __alloc_pages_internal+0x80/0x470
RSP: 0018:ffff810bf9dbdbc0 EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff810bef4786c0 RCX: 0000000000000001
RDX: 0000000000001e00 RSI: 0000000000000001 RDI: 0000000000001020
RBP: ffff810bf9dbb6d0 R08: 0000000000001020 R09: 0000000000000000
R10: 0000000000000008 R11: ffffffff8046d130 R12: 0000000000001020
R13: 0000000000000001 R14: 0000000000001e00 R15: ffff810bf8d29878
FS: 0000000000000000(0000) GS:ffff810bf916dec0(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000001e08 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff810bf9dbc000, task ffff810bf9dbb6d0)
Stack: 0002102000000000 0000000000000002 0000000000000000 0000000200000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 ffff810bef4786c0 0000000000001020 ffffffffffffffff
Call Trace:
[<ffffffff802112e9>] dma_alloc_coherent+0xa9/0x280
[<ffffffff804e8c9e>] tg3_init_one+0xa3e/0x15e0
[<ffffffff8028d0e4>] alternate_node_alloc+0x84/0xd0
[<ffffffff802286fc>] task_rq_lock+0x4c/0x90
[<ffffffff8022de62>] set_cpus_allowed_ptr+0x72/0xf0
[<ffffffff802e12fb>] sysfs_addrm_finish+0x1b/0x210
[<ffffffff802e0f99>] sysfs_find_dirent+0x29/0x40
[<ffffffff8036cc34>] pci_device_probe+0xe4/0x130
[<ffffffff803bfc26>] driver_probe_device+0x96/0x1a0
[<ffffffff803bfdb9>] __driver_attach+0x89/0x...

To: Kamalesh Babulal <kamalesh@...>
Cc: <linux-kernel@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>, <linux-mm@...>
Date: Wednesday, May 14, 2008 - 1:36 pm

grumble. why. There are lots of patches already which changed the
page allocator.

config, please?

Is it NUMA?
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Andy Whitcroft <apw@...>, Balbir Singh <balbir@...>, <linux-mm@...>
Date: Wednesday, May 14, 2008 - 2:21 pm

It is a NUMA box, with 4 nodes.

--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

To: Kamalesh Babulal <kamalesh@...>
Cc: <linux-kernel@...>, <apw@...>, <balbir@...>, <linux-mm@...>
Date: Wednesday, May 14, 2008 - 3:44 pm

On Wed, 14 May 2008 23:51:36 +0530

Can you bisect it please?

Wrecking the page allocator is a fairly unusual thing to do. I'd start
out by looking at *bootmem*.patch and perhaps
acpi-acpi_numa_init-build-fix.patch.
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <apw@...>, <balbir@...>, <linux-mm@...>, <mingo@...>
Date: Sunday, May 18, 2008 - 4:00 am

After bisecting, the acpi-acpi_numa_init-build-fix.patch patch seems
to be causing the kernel panic during the bootup. Reverting the patch helps
in booting up the machine without the panic.

commit 5dc90c0b2d4bd0127624bab67cec159b2c6c4daf
Author: Ingo Molnar <mingo@elte.hu>
Date: Thu May 1 09:51:47 2008 +0000

acpi-acpi_numa_init-build-fix

x86.git testing found the following build error on latest -git:

drivers/acpi/numa.c: In function 'acpi_numa_init':
drivers/acpi/numa.c:226: error: 'NR_NODE_MEMBLKS' undeclared (first use in this function)
drivers/acpi/numa.c:226: error: (Each undeclared identifier is reported only once
drivers/acpi/numa.c:226: error: for each function it appears in.)

with this config:

http://redhat.com/~mingo/misc/config-Wed_Apr_30_22_42_42_CEST_2008.bad

i suspect we dont want SRAT parsing when CONFIG_HAVE_ARCH_PARSE_SRAT
is unset - but the fix looks a bit ugly. Perhaps we should define
NR_NODE_MEMBLKS even in this case and just let the code fall back
to some sane behavior?

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 5d59cb3..8cab8c5 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -176,6 +176,7 @@ acpi_parse_processor_affinity(struct acpi_subtable_header * header,
return 0;
}

+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
static int __init
acpi_parse_memory_affinity(struct acpi_subtable_header * header,
const unsigned long end)
@@ -193,6 +194,7 @@ acpi_parse_memory_affinity(struct acpi_subtable_header * header,

return 0;
}
+#endif

static int __init acpi_parse_srat(struct acpi_table_header *table)
{
@@ -221,9 +223,11 @@ int __init acpi_numa_init(void)
if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
acpi_parse_processo...

To: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>
Cc: LKML <linux-kernel@...>, <apw@...>, <balbir@...>, <linux-mm@...>, <mingo@...>, <kosaki.motohiro@...>, <kosaki.motohiro@...>
Date: Sunday, May 18, 2008 - 1:07 pm

this patch break Fujitsu ia64 numa box too.
after revert, my test environment works well.

Thanks.
--

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, <apw@...>, <balbir@...>, <linux-mm@...>, <mingo@...>, <kosaki.motohiro@...>
Date: Monday, May 19, 2008 - 10:49 am

On HP ia64 numa, that patch causes all memory to show up on node 0, but
otherwise the platform boots and runs. Didn't notice it until I tried
to run some numa tests.

Reverting the patch restores numaness.

Lee

--

To: Andrew Morton <akpm@...>
Cc: Kamalesh Babulal <kamalesh@...>, <linux-kernel@...>, <apw@...>, <balbir@...>, <linux-mm@...>
Date: Wednesday, May 14, 2008 - 9:54 pm

On Wed, 14 May 2008 12:44:55 -0700

From stack trace, it seems NODE_DATA(nid) is NULL.
There are 2 cases.
- nid is bad.
- NODE_DATA(nid) is not initialized...

Hmm..
Thanks,
-Kame

--

Previous thread: [RFC/PATCH 0/6] memcg: peformance improvement at el. v3 by KAMEZAWA Hiroyuki on Wednesday, May 14, 2008 - 4:02 am. (14 messages)

Next thread: m68k: main.c:(.init.text+0x730): undefined reference to `strlen' by Geert Uytterhoeven on Wednesday, May 14, 2008 - 4:02 am. (19 messages)