ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.26-rc2...
- The -mm tree is now based on linux-next.
I will occasionally pick up later versions of trees which are already
in linux-next, to catch material which was added after Stephen last
pulled that tree. That happened this time: git-net had a lot of driver
changes which weren't in linux-next and which I wanted in
2.6.26-rc2-mm1.- A few more git trees were added: git-ubifs.patch, git-regulator.patch,
git-logfs.patch, git-orion.patch.Boilerplate:
- See the `hot-fixes' directory for any important updates to this patchset.
- To fetch an -mm tree using git, use (for example)
git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1- -mm kernel commit activity can be reviewed by subscribing to the
mm-commits mailing list.echo "subscribe mm-commits" | mail majordomo@vger.kernel.org
- If you hit a bug in -mm and it is not obvious which patch caused it, it is
most valuable if you can perform a bisection search to identify which patch
introduced the bug. Instructions for this process are athttp://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
But beware that this process takes some time (around ten rebuilds and
reboots), so consider reporting the bug first and if we cannot immediately
identify the faulty patch, then perform the bisection search.- When reporting bugs, please try to Cc: the relevant maintainer and mailing
list on any email.- When reporting bugs in this kernel via email, please also rewrite the
email Subject: in some manner to reflect the nature of the bug. Some
developers filter by Subject: when looking for messages to read.- Occasional snapshots of the -mm lineup are uploaded to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
the mm-commits list. These probab...
Hello,
This lockdep warning is seen when I remove pcmcia wifi card
from the slot. Doesn't happen every time. It's x86_32.=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.26-rc2-mm1 #2
-------------------------------------------------------
pccardd/1037 is trying to acquire lock:
(rtnl_mutex){--..}, at: [<c02870f1>] rtnl_lock+0x14/0x16but task is already holding lock:
(&socket->skt_mutex){--..}, at: [<c02608ba>] pccardd+0x161/0x28cwhich lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&socket->skt_mutex){--..}:
[<c013fff0>] __lock_acquire+0xf3b/0x103b
[<c0140169>] lock_acquire+0x79/0x92
[<c02cfcd5>] mutex_lock_nested+0x90/0x290
[<c02600a6>] pccard_register_pcmcia+0x22/0x78
[<ded5af02>] pcmcia_bus_add_socket+0x9f/0xe0 [pcmcia]
[<c0251c02>] class_interface_register+0x83/0xb2
[<ded6003a>] 0xded6003a
[<c0146115>] sys_init_module+0x11e/0x18e4
[<c0103001>] sysenter_past_esp+0x6a/0xa5
[<ffffffff>] 0xffffffff-> #1 (&cls->mutex){--..}:
[<c013fff0>] __lock_acquire+0xf3b/0x103b
[<c0140169>] lock_acquire+0x79/0x92
[<c02cfcd5>] mutex_lock_nested+0x90/0x290
[<c024f4a0>] device_add+0x42f/0x557
[<c02895a1>] netdev_register_kobject+0x76/0x7b
[<c027e3f6>] register_netdevice+0x22e/0x39a
[<c027e599>] register_netdev+0x37/0x44
[<c03ce7fb>] loopback_net_init+0x38/0x7d
[<c027bb59>] register_pernet_operations+0x18/0x1a
[<c027bbd3>] register_pernet_device+0x24/0x51
[<c03ce7c1>] loopback_init+0x12/0x14
[<c03b9721>] kernel_init+0x80/0x227
[<c0103c13>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff-> #0 (rtnl_...
cls->mutex
rtnl_lock
cls->mutex
This bug has always been there, and is now exposed by the conversion
of cls->mutex from a semaphore to a mutex. Because lockdep doesn't
check semaphores.I don't know how to get this fixed, sorry. I'll just push
struct-class-sem-to-mutex-converting.patch at Greg until it sticks,
then it will go into mainline, then we'll get a shower of bug reports,
including this one, then someone someday will do soemthing about it.Fun.
--
Hi Andrew,
The 2.6.26-rc2-mm1 kernel gets stuck, while booting up on x86_64 machine,
with the CONFIG_FTRACE_STARTUP_TEST enabled. The following .config
options related to FTRACE are enabled.CONFIG_FTRACE_SELFTEST=y
CONFIG_FTRACE_STARTUP_TEST=y
CONFIG_FTRACE=y
CONFIG_HAVE_FTRACE=y
CONFIG_DYNAMIC_FTRACE=yBIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009dc00 (usable)
BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000d7fcca00 (usable)
BIOS-e820: 00000000d7fcca00 - 00000000d7fd0000 (ACPI data)
BIOS-e820: 00000000d7fd0000 - 00000000d8000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 00000001e8000000 (usable)
max_pfn_mapped = 1998848
init_memory_mapping
DMI 2.3 present.
ACPI: RSDP 000FDFB0, 0024 (r2 IBM )
ACPI: XSDT D7FCFF00, 0044 (r1 IBM SERONYXP 1001 IBM 45444F43)
ACPI: FACP D7FCFE40, 0084 (r2 IBM SERONYXP 1001 IBM 45444F43)
ACPI: DSDT D7FCCA00, 2AA0 (r2 IBM SERTURQU 1000 INTL 20041203)
ACPI: FACS D7FCFD00, 0040
ACPI: APIC D7FCFD80, 00B4 (r1 IBM SERONYXP 1001 IBM 45444F43)
ACPI: MCFG D7FCFD40, 003C (r1 IBM SERONYXP 1001 IBM 45444F43)
ACPI: SSDT D7FCFA40, 02BD (r2 IBM YETA0 1000 INTL 20041203)
No NUMA configuration found
Faking a node at 0000000000000000-00000001e8000000
Bootmem setup node 0 0000000000000000-00000001e8000000
NODE_DATA [0000000000011000 - 0000000000016fff]
bootmap [0000000000017000 - 0000000000053fff] pages 3d
early res: 0 [0-fff] BIOS data page
early res: 1 [6000-7fff] TRAMPOLINE
early res: 2 [200000-b4e40b] TEXT DATA BSS
early res: 3 [37e81000-37fefaa0] RAMDISK
early res: 4 [9dc00-fffff] BIOS reserved
early res: 5 [8000-10fff] PGTABLE
Zone PFN ranges:
DMA 0 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 1998848
Movable zone start PFN for each node
e...
Hi, could you do nmi_watchdog=1 and see if that gives you a stack dump?
Thanks.
-- Steve
--
Hi Steven,
Passing nmi_watchdog=1 did not help in getting any extra information, over the previous
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
Thanks for trying.
Can you send your config privately to my goodmis account.
Thanks,
-- Steve
--
Seen in a 'make silentoldconfig':
---
LED Default ON Trigger (LEDS_TRIGGER_DEFAULT_ON) [N/m/y/?] (NEW) ?This allows LEDs to be initialised in the ON state.
If unsure, say Y.
---The default is N, but if unsure, say Y. Some digging shows that it's because
there's a "depends on LEDS_TRIGGERS" that I had set to N. I wonder if the
various 'config LEDS_TRIGGER_FOO' in drivers/leds/Kconfig should all be
wrapped in one 'if LEDS_TRIGGERS'? Kind of like this totally untested patch:If I'm actually right here, here's a:
Signed-Off-By: Valdis Kletnieks <valdis.kletnieks@vt.edu>
--- linux-2.6.26-rc2-mm1/drivers/leds/Kconfig.before 2008-05-17 06:22:03.000000000 -0400
+++ linux-2.6.26-rc2-mm1/drivers/leds/Kconfig 2008-05-17 06:22:55.000000000 -0400
@@ -164,9 +164,9 @@ config LEDS_TRIGGERS
These triggers allow kernel events to drive the LEDs and can
be configured via sysfs. If unsure, say Y.+if LEDS_TRIGGERS
config LEDS_TRIGGER_TIMER
tristate "LED Timer Trigger"
- depends on LEDS_TRIGGERS
help
This allows LEDs to be controlled by a programmable timer
via sysfs. Some LED hardware can be programmed to start
@@ -177,14 +177,13 @@ config LEDS_TRIGGER_TIMERconfig LEDS_TRIGGER_IDE_DISK
bool "LED IDE Disk Trigger"
- depends on LEDS_TRIGGERS && BLK_DEV_IDEDISK
+ depends on BLK_DEV_IDEDISK
help
This allows LEDs to be controlled by IDE disk activity.
If unsure, say Y.config LEDS_TRIGGER_HEARTBEAT
tristate "LED Heartbeat Trigger"
- depends on LEDS_TRIGGERS
help
This allows LEDs to be controlled by a CPU load average.
The flash frequency is a hyperbolic function of the 1-minute
@@ -193,9 +192,9 @@ config LEDS_TRIGGER_HEARTBEATconfig LEDS_TRIGGER_DEFAULT_ON
tristate "LED Default ON Trigger"
- depends on LEDS_TRIGGERS
help
This allows LEDs to be initialised in the ON state.
If unsure, say Y.+endif # LEDS_TRIGGERS
endif # NEW_LEDS
> linux-next.patch
That's terse. ;-)
Who is responsible for something called "Option High Speed Mobile
Devices"?It's using create_proc_read_entry() interface, so should be switched
to seq_files before merging.And "procfs" module parameter is plain stupid, sorry.
--
On Sat, 17 May 2008 02:17:49 +0400
well, it's a git tree, and all that this implies. The git URL is
The full changelog is contained in linux-next.patch. Searching it for
"Option" quickly leads tocommit a50a26ba350a5f32ec6481c85b938fc7fb476671
Author: Greg Kroah-Hartman <gregkh@suse.de>
Date: Mon Apr 14 11:41:16 2008 -0700USB: add option hso driver
This driver is for a number of different Option devices. Originally
written by Option and Andrew Bird, but cleaned up massivly for
acceptance into mainline by me (Greg).TODO:
- remove proc files and move to debugfs
- review network interfaces
- add better changelog information
- Use netif_msg_ for the message level rather than module parameter
- net_device_stats are now available in dev->statsMany thanks to the following for their help in cleaning up the driver by
providing feedback and patches to it:
- Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
- Oliver Neukum <oliver@neukum.org>
- Alan Cox <alan@lxorguk.ukuu.org.uk>Cc: Andrew Bird <ajb@spheresystems.co.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Filip Aben <f.aben@option.com>
Cc: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Cc: Oliver Neukum <oliver@neukum.org>stupid people cc'ed ;)
--
That parameter is gone, see the patches posted to lkml for an updated
version.thanks,
greg "i'm stupid" k-h
--
Hello,
To get this I simply modprobe wusbcore. modprobe itself ends with
SIGSEGV. This comes from x86_32.UWB: workarounds enabled for bugs:445 514 543 548 010612024004
BUG: unable to handle kernel NULL pointer dereference at 0000000c
IP: [<c01e0e4c>] scatterwalk_start+0xc/0x1f
*pde = 00000000
Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:05.0/resource
Modules linked in: cbc wusbcore(+) uwb radeon drm orinoco_cs orinoco hermes parport_pc parport floppy pcmcia firmware_class rtc psmouse pcspkr 8139too ide_cd_mod cdrom ehci_hcd uhci_hcd usbcore sony_laptop backlight snd_ali5451 snd_ac97_codec ac97_bus snd_pcm snd_timer snd snd_page_alloc yenta_socket rsrc_nonstatic ati_agp agpgartPid: 5423, comm: modprobe Not tainted (2.6.26-rc2-mm1 #1)
EIP: 0060:[<c01e0e4c>] EFLAGS: 00010296 CPU: 0
EIP is at scatterwalk_start+0xc/0x1f
EAX: da471c78 EBX: da471c78 ECX: da471c78 EDX: 00000000
ESI: da471dbb EDI: da4a5010 EBP: da471ba8 ESP: da471ba8
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process modprobe (pid: 5423, ti=da471000 task=dceb0000 task.ti=da471000)
Stack: da471bb4 c01e0ea9 00000000 da471bd4 c01e0f96 00000010 da471c78 da4a5010
00000010 fffffffc 00000000 da471c04 c01e21ca 00000000 da471c68 da471dc8
00000003 00000010 da471c84 da471c78 00000030 da4a5010 da4b8320 da471c34
Call Trace:
[<c01e0ea9>] ? scatterwalk_pagedone+0x4a/0x84
[<c01e0f96>] ? scatterwalk_copychunks+0x2f/0xbb
[<c01e21ca>] ? blkcipher_walk_next+0x311/0x38b
[<c01e1cdf>] ? blkcipher_walk_done+0xb2/0x28c
[<de86e308>] ? crypto_cbc_encrypt+0xc6/0x13b [cbc]
[<c01e3ac6>] ? aes_encrypt+0x0/0x114d
[<c02d1ba8>] ? _spin_unlock_irqrestore+0x3e/0x5f
[<c01f8e34>] ? sg_init_one+0xb/0x66
[<dedea2ba>] ? wusb_prf+0x2b0/0x3e2 [wusbcore]
[<c013ec7e>] ? trace_hardirqs_on+0xb/0xd
[<dedea445>] ? wusb_crypto_init+0x59/0x274 [wusbcore]
[<c02d1ba8>] ? _spin_unlock_irqrestore+0...
--
This was fixed by David Vrabel recently, s/g arrays weren't
proerly initialized (I am to blame for that).David?
--
Parenthesis fix in include/asm-mips/mach-au1x00/au1000.h
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
diff -upr linux-2.6.26-rc2-mm1-a/include/asm-mips/mach-au1x00/au1000.h linux-2.6.26-rc2-mm1-b/include/asm-mips/mach-au1x00/au1000.h
--- linux-2.6.26-rc2-mm1-a/include/asm-mips/mach-au1x00/au1000.h 2008-05-15 19:44:48.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-mips/mach-au1x00/au1000.h 2008-05-15 19:52:38.000000000 +0200
@@ -1036,7 +1036,7 @@ enum soc_au1200_ints {
#define USBD_INTSTAT 0xB020001C
# define USBDEV_INT_SOF (1 << 12)
# define USBDEV_INT_HF_BIT 6
-# define USBDEV_INT_HF_MASK 0x3f << USBDEV_INT_HF_BIT)
+# define USBDEV_INT_HF_MASK (0x3f << USBDEV_INT_HF_BIT)
# define USBDEV_INT_CMPLT_BIT 0
# define USBDEV_INT_CMPLT_MASK (0x3f << USBDEV_INT_CMPLT_BIT)
#define USBD_CONFIG 0xB0200020
--
Parenthesis fix in include/asm-mips/gic.h
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
diff -upr linux-2.6.26-rc2-mm1-a/include/asm-mips/gic.h linux-2.6.26-rc2-mm1-b/include/asm-mips/gic.h
--- linux-2.6.26-rc2-mm1-a/include/asm-mips/gic.h 2008-05-15 19:44:48.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-mips/gic.h 2008-05-15 19:52:20.000000000 +0200
@@ -330,7 +330,7 @@#define GIC_SH_RMASK_OFS 0x0300
#define GIC_CLR_INTR_MASK(intr, val) \
- GICWRITE(GIC_REG_ADDR(SHARED, GIC_SH_RMASK_OFS + 4 + (((((intr) / 32) ^ 1) - 1) * 4)), ((val) << ((intr) % 32))
+ GICWRITE(GIC_REG_ADDR(SHARED, GIC_SH_RMASK_OFS + 4 + (((((intr) / 32) ^ 1) - 1) * 4)), ((val) << ((intr) % 32)))/* Register Map for Local Section */
#define GIC_VPE_CTL_OFS 0x0000
--
Parenthesis fix in include/asm-arm/arch-omap/control.h
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
diff -upr linux-2.6.26-rc2-mm1-a/include/asm-arm/arch-omap/control.h linux-2.6.26-rc2-mm1-b/include/asm-arm/arch-omap/control.h
--- linux-2.6.26-rc2-mm1-a/include/asm-arm/arch-omap/control.h 2008-05-15 19:44:38.000000000 +0200
+++ linux-2.6.26-rc2-mm1-b/include/asm-arm/arch-omap/control.h 2008-05-15 19:51:30.000000000 +0200
@@ -80,7 +80,7 @@
#define OMAP24XX_CONTROL_SEC_TAP (OMAP2_CONTROL_GENERAL + 0x0064)
#define OMAP24XX_CONTROL_OCM_PUB_RAM_ADD (OMAP2_CONTROL_GENERAL + 0x006c)
#define OMAP24XX_CONTROL_EXT_SEC_RAM_START_ADD (OMAP2_CONTROL_GENERAL + 0x0070)
-#define OMAP24XX_CONTROL_EXT_SEC_RAM_STOP_ADD (OMAP2_CONTROL_GENERAL + 0x0074
+#define OMAP24XX_CONTROL_EXT_SEC_RAM_STOP_ADD (OMAP2_CONTROL_GENERAL + 0x0074)
#define OMAP24XX_CONTROL_SEC_STATUS (OMAP2_CONTROL_GENERAL + 0x0080)
#define OMAP24XX_CONTROL_SEC_ERR_STATUS (OMAP2_CONTROL_GENERAL + 0x0084)
#define OMAP24XX_CONTROL_STATUS (OMAP2_CONTROL_GENERAL + 0x0088)--
My HP nx6325 doesn't resume from suspend. It looks like the graphics doesn't
come up, so probably s2ram is busted.I'll try to bisect on weekend, if I have the time (not sure).
Thanks,
Rafael
--
net/built-in.o: In function `init_p9':
mod.c:(.init.text+0x4b0d): undefined reference to `p9_trans_fd_init'
make[1]: *** [.tmp_vmlinux1] Error 1CONFIG_NET_9P=y
CONFIG_NET_9P_FD=m
CONFIG_NET_9P_VIRTIO=m
CONFIG_NET_9P_DEBUG=y# CONFIG_9P_FS is not set
---
~Randy
--
This is probably a side effect of the merge issue v9fs-devel and -mm
had last week. It should no longer be possible with the code that's
been in my v9fs-devel tree on kernel.org for the past 5 days
(CONFIG_NET_9P_FD no longer exists).-eric
--
On Wed, 14 May 2008 19:00:12 -0500
But I'm still reverting the v9fs tree due to
git-v9fs is causing i386 allmodconfig failures:
net/9p/trans_fd.o: In function `init_module':
trans_fd.c:(.init.text+0x0): multiple definition of `init_module'
net/9p/mod.o:mod.c:(.init.text+0x0): first defined here
/opt/crosstool/gcc-4.1.0-glibc-2.3.6/i686-unknown-linux-gnu/bin/i686-unknown-linux-gnu-ld: Warning: size of symbol `init_module' changed from 27 in net/9p/mod.o to 128 in net/9p/trans_fd.o--
On Wed, May 14, 2008 at 7:05 PM, Andrew Morton
Okay, clearly I'm doing something wrong. I've tried the allmodconfig
on my local sandbox and its fine. When I look to see if there is
still a module_init in net/9p/trans_fd on kernel.org via gitweb, I
can't find it. (http://git.kernel.org/?p=linux/kernel/git/ericvh/v9fs.git;a=blob;f=net/9...)Are you pulling from my v9fs-devel tree or is --mm switched over to
pull from linux-next or something?-eric
--
It has mysteriously gone away. Perhaps it was triggered by some other
The algorithm to determine this is to look at the first line of -mm's
git-v9fs.patch:ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.26-rc2...
has:GIT 38bfbd9f766f0b33de6bc16fd9ad1018b8fd3fe2 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git#v9fs-devel
Yes, -mm uses both linux-next and git-v9fs (aka #v9fs-devel)
linux-next uses #for-next and afacit that was empty as of a few hours
ago. Nothing for 2.6.27?
--
On Wed, May 14, 2008 at 10:04 PM, Andrew Morton
Oh, there's stuff for 2.6.27, I'm still working on stablizing it --
but I put it on hold while I tried to clear out my bugzilla backlog.
Trying to stick to a policy of increased testing and removing bugs
before potentially introducing new ones. It helps that there have
been several additional groups starting to use 9p and hitting corner
cases my testing didn't cover before.-eric
--
SCSI_DH has some problems when CONFIG_SCSI=n:
drivers/built-in.o: In function `activate_path':
dm-mpath.c:(.text+0x18a292): undefined reference to `scsi_dh_activate'
drivers/built-in.o: In function `multipath_ctr':
dm-mpath.c:(.text+0x18a6f0): undefined reference to `scsi_dh_handler_exist'
make[1]: *** [.tmp_vmlinux1] Error 1#
# SCSI device support
#
CONFIG_RAID_ATTRS=y
# CONFIG_SCSI is not set
# CONFIG_SCSI_DMA is not set
# CONFIG_SCSI_NETLINK is not set
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y---
~Randy
--
This is one more of those annoying selects. The SCSI_DH Kconfig file is
correctly dependent on SCSI:menuconfig SCSI_DH
tristate "SCSI Device Handlers"
depends on SCSI
default n
helpbut we've also got a select in md/Kconfig:
config DM_MULTIPATH
tristate "Multipath target"
depends on BLK_DEV_DM
select SCSI_DHWhich ignores the dependency.
My best guess for fixing this is either to make the select a depends or
just drop it altogether (after all, it's possible to have multipath on
non-SCSI devices).James
--
Hi James, Andrew,
Here is a patch to remove the automatic "select" of scsi_dh for
dm-multipath.Sorry about the mishap.
chandra
You obviously wanted `static inline' there, but it still fails i386
allmodconfig compilation.--
Yikes.... Sorry again... Hopefully this attached patch work properly.
chandra
-------------------------
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow hardware handlers to be used.Handle SCSI_DH being a module also.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Hannes Reinecke <hare@suse.de>
---Index: scsi-misc-2.6/drivers/md/Kconfig
===================================================================
--- scsi-misc-2.6.orig/drivers/md/Kconfig
+++ scsi-misc-2.6/drivers/md/Kconfig
@@ -252,7 +252,6 @@ config DM_ZERO
config DM_MULTIPATH
tristate "Multipath target"
depends on BLK_DEV_DM
- select SCSI_DH
---help---
Allow volume managers to support multipath hardware.Index: scsi-misc-2.6/drivers/md/dm-mpath.c
===================================================================
--- scsi-misc-2.6.orig/drivers/md/dm-mpath.c
+++ scsi-misc-2.6/drivers/md/dm-mpath.c
@@ -664,6 +664,8 @@ static int parse_hw_handler(struct arg_s
request_module("scsi_dh_%s", m->hw_handler_name);
if (scsi_dh_handler_exist(m->hw_handler_name) == 0) {
ti->error = "unknown hardware handler type";
+ kfree(m->hw_handler_name);
+ m->hw_handler_name = NULL;
return -EINVAL;
}
consume(as, hw_argc - 1);
Index: scsi-misc-2.6/include/scsi/scsi_dh.h
===================================================================
--- scsi-misc-2.6.orig/include/scsi/scsi_dh.h
+++ scsi-misc-2.6/include/scsi/scsi_dh.h
@@ -54,6 +54,16 @@ enum {
SCSI_DH_NOSYS,
SCSI_DH_DRIVER_MAX,
};
-
+#if defined(CONFIG_SCSI_DH) || defined(CONFIG_SCSI_DH_MODULE)
extern int scsi_dh_activate(struct request_queue *);
extern int scsi_dh_han...
Did it build cleanly for you?
Hint:--
~Randy
--
Oh, my... it is getting very tricky.
Here is a patch that compiles clean in different combinations. But, I
agree that the "depends" (under DM_MULTIPATH) sure looks weird.-----------
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow hardware handlers to be used.Handle SCSI_DH being a module also. Make sure it doesn't allow DM_MULTIPATH
to be compiled in when SCSI_DH is a module.Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Mike Anderson <andmike@us.ibm.com>
Cc: Hannes Reinecke <hare@suse.de>
---Index: scsi-misc-2.6/drivers/md/Kconfig
===================================================================
--- scsi-misc-2.6.orig/drivers/md/Kconfig
+++ scsi-misc-2.6/drivers/md/Kconfig
@@ -252,7 +252,7 @@ config DM_ZERO
config DM_MULTIPATH
tristate "Multipath target"
depends on BLK_DEV_DM
- select SCSI_DH
+ depends on SCSI_DH || !SCSI_DH
---help---
Allow volume managers to support multipath hardware.Index: scsi-misc-2.6/drivers/md/dm-mpath.c
===================================================================
--- scsi-misc-2.6.orig/drivers/md/dm-mpath.c
+++ scsi-misc-2.6/drivers/md/dm-mpath.c
@@ -664,6 +664,8 @@ static int parse_hw_handler(struct arg_s
request_module("scsi_dh_%s", m->hw_handler_name);
if (scsi_dh_handler_exist(m->hw_handler_name) == 0) {
ti->error = "unknown hardware handler type";
+ kfree(m->hw_handler_name);
+ m->hw_handler_name = NULL;
return -EINVAL;
}
consume(as, hw_argc - 1);
Index: scsi-misc-2.6/include/scsi/scsi_dh.h
===================================================================
--- scsi-misc-2.6.orig/include/scsi/scsi_dh.h
+++ scsi-...
No good on my first attempt. Here is what I ran into:
The printk timestamps have gone wild. I cannot paste a dmesg but here
is one line I wrote down:
[17180644.495790] Testing tracer ftrace: NMI watchdog ...Which leads into the next problem: The kernel freezes after Testing
tracer ftrace. Then I rebooted with my special testing command line
"kernel /bzImage-2.6.26-rc2-mm1 root=3D/dev/sda2 rootfstype=3Dreiser4
rootflags=3Ddefaults,noatime i8042.nomux elevator=3Dcfq resume=3D/dev/sda3
panic=3D5 nmi_watchdog=3D2,panic debug idle=3Dpoll nohz=3Doff"and I got the same freeze but then the NMI watchdog message. Which is
the third problem.Why did the NMI watchdog not panic and reboot the system? It detected
the lock and printed the message. It should have then panicked, waited
5 seconds, and rebooted.System is a 64-bit Gentoo AMD-64 Compaq R3000 laptop. Compiler is GCC
4.3.Config follows:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.26-rc2-mm1
# Wed May 14 09:59:19 2008
#
CONFIG_64BIT=3Dy
# CONFIG_X86_32 is not set
CONFIG_X86_64=3Dy
CONFIG_X86=3Dy
CONFIG_DEFCONFIG_LIST=3D"arch/x86/configs/x86_64_defconfig"
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=3Dy
CONFIG_GENERIC_CMOS_UPDATE=3Dy
CONFIG_CLOCKSOURCE_WATCHDOG=3Dy
CONFIG_GENERIC_CLOCKEVENTS=3Dy
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=3Dy
CONFIG_LOCKDEP_SUPPORT=3Dy
CONFIG_STACKTRACE_SUPPORT=3Dy
CONFIG_HAVE_LATENCYTOP_SUPPORT=3Dy
CONFIG_FAST_CMPXCHG_LOCAL=3Dy
CONFIG_MMU=3Dy
CONFIG_ZONE_DMA=3Dy
CONFIG_GENERIC_ISA_DMA=3Dy
CONFIG_GENERIC_IOMAP=3Dy
CONFIG_GENERIC_BUG=3Dy
CONFIG_GENERIC_HWEIGHT=3Dy
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=3Dy
CONFIG_RWSEM_GENERIC_SPINLOCK=3Dy
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=3Dy
CONFIG_GENERIC_CALIBRATE_DELAY=3Dy
CONFIG_GENERIC_TIME_VSYSCALL=3Dy
CONFIG_ARCH_HAS_CPU_RELAX=3Dy
CONFIG_ARC...
On Wed, 14 May 2008 14:49:07 -0600
I've seen reports like this against mainline, but I'm not sure that
Thanks.
--
I've reported problems with -next and ftrace. The timestamps look very similar
to what I've seen as well. I don't have those kernels available anymore - I
decided to wipe my system and move to a distro where it's easier to test new
kernels.However, it stands to reason that it isn't ftrace actually causing the
That is quite similar... I'm on a Core2Duo (Dell Inspiron 1420 laptop) and was
seeing the problems with GCC4.3 and a pure 64bit userland.--
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
--
I disabled a bunch of trace and self test options, and I am now running
2.6.26-rc2-mm1. So far, so good.I am including some dmesg with the weird timestamps in case it is useful
to anyone.[ 0.000000] Linux version 2.6.26-rc2-mm1 (lynx@zephyr) (gcc version 4.3.=
0 (Gentoo 4.3.0 p1.0) ) #13 SMP Wed May 14 15:16:26 MDT 2008
[ 0.000000] Command line: root=3D/dev/sda2 rootfstype=3Dreiser4 rootflag=
s=3Ddefaults,noatime i8042.nomux elevator=3Dcfq resume=3D/dev/sda3 panic=3D=
5 debug
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000d0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000003ff70000 (usable)
[ 0.000000] BIOS-e820: 000000003ff70000 - 000000003ff7f000 (ACPI data)
[ 0.000000] BIOS-e820: 000000003ff7f000 - 000000003ff80000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000003ff80000 - 0000000040000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[ 0.000000] max_pfn_mapped =3D 1048576
[ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x701060007=
0106
[ 0.000000] init_memory_mapping
[ 0.000000] DMI present.
[ 0.000000] ACPI: RSDP 000F7240, 0014 (r0 PTLTD )
[ 0.000000] ACPI: RSDT 3FF7A87E, 0034 (r1 PTLTD RSDT 6040000 LTP =
0)
[ 0.000000] ACPI: FACP 3FF7EE13, 0074 (r1 NVIDIA CK8 6040000 PTL_ =
F4240)
[ 0.000000] ACPI: DSDT 3FF7A8B2, 4561 (r1 NVIDIA CK8 6040000 MSFT =
100000E)
[ 0.000000] ACPI: FACS 3FF7FFC0, 0040
[ 0.000000] ACPI: APIC 3FF7EE87, 005A (r1 NVIDIA NV_APIC_ 6040000 LTP =
0)
[ 0.000000] ACPI: BOOT 3FF7EEE1, 0028 (r1 PTLTD $SBFTBL$ 6040000 LTP =
1)
[ 0.000000] ACPI: SSDT 3FF7EF09, 00F7 (r1 PTLTD POWERNOW 6040000 LTP =
1)
[ 0.000000] ACPI: DMI detected: Hewlett-Packard
[ 0.000000] early r...
With CONFIG_*FD=n:
# CONFIG_SIGNALFD is not set
# CONFIG_TIMERFD is not set
# CONFIG_EVENTFD is not setthe build fails with:
arch/x86/kernel/built-in.o: In function `sys_call_table':
(.rodata+0x89c): undefined reference to `sys_signalfd4'
arch/x86/kernel/built-in.o: In function `sys_call_table':
(.rodata+0x8a0): undefined reference to `sys_eventfd2'
make[1]: *** [.tmp_vmlinux1] Error 1---
~Randy
--
Using WARN() with CONFIG_BUG=n causes:
linux-2.6.26-rc2-mm1/lib/kobject.c: In function 'kobject_add_internal':
linux-2.6.26-rc2-mm1/lib/kobject.c:218: error: implicit declaration of function 'WARN'
make[2]: *** [lib/kobject.o] Error 1---
~Randy
--
mkfs.ext2 became kick-ass slow:
+ sudo mkfs.ext2 -F
mke2fs 1.40.6 (09-Feb-2008)
Warning: 256-byte inodes not usable on older systems
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
9773056 inodes, 39072726 blocks
1953636 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
1193 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
...Writing inode tables: 193/1193
^^^^
counter moves slowly,
occasional counting at what seems to be normal
speed occur.160 GB SATA disk, no partitions.
According to sysfs, CFQ is in use, the rest is compiled out.
2.6.26-rc2 is fine, mkfs takes ~1 min.Slowdown is totally reproducible.
CONFIG_ATA=y
CONFIG_ATA_ACPI=y
CONFIG_SATA_AHCI=y
CONFIG_ATA_SFF=y
CONFIG_ATA_PIIX=y
CONFIG_PATA_JMICRON=y/sys/block/sdb/queue/iosched/back_seek_max
16384
/sys/block/sdb/queue/iosched/back_seek_penalty
2
/sys/block/sdb/queue/iosched/fifo_expire_async
250
/sys/block/sdb/queue/iosched/fifo_expire_sync
120
/sys/block/sdb/queue/iosched/quantum
4
/sys/block/sdb/queue/iosched/slice_async
40
/sys/block/sdb/queue/iosched/slice_async_rq
2
/sys/block/sdb/queue/iosched/slice_idle
10
/sys/block/sdb/queue/iosched/slice_sync
100--
Here is where it spends time (seems to be always the same):
mkfs.ext2 D 0000000000000000 0 4760 4759
ffff81017ce93a58 0000000000000046 0000000000000000 0000000000000282
ffff81017e14d640 ffffffff8056f4c0 ffff81017e14d880 ffffffff804679a2
00000000ffffb5c4 000000007ce93a68 0000000000000003 ffffffff8023d504
Call Trace:
[<ffffffff804679a2>] ? _spin_unlock_irqrestore+0x42/0x80
[<ffffffff8023d504>] ? __mod_timer+0xc4/0x110
[<ffffffff80465012>] schedule_timeout+0x62/0xe0
[<ffffffff8023cee0>] ? process_timeout+0x0/0x10
[<ffffffff80464ef8>] io_schedule_timeout+0x28/0x40
[<ffffffff8027663a>] congestion_wait+0x8a/0xb0
[<ffffffff80248720>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8026fe31>] balance_dirty_pages_ratelimited_nr+0x1a1/0x3f0
[<ffffffff8026915f>] generic_file_buffered_write+0x1ff/0x740
[<ffffffff80467870>] ? _spin_unlock+0x30/0x60
[<ffffffff802acafb>] ? mnt_drop_write+0x7b/0x160
[<ffffffff80269b30>] __generic_file_aio_write_nolock+0x2a0/0x460
[<ffffffff802548ed>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff80269df7>] generic_file_aio_write_nolock+0x37/0xa0
[<ffffffff80292be1>] do_sync_write+0xf1/0x130
[<ffffffff80256485>] ? trace_hardirqs_on_caller+0xd5/0x160
[<ffffffff80248720>] ? autoremove_wake_function+0x0/0x40
[<ffffffff80256485>] ? trace_hardirqs_on_caller+0xd5/0x160
[<ffffffff8025651d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff8029339a>] vfs_write+0xaa/0xe0
[<ffffffff80293940>] sys_write+0x50/0x90
[<ffffffff8020b69b>] system_call_after_swapgs+0x7b/0x80--
And not only mkfs, ld took ages to link vmlinux.o:
ld D 0000000000000000 0 17340 17339
ffff8100681819c8 0000000000000082 0000000000000000 ffff81006818198c
ffffffff806c90c0 ffff81006b50d2e0 ffffffff80636360 ffff81006b50d558
0000000068181978 0000000100a7523e ffff81006b50d558 0000000100a75269
Call Trace:
[<ffffffff805056b2>] schedule_timeout+0x62/0xd0
[<ffffffff802403b0>] ? process_timeout+0x0/0x10
[<ffffffff805056ad>] ? schedule_timeout+0x5d/0xd0
[<ffffffff80504956>] io_schedule_timeout+0x76/0xd0
[<ffffffff80282cac>] congestion_wait+0x6c/0x90
[<ffffffff8024c2c0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8027c82f>] balance_dirty_pages_ratelimited_nr+0x13f/0x330
[<ffffffff80275a3d>] generic_file_buffered_write+0x1dd/0x6d0
[<ffffffff8027d0e7>] ? __do_page_cache_readahead+0x167/0x220
[<ffffffff802763ae>] __generic_file_aio_write_nolock+0x25e/0x450
[<ffffffff80276c75>] ? generic_file_aio_read+0x565/0x640
[<ffffffff80276607>] generic_file_aio_write+0x67/0xd0
[<ffffffff802f8bd6>] ext3_file_write+0x26/0xc0
[<ffffffff8029ffa1>] do_sync_write+0xf1/0x140
[<ffffffff8024c2c0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff80289703>] ? remove_vma+0x53/0x70
[<ffffffff80505a01>] ? mutex_lock+0x11/0x30
[<ffffffff802a0a2b>] vfs_write+0xcb/0x190
[<ffffffff802a0be0>] sys_write+0x50/0x90
[<ffffffff8020b82b>] system_call_after_swapgs+0x7b/0x80
--
On Wed, May 14, 2008 at 10:01 AM, Andrew Morton
Nice! This one works for me again.
But somehow the NUMAness of my system is gone.
2.6.26-rc2-mm1:
[ 0.000000] max_pfn_mapped = 1179648
[ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[ 0.000000] init_memory_mapping
[ 0.000000] DMI present.
[ 0.000000] ACPI: RSDP 000FB080, 0024 (r2 ACPIAM)
[ 0.000000] ACPI: XSDT DFFD0100, 0064 (r1 A_M_I_ OEMXSDT 4000713
MSFT 97)
[ 0.000000] ACPI: FACP DFFD0290, 00F4 (r3 A_M_I_ OEMFACP 4000713
MSFT 97)
[ 0.000000] ACPI: DSDT DFFD0450, 4FC5 (r1 S0027 S0027000 0
INTL 20051117)
[ 0.000000] ACPI: FACS DFFDE000, 0040
[ 0.000000] ACPI: APIC DFFD0390, 0080 (r1 A_M_I_ OEMAPIC 4000713
MSFT 97)
[ 0.000000] ACPI: MCFG DFFD0410, 003C (r1 A_M_I_ OEMMCFG 4000713
MSFT 97)
[ 0.000000] ACPI: OEMB DFFDE040, 0060 (r1 A_M_I_ AMI_OEM 4000713
MSFT 97)
[ 0.000000] ACPI: HPET DFFD5420, 0038 (r1 A_M_I_ OEMHPET0 4000713
MSFT 97)
[ 0.000000] ACPI: MCFG DFFD5460, 003C (r1 A_M_I_ OEMMCFG 4000713
MSFT 97)
[ 0.000000] ACPI: SRAT DFFD54A0, 0110 (r1 AMD HAMMER 1
AMD 1)
[ 0.000000] ACPI: SSDT DFFD55B0, 04F0 (r1 A_M_I_ POWERNOW 1
AMD 1)
[ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC 2 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 3 -> Node 1
[ 0.000000] SRAT: PXMs only cover 0MB of your 4608MB e820 RAM. Not used.
[ 0.000000] SRAT: SRAT not used.
[ 0.000000] No NUMA configuration found
[ 0.000000] Faking a node at 0000000000000000-0000000120000000
[ 0.000000] Bootmem setup node 0 0000000000000000-0000000120000000
[ 0.000000] NODE_DATA [0000000000001000 - 0000000000004fff]
[ 0.000000] bootmap [000000000000e000 - 0000000000031fff] pages 24
[ 0.000000] early res: 0 [0-fff] BIOS data page
[ 0.000000] early res: 1 [60...
On Wed, 14 May 2008 21:12:13 +0200
I suspect that this might be caused by the below.
That patch no longer seems to be necessary so I'll drop it. Perhaps
you could try reverting it, please?From: Ingo Molnar <mingo@elte.hu>
x86.git testing found the following build error on latest -git:
drivers/acpi/numa.c: In function 'acpi_numa_init':
drivers/acpi/numa.c:226: error: 'NR_NODE_MEMBLKS' undeclared (first use in this function)
drivers/acpi/numa.c:226: error: (Each undeclared identifier is reported only once
drivers/acpi/numa.c:226: error: for each function it appears in.)with this config:
http://redhat.com/~mingo/misc/config-Wed_Apr_30_22_42_42_CEST_2008.bad
i suspect we dont want SRAT parsing when CONFIG_HAVE_ARCH_PARSE_SRAT
is unset - but the fix looks a bit ugly. Perhaps we should define
NR_NODE_MEMBLKS even in this case and just let the code fall back
to some sane behavior?Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---drivers/acpi/numa.c | 4 ++++
1 file changed, 4 insertions(+)diff -puN drivers/acpi/numa.c~acpi-acpi_numa_init-build-fix drivers/acpi/numa.c
--- a/drivers/acpi/numa.c~acpi-acpi_numa_init-build-fix
+++ a/drivers/acpi/numa.c
@@ -176,6 +176,7 @@ acpi_parse_processor_affinity(struct acp
return 0;
}+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
static int __init
acpi_parse_memory_affinity(struct acpi_subtable_header * header,
const unsigned long end)
@@ -193,6 +194,7 @@ acpi_parse_memory_affinity(struct acpi_sreturn 0;
}
+#endifstatic int __init acpi_parse_srat(struct acpi_table_header *table)
{
@@ -221,9 +223,11 @@ int __init acpi_numa_init(void)
if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
acpi_parse_processor_affinity, NR_CPUS);
+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
acpi_parse_memory...
On Wed, May 14, 2008 at 9:35 PM, Andrew Morton
Yes, reverting the patch below gets the system back to its normal state.
[ 0.000000] ACPI: SSDT DFFD55B0, 04F0 (r1 A_M_I_ POWERNOW 1
AMD 1)
[ 0.000000] SRAT: PXM 0 -> APIC 0 -> Node 0
[ 0.000000] SRAT: PXM 0 -> APIC 1 -> Node 0
[ 0.000000] SRAT: PXM 1 -> APIC 2 -> Node 1
[ 0.000000] SRAT: PXM 1 -> APIC 3 -> Node 1
[ 0.000000] SRAT: Node 0 PXM 0 0-a0000
[ 0.000000] SRAT: Node 0 PXM 0 100000-80000000
[ 0.000000] SRAT: Node 1 PXM 1 80000000-e0000000
[ 0.000000] SRAT: Node 1 PXM 1 100000000-120000000
[ 0.000000] NUMA: Allocated memnodemap from e000 - 10440
[ 0.000000] NUMA: Using 20 for the hash shift.
[ 0.000000] Bootmem setup node 0 0000000000000000-0000000080000000
[ 0.000000] NODE_DATA [0000000000001000 - 0000000000004fff]
[ 0.000000] bootmap [0000000000011000 - 0000000000020fff] pages 10
[ 0.000000] early res: 0 [0-fff] BIOS data page
[ 0.000000] early res: 1 [6000-7fff] TRAMPOLINE
[ 0.000000] early res: 2 [200000-9601db] TEXT DATA BSS
[ 0.000000] early res: 3 [37ec8000-37fefc27] RAMDISK
[ 0.000000] early res: 4 [9fc00-fffff] BIOS reserved
[ 0.000000] early res: 5 [8000-dfff] PGTABLE
[ 0.000000] early res: 6 [e000-1043f] MEMNODEMAP
[ 0.000000] Bootmem setup node 1 0000000080000000-0000000120000000
[ 0.000000] NODE_DATA [0000000080000000 - 0000000080003fff]
[ 0.000000] bootmap [0000000080004000 - 0000000080017fff] pages 14
[ 0.000000] [ffffe20000000000-ffffe20001bfffff] PMD ->
[ffff81000c200000-ffff81000ddfffff] on node 0
[ 0.000000] [ffffe20001c00000-ffffe20003ffffff] PMD ->
[ffff810080200000-ffff810081ffffff] on node 1
[ 0.000000] sizeof(struct page) = 56Just for your information: I'm also using a 64bit Gentoo system with
gcc 4.3.0-alpha20080410 and I'm also seeing these strange time
outputs:[ 0.000000] NR_CPUS: 4, nr_cpu_ids: 4
[42949372.960000] Built 2 zonelists in ...
Great, thanks for checking.
--
Hello,
Got this on sparc64 startup:
=============================================
[ INFO: possible recursive locking detected ]
2.6.26-rc2-mm1 #2
---------------------------------------------
modprobe/514 is trying to acquire lock:
(&cls->mutex){--..}, at: [<00000000005ff538>] device_add+0x3c0/0x5c0but task is already holding lock:
(&cls->mutex){--..}, at: [<000000000060287c>] class_interface_register+0x44/0xe0other info that might help us debug this:
1 lock held by modprobe/514:
#0: (&cls->mutex){--..}, at: [<000000000060287c>] class_interface_register+0x44/0xe0stack backtrace:
Call Trace:
[000000000048cc64] __lock_acquire+0x104c/0x1400
[000000000048d098] lock_acquire+0x80/0xa0
[0000000000701898] mutex_lock_nested+0xc0/0x4e0
[00000000005ff538] device_add+0x3c0/0x5c0
[00000000005ff74c] device_register+0x14/0x20
[00000000005ff808] device_create+0xb0/0xe0
[0000000010012ef8] sg_add+0x160/0x380 [sg]
[00000000006028d4] class_interface_register+0x9c/0xe0
[0000000000617050] scsi_register_interface+0x18/0x40
[000000001001c0a4] init_sg+0xac/0x180 [sg]
[00000000004960c8] sys_init_module+0xb0/0x1c0
[00000000004463cc] sys32_init_module+0x14/0x20
[0000000000406294] linux_sparc_syscall32+0x3c/0x40
[0000000000013698] 0x136a0
Yeah, this is a bug which has always been there, afaik. A
semaphore was converted to a mutex. Semaphores don't have lockdep
checking, but mutexes do, so we just now got to find out about it.
Some finger-pointing is occurring over on the scsi list ;)I assume the machine otherwise works OK?
--
Yes - seems it's running fine. I'm doing some other tests now so if anything pops out
you'll know it.Mariusz
--
Hi Andrew,
2.6.26-rc2-mm1 kernel panics on powerpc, while running ltp test over it.
I have attached the gdb output of the pc and lr registers. The patch
list_for_each_rcu-must-die-networking.patch points to changes made
to the same lines listed by the gdb output.Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000481fa0
cpu 0x0: Vector: 300 (Data Access) at [c0000000eae37900]
pc: c000000000481fa0: .inet_create+0xb4/0x330
lr: c000000000413340: .__sock_create+0x190/0x280
sp: c0000000eae37b80
msr: 8000000000009032
dar: 0
dsisr: 40010000
current = 0xc0000000cd201500
paca = 0xc0000000007c3480
pid = 6462, comm = socket01
enter ? for help
[c0000000eae37c30] c000000000413340 .__sock_create+0x190/0x280
[c0000000eae37cf0] c0000000004137e0 .sys_socket+0x40/0x98
[c0000000eae37d90] c000000000438e18 .compat_sys_socketcall+0xc0/0x234
[c0000000eae37e30] c0000000000086b4 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 000000000ff20484
SP (ffc8f770) is in userspace0xc000000000481fa0 is in inet_create (net/ipv4/af_inet.c:290).
285 /* Look for the requested type/protocol pair. */
286 answer = NULL;
287 lookup_protocol:
288 err = -ESOCKTNOSUPPORT;
289 rcu_read_lock();
290 list_for_each_entry_rcu(answer, &inetsw[sock->type], list) {
291
292 /* Check the non-wild match. */
293 if (protocol == answer->protocol) {
294 if (protocol != IPPROTO_IP)0xc000000000413340 is in __sock_create (net/socket.c:1171).
1166 goto out_release;
1167
1168 /* Now protected by module ref count */
1169 rcu_read_unlock();
1170
1171 err = pf->create(net, sock, protocol);
1172 if (err < 0)
1173 goto out_module_put;
1174
1175 /*--
Thanks & Regards,
Kamales...
Hmmm.... Does the panic go away when this patch is reverted?
--
Yes.
--
OK, am awake now, apologies for my confusion. Not sure -what- state
I was in when generating and validating the original...Thanx, Paul
--
Hi Andrew,
While running the dbench benchmark on the reiserfs filesystem,
over the x86_64 box booted with the 2.6.26-rc2-mm1 kernel. The
Kernel BUG() is seen on the console.------------[ cut here ]------------
kernel BUG at fs/reiserfs/journal.c:1414!
invalid opcode: 0000 [1] SMP
last sysfs file: /sys/devices/pci0000:20/0000:20:04.1/resource
CPU 3
Modules linked in:
Pid: 5160, comm: umount Not tainted 2.6.26-rc2-mm1-autotest #1
RIP: 0010:[<ffffffff802e47e5>] [<ffffffff802e47e5>] flush_journal_list+0x78/0x575
RSP: 0000:ffff8101fec6fb18 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000003
RDX: 0000000000000000 RSI: ffff81007edf7c00 RDI: ffffc2000210d0a0
RBP: ffff8101fec6fb58 R08: ffff8101fec6e000 R09: 0000000000000001
R10: 0000000000000000 R11: ffffc2000212f1b0 R12: ffff81007edf7c00
R13: 000000000000000d R14: ffff8101fe5d2c00 R15: ffffc2000210d000
FS: 0000000000000000(0000) GS:ffff8101fff07f80(0063) knlGS:00000000f7fbdb20
CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f7f5d260 CR3: 00000001fb475000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 5160, threadinfo ffff8101fec6e000, task ffff8101fe642c80)
Stack: 00000000fec6fb10 ffffffff802a557b 00000000fc7593f0 ffffc2000210d000
ffff81007f65f900 000000000000000d ffff8101fe5d2c00 ffffc2000210d000
ffff8101fec6fba8 ffffffff802e4aff 000000000000000d 0000000100000001
Call Trace:
[<ffffffff802a557b>] submit_bh+0x105/0x111
[<ffffffff802e4aff>] flush_journal_list+0x392/0x575
[<ffffffff802e7e76>] do_journal_end+0xb6d/0xe0c
[<ffffffff80261f7f>] __writepage+0x0/0x2a
[<ffffffff80263c1f>] pagevec_lookup_tag+0x20/0x2a
[<ffffffff8025ad8e>] wait_on_page_writeback_range+0xeb/0x13e
[<ffffffff802e8376>] do_journal_begin_r+0x261/0x2a2
[<ffffffff802e8a13>] do_journal_release+0x4c/0x180
[<ffffffff8024...
This?
--- a/fs/reiserfs/journal.c~reiserfs-convert-j_flush_sem-to-mutex-fix
+++ a/fs/reiserfs/journal.c
@@ -1412,7 +1412,7 @@ static int flush_journal_list(struct sup
/* if flushall == 0, the lock is already held */
if (flushall) {
mutex_lock(&journal->j_flush_mutex);
- } else if (!mutex_trylock(&journal->j_flush_mutex)) {
+ } else if (mutex_trylock(&journal->j_flush_mutex)) {
BUG();
}--
Hi Andrew,
The 2.6.26-rc2-mm1 kernel panic's while bootup on the x86_64 machine.
BUG: unable to handle kernel paging request at 0000000000001e08
IP: [<ffffffff8026ac60>] __alloc_pages_internal+0x80/0x470
PGD 0
Oops: 0000 [1] SMP
last sysfs file:
CPU 31
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.26-rc2-mm1-autotest #1
RIP: 0010:[<ffffffff8026ac60>] [<ffffffff8026ac60>] __alloc_pages_internal+0x80/0x470
RSP: 0018:ffff810bf9dbdbc0 EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff810bef4786c0 RCX: 0000000000000001
RDX: 0000000000001e00 RSI: 0000000000000001 RDI: 0000000000001020
RBP: ffff810bf9dbb6d0 R08: 0000000000001020 R09: 0000000000000000
R10: 0000000000000008 R11: ffffffff8046d130 R12: 0000000000001020
R13: 0000000000000001 R14: 0000000000001e00 R15: ffff810bf8d29878
FS: 0000000000000000(0000) GS:ffff810bf916dec0(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000001e08 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff810bf9dbc000, task ffff810bf9dbb6d0)
Stack: 0002102000000000 0000000000000002 0000000000000000 0000000200000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 ffff810bef4786c0 0000000000001020 ffffffffffffffff
Call Trace:
[<ffffffff802112e9>] dma_alloc_coherent+0xa9/0x280
[<ffffffff804e8c9e>] tg3_init_one+0xa3e/0x15e0
[<ffffffff8028d0e4>] alternate_node_alloc+0x84/0xd0
[<ffffffff802286fc>] task_rq_lock+0x4c/0x90
[<ffffffff8022de62>] set_cpus_allowed_ptr+0x72/0xf0
[<ffffffff802e12fb>] sysfs_addrm_finish+0x1b/0x210
[<ffffffff802e0f99>] sysfs_find_dirent+0x29/0x40
[<ffffffff8036cc34>] pci_device_probe+0xe4/0x130
[<ffffffff803bfc26>] driver_probe_device+0x96/0x1a0
[<ffffffff803bfdb9>] __driver_attach+0x89/0x...
grumble. why. There are lots of patches already which changed the
page allocator.config, please?
Is it NUMA?
--
It is a NUMA box, with 4 nodes.
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
On Wed, 14 May 2008 23:51:36 +0530
Can you bisect it please?
Wrecking the page allocator is a fairly unusual thing to do. I'd start
out by looking at *bootmem*.patch and perhaps
acpi-acpi_numa_init-build-fix.patch.
--
After bisecting, the acpi-acpi_numa_init-build-fix.patch patch seems
to be causing the kernel panic during the bootup. Reverting the patch helps
in booting up the machine without the panic.commit 5dc90c0b2d4bd0127624bab67cec159b2c6c4daf
Author: Ingo Molnar <mingo@elte.hu>
Date: Thu May 1 09:51:47 2008 +0000acpi-acpi_numa_init-build-fix
x86.git testing found the following build error on latest -git:
drivers/acpi/numa.c: In function 'acpi_numa_init':
drivers/acpi/numa.c:226: error: 'NR_NODE_MEMBLKS' undeclared (first use in this function)
drivers/acpi/numa.c:226: error: (Each undeclared identifier is reported only once
drivers/acpi/numa.c:226: error: for each function it appears in.)with this config:
http://redhat.com/~mingo/misc/config-Wed_Apr_30_22_42_42_CEST_2008.bad
i suspect we dont want SRAT parsing when CONFIG_HAVE_ARCH_PARSE_SRAT
is unset - but the fix looks a bit ugly. Perhaps we should define
NR_NODE_MEMBLKS even in this case and just let the code fall back
to some sane behavior?Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 5d59cb3..8cab8c5 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -176,6 +176,7 @@ acpi_parse_processor_affinity(struct acpi_subtable_header * header,
return 0;
}+#ifdef CONFIG_HAVE_ARCH_PARSE_SRAT
static int __init
acpi_parse_memory_affinity(struct acpi_subtable_header * header,
const unsigned long end)
@@ -193,6 +194,7 @@ acpi_parse_memory_affinity(struct acpi_subtable_header * header,return 0;
}
+#endifstatic int __init acpi_parse_srat(struct acpi_table_header *table)
{
@@ -221,9 +223,11 @@ int __init acpi_numa_init(void)
if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
acpi_parse_processo...
this patch break Fujitsu ia64 numa box too.
after revert, my test environment works well.Thanks.
--
On HP ia64 numa, that patch causes all memory to show up on node 0, but
otherwise the platform boots and runs. Didn't notice it until I tried
to run some numa tests.Reverting the patch restores numaness.
Lee
--
On Wed, 14 May 2008 12:44:55 -0700
From stack trace, it seems NODE_DATA(nid) is NULL.
There are 2 cases.
- nid is bad.
- NODE_DATA(nid) is not initialized...Hmm..
Thanks,
-Kame--
| Amit K. Arora | [RFC] Heads up on sys_fallocate() |
| James Bottomley | Re: Integration of SCST in the mainstream Linux kernel |
| Stephen Rothwell | Re: Announce: Linux-next (Or Andrew's dream :-)) |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Patrick McHardy | Re: [GIT]: Networking |
| Natalie Protasevich | [BUG] New Kernel Bugs |
