Re: [BUILD_FAILURE] 2.6.25-rc2-mm1 - Build Failure at acpi_os

Previous thread: [PATCH] MAINTAINERS: update ide-cd maintainer's email address by Borislav Petkov on Saturday, February 16, 2008 - 1:12 am. (2 messages)

Next thread: Re: Optiarc DVD RW AD-5200A audio playing by Borislav Petkov on Saturday, February 16, 2008 - 2:05 am. (12 messages)
From: Andrew Morton
Date: Saturday, February 16, 2008 - 1:25 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc2/2.6.25-rc2-mm1/

- git-xfs is dropped due to git conflicts

- git-x86 is dropped due to too many changes to non-x86 code

- git-perfmon remains dropped due to rejects

- git-kgdb remains dropped due to rejects

- Added the slab/slub tree as git-slub.patch (Christoph Lameter)



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.  These probably are at least compilable.

- More-than-daily -mm snapshots may be found at
  http://userweb.kernel.org/~akpm/mmotm/.  These are almost certainly not
  compileable.




Changes ...
From: Marcin Slusarz
Date: Saturday, February 16, 2008 - 3:59 am

arch/x86/kernel/built-in.o: In function `amd_smp_thermal_interrupt':
(.text+0xe03b): undefined reference to `mce_log_therm_throt_event'
arch/x86/kernel/built-in.o: In function `acpi_save_state_mem':
(.text+0x12239): undefined reference to `setup_trampoline'

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.25-rc2-mm1
# Sat Feb 16 11:32:49 2008
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
# CONFIG_QUICKLIST is not set
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
# CONFIG_KTIME_SCALAR is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT ...
From: Andrew Morton
Date: Saturday, February 16, 2008 - 4:09 am

ho hum, thanks.  I think I'll drop x86-amd-thermal-interrupt-support.patch.
 I don't think it's the final version anwyay.

--

From: Marcin Slusarz
Date: Saturday, February 16, 2008 - 4:37 am

Ok, I had to revert x86-remove-pt_regs-arg-from-smp_thermal_interrupt before x86-amd-thermal-interrupt-support.

Second error vanished when I reverted "suspend: wakeup code in C".
--

From: Rafael J. Wysocki
Date: Saturday, February 16, 2008 - 5:22 pm

This one is easily fixed by the appended patch (whether it works is a separate

It will compile if you set CONFIG_SMP.  Working on a fix.

Thanks,
Rafael


---
 arch/x86/kernel/cpu/mcheck/mce_64.c      |    4 ++--
 arch/x86/kernel/cpu/mcheck/mce_thermal.h |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

Index: linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_64.c
===================================================================
--- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -317,7 +317,7 @@ void do_machine_check(struct pt_regs * r
 	atomic_dec(&mce_entry);
 }
 
-#ifdef CONFIG_X86_MCE_INTEL
+#ifdef CONFIG_X86_MCE
 /***
  * mce_log_therm_throt_event - Logs the thermal throttling event to mcelog
  * @cpu: The CPU on which the event occurred.
@@ -342,7 +342,7 @@ void mce_log_therm_throt_event(unsigned 
 	rdtscll(m.tsc);
 	mce_log(&m);
 }
-#endif /* CONFIG_X86_MCE_INTEL */
+#endif /* CONFIG_X86_MCE */
 
 /*
  * Periodic polling timer for "silent" machine check errors.  If the
Index: linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_thermal.h
===================================================================
--- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/cpu/mcheck/mce_thermal.h
+++ linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_thermal.h
@@ -4,5 +4,5 @@
 typedef void (*smp_thermal_interrupt_callback_t)(void);
 extern smp_thermal_interrupt_callback_t	smp_thermal_interrupt;
 
-void mce_log_therm_throt_event(unsigned int cpu, __u64 status);
+extern void mce_log_therm_throt_event(unsigned int cpu, __u64 status);
 
--

From: Marcin Slusarz
Date: Sunday, February 17, 2008 - 2:54 am

From: Rafael J. Wysocki
Date: Saturday, February 16, 2008 - 6:37 pm

The appended patch should fix the second error.

Thanks,
Rafael

---
On x86-64 the CPU trampoline code is now used while waking up from ACPI
suspend to RAM.  For this reason, make it depend on
(64BIT && ACPI_SLEEP) as well as on SMP, move the relevant declarations
to a separate header and move the definition of setup_trampoline() from
smpboot_64.c .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 arch/x86/Kconfig             |    2 +-
 arch/x86/kernel/acpi/sleep.c |    1 -
 arch/x86/kernel/acpi/sleep.h |    3 ++-
 arch/x86/kernel/e820_64.c    |    5 +++--
 arch/x86/kernel/setup_64.c   |   16 ++++++++++++++++
 arch/x86/kernel/smpboot_64.c |   23 ++---------------------
 include/asm-x86/smp_64.h     |    2 --
 include/asm-x86/trampoline.h |   18 ++++++++++++++++++
 8 files changed, 42 insertions(+), 28 deletions(-)

Index: linux-2.6.25-rc2-mm1/arch/x86/Kconfig
===================================================================
--- linux-2.6.25-rc2-mm1.orig/arch/x86/Kconfig
+++ linux-2.6.25-rc2-mm1/arch/x86/Kconfig
@@ -180,7 +180,7 @@ config X86_BIOS_REBOOT
 
 config X86_TRAMPOLINE
 	bool
-	depends on X86_SMP || (X86_VOYAGER && SMP)
+	depends on X86_SMP || (X86_VOYAGER && SMP) || (64BIT && ACPI_SLEEP)
 	default y
 
 config KTIME_SCALAR
Index: linux-2.6.25-rc2-mm1/arch/x86/kernel/e820_64.c
===================================================================
--- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/e820_64.c
+++ linux-2.6.25-rc2-mm1/arch/x86/kernel/e820_64.c
@@ -27,6 +27,7 @@
 #include <asm/setup.h>
 #include <asm/sections.h>
 #include <asm/kdebug.h>
+#include <asm/trampoline.h>
 
 struct e820map e820;
 
@@ -58,8 +59,8 @@ struct early_res {
 };
 static struct early_res early_res[MAX_EARLY_RES] __initdata = {
 	{ 0, PAGE_SIZE, "BIOS data page" },			/* BIOS data page */
-#ifdef CONFIG_SMP
-	{ SMP_TRAMPOLINE_BASE, SMP_TRAMPOLINE_BASE + 2*PAGE_SIZE, "SMP_TRAMPOLINE" },
+#ifdef CONFIG_X86_TRAMPOLINE
+	{ TRAMPOLINE_BASE, TRAMPOLINE_BASE + 2 * PAGE_SIZE, ...
From: Marcin Slusarz
Date: Sunday, February 17, 2008 - 2:56 am

From: Russell Leidich
Date: Tuesday, February 19, 2008 - 11:51 am

Thanks for fixing this, Rafael.

I must admit that I'm puzzled as to why this should fail, since my
Kconfig change forces K8_NB=y when X86_MCE_AMD=y -- unless this is a
discussion of my earlier patch, which did not include this change.




-- 
Russell Leidich
--

From: Kamalesh Babulal
Date: Saturday, February 16, 2008 - 9:15 am

Hi Andrew,

The 2.6.25-rc2-mm1 kernel panic's while boot up on the s390x

Unable to handle kernel pointer dereference at virtual kernel address 0000000000
000000
Oops: 0004 #1¨ SMP
Modules linked in:
CPU: 0 Not tainted 2.6.25-rc2-mm1-autotest #1
Process swapper (pid: 1, task: 000000003f830000, ksp: 000000003f83ba48)
Krnl PSW : 0704a00180000000 000000000024b2be (futex_atomic_cmpxchg_std+0x12/0x28
)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:2 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000074 00000000fffffff2 0000000000000000 0000000000000000
           0000000000000000 0000000000000001 0000000000000000 00000000004d0764
           0000000000000000 00000000004d8768 0000000000000000 000000003f83bdb0
           0000000000000040 0000000000343950 00000000000627e4 000000003f83bdb0
Krnl Code: 000000000024b2b2: b90400bf           lgr     %r11,%r15
           000000000024b2b6: a718fff2           lhi     %r1,-14
           000000000024b2ba: b2790100           sacf    256
          >000000000024b2be: ba342000           cs      %r3,%r4,0(%r2)
           000000000024b2c2: 1813               lr      %r1,%r3
           000000000024b2c4: b2790000           sacf    0
           000000000024b2c8: b9140021           lgfr    %r2,%r1
           000000000024b2cc: e3b0b0700004       lg      %r11,112(%r11)
Call Trace:
(<000000003f83bda8>¨ 0x3f83bda8)
 <00000000004bdeec>¨ init+0x30/0x104
 <00000000004b0c40>¨ kernel_init+0x1e0/0x370
 <000000000001a5c6>¨ kernel_thread_starter+0x6/0xc
 <000000000001a5c0>¨ kernel_thread_starter+0x0/0xc

 <4>--- end trace 561bb236c800851f ¨---
note: swapper1¨ exited with preempt_count 1
Kernel panic - not syncing: Attempted to kill init!

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Andrew Morton
Date: Saturday, February 16, 2008 - 12:45 pm

Possibilities are that something has gone wrong with the recent cmpxchg
changes which are now in mainline, or there's something wrong with
futex-fix-init-order.patch or
futex-runtime-enable-pi-and-robust-functionality.patch.

Or, of course, it's something else ;)

First question is: does this happen in current mainline?

If not, it would be useful if someone could test futex-fix-init-order.patch
and futex-runtime-enable-pi-and-robust-functionality.patch on curent
mainline, because those are planned for 2.6.25.

Thanks.
--

From: Thomas Gleixner
Date: Saturday, February 16, 2008 - 12:49 pm

> > Oops: 0004 #1
From: Thomas Gleixner
Date: Saturday, February 16, 2008 - 12:50 pm

> > > Oops: 0004 #1
From: Kamalesh Babulal
Date: Saturday, February 16, 2008 - 8:40 pm

To conform the patches causing the panic, I tested the 2.6.24.2 kernel with the futex-fix-init-order.patch and
futex-runtime-enable-pi-and-robust-functionality.patch applied and they seem to cause the kernel
panic.

Unable to handle kernel pointer dereference at virtual kernel address 0000000000
000000
Oops: 0004 #1¨ SMP
Modules linked in:
CPU: 0 Not tainted 2.6.25-rc2-mm1-autotest #1
Process swapper (pid: 1, task: 000000003f830000, ksp: 000000003f83ba48)
Krnl PSW : 0704a00180000000 000000000024b2be (futex_atomic_cmpxchg_std+0x12/0x28
)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:2 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000074 00000000fffffff2 0000000000000000 0000000000000000
           0000000000000000 0000000000000001 0000000000000000 00000000004d0764
           0000000000000000 00000000004d8768 0000000000000000 000000003f83bdb0
           0000000000000040 0000000000343950 00000000000627e4 000000003f83bdb0
Krnl Code: 000000000024b2b2: b90400bf           lgr     %r11,%r15
           000000000024b2b6: a718fff2           lhi     %r1,-14
           000000000024b2ba: b2790100           sacf    256
          >000000000024b2be: ba342000           cs      %r3,%r4,0(%r2)
           000000000024b2c2: 1813               lr      %r1,%r3
           000000000024b2c4: b2790000           sacf    0
           000000000024b2c8: b9140021           lgfr    %r2,%r1
           000000000024b2cc: e3b0b0700004       lg      %r11,112(%r11)
Call Trace:
(<000000003f83bda8>¨ 0x3f83bda8)
 <00000000004bdeec>¨ init+0x30/0x104
 <00000000004b0c40>¨ kernel_init+0x1e0/0x370
 <000000000001a5c6>¨ kernel_thread_starter+0x6/0xc
 <000000000001a5c0>¨ kernel_thread_starter+0x0/0xc

 <4>--- end trace 561bb236c800851f ¨---
note: swapper1¨ exited with preempt_count 1
Kernel panic - not syncing: Attempted to kill init!

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Heiko Carstens
Date: Sunday, February 17, 2008 - 2:06 am

Thanks for reporting! Patch below should fix it.

Index: linux-2.6/arch/s390/lib/uaccess_std.c
===================================================================
--- linux-2.6.orig/arch/s390/lib/uaccess_std.c
+++ linux-2.6/arch/s390/lib/uaccess_std.c
@@ -293,8 +293,8 @@ int futex_atomic_cmpxchg_std(int __user 
 
 	asm volatile(
 		"   sacf 256\n"
-		"   cs   %1,%4,0(%5)\n"
-		"0: lr   %0,%1\n"
+		"0: cs   %1,%4,0(%5)\n"
+		"   lr   %0,%1\n"
 		"1: sacf 0\n"
 		EX_TABLE(0b,1b)
 		: "=d" (ret), "+d" (oldval), "=m" (*uaddr)
--

From: Kamalesh Babulal
Date: Monday, February 18, 2008 - 7:08 am

Hi Heiko,

Thanks for the patch, I have tested patch, it fixes the bootup panic.

Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>

Index: linux-2.6/arch/s390/lib/uaccess_std.c
===================================================================
--- linux-2.6.orig/arch/s390/lib/uaccess_std.c
+++ linux-2.6/arch/s390/lib/uaccess_std.c
@@ -293,8 +293,8 @@ int futex_atomic_cmpxchg_std(int __user 

      asm volatile(
              "   sacf 256\n"
-             "   cs   %1,%4,0(%5)\n"
-             "0: lr   %0,%1\n"
+             "0: cs   %1,%4,0(%5)\n"
+             "   lr   %0,%1\n"
              "1: sacf 0\n"
              EX_TABLE(0b,1b)
              : "=d" (ret), "+d" (oldval), "=m" (*uaddr)
--
-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Randy Dunlap
Date: Saturday, February 16, 2008 - 10:21 am

cciss driver build errors on x86_64:

In file included from drivers/block/cciss.c:231:
drivers/block/cciss_scsi.c:1498:38: error: macro parameters must be comma-separated
drivers/block/cciss.c: In function 'cciss_seq_show_header':
drivers/block/cciss.c:272: error: implicit declaration of function 'cciss_seq_tape_report'
drivers/block/cciss.c: In function 'cciss_proc_write':
drivers/block/cciss.c:393: error: implicit declaration of function 'cciss_engage_scsi'
make[2]: *** [drivers/block/cciss.o] Error 1


---
~Randy
--

From: Kamalesh Babulal
Date: Saturday, February 16, 2008 - 10:31 am

Hi Andrew,

The 2.6.25-rc2-mm1 kernel build fails on the powerpc(s) 

  CC      security/keys/compat.o
security/keys/compat.c: In function ‘compat_sys_keyctl’:
security/keys/compat.c:83: error: implicit declaration of function ‘keyctl_get_security’
make[2]: *** [security/keys/compat.o] Error 1
make[1]: *** [security/keys] Error 2
make: *** [security] Error 2

The keys-add-keyctl-function-to-get-a-security-label.patch is causing
this build failure.

I have tested the patch for the build failure only

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
--
--- linux-2.6.25-rc2/security/keys/internal.h	2008-02-17 05:03:30.000000000 +0530
+++ linux-2.6.25-rc2/security/keys/~internal.h	2008-02-17 05:46:16.000000000 +0530
@@ -155,6 +155,8 @@ extern long keyctl_negate_key(key_serial
 extern long keyctl_set_reqkey_keyring(int);
 extern long keyctl_set_timeout(key_serial_t, unsigned);
 extern long keyctl_assume_authority(key_serial_t);
+extern long keyctl_get_security(key_serial_t keyid, char __user *buffer,
+				size_t buflen);
 
 
 /*
-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Randy Dunlap
Date: Saturday, February 16, 2008 - 10:48 am

The ACPI wakeup in C patch (I think) won't build for me on x86_32
(i.e., i386 build on x86_64 system):

linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set
linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set
linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: -mpreferred-stack-boundary=2 is not between 4 and 12
make[4]: *** [arch/x86/kernel/acpi/realmode/wakeup.o] Error 1
make[3]: *** [arch/x86/kernel/acpi/realmode/wakeup.bin] Error 2

---
~Randy
--

From: Rafael J. Wysocki
Date: Saturday, February 16, 2008 - 6:18 pm

It compiles for me on a native i386.

Can you please give me a hint what to do to reproduce the problem?

Rafael
--

From: H. Peter Anvin
Date: Saturday, February 16, 2008 - 6:22 pm

Sounds like you're not adding -m32 to the gcc command line.

	-hpa
--

From: Randy Dunlap
Date: Saturday, February 16, 2008 - 7:19 pm

Yes, adding -m32 to the X86_32 config ccflags (as is done for the
X86_64 case) makes it build for me.  (like patch below)

Thanks.
---

From: Randy Dunlap <randy.dunlap@oracle.com>

Fix wakeup code build errors on x86_64.

linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set
linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set
linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: -mpreferred-stack-boundary=2 is not between 4 and 12

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 arch/x86/kernel/acpi/realmode/Makefile |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/acpi/realmode/Makefile
+++ linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/Makefile
@@ -27,7 +27,7 @@ bootsrc		:= $(src)/../../../boot
 # How to compile the 16-bit code.  Note we always compile for -march=i386,
 # that way we can complain to the user if the CPU is insufficient.
 # Compile with _SETUP since this is similar to the boot-time setup code.
-cflags-$(CONFIG_X86_32) :=
+cflags-$(CONFIG_X86_32) := -m32
 cflags-$(CONFIG_X86_64) := -m32
 KBUILD_CFLAGS	:= $(LINUXINCLUDE) -g -Os -D_SETUP -D_WAKEUP -D__KERNEL__ \
 		   -I$(srctree)/$(bootsrc) \
--

From: H. Peter Anvin
Date: Saturday, February 16, 2008 - 8:58 pm

It's wrong, though, because you can't assume a 32-bit compiler knows 
about -m32.

You need $(call cc-option,-m32).

	-hpa
--

From: Randy Dunlap
Date: Saturday, February 16, 2008 - 9:38 pm

Thanks, Peter.  Tested/works.

---
From: Randy Dunlap <randy.dunlap@oracle.com>

Fix wakeup code build errors on x86_64.

linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set
linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set
linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: -mpreferred-stack-boundary=2 is not between 4 and 12

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 arch/x86/kernel/acpi/realmode/Makefile |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/acpi/realmode/Makefile
+++ linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/Makefile
@@ -27,7 +27,7 @@ bootsrc		:= $(src)/../../../boot
 # How to compile the 16-bit code.  Note we always compile for -march=i386,
 # that way we can complain to the user if the CPU is insufficient.
 # Compile with _SETUP since this is similar to the boot-time setup code.
-cflags-$(CONFIG_X86_32) :=
+cflags-$(CONFIG_X86_32) := $(call cc-option, -m32)
 cflags-$(CONFIG_X86_64) := -m32
 KBUILD_CFLAGS	:= $(LINUXINCLUDE) -g -Os -D_SETUP -D_WAKEUP -D__KERNEL__ \
 		   -I$(srctree)/$(bootsrc) \
--

From: H. Peter Anvin
Date: Saturday, February 16, 2008 - 9:35 pm

I think this works for both; that's what we do for arch/x86/boot.

	-hpa

--

From: Randy Dunlap
Date: Saturday, February 16, 2008 - 9:47 pm

OK, that makes sense.  I think I'll let Rafael complete it.

-- 
~Randy
--

From: Rafael J. Wysocki
Date: Sunday, February 17, 2008 - 1:40 pm

OK, so that would be the appended patch.

Still, since there are several fixes against the "move the wakeup code to C"
patch, I'll probably fold them all into a new version of this patch and resend
it.

Thanks,
Rafael

---
 arch/x86/kernel/acpi/realmode/Makefile |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-2.6/arch/x86/kernel/acpi/realmode/Makefile
===================================================================
--- linux-2.6.orig/arch/x86/kernel/acpi/realmode/Makefile
+++ linux-2.6/arch/x86/kernel/acpi/realmode/Makefile
@@ -27,8 +27,6 @@ bootsrc		:= $(src)/../../../boot
 # How to compile the 16-bit code.  Note we always compile for -march=i386,
 # that way we can complain to the user if the CPU is insufficient.
 # Compile with _SETUP since this is similar to the boot-time setup code.
-cflags-$(CONFIG_X86_32) :=
-cflags-$(CONFIG_X86_64) := -m32
 KBUILD_CFLAGS	:= $(LINUXINCLUDE) -g -Os -D_SETUP -D_WAKEUP -D__KERNEL__ \
 		   -I$(srctree)/$(bootsrc) \
 		   $(cflags-y) \
@@ -41,6 +39,7 @@ KBUILD_CFLAGS	:= $(LINUXINCLUDE) -g -Os 
 			$(call cc-option, -fno-unit-at-a-time)) \
 		   $(call cc-option, -fno-stack-protector) \
 		   $(call cc-option, -mpreferred-stack-boundary=2)
+KBUILD_CFLAGS	+= $(call cc-option, -m32)
 KBUILD_AFLAGS	:= $(KBUILD_CFLAGS) -D__ASSEMBLY__
 
 WAKEUP_OBJS = $(addprefix $(obj)/,$(wakeup-y))
--

From: Sam Ravnborg
Date: Sunday, February 17, 2008 - 2:07 pm

For a 64 bit build we should error out if the compiler
fials to support -m32 (how unlikely that may be).
So I would prefer it unconditional for 64 bit.

But nit picking - I know.

	Sam
--

From: H. Peter Anvin
Date: Sunday, February 17, 2008 - 2:21 pm

We will err out anyway.

	-hpa
--

From: Sam Ravnborg
Date: Sunday, February 17, 2008 - 2:28 pm

But I assume in less obvious way.
It is a bit more intuitive to error out on missing
-m32 support than gcc failing to support .code16
or some other inline assembler magic.

	Sam
--

From: H. Peter Anvin
Date: Sunday, February 17, 2008 - 2:31 pm

No, you will get the message "the selected CPU doesn't support the 
x86-64 architecture".

	-hpa
--

From: Sam Ravnborg
Date: Sunday, February 17, 2008 - 2:46 pm

A bit confusing but acceptable.

	Sam
--

From: H. Peter Anvin
Date: Sunday, February 17, 2008 - 2:20 pm

Yes, please.

	-hpa
--

From: Kamalesh Babulal
Date: Saturday, February 16, 2008 - 12:47 pm

Hi Andrew,

The 2.6.25-rc2-mm1 kernel with randconfig build option, fails
to build on x86_64 machine

  CC      drivers/acpi/osl.o
drivers/acpi/osl.c:60:38: error: empty filename in #include
drivers/acpi/osl.c: In function ‘acpi_os_table_override’:
drivers/acpi/osl.c:399: error: ‘AmlCode’ undeclared (first use in this function)
drivers/acpi/osl.c:399: error: (Each undeclared identifier is reported only once
drivers/acpi/osl.c:399: error: for each function it appears in.)
make[2]: *** [drivers/acpi/osl.o] Error 1
make[1]: *** [drivers/acpi] Error 2
make: *** [drivers] Error 2

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.25-rc2-mm1
# Sun Feb 17 08:07:17 2008
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
# CONFIG_QUICKLIST is not set
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_HWEIGHT=y
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
# CONFIG_KTIME_SCALAR is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General ...
From: Laura Garcia
Date: Saturday, February 16, 2008 - 1:01 pm

Hi,

If you select CONFIG_ACPI_CUSTOM_DSDT=y, you have to set a file path in the 
option CONFIG_ACPI_CUSTOM_DSDT_FILE="".


Best regards,
Laura.



--

From: Len Brown
Date: Thursday, February 21, 2008 - 12:08 am

garbage in, garbage out.

If you don't give this build option a file name where AmlCode lives,
then the build will be unable to find AmlCode[].

http://www.lesswatts.org/projects/acpi/overridingDSDT.php

cheers,
-Len
--

From: Nish Aravamudan
Date: Thursday, February 21, 2008 - 11:54 am

So we have a .config option whose sole purpose is to use another
.config option? That seems ... less than ideal. Is there not some
Kconfig voodoo we can do to only require the one option? Maybe
something like how CONFIG_INITRAMFS_SOURCE is done? Adding Sam to the
Cc, in case he has any ideas.

Thanks,
Nish
--

From: Sam Ravnborg
Date: Thursday, February 21, 2008 - 3:22 pm

Make sure STANDALONE is y for your randconfig builds.
See README for examples.

STANALONE is there exactly to prevent the above but we cannot
control randconfig.

	Sam

--

From: Nish Aravamudan
Date: Thursday, February 21, 2008 - 6:38 pm

Hrm, if this is needed for randconfig to work, perhaps randconfig

While setting STANDALONE does fix the above, it doesn't answer the
more basic question I had -- do we really need both .config options in
this case? If it's simply a case of "That's how it is, won't be fixed,
there are higher priorities", that's good enough by me. Just seems a
shame that we have an option to enable another option, which is
required for the first option to be sensible -- seems like we should
only need the second option...

Thanks,
Nish
--

From: Sam Ravnborg
Date: Friday, February 22, 2008 - 11:08 am

I really do not see what problem you are trying to address.

STANDALONE is there as an easy way to turn of the options that requires
sensible input to make a kernel compile.

And that makes _perfect_ sense when you do randconfig builds.

	Sam
--

From: Nish Aravamudan
Date: Friday, February 22, 2008 - 11:12 am

Yes it does. As I said above I'm *not* arguing about using STANDALONE
for randconfig builds.

What I was doing, perhaps unclearly, was asking if there was a real
Kconfig need to have both CONFIG_ACPI_CUSTOM_DSDT and
CONFIG_ACPI_CUSTOM_DSDT_FILE, when the latter *only* is visible with
the former and the former *only* makes sense with the latter. Couldn't
we just have CONFIG_ACPI_CUSTOM_DSDT_FILE and check that in the code?
Why do we need a boolean option to make another string option
available?

Thanks,
Nish
--

From: Randy Dunlap
Date: Friday, February 22, 2008 - 11:13 am

Is there a way to generate (in Kconfig language) the boolean
CONFIG_ACPI_CUSTOM_DSDT based on whether CONFIG_ACPI_CUSTOM_DSDT_FILE
== "" or != "" ?  I tried to muck around with that last night but
couldn't get it to work.  I.e., just present the ACPI_CUSTOM_DSDT_FILE
config symbol to the user and then generate the ACPI_CUSTOM_DSDT bool
based on the string value.


---
~Randy
--

From: Nish Aravamudan
Date: Friday, February 22, 2008 - 11:21 am

Thanks for re-expressing my question, Randy, this is exactly what I'm wondering.

Thanks,
Nish
--

From: Sam Ravnborg
Date: Friday, February 22, 2008 - 11:27 am

Something following this example?

config STRING
        string
        prompt "What string"
        default ""

config STRING_IS_NOT_EMPTY
        bool
        default STRING != ""


But that seems too easy - were you trying to do something
more complex than this?

	Sam
--

From: Randy Dunlap
Date: Friday, February 22, 2008 - 11:29 am

Yes, that's almost what I had.  I used def_bool n on the second config symbol,
but the bool value never changed when I changed the string value.
I'll be glad to look at it again though.

-- 
~Randy
--

From: Sam Ravnborg
Date: Friday, February 22, 2008 - 11:56 am

I tested that above in a small Kconfig file and it
works as expected. When I set the string to something
STRING_IS_NOT_EMPTY is equal to y.

	Sam
--

From: Randy Dunlap
Date: Friday, February 22, 2008 - 12:25 pm

Let's see what the ACPI people think about this change.

Thanks, Sam.
---
From: Randy Dunlap <randy.dunlap@oracle.com>

Make ACPI_CUSTOM_DSDT boolean config symbol a hidden and derived
value, based on the value of ACPI_CUSTOM_DSDT_FILE (string).
Only the latter is presented to the user as a config option.

This fixes problems with "make randconfig" setting ACPI_CUSTOM_DSDT
but leaving ACPI_CUSTOM_DSDT_FILE empty/blank.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 drivers/acpi/Kconfig |   19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

--- linux-2.6.25-rc2-git5.orig/drivers/acpi/Kconfig
+++ linux-2.6.25-rc2-git5/drivers/acpi/Kconfig
@@ -283,24 +283,23 @@ config ACPI_TOSHIBA
 	  If you have a legacy free Toshiba laptop (such as the Libretto L1
 	  series), say Y.
 
-config ACPI_CUSTOM_DSDT
-	bool "Include Custom DSDT"
+config ACPI_CUSTOM_DSDT_FILE
+	string "Custom DSDT Table file to include"
+	default ""
 	depends on !STANDALONE
-	default n 
 	help
 	  This option supports a custom DSDT by linking it into the kernel.
 	  See Documentation/acpi/dsdt-override.txt
 
-	  If unsure, say N.
-
-config ACPI_CUSTOM_DSDT_FILE
-	string "Custom DSDT Table file to include"
-	depends on ACPI_CUSTOM_DSDT
-	default ""
-	help
 	  Enter the full path name to the file which includes the AmlCode
 	  declaration.
 
+	  If unsure, don't enter a file name.
+
+config ACPI_CUSTOM_DSDT
+	bool
+	default ACPI_CUSTOM_DSDT_FILE != ""
+
 config ACPI_CUSTOM_DSDT_INITRD
 	bool "Read Custom DSDT from initramfs"
 	depends on BLK_DEV_INITRD
--

From: Len Brown
Date: Friday, February 22, 2008 - 10:41 pm

works for me!

applied.

thanks,
-len

ps.  CONFIG_ACPI_CUSTOM_DSDT's only use is to guard the use of
CONFIG_ACPI_CUSTOM_DSDT_FILE:

#ifdef CONFIG_ACPI_CUSTOM_DSDT
#include CONFIG_ACPI_CUSTOM_DSDT_FILE
#endif

we could get rid of it if cpp could so something like

#if (CONFIG_ACPI_CUSTOM_DSDT_FILE != "")
#include CONFIG_ACPI_CUSTOM_DSDT_FILE
#endif

but it doesn't look like cpp has a concept of strings in expressions.

--

From: Kamalesh Babulal
Date: Saturday, February 23, 2008 - 8:33 am

Thanks, the patch solves the build failure.

Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 drivers/acpi/Kconfig |   19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

--- linux-2.6.25-rc2-git5.orig/drivers/acpi/Kconfig
+++ linux-2.6.25-rc2-git5/drivers/acpi/Kconfig
@@ -283,24 +283,23 @@ config ACPI_TOSHIBA
        If you have a legacy free Toshiba laptop (such as the Libretto
L1
        series), say Y.

-config ACPI_CUSTOM_DSDT
-     bool "Include Custom DSDT"
+config ACPI_CUSTOM_DSDT_FILE
+     string "Custom DSDT Table file to include"
+     default ""
      depends on !STANDALONE
-     default n 
      help
        This option supports a custom DSDT by linking it into the
kernel.
        See Documentation/acpi/dsdt-override.txt

-       If unsure, say N.
-
-config ACPI_CUSTOM_DSDT_FILE
-     string "Custom DSDT Table file to include"
-     depends on ACPI_CUSTOM_DSDT
-     default ""
-     help
        Enter the full path name to the file which includes the AmlCode
        declaration.

+       If unsure, don't enter a file name.
+
+config ACPI_CUSTOM_DSDT
+     bool
+     default ACPI_CUSTOM_DSDT_FILE != ""
+
 config ACPI_CUSTOM_DSDT_INITRD
      bool "Read Custom DSDT from initramfs"
      depends on BLK_DEV_INITRD

After applying the patch and continuing with the same randconfig
reported earlier, the build fails with following error

drivers/acpi/thermal.c: In function ‘acpi_thermal_init’:
drivers/acpi/thermal.c:1792: error: ‘thermal_dmi_table’ undeclared (first use in this function)
drivers/acpi/thermal.c:1792: error: (Each undeclared identifier is reported only once
drivers/acpi/thermal.c:1792: error: for each function it appears in.)
make[2]: *** [drivers/acpi/thermal.o] Error 1
make[1]: *** [drivers/acpi] Error 2
make: *** [drivers] Error 2

I have tested the patch for build failure only.

Signed-off-by: Kamalesh Babulal ...
From: Laurent Riffard
Date: Saturday, February 16, 2008 - 2:27 pm

Le 16.02.2008 09:25, Andrew Morton a 
From: Arjan van de Ven
Date: Saturday, February 16, 2008 - 2:52 pm

Len: This WARN_ON says that ACPI is trying to call ioremap() on memory that the e820_table
lists as "kernel owned". Do you know why ACPI would do this? Would ACPI get upset if
the kernel would tell it to take a hike?
--

From: Brown, Len
Date: Sunday, February 17, 2008 - 9:58 pm

Depends on the BIOS -- as it is the BIOS AML that is making this
request.

-Len
--

From: Arjan van de Ven
Date: Sunday, February 17, 2008 - 10:18 pm

On Sun, 17 Feb 2008 23:58:03 -0500

is there any possible valid scenario where the BIOS AML would touch memory the 
kernel is using? (Since that seems to be what is going on; I'll cook up a diagnostics
patch to get more info but the warning gets spewed if ioremap() is trying to map
memory the kernel sees as ram)


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Arjan van de Ven
Date: Monday, February 18, 2008 - 12:35 pm

Can you try the patch below? It should print a bit more information so that we can
figure out who's really at fault here.. (eg it's either the diagnostics that are
wrong, or ACPI is doing something evil (on behalf of the bios), we need to know
what address is being triggered)


 From c346400b372a99a4158fce3ea45234bcf947bdf8 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Mon, 18 Feb 2008 08:01:47 -0800
Subject: [PATCH] More diagnostic output for the ioremap WARN_ON

now that ACPI seems to have triggered this WARN_ON.. we need to know
which address it triggers on to be able to judge a final "who's at fault".

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
  arch/x86/mm/ioremap.c |   10 +++++++++-
  1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 69f4981..524dd45 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -126,7 +126,15 @@ static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
  			return NULL;
  	}

-	WARN_ON_ONCE(page_is_ram(pfn));
+	for (pfn = phys_addr >> PAGE_SHIFT; pfn < max_pfn_mapped &&
+	     (pfn << PAGE_SHIFT) < last_addr; pfn++) {
+		if (page_is_ram(pfn)) {
+			printk(KERN_ERR "ioremap: trying to map RAM page at %lx\n",
+					pfn << PAGE_SHIFT);
+			WARN_ON_ONCE(page_is_ram(pfn));
+		}
+	}
+

  	switch (mode) {
  	case IOR_MODE_UNCACHED:
-- 
1.5.4.1

--

From: Laurent Riffard
Date: Monday, February 18, 2008 - 2:05 pm

I've got 2 new lines in dmesg output:

ACPI: EC: Look up EC in DSDT
ioremap: trying to map RAM page at 0       <===============
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap.c:134 __ioremap+0xe8/0x1b6()
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.25-rc2-mm1 #41
 [<c0118989>] warn_on_slowpath+0x41/0x6d
 [<c0130ae6>] ? trace_hardirqs_off+0xb/0xd
 [<c0116326>] ? runqueue_is_locked+0x23/0x3f
 [<c0118d31>] ? release_console_sem+0x1be/0x1c6
 [<c0119304>] ? vprintk+0x2d0/0x31d
 [<c01129e6>] __ioremap+0xe8/0x1b6
 [<c0112acd>] ioremap_nocache+0xa/0xc
 [<c02a68a7>] acpi_os_map_memory+0x11/0x1a
 [<c020b6c7>] acpi_ex_system_memory_space_handler+0xd3/0x228
 [<c0203c08>] ? acpi_ev_address_space_dispatch+0x142/0x1a8
 [<c020b5f4>] ? acpi_ex_system_memory_space_handler+0x0/0x228
 [<c0203c2d>] acpi_ev_address_space_dispatch+0x167/0x1a8
 [<c020840d>] acpi_ex_access_region+0x1e4/0x270
 [<c02085ec>] acpi_ex_field_datum_io+0x153/0x2a1
 [<c0158af4>] ? cache_alloc_debugcheck_after+0xe9/0x165
 [<c02087cb>] acpi_ex_extract_from_field+0x91/0x224
 [<c0206bff>] ? acpi_ex_read_data_from_field+0x163/0x1b0
 [<c0206c1c>] acpi_ex_read_data_from_field+0x180/0x1b0
 [<c020d286>] acpi_ex_resolve_node_to_value+0x1aa/0x230
 [<c0207a62>] acpi_ex_resolve_to_value+0x270/0x2aa
 [<c0209e77>] acpi_ex_resolve_operands+0x24e/0x52f
 [<c0200857>] acpi_ds_exec_end_op+0xb7/0x4f4
 [<c0212d81>] acpi_ps_parse_loop+0x5e5/0x79c
 [<c021210c>] acpi_ps_parse_aml+0xb2/0x2dd
 [<c021353c>] acpi_ps_execute_method+0x13d/0x20d
 [<c020fba2>] acpi_ns_evaluate+0x10e/0x1b0
 [<c02164fa>] acpi_ut_evaluate_object+0x57/0x1a1
 [<c02166fe>] acpi_ut_execute_STA+0x22/0x7b
 [<c0218d91>] ? acpi_ut_release_mutex+0x85/0x8f
 [<c020f48d>] acpi_ns_get_device_callback+0x5a/0x121
 [<c021176e>] acpi_ns_walk_namespace+0xfa/0x114
 [<c020f3b1>] acpi_get_devices+0x47/0x5d
 [<c020f433>] ? acpi_ns_get_device_callback+0x0/0x121
 [<c021ce7a>] ? ec_parse_device+0x0/0x6e
 [<c03a4460>] acpi_ec_ecdt_probe+0xaa/0x10a
 ...
From: Arjan van de Ven
Date: Monday, February 18, 2008 - 2:12 pm

actually... it helps a lot.
I'll cook up a patch for this now :)
thanks for testing so quickly


--

From: Gabriel C
Date: Saturday, February 23, 2008 - 5:44 pm

[..]


Gabriel
--

From: Arjan van de Ven
Date: Saturday, February 23, 2008 - 7:50 pm

we fixed the cause of the machine you quoted; so I suspect yours is different..
Can you get me your stacktrace ? Can you try the patch from this thread to show
what memory the offender tries to access ?
--

From: Gabriel C
Date: Monday, February 25, 2008 - 9:31 am

Arjan , sorry for the lag.

With your patch from http://marc.info/?l=linux-kernel&m=120336371506283&w=2 I don't have a warning anymore.

There are the 2 dmesg's , one from 2.6.25-rc3 and other from 2.6.25-rc3+your patch:

http://frugalware.org/~crazy/dmesg/dmesg
http://frugalware.org/~crazy/dmesg/dmesg_with_patch

It seems I'm not alone with that :)

Have a look at http://lkml.org/lkml/2008/2/24/265

Regards,

Gabriel
--

From: Arjan van de Ven
Date: Monday, February 25, 2008 - 3:44 pm

that is ... odd since it's the same in theory, just with some added printk's ;-(
--

From: Gabriel C
Date: Monday, February 25, 2008 - 4:33 pm

[Empty message]
From: Gabriel C
Date: Monday, February 25, 2008 - 4:59 pm

With only that part I get the waring again. 

But I see the printk() twice so I guess this should be in general an WARN_ON() ?

http://frugalware.org/~crazy/dmesg/dmesg_with_patch_2

 
--

From: Mirco Tischler
Date: Saturday, March 1, 2008 - 8:40 am

I still get this in mainline (todays 2.6.25-rc3-git)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index ac3c959..0a9a616 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -134,7 +134,11 @@ static void __iomem *__ioremap(unsigned long
phys_addr, unsigned long size,
                        return NULL;
        }

-       WARN_ON_ONCE(page_is_ram(pfn));
+       if (page_is_ram(pfn)) {
+               printk(KERN_ERR "ioremap: trying to map RAM page at %lx
\n",
+                       pfn << PAGE_SHIFT);
+               WARN_ON_ONCE(page_is_ram(pfn));
+       }

        switch (mode) {
        case IOR_MODE_UNCACHED:

With this diagnostics patch applied, I get the following stacktrace:

[   23.223587] Allocate Port Service[0000:00:1c.3:pcie03]
[   23.232536] ioremap: trying to map RAM page at 1000
[   23.232590] ------------[ cut here ]------------
[   23.232633] WARNING: at arch/x86/mm/ioremap.c:140 __ioremap
+0x232/0x280()
[   23.232678] Modules linked in:
[   23.232744] Pid: 48, comm: kacpid Not tainted
2.6.25-rc3-current-fix.ioremap.warning #1
[   23.232801]=20
[   23.232801] Call Trace:
[   23.232881]  [<ffffffff8023b4cf>] warn_on_slowpath+0x5f/0x80
[   23.232925]  [<ffffffff8023c657>] ? printk+0x67/0x70
[   23.232970]  [<ffffffff8038c5f2>] ? acpi_ec_transaction+0x1ec/0x207
[   23.233015]  [<ffffffff802271b2>] __ioremap+0x232/0x280
[   23.233059]  [<ffffffff8022721b>] ioremap_nocache+0xb/0x10
[   23.233103]  [<ffffffff80467792>] acpi_os_map_memory+0x13/0x21
[   23.233149]  [<ffffffff8037d580>] acpi_ex_system_memory_space_handler
+0xd2/0x1c2
[   23.233204]  [<ffffffff8037d4ae>] ?
acpi_ex_system_memory_space_handler+0x0/0x1c2
[   23.233261]  [<ffffffff80376320>] acpi_ev_address_space_dispatch
+0x172/0x1c1
[   23.233307]  [<ffffffff8037a7ff>] acpi_ex_access_region+0x210/0x22d
[   23.233351]  [<ffffffff8037a90b>] acpi_ex_field_datum_io+0xef/0x183
[   23.233397]  [<ffffffff802ad1e2>] ? kmem_cache_alloc+0x82/0xc0
[   23.233441]  ...
From: Fabio Checconi
Date: Sunday, March 2, 2008 - 8:53 am

[cc'd relevant maintainers]


the same here on 2.6.25-rc3, with the innocent ibmphp_access_ebda()
that fires the WARN_ON() in __ioremap() asking for the pfn 0, even
after the page_is_ram() change.  With your patch the warning
disappears.

I think this is because the pfn checked by the original code (before
your patch) is the one after the last iteration, while your patch
checks for each pfn that is going to be mapped.  The latter should
be the intended behavior.  If I've understood the problem, the
(trivial) patch below should fix it.

Also, note that if last_addr is at the beginning of a page we can
__ioremap() normal RAM (in fact we only emit the warning with the
old code, instead of returning NULL.)  Is that possible/intended
behavior?  If not the loop should do one more iteration.


__ioremap() emits a warning if the pfn after the last one it's going
to map is of normal ram.  Correct this and emit the warning (once)
only if one of the asked pages is.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
---
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index ac3c959..6f7b158 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -109,7 +109,7 @@ static int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
 			       enum ioremap_mode mode)
 {
-	unsigned long pfn, offset, last_addr, vaddr;
+	unsigned long pfn, offset, last_addr, vaddr, is_ram = 0;
 	struct vm_struct *area;
 	pgprot_t prot;
 
@@ -132,9 +132,10 @@ static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
 		if (page_is_ram(pfn) && pfn_valid(pfn) &&
 		    !PageReserved(pfn_to_page(pfn)))
 			return NULL;
+		is_ram |= page_is_ram(pfn);
 	}
 
-	WARN_ON_ONCE(page_is_ram(pfn));
+	WARN_ON_ONCE(is_ram);
 
 	switch (mode) {
 	case IOR_MODE_UNCACHED:
--

From: Arjan van de Ven
Date: Sunday, March 2, 2008 - 9:58 am

looks good to me; Ingo please apply
(Note: if no legit users show up I want to just remove support for mapping ram altogether
in 2.6.26 or so)
--

From: Ingo Molnar
Date: Monday, March 3, 2008 - 1:46 am

well upstream doesnt have the warning anymore, i queued up the patch 
below into x86.git#testing.

	Ingo

-------------------->
Subject: x86: warn about RAM pages in ioremap()
From: Ingo Molnar <mingo@elte.hu>
Date: Mon Mar 03 09:37:41 CET 2008

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/mm/ioremap.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Index: linux-x86.q/arch/x86/mm/ioremap.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/ioremap.c
+++ linux-x86.q/arch/x86/mm/ioremap.c
@@ -149,9 +149,11 @@ static void __iomem *__ioremap(unsigned 
 	for (pfn = phys_addr >> PAGE_SHIFT;
 				(pfn << PAGE_SHIFT) < last_addr; pfn++) {
 
-		if (page_is_ram(pfn) && pfn_valid(pfn) &&
-		    !PageReserved(pfn_to_page(pfn)))
+		int is_ram = page_is_ram(pfn);
+
+		if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)))
 			return NULL;
+		WARN_ON_ONCE(is_ram);
 	}
 
 	switch (mode) {
--

From: Ingo Molnar
Date: Monday, March 3, 2008 - 1:47 am

i mean, x86.git#testing doesnt have the warning anymore.

	Ingo
--

From: Fabio Checconi
Date: Monday, March 3, 2008 - 3:21 am

In this way we can emit the warning even for pages that will not
be mapped, if asked for more than one page (e.g., one page triggers
the warning, one of the following triggers the return NULL condition,)
I don't think that would be useful, as the caller will notice the
error anyway.

I don't know if it's so important, but while we're at it, please
consider that, if last_addr % PAGE_SIZE == 0 the for loop exits
without checking the last pfn, that will be mapped.

--

From: Laurent Riffard
Date: Sunday, March 2, 2008 - 1:40 pm

Tested-by: Laurent Riffard <laurent.riffard@free.fr>

With this patch, the WARNING at arch/x86/mm/ioremap.c does not occur anymore.

thanks
~~
--

From: Mirco Tischler
Date: Sunday, March 2, 2008 - 4:35 pm

Yes, the patch does help for me too.

Thanks=20

Mirco
From: Kamalesh Babulal
Date: Saturday, February 16, 2008 - 9:10 pm

Hi Andrew,

The signals-do_signal_stop-use-signal_group_exit.patch is causing the
kernel panic, while booting in to the 2.6.25-rc2-mm1 kernel on x86.

There has been discussion on the patch for this panic on http://lkml.org/lkml/2008/2/16/99 

[   25.512919] BUG: unable to handle kernel paging request at 9d74e37b
[   25.514926] IP: [<c04a8fac>] proc_flush_task+0x5b/0x223
[   25.516934] Oops: 0000 [#1] SMP 
[   25.517918] last sysfs file: /sys/block/hdc/removable
[   25.517918] Modules linked in: dm_mirror dm_mod video output sbs sbshc battery ac parport_pc lp parport floppy sg serio_raw ide_cd_mod cdrom scb2_flash mtd chipreg button i2c_piix4 i2c_core pcspkr tg3 mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
[   25.517918] 
[   25.517918] Pid: 1, comm: init Not tainted (2.6.25-rc2-mm1-autotest #1)
[   25.517918] EIP: 0060:[<c04a8fac>] EFLAGS: 00010282 CPU: 2
[   25.517918] EIP is at proc_flush_task+0x5b/0x223
[   25.517918] EAX: 9d74e35b EBX: f7881ef0 ECX: f7a5ed84 EDX: a56b6b6b
[   25.517918] ESI: f74f76f8 EDI: a56b6b6b EBP: f7881f08 ESP: f7881ec0
[   25.517918]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[   25.517918] Process init (pid: 1, ti=f7881000 task=f788adf0 task.ti=f7881000)
[   25.517918] Stack: f7a5ed84 00000001 f7a5ed58 f7a5ed58 00000000 a56b6b6b f7852750 f7a5ed68 
[   25.517918]        35881ef4 00003130 f788b4a4 f788adf0 0012d6c5 00000003 f7881ee3 00000003 
[   25.517918]        f7a72df0 f7a72df0 f7881f1c c042676d 00000003 000001f5 f7a72df0 f7881f78 
[   25.517918] Call Trace:
[   25.517918]  [<c042676d>] release_task+0x19/0x2d5
[   25.517918]  [<c042701b>] do_wait+0x5f2/0x8fc
[   25.517918]  [<c041d9e7>] default_wake_function+0x0/0xd
[   25.517918]  [<c042739d>] sys_wait4+0x78/0x8e
[   25.517918]  [<c04273c6>] sys_waitpid+0x13/0x15
[   25.517918]  [<c04039ba>] sysenter_past_esp+0x5f/0x99
[   25.517918]  =======================
[   25.517918] Code: 1c 89 4d d4 89 4d c4 89 55 b8 e9 b5 01 00 00 31 ff 83 7d c4 00 74 ...
From: Andrew Morton
Date: Saturday, February 16, 2008 - 9:50 pm

hm, are you sure that signals-do_signal_stop-use-signal_group_exit.patch
causes this oops?  
--

From: Kamalesh Babulal
Date: Saturday, February 16, 2008 - 10:02 pm

sorry signals-do_signal_stop-use-signal_group_exit.patch is not causing the problem, not sure of what is causing this panic :-(

--

From: Oleg Nesterov
Date: Sunday, February 17, 2008 - 5:41 am

Kamalesh, could you send me the output of "objdump -d fs/proc/base.o" ?
and just in case, fs/proc/base.s (make fs/proc/base.s).

Oleg.

--

From: Rik van Riel
Date: Wednesday, February 20, 2008 - 12:34 pm

On Sun, 17 Feb 2008 09:40:33 +0530

I wonder if this one is related.   Also with 2.6.25-rc2-mm1 on x86_64:

BUG: unable to handle kernel paging request at 0000000000200200
IP: [<ffffffff81043d3c>] free_pid+0x35/0x90
PGD 43c00c067 PUD 43e5f1067 PMD 0 
Oops: 0002 [1] SMP 
last sysfs file: /sys/devices/pnp0/00:0b/id
CPU 7 
Modules linked in: dm_multipath qla2xxx bnx2 iTCO_wdt iTCO_vendor_support serio_raw rtc_cmos pcspkr watchdog_core scsi_transport_fc watchdog_dev i5000_edac edac_core button dcdbas joydev sg sr_mod cdrom usb_storage ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod shpchp megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Pid: 1992, comm: S05kudzu Not tainted 2.6.25-rc2-mm1 #4
RIP: 0010:[<ffffffff81043d3c>]  [<ffffffff81043d3c>] free_pid+0x35/0x90
RSP: 0018:ffff81043c895e58  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff81043dd31440 RCX: ffff81043e5ffb08
RDX: 0000000000200200 RSI: 0000000000000046 RDI: 0000000000000000
RBP: ffff81043b9703c0 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffff81043d1a R11: 0000000000000000 R12: ffff81043e5ffac0
R13: 0000000000000000 R14: 0000000000000000 R15: 00000000008cd530
FS:  00007f68f99786f0(0000) GS:ffff81043e7100c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000200200 CR3: 0000000436c1f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process S05kudzu (pid: 1992, threadinfo ffff81043c894000, task ffff81043b9acb40)
Stack:  ffff81043dd31440 ffff81043b9703c0 ffff81043c84ae40 ffffffff81035a6d
 ffff81043b9703c0 0000000000000000 00000000000007cd ffffffff810362b7
 ffff81043c895f18 ffffffff81051316 0000000000000000 00007fff01989514
Call Trace:
 [<ffffffff81035a6d>] ? release_task+0x1be/0x346
 [<ffffffff810362b7>] ? do_wait+0x6c2/0xa0e
 [<ffffffff81051316>] ? trace_hardirqs_on_caller+0xf2/0x115
 [<ffffffff8102ac72>] ? ...
From: Oleg Nesterov
Date: Wednesday, February 20, 2008 - 1:04 pm

Yes, please look at http://marc.info/?t=120309840500006

Btw. The bug in tty_io.c _can_ explain this trace, but it would be nice
to ensure we don't have other problems. Could you try this

	http://marc.info/?l=linux-kernel&m=120352655031911

patch?

(I can't understand why this happens at the boot time, and it is not
 reproducable on my side).

Oleg.

--

From: Rik van Riel
Date: Wednesday, February 20, 2008 - 3:53 pm

On Wed, 20 Feb 2008 23:04:40 +0300

Fun.  With that debugging patch applied, the oops on boot no longer
happens...

No, I have no idea why...

-- 
All Rights Reversed
--

From: Cedric Le Goater
Date: Thursday, February 28, 2008 - 6:13 am

Hello !


It also fixed a oops at boot time for me. Here's a warning I got with oleg patch.

Thanks,

C.

WARNING: at /home/legoater/linux/2.6.25-rc2-mm1/kernel/pid.c:213 put_pid+0x4b/0x82()
Modules linked in: tg3 sg joydev ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 1, comm: init Not tainted 2.6.25-rc2-mm1 #6

Call Trace:
 [<ffffffff8022f0a1>] warn_on_slowpath+0x58/0x85
 [<ffffffff8024b87d>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff8024b84c>] ? trace_hardirqs_on_caller+0xf2/0x116
 [<ffffffff8024b87d>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff8024b84c>] ? trace_hardirqs_on_caller+0xf2/0x116
 [<ffffffff8023fb0f>] put_pid+0x4b/0x82
 [<ffffffff802ccd6e>] ? proc_delete_inode+0x0/0x4f
 [<ffffffff802ccd90>] proc_delete_inode+0x22/0x4f
 [<ffffffff802ccd6e>] ? proc_delete_inode+0x0/0x4f
 [<ffffffff802a0826>] generic_delete_inode+0xb8/0x138
 [<ffffffff8029fd5d>] iput+0x7c/0x80
 [<ffffffff8029d835>] dentry_iput+0xa3/0xbb
 [<ffffffff8029d8ea>] d_kill+0x21/0x42
 [<ffffffff8029ebee>] dput+0x114/0x125
 [<ffffffff802cefe0>] proc_flush_task+0x125/0x28f
 [<ffffffff8024b87d>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff802312a5>] release_task+0x24/0x331
 [<ffffffff80231ca1>] do_wait+0x6ef/0xa99
 [<ffffffff80227b43>] ? default_wake_function+0x0/0xf
 [<ffffffff802320dc>] sys_wait4+0x91/0xab
 [<ffffffff8020b21b>] system_call_after_swapgs+0x7b/0x80

--

From: Alan Cox
Date: Wednesday, February 20, 2008 - 1:07 pm

On Wed, 20 Feb 2008 14:34:17 -0500

Probably - am testing some locking patches now
--

From: Kamalesh Babulal
Date: Saturday, February 16, 2008 - 10:08 pm

Hi Andrew,

The 2.6.25-rc2-mm1 kernel oopses, followed by softlockup several times (have pasted
only some of them) on the x86_64 machine. The machine has 4 cpu(s).

BUG: unable to handle kernel NULL pointer dereference at 0000000000000219
IP: [<ffffffff802ee99a>] security_inode_getattr+0x4/0x21
PGD 1da947067 PUD 1e1803067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu1/cpufreq/scaling_setspeed
CPU 2 
Modules linked in: auth_rpcgss exportfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq dm_mirror dm_mod video output sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy tg3 button ide_cd_mod cdrom serio_raw i2c_i801 pcspkr e752x_edac edac_core shpchp i2c_core aic79xx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode]
Pid: 3069, comm: modprobe Not tainted 2.6.25-rc2-mm1-autotest #1
RIP: 0010:[<ffffffff802ee99a>]  [<ffffffff802ee99a>] security_inode_getattr+0x4/0x21
RSP: 0018:ffff8101da9e9ea0  EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff8101e1cd7a40 RCX: 0000000000000001
RDX: ffff8101da9e9ef8 RSI: ffff8101e1cd7a40 RDI: ffff8101e5946dc0
RBP: 00000000fffffff7 R08: 0000000000000002 R09: 0000000000000002
R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
R13: ffff8101da9e9ef8 R14: ffff8101e5946dc0 R15: 000000000061a660
FS:  00007fc33bc746f0(0000) GS:ffff8101e714de40(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000219 CR3: 00000001da894000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 3069, threadinfo ffff8101da9e8000, task ffff8101e51975e0)
Stack:  ffffffff8028e55d ffff8101e7111300 00000000fffffff7 ffff8101da9e9ef8
 0000000000000003 0000000000000001 ffffffff8028e5ca 00007fff43c90120
 0000000000618e40 0000000000000000 ffffffff8028e5ec ffffffff8025b7e3
Call Trace:
 [<ffffffff8028e55d>] ...
From: Andrew Morton
Date: Saturday, February 16, 2008 - 10:24 pm

Beats me.  Looks like we somehow passed a garbage dentry* into
security_inode_getattr().  But 0x219?  That could be an offset from an
accidentally IS_ERR pointer, but sizeof(struct dentry) is only 0xa0 here,
so the pointer would have to have a value of -0x139 or less, and that's
outside the range of any sane errnos.

If it's reproducible then a bisection search would be great, please.
--

From: Kamalesh Babulal
Date: Sunday, February 17, 2008 - 12:36 am

Hi Andrew,

I tried reproducing this panic, but was unsuccessful is reproducing it even after four rounds of
try, One of those round i had the following kernel panic 

BUG: unable to handle kernel paging request at 000000000508fffe
IP: [<ffff8101e5d15e44>]
PGD 1e382b067 PUD 1e38a9067 PMD 0 
Oops: 0002 [1] SMP 
last sysfs file: /sys/block/hda/removable
CPU 3 
Modules linked in: dm_mirror dm_mod video output sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy tg3 ide_cd_mod button cdrom serio_raw i2c_i801 e752x_edac shpchp edac_core i2c_core pcspkr aic79xx scsi_transport_spi sd_mod scsi_mod ehci_hcd ohci_hcd uhci_hcd
Pid: 0, comm: swapper Not tainted 2.6.25-rc2-mm1-autotest #1
RIP: 0010:[<ffff8101e5d15e44>]  [<ffff8101e5d15e44>]
RSP: 0018:ffff8101e71dbf08  EFLAGS: 00010282
RAX: ffff8101e408cb00 RBX: ffff81000104175f RCX: ffffffffffffffff
RDX: 0000000000000060 RSI: 7fffffffffffffff RDI: ffff8101e408cb00
RBP: ffff8101e5839680 R08: 0000000000000004 R09: 000000000000003c
R10: ffff8101e711a4c8 R11: ffff8101e71dbf10 R12: 0000000000000002
R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8101e714d640(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000000508fffe CR3: 00000001e5cbf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8101e71d2000, task ffff8101e71cea70)
Stack:  ffffffff80260a4c 0000000000000001 ffffffff806800f0 000000000000000a
 ffffffff80260ada ffffffff806800e0 ffffffff80236f33 ffff8101e71d3e88
 0000000000000046 ffff8101e71dbf78 0000000000000000 0000000000000000
Call Trace:
 <IRQ>  [<ffffffff80260a4c>] __rcu_process_callbacks+0x10f/0x17a
 [<ffffffff80260ada>] rcu_process_callbacks+0x23/0x43
 [<ffffffff80236f33>] __do_softirq+0x55/0xc4
 [<ffffffff8020cfec>] call_softirq+0x1c/0x28
 [<ffffffff8020e677>] do_softirq+0x2c/0x68
 ...
From: Randy Dunlap
Date: Saturday, February 16, 2008 - 10:16 pm

ACPI is enabled, but DMI=n.

linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c: In function 'acpi_thermal_init':
linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c:1792: error: 'thermal_dmi_table' undeclared (first use in this function)
linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c:1792: error: (Each undeclared identifier is reported only once
linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c:1792: error: for each function it appears in.)
make[3]: *** [drivers/acpi/thermal.o] Error 1


---
~Randy
--

From: Andrew Morton
Date: Saturday, February 16, 2008 - 10:44 pm

Bustage in x86-configurable-dmi-scanning-code.patch.  Previously, DMI=y was
just hardwired.  Now, it becomes selectable and stuff breaks.

I guess the DMI=n version of dmi_check_system() could become a macro so we
don't emit a reference to its argument, but that might generate
unused-variable warnings elsewhere.

--

From: Thomas Petazzoni
Date: Monday, February 18, 2008 - 3:15 am

Hi,

Le Sat, 16 Feb 2008 21:44:10 -0800,

Thanks for your report. The issue is that some DMI fixup tables and
callbacks are defined inside #ifdef CONFIG_DMI, some others are not. We
need to normalize that to fix the build issue in all situations.

I've thought about it, and I see two options, but I can't decide which
one is the best, so I request your opinion on that.

 1) Remove the #ifdef CONFIG_DMI around DMI fixup tables and callbacks
    definition, so that everything exists and gcc is happy. gcc is able
    to optimize out the DMI fixup table (it is not present in the binary
    when compiling with DMI=3Dn), but gcc doesn't seem to be able to
    optimize out the DMI fixup callbacks (they are still present in the
    binary). So this would leave some unused code in the binary, which
    is not completely satisfying.

 2) Define macros such as DECLARE_DMI_FIXUP_TABLE and
    DECLARE_DMI_FIXUP_CALLBACK, which could then be used like this:

DECLARE_DMI_FIXUP_CALLBACK(set_bios_reboot, __init, d, {
	if (reboot_type !=3D BOOT_BIOS) {
		reboot_type =3D BOOT_BIOS;
		printk(KERN_INFO "%s series board detected. Selecting BIOS-method for reb=
oots.\n", d->ident);
	}
	return 0;
});

DECLARE_DMI_FIXUP_TABLE(reboot_dmi_table, __initdata, {
	{	/* Handle problems with rebooting on Dell E520's */
		.callback =3D set_bios_reboot,
		.ident =3D "Dell E520",
		.matches =3D {
			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
			DMI_MATCH(DMI_PRODUCT_NAME, "Dell DM061"),
		},
	}
});

     And use them everywhere, so that DMI fixup tables and callbacks
     are properly compiled out when DMI=3Dn. Here are the macro definition:

#ifdef CONFIG_DMI

#define DECLARE_DMI_FIXUP_TABLE(name, opts, contents...) \
	static struct dmi_system_id opts name [] =3D contents

#define DECLARE_DMI_FIXUP_CALLBACK(name, opts, id, contents...) \
	static int opts name(const struct dmi_system_id *id) contents

#else

#define DECLARE_DMI_FIXUP_TABLE(name, opts, contents...)

#define ...
From: Andrew Morton
Date: Monday, February 18, 2008 - 5:13 am

Option 3 wold be to add more #ifdef CONFIG_DMI lines around the place.  How
ugly would that get?

--

From: Thomas Petazzoni
Date: Tuesday, February 19, 2008 - 8:55 am

Le Mon, 18 Feb 2008 04:13:40 -0800,

Like the attached patch. #ifdef CONFIG_DMI everywhere :-(

Sincerly,

Thomas

---

Turn CONFIG_DMI into a selectable option if EMBEDDED is defined, in
order to be able to remove the DMI table scanning code if it's not
needed, and then reduce the kernel code size.

The DMI code users are modified, so that they either depend on
CONFIG_DMI (for the drivers who really need DMI to work) or their
DMI-related code is enclosed in #ifdef CONFIG_DMI.

With CONFIG_DMI (i.e before) :

   text    data     bss     dec     hex filename
1076076  128656   98304 1303036  13e1fc vmlinux

Without CONFIG_DMI (i.e after) :

   text    data     bss     dec     hex filename
1068092  126308   98304 1292704  13b9a0 vmlinux

Result:

   text    data     bss     dec     hex filename
  -7984   -2348       0  -10332   -285c vmlinux

The new option appears in "Processor type and features", only when
CONFIG_EMBEDDED is defined.

This patch is part of the Linux Tiny project, and is based on previous
work done by Matt Mackall <mpm@selenic.com>.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

---
 arch/x86/Kconfig                           |   13 ++++++++++---
 arch/x86/kernel/acpi/boot.c                |    4 ++--
 arch/x86/kernel/acpi/sleep_32.c            |    2 ++
 arch/x86/kernel/apm_32.c                   |    2 ++
 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c |    4 ++--
 arch/x86/kernel/cpu/cpufreq/powernow-k7.c  |    3 ++-
 arch/x86/kernel/io_delay.c                 |    2 ++
 arch/x86/kernel/reboot.c                   |    2 ++
 arch/x86/kernel/tsc_32.c                   |    2 ++
 arch/x86/mach-generic/bigsmp.c             |    3 ++-
 arch/x86/pci/acpi.c                        |    2 ++
 arch/x86/pci/common.c                      |    2 ++
 arch/x86/pci/fixup.c                       |    5 ++++-
 arch/x86/pci/irq.c                         |    2 ++
 drivers/acpi/sleep/main.c                  |    2 ++
 ...
From: Randy Dunlap
Date: Tuesday, February 19, 2008 - 10:41 am

Does this patch apply to -mm?  Seem like No.

After converting it from mime(?) to ASCII and fixing one #if
(change "and" to "&&") & fixing patch rejects, it does build cleanly.


---
~Randy
--

From: Thomas Petazzoni
Date: Tuesday, February 19, 2008 - 3:00 pm

Le Tue, 19 Feb 2008 09:41:47 -0800,


Probably due to my PGP-MIME signature. Will try to remember that


The rejects are probably due to the patch being applied to -mm. It
applies fine on -rc here.

Any opinion about whether the patch is clean ? Worth it ?

Thanks for testing the patch,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Free Embedded Linux Training Materials
on http://free-electrons.com/training
(More than 1500 pages!)
--

From: Randy Dunlap
Date: Tuesday, February 19, 2008 - 3:05 pm

It seems reasonable to me as long as the option depends on EMBEDDED,
as it does.

-- 
~Randy
--

From: Andrew Morton
Date: Tuesday, February 19, 2008 - 4:21 pm

ug, sorry, if I'd realised it was like this I'd have said "don't bother". 
Apart from the obvious problem, this means that people will keep breaking
CONFIG_DMI=n all the time, because they will forget the ifdefs, and the
number of people who test with CONFIG_DMI=n will be small.


--

From: Thomas Petazzoni
Date: Wednesday, February 20, 2008 - 12:21 am

Le Tue, 19 Feb 2008 15:21:29 -0800,

Yes, #ifdef CONFIG_DMI is not very comfortable. That why I proposed
things such as DECLARE_DMI_FIXUP_TABLE(), because it would force people
to use these macros, which would then be working correctly depending on
DMI=3Dy/n. However, there's still the issue of driver_data that I
mentionned in my earlier post.

What should I do ? Option 1 ? Option 2 ? Give up with the patch ?

Thanks for your comments,

Thomas
--=20
Thomas Petazzoni, Free Electrons
Free Embedded Linux Training Materials
on http://free-electrons.com/training
(More than 1500 pages!)
From: Andrew Morton
Date: Wednesday, February 20, 2008 - 2:55 am

Option 1 would be best, I think:

 1) Remove the #ifdef CONFIG_DMI around DMI fixup tables and callbacks
    definition, so that everything exists and gcc is happy. gcc is able
    to optimize out the DMI fixup table (it is not present in the binary
    when compiling with DMI=n), but gcc doesn't seem to be able to
    optimize out the DMI fixup callbacks (they are still present in the
    binary). So this would leave some unused code in the binary, which
    is not completely satisfying.

gcc _should_ be able to remove the callbacks as long as they are static and
have no references.  If even the latest gcc versions are still incluing the
unreferenced, static function in the final vmlinux then let's get gcc fixed?

--

From: Randy Dunlap
Date: Saturday, February 16, 2008 - 10:32 pm

When SMP=n, x86_64 build gets:

arch/x86/kernel/built-in.o: In function `acpi_save_state_mem':
(.text+0xfd7f): undefined reference to `setup_trampoline'
make[1]: *** [.tmp_vmlinux1] Error 1

---
~Randy
--

From: Andrew Morton
Date: Saturday, February 16, 2008 - 10:46 pm

Thanks.  Say hello to Pavel.
--

From: Pavel Machek
Date: Sunday, February 17, 2008 - 12:52 pm

Sorry, I was in mountains, with electricity 5 hours a day... I believe
this was solved already?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Rafael J. Wysocki
Date: Sunday, February 17, 2008 - 1:12 pm

I've sent a patch, it's being tested.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Sunday, February 17, 2008 - 4:23 am

Please try the patch at http://lkml.org/lkml/2008/2/16/301 .

Thanks,
Rafael

--

From: Randy Dunlap
Date: Sunday, February 17, 2008 - 10:54 am

Yes, fixed.  Thanks.

-- 
~Randy
--

From: Joel Becker
Date: Saturday, February 16, 2008 - 11:25 pm

Andrew,
	This patch, by introducing sizeof(long) in the BITS_TO_LONGS
math, changes BITS_TO_LONGS from an int to a unsigned long.  We noticed
because this printk in fs/ocfs2/dlm/dlmdomain.c:

    mlog(ML_ERROR,
         "map_size %u != BITS_TO_LONGS(O2NM_MAX_NODES) %u\n",
         map_size, BITS_TO_LONGS(O2NM_MAX_NODES));

now gives this warning:

    fs/ocfs2/dlm/dlmdomain.c:938: warning: format '%u' expects type
    'unsigned int', but argument 7 has type 'long unsigned int'

We can tweak the printk once the patch goes to Linus, no worries.  I
just wanted to send a heads up in case the size change affects anything
else.

Joel

-- 

 There are morethings in heaven and earth, Horatio,
 Than are dreamt of in your philosophy.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
--

From: Joel Becker
Date: Saturday, February 16, 2008 - 11:32 pm

When building on x86_64.  I forgot that bit :-)

Joel

-- 

Life's Little Instruction Book #444

	"Never underestimate the power of a kind word or deed."

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
--

From: Andrew Morton
Date: Saturday, February 16, 2008 - 11:51 pm

Thanks.  That was not intentional and I'll drop the patch.
--

From: Alexey Dobriyan
Date: Sunday, February 17, 2008 - 3:50 am

Guys, create_proc_entry() is slightly racy in case of modular code and
proc_create() was invented to fix it. Eventually all create_proc_entry()
users will be converted to proc_create(), so please do it for new code.
--

From: Andrew Morton
Date: Monday, February 18, 2008 - 6:01 am

The first two patches are -mm-only debug patches - they won't be going into
mainline.

I'll make a note that
cciss-procfs-updates-to-display-info-about-many-volumes.pach needs
updating, thanks.

--

From: Daniel Walker
Date: Monday, February 18, 2008 - 9:13 am

profile-likely-unlikely-macros isn't new code, but I can still conver it
to proc_create() if you like..

Daniel
--

From: Miller, Mike (OS Dev)
Date: Tuesday, February 19, 2008 - 3:25 pm

Why am I getting "implicit declaration of function 'proc_create'" when I try to use that function? I have kernel version 2.6.24, gcc version 4.1.1 20070105 (Red Hat 4.1.1-51). Am I missing a header file?

-- mikem
--

From: Andrew Morton
Date: Tuesday, February 19, 2008 - 4:59 pm

2.6.24 is 47MB of diff ago.  Please always develop and test against
development kernels!
--

From: Randy Dunlap
Date: Sunday, February 17, 2008 - 5:14 pm

CIFS has some build problems:

linux-2.6.25-rc2-mm1/fs/cifs/cifs_debug.c:922: error: static declaration of 'cifs_proc_init' follows non-static declaration
linux-2.6.25-rc2-mm1/fs/cifs/cifsproto.h:112: error: previous declaration of 'cifs_proc_init' was here
linux-2.6.25-rc2-mm1/fs/cifs/cifs_debug.c:926: error: static declaration of 'cifs_proc_clean' follows non-static declaration
linux-2.6.25-rc2-mm1/fs/cifs/cifsproto.h:113: error: previous declaration of 'cifs_proc_clean' was here
make[3]: *** [fs/cifs/cifs_debug.o] Error 1

.config is attached.

---
~Randy
From: Steve French
Date: Sunday, February 17, 2008 - 9:10 pm

Thanks for spotting this - it only would happen if CONFIG_PROC_FS is disabled.
I have fixed it in the cifs-2.6.git tree so should be fine next time akpm pulls.




-- 
Thanks,

Steve
--

From: Randy Dunlap
Date: Sunday, February 17, 2008 - 5:16 pm

Building i386 kernel on x86_64, I see a build error in linking:

kernel/built-in.o: In function `jiffies_64_to_usecs':
(.text+0xeaed): undefined reference to `__udivdi3'
make[1]: *** [.tmp_vmlinux1] Error 1

.config attached.

---
~Randy
From: Andrew Morton
Date: Monday, February 18, 2008 - 2:34 am

Thanks - that'll be provide-u64-version-of-jiffies_to_usecs-in-kernel-tsacctc.patch
--

From: Randy Dunlap
Date: Sunday, February 17, 2008 - 10:17 pm

It's possible to config a specific CPU and also enable Intel MCE checks
and AMD MCE checks, ending with this:

arch/x86/kernel/built-in.o: In function `smp_thermal_interrupt_init':
mce_amd_64.c:(.text+0xd7ea): undefined reference to `num_k8_northbridges'
mce_amd_64.c:(.text+0xd800): undefined reference to `k8_northbridges'
mce_amd_64.c:(.text+0xd863): undefined reference to `k8_northbridges'
mce_amd_64.c:(.text+0xd888): undefined reference to `k8_northbridges'
mce_amd_64.c:(.text+0xd8b1): undefined reference to `num_k8_northbridges'
mce_amd_64.c:(.text+0xd8d8): undefined reference to `num_k8_northbridges'
mce_amd_64.c:(.text+0xd8f3): undefined reference to `k8_northbridges'
mce_amd_64.c:(.text+0xd917): undefined reference to `num_k8_northbridges'
make[1]: *** [.tmp_vmlinux1] Error 1

.config file is attached.

---
~Randy
From: Yinghai Lu
Date: Sunday, February 17, 2008 - 11:03 pm

same config no problem with mainline.

YH
--

From: Adrian Bunk
Date: Monday, February 18, 2008 - 6:31 am

That's x86-amd-thermal-interrupt-support.patch failing with 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Randy Dunlap
Date: Monday, February 18, 2008 - 10:13 am

cciss driver has a bad macro definition:

#else /* no CONFIG_CISS_SCSI_TAPE */

/* If no tape support, then these become defined out of existence */

#define cciss_scsi_setup(cntl_num)
#define cciss_unregister_scsi(ctlr)
#define cciss_register_scsi(ctlr)
#define cciss_seq_tape_report(struct seq_file *seq, int ctlr)

#endif /* CONFIG_CISS_SCSI_TAPE */

which causes this error:

In file included from /local/linsrc/linux-2.6.25-rc2-mm1/drivers/block/cciss.c:231:
/local/linsrc/linux-2.6.25-rc2-mm1/drivers/block/cciss_scsi.c:1498:38: error: macro parameters must be comma-separated
make[3]: *** [drivers/block/cciss.o] Error 1

---
~Randy
--

From: Kevin Winchester
Date: Monday, February 18, 2008 - 5:08 pm

I don't think I've seen anyone else report this, but if I'm wrong, I'm
sure someone will point me to the thread.  This is during boot up, and
doesn't seem to have any effect on the running system, that I can tell.

[    0.090840] ------------[ cut here ]------------
[    0.090920] WARNING: at kernel/lockdep.c:2677 check_flags+0x8d/0x12d()
[    0.090986] Pid: 1, comm: swapper Not tainted 2.6.25-rc2-mm1 #49
[    0.090986]  [<c011b92d>] warn_on_slowpath+0x41/0x72
[    0.090986]  [<c0136ff1>] ? __lock_acquire+0xb99/0xbb5
[    0.090986]  [<c0140632>] ? ftrace_record_ip+0x11e/0x17a
[    0.090986]  [<c0102bd0>] ? mcount_call+0x5/0x9
[    0.090986]  [<c01344f0>] ? check_chain_key+0xe/0x16a
[    0.090986]  [<c0140632>] ? ftrace_record_ip+0x11e/0x17a
[    0.090986]  [<c01dd1a8>] ? debug_locks_off+0x8/0x3c
[    0.090986]  [<c0102bd0>] ? mcount_call+0x5/0x9
[    0.090986]  [<c01182c4>] ? sub_preempt_count+0xa/0xb0
[    0.090986]  [<c03453bd>] ? _spin_unlock_irqrestore+0x47/0x5d
[    0.090986]  [<c0140632>] ? ftrace_record_ip+0x11e/0x17a
[    0.090986]  [<c0102bd0>] ? mcount_call+0x5/0x9
[    0.090986]  [<c0134442>] check_flags+0x8d/0x12d
[    0.090986]  [<c0137043>] lock_acquire+0x36/0x82
[    0.090986]  [<c03440c2>] down_write+0x2d/0x48
[    0.090986]  [<c015e91a>] ? kmem_cache_create+0x21/0x1a7
[    0.090986]  [<c015e91a>] kmem_cache_create+0x21/0x1a7
[    0.090986]  [<c0449642>] filelock_init+0x23/0x2c
[    0.090986]  [<c016d908>] ? init_once+0x0/0x11
[    0.090986]  [<c043d697>] kernel_init+0xb6/0x203
[    0.090986]  [<c0102dae>] ? restore_nocheck_notrace+0x0/0xe
[    0.090986]  [<c043d5e1>] ? kernel_init+0x0/0x203
[    0.090986]  [<c043d5e1>] ? kernel_init+0x0/0x203
[    0.090986]  [<c01038d3>] kernel_thread_helper+0x7/0x10
[    0.090986]  =======================
[    0.090986] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.090986] possible reason: unannotated irqs-on.
[    0.090986] irq event stamp: 1404
[    0.090986] hardirqs last  enabled at (1403): ...
From: Andrew Morton
Date: Monday, February 18, 2008 - 5:15 pm

Looks like an ftrace-vs-lockdep problem.
--

From: Steven Rostedt
Date: Monday, February 18, 2008 - 5:22 pm

Is there a .config around to look at?

Thanks,

-- Steve

--

From: Kevin Winchester
Date: Monday, February 18, 2008 - 6:15 pm

Sorry, here it is.

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.25-rc2-mm1
# Sun Feb 17 13:12:53 2008
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_HAS_CPU_RELAX=y
# CONFIG_HAVE_SETUP_PER_CPU_AREA is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
# CONFIG_SYSVIPC is not set
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_RT_GROUP_SCHED is not ...
From: Tilman Schmidt
Date: Wednesday, February 20, 2008 - 2:14 pm

This is a multi-part message in MIME format.
--------------090505060502050700020809
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Ok, so here's my story on 2.6.25-rc2-mm1:

Built fine on my Pentium D in 32 bit mode, booted too, although
complaining once already while unpacking the initramfs:

<0>[    0.069176] BUG: spinlock bad magic on CPU#0, swapper/0
<0>[    0.069324]  lock: c2c19480, .magic: 00000000, .owner: swapper/0, .=
owner_cpu: 0
<4>[    0.069559] Pid: 0, comm: swapper Not tainted 2.6.25-rc2-mm1-testin=
g #1
<4>[    0.069710]  [spin_bug+129/140] spin_bug+0x81/0x8c
<4>[    0.069907]  [_raw_spin_unlock+30/118] _raw_spin_unlock+0x1e/0x76
<4>[    0.069997]  [_spin_unlock+34/65] _spin_unlock+0x22/0x41
<4>[    0.070194]  [mnt_want_write+103/138] mnt_want_write+0x67/0x8a
<4>[    0.070390]  [sys_mkdirat+139/219] sys_mkdirat+0x8b/0xdb
<4>[    0.070584]  [clean_path+27/79] ? clean_path+0x1b/0x4f
<4>[    0.070829]  [trace_hardirqs_on+11/13] ? trace_hardirqs_on+0xb/0xd
<4>[    0.071185]  [sys_mkdir+21/23] sys_mkdir+0x15/0x17
<4>[    0.071378]  [do_name+279/440] do_name+0x117/0x1b8
<4>[    0.071570]  [write_buffer+34/49] write_buffer+0x22/0x31
<4>[    0.071763]  [flush_window+105/184] flush_window+0x69/0xb8
<4>[    0.071996]  [unpack_to_rootfs+1585/2238] unpack_to_rootfs+0x631/0x=
8be
<4>[    0.072192]  [trace_hardirqs_on_caller+248/301] ? trace_hardirqs_on=
_caller+0xf8/0x12d
<4>[    0.072440]  [restore_nocheck_notrace+0/16] ? restore_nocheck_notra=
ce+0x0/0x10
<4>[    0.072689]  [populate_rootfs+37/270] populate_rootfs+0x25/0x10e
<4>[    0.072886]  [alternative_instructions+344/349] ? alternative_instr=
uctions+0x158/0x15d
<4>[    0.073139]  [start_kernel+840/858] start_kernel+0x348/0x35a
<4>[    0.073335]  =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D

Still, X came up fine, I could log in (Gnome feeling subjectively
a bit sluggish), call up a web page from the Internet in Firefox,
and start perusing ...
From: Patrick McHardy
Date: Wednesday, February 20, 2008 - 2:50 pm

I guess the cause for this is a combination of preemtible
RCU and conntrack using RCU since 2.6.25-rc. Using
NF_CT_STAT_INC_ATOMIC should fix it, but I'd prefer
to have a fix that doesn't increase overhead when regular
RCU is used.

I'll see if I can find a better way to fix this tommorrow.

--

From: Patrick McHardy
Date: Thursday, February 21, 2008 - 4:28 am

Could you test whether this patch fixes the netfilter
warnings please?

From: Stephen Hemminger
Date: Thursday, February 21, 2008 - 9:32 am

On Thu, 21 Feb 2008 12:28:50 +0100

Use rcu_read_lock instead. local_bh_disable() won't work with some of the other forms
of RCU alternatives.

--

From: Patrick McHardy
Date: Thursday, February 21, 2008 - 9:34 am

The caller already calls rcu_read_lock(). This is for the per-cpu
statistics.

--

From: Tilman Schmidt
Date: Thursday, February 21, 2008 - 5:40 pm

de warning

Yes, it does; and the system also survives substantially longer.
(IOW, it hasn't crashed on me so far.)

Thanks,
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Tilman Schmidt
Date: Thursday, February 21, 2008 - 5:52 pm

Which of course it did the second after I sent off that mail. :-(
No message at all this time at the time of the crash, even though
I had "tail -f /var/log/messages" running in an ssh session.

So the nf_conntrack BUG is fixed, but the crash (and of course the
swapper "spinlock bad magic" BUG) persists.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Paul E. McKenney
Date: Friday, February 22, 2008 - 10:09 am

Do you have CONFIG_DEBUG_PREEMPT set?  That would help find any other
bugs similar to nf_conntrack.

							Thanx, Paul
--

From: Tilman Schmidt
Date: Monday, February 25, 2008 - 1:54 am

CONFIG_DEBUG_PREEMPT=3Dy was set but didn't produce anything.
Or perhaps it did and the message just didn't make it to the disk.
Time to set up a test with netconsole, I guess.

--=20
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Tilman Schmidt
Date: Wednesday, February 27, 2008 - 9:37 am

Bad news: With 2.6.25-rc3, that bug has made it into mainline.
Good news: Your patch fixes it there, too.

So I suggest you forward it there as soon as possible.

Thanks,
Tilman

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Patrick McHardy
Date: Wednesday, February 27, 2008 - 9:47 am

Already done, should hit upstream soon.

--

From: Andrew Morton
Date: Thursday, February 21, 2008 - 5:38 am

(net-related cc's removed)

This look like a startup ordering bug in mnt_want_write().
--

From: Christoph Hellwig
Date: Thursday, February 21, 2008 - 9:46 am

Do you have CONFIG_ACPI_CUSTOM_DSDT_INITRD set?
--

From: Tilman Schmidt
Date: Thursday, February 21, 2008 - 5:10 pm

Negative.

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Dave Hansen
Date: Thursday, February 21, 2008 - 12:36 pm

Let me look into it a bit.  Although, it does seem that this stuff is
just calling into the filesystem code too early.  The mnt_writers[]
spinlocks are init'd with a:

	fs_initcall(init_mnt_writers);

and populate_rootfs() is supposed to happen in a rootfs_initcall() so
I'm a bit confused how it happened in this order.

-- Dave

--

Previous thread: [PATCH] MAINTAINERS: update ide-cd maintainer's email address by Borislav Petkov on Saturday, February 16, 2008 - 1:12 am. (2 messages)

Next thread: Re: Optiarc DVD RW AD-5200A audio playing by Borislav Petkov on Saturday, February 16, 2008 - 2:05 am. (12 messages)