ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc2/2.6.25-rc2-mm1/ - git-xfs is dropped due to git conflicts - git-x86 is dropped due to too many changes to non-x86 code - git-perfmon remains dropped due to rejects - git-kgdb remains dropped due to rejects - Added the slab/slub tree as git-slub.patch (Christoph Lameter) Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail majordomo@vger.kernel.org - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. These probably are at least compilable. - More-than-daily -mm snapshots may be found at http://userweb.kernel.org/~akpm/mmotm/. These are almost certainly not compileable. Changes ...
arch/x86/kernel/built-in.o: In function `amd_smp_thermal_interrupt': (.text+0xe03b): undefined reference to `mce_log_therm_throt_event' arch/x86/kernel/built-in.o: In function `acpi_save_state_mem': (.text+0x12239): undefined reference to `setup_trampoline' # # Automatically generated make config: don't edit # Linux kernel version: 2.6.25-rc2-mm1 # Sat Feb 16 11:32:49 2008 # CONFIG_64BIT=y # CONFIG_X86_32 is not set CONFIG_X86_64=y CONFIG_X86=y # CONFIG_GENERIC_LOCKBREAK is not set CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_FAST_CMPXCHG_LOCAL=y CONFIG_MMU=y CONFIG_ZONE_DMA=y # CONFIG_QUICKLIST is not set CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y # CONFIG_GENERIC_GPIO is not set CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_RWSEM_GENERIC_SPINLOCK=y # CONFIG_RWSEM_XCHGADD_ALGORITHM is not set # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ZONE_DMA32=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_AUDIT_ARCH=y CONFIG_ARCH_SUPPORTS_AOUT=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y # CONFIG_KTIME_SCALAR is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y # CONFIG_TASKSTATS is not set # CONFIG_AUDIT ...
ho hum, thanks. I think I'll drop x86-amd-thermal-interrupt-support.patch. I don't think it's the final version anwyay. --
Ok, I had to revert x86-remove-pt_regs-arg-from-smp_thermal_interrupt before x86-amd-thermal-interrupt-support. Second error vanished when I reverted "suspend: wakeup code in C". --
This one is easily fixed by the appended patch (whether it works is a separate It will compile if you set CONFIG_SMP. Working on a fix. Thanks, Rafael --- arch/x86/kernel/cpu/mcheck/mce_64.c | 4 ++-- arch/x86/kernel/cpu/mcheck/mce_thermal.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) Index: linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_64.c =================================================================== --- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/cpu/mcheck/mce_64.c +++ linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_64.c @@ -317,7 +317,7 @@ void do_machine_check(struct pt_regs * r atomic_dec(&mce_entry); } -#ifdef CONFIG_X86_MCE_INTEL +#ifdef CONFIG_X86_MCE /*** * mce_log_therm_throt_event - Logs the thermal throttling event to mcelog * @cpu: The CPU on which the event occurred. @@ -342,7 +342,7 @@ void mce_log_therm_throt_event(unsigned rdtscll(m.tsc); mce_log(&m); } -#endif /* CONFIG_X86_MCE_INTEL */ +#endif /* CONFIG_X86_MCE */ /* * Periodic polling timer for "silent" machine check errors. If the Index: linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_thermal.h =================================================================== --- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/cpu/mcheck/mce_thermal.h +++ linux-2.6.25-rc2-mm1/arch/x86/kernel/cpu/mcheck/mce_thermal.h @@ -4,5 +4,5 @@ typedef void (*smp_thermal_interrupt_callback_t)(void); extern smp_thermal_interrupt_callback_t smp_thermal_interrupt; -void mce_log_therm_throt_event(unsigned int cpu, __u64 status); +extern void mce_log_therm_throt_event(unsigned int cpu, __u64 status); --
The appended patch should fix the second error.
Thanks,
Rafael
---
On x86-64 the CPU trampoline code is now used while waking up from ACPI
suspend to RAM. For this reason, make it depend on
(64BIT && ACPI_SLEEP) as well as on SMP, move the relevant declarations
to a separate header and move the definition of setup_trampoline() from
smpboot_64.c .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
arch/x86/Kconfig | 2 +-
arch/x86/kernel/acpi/sleep.c | 1 -
arch/x86/kernel/acpi/sleep.h | 3 ++-
arch/x86/kernel/e820_64.c | 5 +++--
arch/x86/kernel/setup_64.c | 16 ++++++++++++++++
arch/x86/kernel/smpboot_64.c | 23 ++---------------------
include/asm-x86/smp_64.h | 2 --
include/asm-x86/trampoline.h | 18 ++++++++++++++++++
8 files changed, 42 insertions(+), 28 deletions(-)
Index: linux-2.6.25-rc2-mm1/arch/x86/Kconfig
===================================================================
--- linux-2.6.25-rc2-mm1.orig/arch/x86/Kconfig
+++ linux-2.6.25-rc2-mm1/arch/x86/Kconfig
@@ -180,7 +180,7 @@ config X86_BIOS_REBOOT
config X86_TRAMPOLINE
bool
- depends on X86_SMP || (X86_VOYAGER && SMP)
+ depends on X86_SMP || (X86_VOYAGER && SMP) || (64BIT && ACPI_SLEEP)
default y
config KTIME_SCALAR
Index: linux-2.6.25-rc2-mm1/arch/x86/kernel/e820_64.c
===================================================================
--- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/e820_64.c
+++ linux-2.6.25-rc2-mm1/arch/x86/kernel/e820_64.c
@@ -27,6 +27,7 @@
#include <asm/setup.h>
#include <asm/sections.h>
#include <asm/kdebug.h>
+#include <asm/trampoline.h>
struct e820map e820;
@@ -58,8 +59,8 @@ struct early_res {
};
static struct early_res early_res[MAX_EARLY_RES] __initdata = {
{ 0, PAGE_SIZE, "BIOS data page" }, /* BIOS data page */
-#ifdef CONFIG_SMP
- { SMP_TRAMPOLINE_BASE, SMP_TRAMPOLINE_BASE + 2*PAGE_SIZE, "SMP_TRAMPOLINE" },
+#ifdef CONFIG_X86_TRAMPOLINE
+ { TRAMPOLINE_BASE, TRAMPOLINE_BASE + 2 * PAGE_SIZE, ...Thanks for fixing this, Rafael. I must admit that I'm puzzled as to why this should fail, since my Kconfig change forces K8_NB=y when X86_MCE_AMD=y -- unless this is a discussion of my earlier patch, which did not include this change. -- Russell Leidich --
Hi Andrew,
The 2.6.25-rc2-mm1 kernel panic's while boot up on the s390x
Unable to handle kernel pointer dereference at virtual kernel address 0000000000
000000
Oops: 0004 #1¨ SMP
Modules linked in:
CPU: 0 Not tainted 2.6.25-rc2-mm1-autotest #1
Process swapper (pid: 1, task: 000000003f830000, ksp: 000000003f83ba48)
Krnl PSW : 0704a00180000000 000000000024b2be (futex_atomic_cmpxchg_std+0x12/0x28
)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:2 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000074 00000000fffffff2 0000000000000000 0000000000000000
0000000000000000 0000000000000001 0000000000000000 00000000004d0764
0000000000000000 00000000004d8768 0000000000000000 000000003f83bdb0
0000000000000040 0000000000343950 00000000000627e4 000000003f83bdb0
Krnl Code: 000000000024b2b2: b90400bf lgr %r11,%r15
000000000024b2b6: a718fff2 lhi %r1,-14
000000000024b2ba: b2790100 sacf 256
>000000000024b2be: ba342000 cs %r3,%r4,0(%r2)
000000000024b2c2: 1813 lr %r1,%r3
000000000024b2c4: b2790000 sacf 0
000000000024b2c8: b9140021 lgfr %r2,%r1
000000000024b2cc: e3b0b0700004 lg %r11,112(%r11)
Call Trace:
(<000000003f83bda8>¨ 0x3f83bda8)
<00000000004bdeec>¨ init+0x30/0x104
<00000000004b0c40>¨ kernel_init+0x1e0/0x370
<000000000001a5c6>¨ kernel_thread_starter+0x6/0xc
<000000000001a5c0>¨ kernel_thread_starter+0x0/0xc
<4>--- end trace 561bb236c800851f ¨---
note: swapper1¨ exited with preempt_count 1
Kernel panic - not syncing: Attempted to kill init!
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
Possibilities are that something has gone wrong with the recent cmpxchg changes which are now in mainline, or there's something wrong with futex-fix-init-order.patch or futex-runtime-enable-pi-and-robust-functionality.patch. Or, of course, it's something else ;) First question is: does this happen in current mainline? If not, it would be useful if someone could test futex-fix-init-order.patch and futex-runtime-enable-pi-and-robust-functionality.patch on curent mainline, because those are planned for 2.6.25. Thanks. --
To conform the patches causing the panic, I tested the 2.6.24.2 kernel with the futex-fix-init-order.patch and
futex-runtime-enable-pi-and-robust-functionality.patch applied and they seem to cause the kernel
panic.
Unable to handle kernel pointer dereference at virtual kernel address 0000000000
000000
Oops: 0004 #1¨ SMP
Modules linked in:
CPU: 0 Not tainted 2.6.25-rc2-mm1-autotest #1
Process swapper (pid: 1, task: 000000003f830000, ksp: 000000003f83ba48)
Krnl PSW : 0704a00180000000 000000000024b2be (futex_atomic_cmpxchg_std+0x12/0x28
)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:2 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000074 00000000fffffff2 0000000000000000 0000000000000000
0000000000000000 0000000000000001 0000000000000000 00000000004d0764
0000000000000000 00000000004d8768 0000000000000000 000000003f83bdb0
0000000000000040 0000000000343950 00000000000627e4 000000003f83bdb0
Krnl Code: 000000000024b2b2: b90400bf lgr %r11,%r15
000000000024b2b6: a718fff2 lhi %r1,-14
000000000024b2ba: b2790100 sacf 256
>000000000024b2be: ba342000 cs %r3,%r4,0(%r2)
000000000024b2c2: 1813 lr %r1,%r3
000000000024b2c4: b2790000 sacf 0
000000000024b2c8: b9140021 lgfr %r2,%r1
000000000024b2cc: e3b0b0700004 lg %r11,112(%r11)
Call Trace:
(<000000003f83bda8>¨ 0x3f83bda8)
<00000000004bdeec>¨ init+0x30/0x104
<00000000004b0c40>¨ kernel_init+0x1e0/0x370
<000000000001a5c6>¨ kernel_thread_starter+0x6/0xc
<000000000001a5c0>¨ kernel_thread_starter+0x0/0xc
<4>--- end trace 561bb236c800851f ¨---
note: swapper1¨ exited with preempt_count 1
Kernel panic - not syncing: Attempted to kill init!
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
Thanks for reporting! Patch below should fix it. Index: linux-2.6/arch/s390/lib/uaccess_std.c =================================================================== --- linux-2.6.orig/arch/s390/lib/uaccess_std.c +++ linux-2.6/arch/s390/lib/uaccess_std.c @@ -293,8 +293,8 @@ int futex_atomic_cmpxchg_std(int __user asm volatile( " sacf 256\n" - " cs %1,%4,0(%5)\n" - "0: lr %0,%1\n" + "0: cs %1,%4,0(%5)\n" + " lr %0,%1\n" "1: sacf 0\n" EX_TABLE(0b,1b) : "=d" (ret), "+d" (oldval), "=m" (*uaddr) --
Hi Heiko,
Thanks for the patch, I have tested patch, it fixes the bootup panic.
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Index: linux-2.6/arch/s390/lib/uaccess_std.c
===================================================================
--- linux-2.6.orig/arch/s390/lib/uaccess_std.c
+++ linux-2.6/arch/s390/lib/uaccess_std.c
@@ -293,8 +293,8 @@ int futex_atomic_cmpxchg_std(int __user
asm volatile(
" sacf 256\n"
- " cs %1,%4,0(%5)\n"
- "0: lr %0,%1\n"
+ "0: cs %1,%4,0(%5)\n"
+ " lr %0,%1\n"
"1: sacf 0\n"
EX_TABLE(0b,1b)
: "=d" (ret), "+d" (oldval), "=m" (*uaddr)
--
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
cciss driver build errors on x86_64: In file included from drivers/block/cciss.c:231: drivers/block/cciss_scsi.c:1498:38: error: macro parameters must be comma-separated drivers/block/cciss.c: In function 'cciss_seq_show_header': drivers/block/cciss.c:272: error: implicit declaration of function 'cciss_seq_tape_report' drivers/block/cciss.c: In function 'cciss_proc_write': drivers/block/cciss.c:393: error: implicit declaration of function 'cciss_engage_scsi' make[2]: *** [drivers/block/cciss.o] Error 1 --- ~Randy --
Hi Andrew, The 2.6.25-rc2-mm1 kernel build fails on the powerpc(s) CC security/keys/compat.o security/keys/compat.c: In function ‘compat_sys_keyctl’: security/keys/compat.c:83: error: implicit declaration of function ‘keyctl_get_security’ make[2]: *** [security/keys/compat.o] Error 1 make[1]: *** [security/keys] Error 2 make: *** [security] Error 2 The keys-add-keyctl-function-to-get-a-security-label.patch is causing this build failure. I have tested the patch for the build failure only Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> -- --- linux-2.6.25-rc2/security/keys/internal.h 2008-02-17 05:03:30.000000000 +0530 +++ linux-2.6.25-rc2/security/keys/~internal.h 2008-02-17 05:46:16.000000000 +0530 @@ -155,6 +155,8 @@ extern long keyctl_negate_key(key_serial extern long keyctl_set_reqkey_keyring(int); extern long keyctl_set_timeout(key_serial_t, unsigned); extern long keyctl_assume_authority(key_serial_t); +extern long keyctl_get_security(key_serial_t keyid, char __user *buffer, + size_t buflen); /* -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. --
The ACPI wakeup in C patch (I think) won't build for me on x86_32 (i.e., i386 build on x86_64 system): linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: -mpreferred-stack-boundary=2 is not between 4 and 12 make[4]: *** [arch/x86/kernel/acpi/realmode/wakeup.o] Error 1 make[3]: *** [arch/x86/kernel/acpi/realmode/wakeup.bin] Error 2 --- ~Randy --
It compiles for me on a native i386. Can you please give me a hint what to do to reproduce the problem? Rafael --
Sounds like you're not adding -m32 to the gcc command line. -hpa --
Yes, adding -m32 to the X86_32 config ccflags (as is done for the X86_64 case) makes it build for me. (like patch below) Thanks. --- From: Randy Dunlap <randy.dunlap@oracle.com> Fix wakeup code build errors on x86_64. linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: -mpreferred-stack-boundary=2 is not between 4 and 12 Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> --- arch/x86/kernel/acpi/realmode/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/acpi/realmode/Makefile +++ linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/Makefile @@ -27,7 +27,7 @@ bootsrc := $(src)/../../../boot # How to compile the 16-bit code. Note we always compile for -march=i386, # that way we can complain to the user if the CPU is insufficient. # Compile with _SETUP since this is similar to the boot-time setup code. -cflags-$(CONFIG_X86_32) := +cflags-$(CONFIG_X86_32) := -m32 cflags-$(CONFIG_X86_64) := -m32 KBUILD_CFLAGS := $(LINUXINCLUDE) -g -Os -D_SETUP -D_WAKEUP -D__KERNEL__ \ -I$(srctree)/$(bootsrc) \ --
It's wrong, though, because you can't assume a 32-bit compiler knows about -m32. You need $(call cc-option,-m32). -hpa --
Thanks, Peter. Tested/works. --- From: Randy Dunlap <randy.dunlap@oracle.com> Fix wakeup code build errors on x86_64. linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: CPU you selected does not support x86-64 instruction set linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/wakeup.S:0: error: -mpreferred-stack-boundary=2 is not between 4 and 12 Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> --- arch/x86/kernel/acpi/realmode/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-2.6.25-rc2-mm1.orig/arch/x86/kernel/acpi/realmode/Makefile +++ linux-2.6.25-rc2-mm1/arch/x86/kernel/acpi/realmode/Makefile @@ -27,7 +27,7 @@ bootsrc := $(src)/../../../boot # How to compile the 16-bit code. Note we always compile for -march=i386, # that way we can complain to the user if the CPU is insufficient. # Compile with _SETUP since this is similar to the boot-time setup code. -cflags-$(CONFIG_X86_32) := +cflags-$(CONFIG_X86_32) := $(call cc-option, -m32) cflags-$(CONFIG_X86_64) := -m32 KBUILD_CFLAGS := $(LINUXINCLUDE) -g -Os -D_SETUP -D_WAKEUP -D__KERNEL__ \ -I$(srctree)/$(bootsrc) \ --
I think this works for both; that's what we do for arch/x86/boot. -hpa --
OK, that makes sense. I think I'll let Rafael complete it. -- ~Randy --
OK, so that would be the appended patch. Still, since there are several fixes against the "move the wakeup code to C" patch, I'll probably fold them all into a new version of this patch and resend it. Thanks, Rafael --- arch/x86/kernel/acpi/realmode/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Index: linux-2.6/arch/x86/kernel/acpi/realmode/Makefile =================================================================== --- linux-2.6.orig/arch/x86/kernel/acpi/realmode/Makefile +++ linux-2.6/arch/x86/kernel/acpi/realmode/Makefile @@ -27,8 +27,6 @@ bootsrc := $(src)/../../../boot # How to compile the 16-bit code. Note we always compile for -march=i386, # that way we can complain to the user if the CPU is insufficient. # Compile with _SETUP since this is similar to the boot-time setup code. -cflags-$(CONFIG_X86_32) := -cflags-$(CONFIG_X86_64) := -m32 KBUILD_CFLAGS := $(LINUXINCLUDE) -g -Os -D_SETUP -D_WAKEUP -D__KERNEL__ \ -I$(srctree)/$(bootsrc) \ $(cflags-y) \ @@ -41,6 +39,7 @@ KBUILD_CFLAGS := $(LINUXINCLUDE) -g -Os $(call cc-option, -fno-unit-at-a-time)) \ $(call cc-option, -fno-stack-protector) \ $(call cc-option, -mpreferred-stack-boundary=2) +KBUILD_CFLAGS += $(call cc-option, -m32) KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__ WAKEUP_OBJS = $(addprefix $(obj)/,$(wakeup-y)) --
For a 64 bit build we should error out if the compiler fials to support -m32 (how unlikely that may be). So I would prefer it unconditional for 64 bit. But nit picking - I know. Sam --
But I assume in less obvious way. It is a bit more intuitive to error out on missing -m32 support than gcc failing to support .code16 or some other inline assembler magic. Sam --
No, you will get the message "the selected CPU doesn't support the x86-64 architecture". -hpa --
Hi Andrew, The 2.6.25-rc2-mm1 kernel with randconfig build option, fails to build on x86_64 machine CC drivers/acpi/osl.o drivers/acpi/osl.c:60:38: error: empty filename in #include drivers/acpi/osl.c: In function ‘acpi_os_table_override’: drivers/acpi/osl.c:399: error: ‘AmlCode’ undeclared (first use in this function) drivers/acpi/osl.c:399: error: (Each undeclared identifier is reported only once drivers/acpi/osl.c:399: error: for each function it appears in.) make[2]: *** [drivers/acpi/osl.o] Error 1 make[1]: *** [drivers/acpi] Error 2 make: *** [drivers] Error 2 # # Automatically generated make config: don't edit # Linux kernel version: 2.6.25-rc2-mm1 # Sun Feb 17 08:07:17 2008 # CONFIG_64BIT=y # CONFIG_X86_32 is not set CONFIG_X86_64=y CONFIG_X86=y # CONFIG_GENERIC_LOCKBREAK is not set CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_FAST_CMPXCHG_LOCAL=y CONFIG_MMU=y CONFIG_ZONE_DMA=y # CONFIG_QUICKLIST is not set CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_HWEIGHT=y # CONFIG_GENERIC_GPIO is not set CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_RWSEM_GENERIC_SPINLOCK=y # CONFIG_RWSEM_XCHGADD_ALGORITHM is not set # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ZONE_DMA32=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_AUDIT_ARCH=y CONFIG_ARCH_SUPPORTS_AOUT=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y # CONFIG_KTIME_SCALAR is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General ...
Hi, If you select CONFIG_ACPI_CUSTOM_DSDT=y, you have to set a file path in the option CONFIG_ACPI_CUSTOM_DSDT_FILE="". Best regards, Laura. --
garbage in, garbage out. If you don't give this build option a file name where AmlCode lives, then the build will be unable to find AmlCode[]. http://www.lesswatts.org/projects/acpi/overridingDSDT.php cheers, -Len --
So we have a .config option whose sole purpose is to use another .config option? That seems ... less than ideal. Is there not some Kconfig voodoo we can do to only require the one option? Maybe something like how CONFIG_INITRAMFS_SOURCE is done? Adding Sam to the Cc, in case he has any ideas. Thanks, Nish --
Make sure STANDALONE is y for your randconfig builds. See README for examples. STANALONE is there exactly to prevent the above but we cannot control randconfig. Sam --
Hrm, if this is needed for randconfig to work, perhaps randconfig While setting STANDALONE does fix the above, it doesn't answer the more basic question I had -- do we really need both .config options in this case? If it's simply a case of "That's how it is, won't be fixed, there are higher priorities", that's good enough by me. Just seems a shame that we have an option to enable another option, which is required for the first option to be sensible -- seems like we should only need the second option... Thanks, Nish --
I really do not see what problem you are trying to address. STANDALONE is there as an easy way to turn of the options that requires sensible input to make a kernel compile. And that makes _perfect_ sense when you do randconfig builds. Sam --
Yes it does. As I said above I'm *not* arguing about using STANDALONE for randconfig builds. What I was doing, perhaps unclearly, was asking if there was a real Kconfig need to have both CONFIG_ACPI_CUSTOM_DSDT and CONFIG_ACPI_CUSTOM_DSDT_FILE, when the latter *only* is visible with the former and the former *only* makes sense with the latter. Couldn't we just have CONFIG_ACPI_CUSTOM_DSDT_FILE and check that in the code? Why do we need a boolean option to make another string option available? Thanks, Nish --
Is there a way to generate (in Kconfig language) the boolean CONFIG_ACPI_CUSTOM_DSDT based on whether CONFIG_ACPI_CUSTOM_DSDT_FILE == "" or != "" ? I tried to muck around with that last night but couldn't get it to work. I.e., just present the ACPI_CUSTOM_DSDT_FILE config symbol to the user and then generate the ACPI_CUSTOM_DSDT bool based on the string value. --- ~Randy --
Thanks for re-expressing my question, Randy, this is exactly what I'm wondering. Thanks, Nish --
Something following this example?
config STRING
string
prompt "What string"
default ""
config STRING_IS_NOT_EMPTY
bool
default STRING != ""
But that seems too easy - were you trying to do something
more complex than this?
Sam
--
Yes, that's almost what I had. I used def_bool n on the second config symbol, but the bool value never changed when I changed the string value. I'll be glad to look at it again though. -- ~Randy --
I tested that above in a small Kconfig file and it works as expected. When I set the string to something STRING_IS_NOT_EMPTY is equal to y. Sam --
Let's see what the ACPI people think about this change. Thanks, Sam. --- From: Randy Dunlap <randy.dunlap@oracle.com> Make ACPI_CUSTOM_DSDT boolean config symbol a hidden and derived value, based on the value of ACPI_CUSTOM_DSDT_FILE (string). Only the latter is presented to the user as a config option. This fixes problems with "make randconfig" setting ACPI_CUSTOM_DSDT but leaving ACPI_CUSTOM_DSDT_FILE empty/blank. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> --- drivers/acpi/Kconfig | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) --- linux-2.6.25-rc2-git5.orig/drivers/acpi/Kconfig +++ linux-2.6.25-rc2-git5/drivers/acpi/Kconfig @@ -283,24 +283,23 @@ config ACPI_TOSHIBA If you have a legacy free Toshiba laptop (such as the Libretto L1 series), say Y. -config ACPI_CUSTOM_DSDT - bool "Include Custom DSDT" +config ACPI_CUSTOM_DSDT_FILE + string "Custom DSDT Table file to include" + default "" depends on !STANDALONE - default n help This option supports a custom DSDT by linking it into the kernel. See Documentation/acpi/dsdt-override.txt - If unsure, say N. - -config ACPI_CUSTOM_DSDT_FILE - string "Custom DSDT Table file to include" - depends on ACPI_CUSTOM_DSDT - default "" - help Enter the full path name to the file which includes the AmlCode declaration. + If unsure, don't enter a file name. + +config ACPI_CUSTOM_DSDT + bool + default ACPI_CUSTOM_DSDT_FILE != "" + config ACPI_CUSTOM_DSDT_INITRD bool "Read Custom DSDT from initramfs" depends on BLK_DEV_INITRD --
works for me! applied. thanks, -len ps. CONFIG_ACPI_CUSTOM_DSDT's only use is to guard the use of CONFIG_ACPI_CUSTOM_DSDT_FILE: #ifdef CONFIG_ACPI_CUSTOM_DSDT #include CONFIG_ACPI_CUSTOM_DSDT_FILE #endif we could get rid of it if cpp could so something like #if (CONFIG_ACPI_CUSTOM_DSDT_FILE != "") #include CONFIG_ACPI_CUSTOM_DSDT_FILE #endif but it doesn't look like cpp has a concept of strings in expressions. --
Thanks, the patch solves the build failure.
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
drivers/acpi/Kconfig | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
--- linux-2.6.25-rc2-git5.orig/drivers/acpi/Kconfig
+++ linux-2.6.25-rc2-git5/drivers/acpi/Kconfig
@@ -283,24 +283,23 @@ config ACPI_TOSHIBA
If you have a legacy free Toshiba laptop (such as the Libretto
L1
series), say Y.
-config ACPI_CUSTOM_DSDT
- bool "Include Custom DSDT"
+config ACPI_CUSTOM_DSDT_FILE
+ string "Custom DSDT Table file to include"
+ default ""
depends on !STANDALONE
- default n
help
This option supports a custom DSDT by linking it into the
kernel.
See Documentation/acpi/dsdt-override.txt
- If unsure, say N.
-
-config ACPI_CUSTOM_DSDT_FILE
- string "Custom DSDT Table file to include"
- depends on ACPI_CUSTOM_DSDT
- default ""
- help
Enter the full path name to the file which includes the AmlCode
declaration.
+ If unsure, don't enter a file name.
+
+config ACPI_CUSTOM_DSDT
+ bool
+ default ACPI_CUSTOM_DSDT_FILE != ""
+
config ACPI_CUSTOM_DSDT_INITRD
bool "Read Custom DSDT from initramfs"
depends on BLK_DEV_INITRD
After applying the patch and continuing with the same randconfig
reported earlier, the build fails with following error
drivers/acpi/thermal.c: In function ‘acpi_thermal_init’:
drivers/acpi/thermal.c:1792: error: ‘thermal_dmi_table’ undeclared (first use in this function)
drivers/acpi/thermal.c:1792: error: (Each undeclared identifier is reported only once
drivers/acpi/thermal.c:1792: error: for each function it appears in.)
make[2]: *** [drivers/acpi/thermal.o] Error 1
make[1]: *** [drivers/acpi] Error 2
make: *** [drivers] Error 2
I have tested the patch for build failure only.
Signed-off-by: Kamalesh Babulal ...Le 16.02.2008 09:25, Andrew Morton a
Len: This WARN_ON says that ACPI is trying to call ioremap() on memory that the e820_table lists as "kernel owned". Do you know why ACPI would do this? Would ACPI get upset if the kernel would tell it to take a hike? --
Depends on the BIOS -- as it is the BIOS AML that is making this request. -Len --
On Sun, 17 Feb 2008 23:58:03 -0500 is there any possible valid scenario where the BIOS AML would touch memory the kernel is using? (Since that seems to be what is going on; I'll cook up a diagnostics patch to get more info but the warning gets spewed if ioremap() is trying to map memory the kernel sees as ram) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
Can you try the patch below? It should print a bit more information so that we can
figure out who's really at fault here.. (eg it's either the diagnostics that are
wrong, or ACPI is doing something evil (on behalf of the bios), we need to know
what address is being triggered)
From c346400b372a99a4158fce3ea45234bcf947bdf8 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Mon, 18 Feb 2008 08:01:47 -0800
Subject: [PATCH] More diagnostic output for the ioremap WARN_ON
now that ACPI seems to have triggered this WARN_ON.. we need to know
which address it triggers on to be able to judge a final "who's at fault".
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
arch/x86/mm/ioremap.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 69f4981..524dd45 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -126,7 +126,15 @@ static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
return NULL;
}
- WARN_ON_ONCE(page_is_ram(pfn));
+ for (pfn = phys_addr >> PAGE_SHIFT; pfn < max_pfn_mapped &&
+ (pfn << PAGE_SHIFT) < last_addr; pfn++) {
+ if (page_is_ram(pfn)) {
+ printk(KERN_ERR "ioremap: trying to map RAM page at %lx\n",
+ pfn << PAGE_SHIFT);
+ WARN_ON_ONCE(page_is_ram(pfn));
+ }
+ }
+
switch (mode) {
case IOR_MODE_UNCACHED:
--
1.5.4.1
--
I've got 2 new lines in dmesg output: ACPI: EC: Look up EC in DSDT ioremap: trying to map RAM page at 0 <=============== ------------[ cut here ]------------ WARNING: at arch/x86/mm/ioremap.c:134 __ioremap+0xe8/0x1b6() Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.25-rc2-mm1 #41 [<c0118989>] warn_on_slowpath+0x41/0x6d [<c0130ae6>] ? trace_hardirqs_off+0xb/0xd [<c0116326>] ? runqueue_is_locked+0x23/0x3f [<c0118d31>] ? release_console_sem+0x1be/0x1c6 [<c0119304>] ? vprintk+0x2d0/0x31d [<c01129e6>] __ioremap+0xe8/0x1b6 [<c0112acd>] ioremap_nocache+0xa/0xc [<c02a68a7>] acpi_os_map_memory+0x11/0x1a [<c020b6c7>] acpi_ex_system_memory_space_handler+0xd3/0x228 [<c0203c08>] ? acpi_ev_address_space_dispatch+0x142/0x1a8 [<c020b5f4>] ? acpi_ex_system_memory_space_handler+0x0/0x228 [<c0203c2d>] acpi_ev_address_space_dispatch+0x167/0x1a8 [<c020840d>] acpi_ex_access_region+0x1e4/0x270 [<c02085ec>] acpi_ex_field_datum_io+0x153/0x2a1 [<c0158af4>] ? cache_alloc_debugcheck_after+0xe9/0x165 [<c02087cb>] acpi_ex_extract_from_field+0x91/0x224 [<c0206bff>] ? acpi_ex_read_data_from_field+0x163/0x1b0 [<c0206c1c>] acpi_ex_read_data_from_field+0x180/0x1b0 [<c020d286>] acpi_ex_resolve_node_to_value+0x1aa/0x230 [<c0207a62>] acpi_ex_resolve_to_value+0x270/0x2aa [<c0209e77>] acpi_ex_resolve_operands+0x24e/0x52f [<c0200857>] acpi_ds_exec_end_op+0xb7/0x4f4 [<c0212d81>] acpi_ps_parse_loop+0x5e5/0x79c [<c021210c>] acpi_ps_parse_aml+0xb2/0x2dd [<c021353c>] acpi_ps_execute_method+0x13d/0x20d [<c020fba2>] acpi_ns_evaluate+0x10e/0x1b0 [<c02164fa>] acpi_ut_evaluate_object+0x57/0x1a1 [<c02166fe>] acpi_ut_execute_STA+0x22/0x7b [<c0218d91>] ? acpi_ut_release_mutex+0x85/0x8f [<c020f48d>] acpi_ns_get_device_callback+0x5a/0x121 [<c021176e>] acpi_ns_walk_namespace+0xfa/0x114 [<c020f3b1>] acpi_get_devices+0x47/0x5d [<c020f433>] ? acpi_ns_get_device_callback+0x0/0x121 [<c021ce7a>] ? ec_parse_device+0x0/0x6e [<c03a4460>] acpi_ec_ecdt_probe+0xaa/0x10a ...
actually... it helps a lot. I'll cook up a patch for this now :) thanks for testing so quickly --
we fixed the cause of the machine you quoted; so I suspect yours is different.. Can you get me your stacktrace ? Can you try the patch from this thread to show what memory the offender tries to access ? --
Arjan , sorry for the lag. With your patch from http://marc.info/?l=linux-kernel&m=120336371506283&w=2 I don't have a warning anymore. There are the 2 dmesg's , one from 2.6.25-rc3 and other from 2.6.25-rc3+your patch: http://frugalware.org/~crazy/dmesg/dmesg http://frugalware.org/~crazy/dmesg/dmesg_with_patch It seems I'm not alone with that :) Have a look at http://lkml.org/lkml/2008/2/24/265 Regards, Gabriel --
that is ... odd since it's the same in theory, just with some added printk's ;-( --
With only that part I get the waring again. But I see the printk() twice so I guess this should be in general an WARN_ON() ? http://frugalware.org/~crazy/dmesg/dmesg_with_patch_2 --
I still get this in mainline (todays 2.6.25-rc3-git)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index ac3c959..0a9a616 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -134,7 +134,11 @@ static void __iomem *__ioremap(unsigned long
phys_addr, unsigned long size,
return NULL;
}
- WARN_ON_ONCE(page_is_ram(pfn));
+ if (page_is_ram(pfn)) {
+ printk(KERN_ERR "ioremap: trying to map RAM page at %lx
\n",
+ pfn << PAGE_SHIFT);
+ WARN_ON_ONCE(page_is_ram(pfn));
+ }
switch (mode) {
case IOR_MODE_UNCACHED:
With this diagnostics patch applied, I get the following stacktrace:
[ 23.223587] Allocate Port Service[0000:00:1c.3:pcie03]
[ 23.232536] ioremap: trying to map RAM page at 1000
[ 23.232590] ------------[ cut here ]------------
[ 23.232633] WARNING: at arch/x86/mm/ioremap.c:140 __ioremap
+0x232/0x280()
[ 23.232678] Modules linked in:
[ 23.232744] Pid: 48, comm: kacpid Not tainted
2.6.25-rc3-current-fix.ioremap.warning #1
[ 23.232801]=20
[ 23.232801] Call Trace:
[ 23.232881] [<ffffffff8023b4cf>] warn_on_slowpath+0x5f/0x80
[ 23.232925] [<ffffffff8023c657>] ? printk+0x67/0x70
[ 23.232970] [<ffffffff8038c5f2>] ? acpi_ec_transaction+0x1ec/0x207
[ 23.233015] [<ffffffff802271b2>] __ioremap+0x232/0x280
[ 23.233059] [<ffffffff8022721b>] ioremap_nocache+0xb/0x10
[ 23.233103] [<ffffffff80467792>] acpi_os_map_memory+0x13/0x21
[ 23.233149] [<ffffffff8037d580>] acpi_ex_system_memory_space_handler
+0xd2/0x1c2
[ 23.233204] [<ffffffff8037d4ae>] ?
acpi_ex_system_memory_space_handler+0x0/0x1c2
[ 23.233261] [<ffffffff80376320>] acpi_ev_address_space_dispatch
+0x172/0x1c1
[ 23.233307] [<ffffffff8037a7ff>] acpi_ex_access_region+0x210/0x22d
[ 23.233351] [<ffffffff8037a90b>] acpi_ex_field_datum_io+0xef/0x183
[ 23.233397] [<ffffffff802ad1e2>] ? kmem_cache_alloc+0x82/0xc0
[ 23.233441] ...[cc'd relevant maintainers]
the same here on 2.6.25-rc3, with the innocent ibmphp_access_ebda()
that fires the WARN_ON() in __ioremap() asking for the pfn 0, even
after the page_is_ram() change. With your patch the warning
disappears.
I think this is because the pfn checked by the original code (before
your patch) is the one after the last iteration, while your patch
checks for each pfn that is going to be mapped. The latter should
be the intended behavior. If I've understood the problem, the
(trivial) patch below should fix it.
Also, note that if last_addr is at the beginning of a page we can
__ioremap() normal RAM (in fact we only emit the warning with the
old code, instead of returning NULL.) Is that possible/intended
behavior? If not the loop should do one more iteration.
__ioremap() emits a warning if the pfn after the last one it's going
to map is of normal ram. Correct this and emit the warning (once)
only if one of the asked pages is.
Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
---
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index ac3c959..6f7b158 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -109,7 +109,7 @@ static int ioremap_change_attr(unsigned long vaddr, unsigned long size,
static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
enum ioremap_mode mode)
{
- unsigned long pfn, offset, last_addr, vaddr;
+ unsigned long pfn, offset, last_addr, vaddr, is_ram = 0;
struct vm_struct *area;
pgprot_t prot;
@@ -132,9 +132,10 @@ static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
if (page_is_ram(pfn) && pfn_valid(pfn) &&
!PageReserved(pfn_to_page(pfn)))
return NULL;
+ is_ram |= page_is_ram(pfn);
}
- WARN_ON_ONCE(page_is_ram(pfn));
+ WARN_ON_ONCE(is_ram);
switch (mode) {
case IOR_MODE_UNCACHED:
--
looks good to me; Ingo please apply (Note: if no legit users show up I want to just remove support for mapping ram altogether in 2.6.26 or so) --
well upstream doesnt have the warning anymore, i queued up the patch
below into x86.git#testing.
Ingo
-------------------->
Subject: x86: warn about RAM pages in ioremap()
From: Ingo Molnar <mingo@elte.hu>
Date: Mon Mar 03 09:37:41 CET 2008
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/mm/ioremap.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Index: linux-x86.q/arch/x86/mm/ioremap.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/ioremap.c
+++ linux-x86.q/arch/x86/mm/ioremap.c
@@ -149,9 +149,11 @@ static void __iomem *__ioremap(unsigned
for (pfn = phys_addr >> PAGE_SHIFT;
(pfn << PAGE_SHIFT) < last_addr; pfn++) {
- if (page_is_ram(pfn) && pfn_valid(pfn) &&
- !PageReserved(pfn_to_page(pfn)))
+ int is_ram = page_is_ram(pfn);
+
+ if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)))
return NULL;
+ WARN_ON_ONCE(is_ram);
}
switch (mode) {
--
i mean, x86.git#testing doesnt have the warning anymore. Ingo --
In this way we can emit the warning even for pages that will not be mapped, if asked for more than one page (e.g., one page triggers the warning, one of the following triggers the return NULL condition,) I don't think that would be useful, as the caller will notice the error anyway. I don't know if it's so important, but while we're at it, please consider that, if last_addr % PAGE_SIZE == 0 the for loop exits without checking the last pfn, that will be mapped. --
Tested-by: Laurent Riffard <laurent.riffard@free.fr> With this patch, the WARNING at arch/x86/mm/ioremap.c does not occur anymore. thanks ~~ --
Yes, the patch does help for me too. Thanks=20 Mirco
Hi Andrew, The signals-do_signal_stop-use-signal_group_exit.patch is causing the kernel panic, while booting in to the 2.6.25-rc2-mm1 kernel on x86. There has been discussion on the patch for this panic on http://lkml.org/lkml/2008/2/16/99 [ 25.512919] BUG: unable to handle kernel paging request at 9d74e37b [ 25.514926] IP: [<c04a8fac>] proc_flush_task+0x5b/0x223 [ 25.516934] Oops: 0000 [#1] SMP [ 25.517918] last sysfs file: /sys/block/hdc/removable [ 25.517918] Modules linked in: dm_mirror dm_mod video output sbs sbshc battery ac parport_pc lp parport floppy sg serio_raw ide_cd_mod cdrom scb2_flash mtd chipreg button i2c_piix4 i2c_core pcspkr tg3 mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [ 25.517918] [ 25.517918] Pid: 1, comm: init Not tainted (2.6.25-rc2-mm1-autotest #1) [ 25.517918] EIP: 0060:[<c04a8fac>] EFLAGS: 00010282 CPU: 2 [ 25.517918] EIP is at proc_flush_task+0x5b/0x223 [ 25.517918] EAX: 9d74e35b EBX: f7881ef0 ECX: f7a5ed84 EDX: a56b6b6b [ 25.517918] ESI: f74f76f8 EDI: a56b6b6b EBP: f7881f08 ESP: f7881ec0 [ 25.517918] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 25.517918] Process init (pid: 1, ti=f7881000 task=f788adf0 task.ti=f7881000) [ 25.517918] Stack: f7a5ed84 00000001 f7a5ed58 f7a5ed58 00000000 a56b6b6b f7852750 f7a5ed68 [ 25.517918] 35881ef4 00003130 f788b4a4 f788adf0 0012d6c5 00000003 f7881ee3 00000003 [ 25.517918] f7a72df0 f7a72df0 f7881f1c c042676d 00000003 000001f5 f7a72df0 f7881f78 [ 25.517918] Call Trace: [ 25.517918] [<c042676d>] release_task+0x19/0x2d5 [ 25.517918] [<c042701b>] do_wait+0x5f2/0x8fc [ 25.517918] [<c041d9e7>] default_wake_function+0x0/0xd [ 25.517918] [<c042739d>] sys_wait4+0x78/0x8e [ 25.517918] [<c04273c6>] sys_waitpid+0x13/0x15 [ 25.517918] [<c04039ba>] sysenter_past_esp+0x5f/0x99 [ 25.517918] ======================= [ 25.517918] Code: 1c 89 4d d4 89 4d c4 89 55 b8 e9 b5 01 00 00 31 ff 83 7d c4 00 74 ...
hm, are you sure that signals-do_signal_stop-use-signal_group_exit.patch causes this oops? --
sorry signals-do_signal_stop-use-signal_group_exit.patch is not causing the problem, not sure of what is causing this panic :-( --
Kamalesh, could you send me the output of "objdump -d fs/proc/base.o" ? and just in case, fs/proc/base.s (make fs/proc/base.s). Oleg. --
On Sun, 17 Feb 2008 09:40:33 +0530 I wonder if this one is related. Also with 2.6.25-rc2-mm1 on x86_64: BUG: unable to handle kernel paging request at 0000000000200200 IP: [<ffffffff81043d3c>] free_pid+0x35/0x90 PGD 43c00c067 PUD 43e5f1067 PMD 0 Oops: 0002 [1] SMP last sysfs file: /sys/devices/pnp0/00:0b/id CPU 7 Modules linked in: dm_multipath qla2xxx bnx2 iTCO_wdt iTCO_vendor_support serio_raw rtc_cmos pcspkr watchdog_core scsi_transport_fc watchdog_dev i5000_edac edac_core button dcdbas joydev sg sr_mod cdrom usb_storage ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod shpchp megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 1992, comm: S05kudzu Not tainted 2.6.25-rc2-mm1 #4 RIP: 0010:[<ffffffff81043d3c>] [<ffffffff81043d3c>] free_pid+0x35/0x90 RSP: 0018:ffff81043c895e58 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff81043dd31440 RCX: ffff81043e5ffb08 RDX: 0000000000200200 RSI: 0000000000000046 RDI: 0000000000000000 RBP: ffff81043b9703c0 R08: 0000000000000000 R09: 0000000000000001 R10: ffffffff81043d1a R11: 0000000000000000 R12: ffff81043e5ffac0 R13: 0000000000000000 R14: 0000000000000000 R15: 00000000008cd530 FS: 00007f68f99786f0(0000) GS:ffff81043e7100c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000200200 CR3: 0000000436c1f000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process S05kudzu (pid: 1992, threadinfo ffff81043c894000, task ffff81043b9acb40) Stack: ffff81043dd31440 ffff81043b9703c0 ffff81043c84ae40 ffffffff81035a6d ffff81043b9703c0 0000000000000000 00000000000007cd ffffffff810362b7 ffff81043c895f18 ffffffff81051316 0000000000000000 00007fff01989514 Call Trace: [<ffffffff81035a6d>] ? release_task+0x1be/0x346 [<ffffffff810362b7>] ? do_wait+0x6c2/0xa0e [<ffffffff81051316>] ? trace_hardirqs_on_caller+0xf2/0x115 [<ffffffff8102ac72>] ? ...
Yes, please look at http://marc.info/?t=120309840500006 Btw. The bug in tty_io.c _can_ explain this trace, but it would be nice to ensure we don't have other problems. Could you try this http://marc.info/?l=linux-kernel&m=120352655031911 patch? (I can't understand why this happens at the boot time, and it is not reproducable on my side). Oleg. --
On Wed, 20 Feb 2008 23:04:40 +0300 Fun. With that debugging patch applied, the oops on boot no longer happens... No, I have no idea why... -- All Rights Reversed --
Hello ! It also fixed a oops at boot time for me. Here's a warning I got with oleg patch. Thanks, C. WARNING: at /home/legoater/linux/2.6.25-rc2-mm1/kernel/pid.c:213 put_pid+0x4b/0x82() Modules linked in: tg3 sg joydev ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 1, comm: init Not tainted 2.6.25-rc2-mm1 #6 Call Trace: [<ffffffff8022f0a1>] warn_on_slowpath+0x58/0x85 [<ffffffff8024b87d>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8024b84c>] ? trace_hardirqs_on_caller+0xf2/0x116 [<ffffffff8024b87d>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8024b84c>] ? trace_hardirqs_on_caller+0xf2/0x116 [<ffffffff8023fb0f>] put_pid+0x4b/0x82 [<ffffffff802ccd6e>] ? proc_delete_inode+0x0/0x4f [<ffffffff802ccd90>] proc_delete_inode+0x22/0x4f [<ffffffff802ccd6e>] ? proc_delete_inode+0x0/0x4f [<ffffffff802a0826>] generic_delete_inode+0xb8/0x138 [<ffffffff8029fd5d>] iput+0x7c/0x80 [<ffffffff8029d835>] dentry_iput+0xa3/0xbb [<ffffffff8029d8ea>] d_kill+0x21/0x42 [<ffffffff8029ebee>] dput+0x114/0x125 [<ffffffff802cefe0>] proc_flush_task+0x125/0x28f [<ffffffff8024b87d>] ? trace_hardirqs_on+0xd/0xf [<ffffffff802312a5>] release_task+0x24/0x331 [<ffffffff80231ca1>] do_wait+0x6ef/0xa99 [<ffffffff80227b43>] ? default_wake_function+0x0/0xf [<ffffffff802320dc>] sys_wait4+0x91/0xab [<ffffffff8020b21b>] system_call_after_swapgs+0x7b/0x80 --
On Wed, 20 Feb 2008 14:34:17 -0500 Probably - am testing some locking patches now --
Hi Andrew, The 2.6.25-rc2-mm1 kernel oopses, followed by softlockup several times (have pasted only some of them) on the x86_64 machine. The machine has 4 cpu(s). BUG: unable to handle kernel NULL pointer dereference at 0000000000000219 IP: [<ffffffff802ee99a>] security_inode_getattr+0x4/0x21 PGD 1da947067 PUD 1e1803067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /sys/devices/system/cpu/cpu1/cpufreq/scaling_setspeed CPU 2 Modules linked in: auth_rpcgss exportfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq dm_mirror dm_mod video output sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy tg3 button ide_cd_mod cdrom serio_raw i2c_i801 pcspkr e752x_edac edac_core shpchp i2c_core aic79xx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd [last unloaded: microcode] Pid: 3069, comm: modprobe Not tainted 2.6.25-rc2-mm1-autotest #1 RIP: 0010:[<ffffffff802ee99a>] [<ffffffff802ee99a>] security_inode_getattr+0x4/0x21 RSP: 0018:ffff8101da9e9ea0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8101e1cd7a40 RCX: 0000000000000001 RDX: ffff8101da9e9ef8 RSI: ffff8101e1cd7a40 RDI: ffff8101e5946dc0 RBP: 00000000fffffff7 R08: 0000000000000002 R09: 0000000000000002 R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000 R13: ffff8101da9e9ef8 R14: ffff8101e5946dc0 R15: 000000000061a660 FS: 00007fc33bc746f0(0000) GS:ffff8101e714de40(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000219 CR3: 00000001da894000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 3069, threadinfo ffff8101da9e8000, task ffff8101e51975e0) Stack: ffffffff8028e55d ffff8101e7111300 00000000fffffff7 ffff8101da9e9ef8 0000000000000003 0000000000000001 ffffffff8028e5ca 00007fff43c90120 0000000000618e40 0000000000000000 ffffffff8028e5ec ffffffff8025b7e3 Call Trace: [<ffffffff8028e55d>] ...
Beats me. Looks like we somehow passed a garbage dentry* into security_inode_getattr(). But 0x219? That could be an offset from an accidentally IS_ERR pointer, but sizeof(struct dentry) is only 0xa0 here, so the pointer would have to have a value of -0x139 or less, and that's outside the range of any sane errnos. If it's reproducible then a bisection search would be great, please. --
Hi Andrew, I tried reproducing this panic, but was unsuccessful is reproducing it even after four rounds of try, One of those round i had the following kernel panic BUG: unable to handle kernel paging request at 000000000508fffe IP: [<ffff8101e5d15e44>] PGD 1e382b067 PUD 1e38a9067 PMD 0 Oops: 0002 [1] SMP last sysfs file: /sys/block/hda/removable CPU 3 Modules linked in: dm_mirror dm_mod video output sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg floppy tg3 ide_cd_mod button cdrom serio_raw i2c_i801 e752x_edac shpchp edac_core i2c_core pcspkr aic79xx scsi_transport_spi sd_mod scsi_mod ehci_hcd ohci_hcd uhci_hcd Pid: 0, comm: swapper Not tainted 2.6.25-rc2-mm1-autotest #1 RIP: 0010:[<ffff8101e5d15e44>] [<ffff8101e5d15e44>] RSP: 0018:ffff8101e71dbf08 EFLAGS: 00010282 RAX: ffff8101e408cb00 RBX: ffff81000104175f RCX: ffffffffffffffff RDX: 0000000000000060 RSI: 7fffffffffffffff RDI: ffff8101e408cb00 RBP: ffff8101e5839680 R08: 0000000000000004 R09: 000000000000003c R10: ffff8101e711a4c8 R11: ffff8101e71dbf10 R12: 0000000000000002 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8101e714d640(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000000508fffe CR3: 00000001e5cbf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff8101e71d2000, task ffff8101e71cea70) Stack: ffffffff80260a4c 0000000000000001 ffffffff806800f0 000000000000000a ffffffff80260ada ffffffff806800e0 ffffffff80236f33 ffff8101e71d3e88 0000000000000046 ffff8101e71dbf78 0000000000000000 0000000000000000 Call Trace: <IRQ> [<ffffffff80260a4c>] __rcu_process_callbacks+0x10f/0x17a [<ffffffff80260ada>] rcu_process_callbacks+0x23/0x43 [<ffffffff80236f33>] __do_softirq+0x55/0xc4 [<ffffffff8020cfec>] call_softirq+0x1c/0x28 [<ffffffff8020e677>] do_softirq+0x2c/0x68 ...
ACPI is enabled, but DMI=n. linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c: In function 'acpi_thermal_init': linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c:1792: error: 'thermal_dmi_table' undeclared (first use in this function) linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c:1792: error: (Each undeclared identifier is reported only once linux-2.6.25-rc2-mm1/drivers/acpi/thermal.c:1792: error: for each function it appears in.) make[3]: *** [drivers/acpi/thermal.o] Error 1 --- ~Randy --
Bustage in x86-configurable-dmi-scanning-code.patch. Previously, DMI=y was just hardwired. Now, it becomes selectable and stuff breaks. I guess the DMI=n version of dmi_check_system() could become a macro so we don't emit a reference to its argument, but that might generate unused-variable warnings elsewhere. --
Hi,
Le Sat, 16 Feb 2008 21:44:10 -0800,
Thanks for your report. The issue is that some DMI fixup tables and
callbacks are defined inside #ifdef CONFIG_DMI, some others are not. We
need to normalize that to fix the build issue in all situations.
I've thought about it, and I see two options, but I can't decide which
one is the best, so I request your opinion on that.
1) Remove the #ifdef CONFIG_DMI around DMI fixup tables and callbacks
definition, so that everything exists and gcc is happy. gcc is able
to optimize out the DMI fixup table (it is not present in the binary
when compiling with DMI=3Dn), but gcc doesn't seem to be able to
optimize out the DMI fixup callbacks (they are still present in the
binary). So this would leave some unused code in the binary, which
is not completely satisfying.
2) Define macros such as DECLARE_DMI_FIXUP_TABLE and
DECLARE_DMI_FIXUP_CALLBACK, which could then be used like this:
DECLARE_DMI_FIXUP_CALLBACK(set_bios_reboot, __init, d, {
if (reboot_type !=3D BOOT_BIOS) {
reboot_type =3D BOOT_BIOS;
printk(KERN_INFO "%s series board detected. Selecting BIOS-method for reb=
oots.\n", d->ident);
}
return 0;
});
DECLARE_DMI_FIXUP_TABLE(reboot_dmi_table, __initdata, {
{ /* Handle problems with rebooting on Dell E520's */
.callback =3D set_bios_reboot,
.ident =3D "Dell E520",
.matches =3D {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell DM061"),
},
}
});
And use them everywhere, so that DMI fixup tables and callbacks
are properly compiled out when DMI=3Dn. Here are the macro definition:
#ifdef CONFIG_DMI
#define DECLARE_DMI_FIXUP_TABLE(name, opts, contents...) \
static struct dmi_system_id opts name [] =3D contents
#define DECLARE_DMI_FIXUP_CALLBACK(name, opts, id, contents...) \
static int opts name(const struct dmi_system_id *id) contents
#else
#define DECLARE_DMI_FIXUP_TABLE(name, opts, contents...)
#define ...Option 3 wold be to add more #ifdef CONFIG_DMI lines around the place. How ugly would that get? --
Le Mon, 18 Feb 2008 04:13:40 -0800, Like the attached patch. #ifdef CONFIG_DMI everywhere :-( Sincerly, Thomas --- Turn CONFIG_DMI into a selectable option if EMBEDDED is defined, in order to be able to remove the DMI table scanning code if it's not needed, and then reduce the kernel code size. The DMI code users are modified, so that they either depend on CONFIG_DMI (for the drivers who really need DMI to work) or their DMI-related code is enclosed in #ifdef CONFIG_DMI. With CONFIG_DMI (i.e before) : text data bss dec hex filename 1076076 128656 98304 1303036 13e1fc vmlinux Without CONFIG_DMI (i.e after) : text data bss dec hex filename 1068092 126308 98304 1292704 13b9a0 vmlinux Result: text data bss dec hex filename -7984 -2348 0 -10332 -285c vmlinux The new option appears in "Processor type and features", only when CONFIG_EMBEDDED is defined. This patch is part of the Linux Tiny project, and is based on previous work done by Matt Mackall <mpm@selenic.com>. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> --- arch/x86/Kconfig | 13 ++++++++++--- arch/x86/kernel/acpi/boot.c | 4 ++-- arch/x86/kernel/acpi/sleep_32.c | 2 ++ arch/x86/kernel/apm_32.c | 2 ++ arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c | 4 ++-- arch/x86/kernel/cpu/cpufreq/powernow-k7.c | 3 ++- arch/x86/kernel/io_delay.c | 2 ++ arch/x86/kernel/reboot.c | 2 ++ arch/x86/kernel/tsc_32.c | 2 ++ arch/x86/mach-generic/bigsmp.c | 3 ++- arch/x86/pci/acpi.c | 2 ++ arch/x86/pci/common.c | 2 ++ arch/x86/pci/fixup.c | 5 ++++- arch/x86/pci/irq.c | 2 ++ drivers/acpi/sleep/main.c | 2 ++ ...
Does this patch apply to -mm? Seem like No. After converting it from mime(?) to ASCII and fixing one #if (change "and" to "&&") & fixing patch rejects, it does build cleanly. --- ~Randy --
Le Tue, 19 Feb 2008 09:41:47 -0800, Probably due to my PGP-MIME signature. Will try to remember that The rejects are probably due to the patch being applied to -mm. It applies fine on -rc here. Any opinion about whether the patch is clean ? Worth it ? Thanks for testing the patch, Thomas -- Thomas Petazzoni, Free Electrons Free Embedded Linux Training Materials on http://free-electrons.com/training (More than 1500 pages!) --
It seems reasonable to me as long as the option depends on EMBEDDED, as it does. -- ~Randy --
ug, sorry, if I'd realised it was like this I'd have said "don't bother". Apart from the obvious problem, this means that people will keep breaking CONFIG_DMI=n all the time, because they will forget the ifdefs, and the number of people who test with CONFIG_DMI=n will be small. --
Le Tue, 19 Feb 2008 15:21:29 -0800, Yes, #ifdef CONFIG_DMI is not very comfortable. That why I proposed things such as DECLARE_DMI_FIXUP_TABLE(), because it would force people to use these macros, which would then be working correctly depending on DMI=3Dy/n. However, there's still the issue of driver_data that I mentionned in my earlier post. What should I do ? Option 1 ? Option 2 ? Give up with the patch ? Thanks for your comments, Thomas --=20 Thomas Petazzoni, Free Electrons Free Embedded Linux Training Materials on http://free-electrons.com/training (More than 1500 pages!)
Option 1 would be best, I think:
1) Remove the #ifdef CONFIG_DMI around DMI fixup tables and callbacks
definition, so that everything exists and gcc is happy. gcc is able
to optimize out the DMI fixup table (it is not present in the binary
when compiling with DMI=n), but gcc doesn't seem to be able to
optimize out the DMI fixup callbacks (they are still present in the
binary). So this would leave some unused code in the binary, which
is not completely satisfying.
gcc _should_ be able to remove the callbacks as long as they are static and
have no references. If even the latest gcc versions are still incluing the
unreferenced, static function in the final vmlinux then let's get gcc fixed?
--
When SMP=n, x86_64 build gets: arch/x86/kernel/built-in.o: In function `acpi_save_state_mem': (.text+0xfd7f): undefined reference to `setup_trampoline' make[1]: *** [.tmp_vmlinux1] Error 1 --- ~Randy --
Thanks. Say hello to Pavel. --
Sorry, I was in mountains, with electricity 5 hours a day... I believe this was solved already? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
I've sent a patch, it's being tested. Thanks, Rafael --
Please try the patch at http://lkml.org/lkml/2008/2/16/301 . Thanks, Rafael --
Yes, fixed. Thanks. -- ~Randy --
Andrew,
This patch, by introducing sizeof(long) in the BITS_TO_LONGS
math, changes BITS_TO_LONGS from an int to a unsigned long. We noticed
because this printk in fs/ocfs2/dlm/dlmdomain.c:
mlog(ML_ERROR,
"map_size %u != BITS_TO_LONGS(O2NM_MAX_NODES) %u\n",
map_size, BITS_TO_LONGS(O2NM_MAX_NODES));
now gives this warning:
fs/ocfs2/dlm/dlmdomain.c:938: warning: format '%u' expects type
'unsigned int', but argument 7 has type 'long unsigned int'
We can tweak the printk once the patch goes to Linus, no worries. I
just wanted to send a heads up in case the size change affects anything
else.
Joel
--
There are morethings in heaven and earth, Horatio,
Than are dreamt of in your philosophy.
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
--
When building on x86_64. I forgot that bit :-) Joel -- Life's Little Instruction Book #444 "Never underestimate the power of a kind word or deed." Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 --
Guys, create_proc_entry() is slightly racy in case of modular code and proc_create() was invented to fix it. Eventually all create_proc_entry() users will be converted to proc_create(), so please do it for new code. --
The first two patches are -mm-only debug patches - they won't be going into mainline. I'll make a note that cciss-procfs-updates-to-display-info-about-many-volumes.pach needs updating, thanks. --
profile-likely-unlikely-macros isn't new code, but I can still conver it to proc_create() if you like.. Daniel --
Why am I getting "implicit declaration of function 'proc_create'" when I try to use that function? I have kernel version 2.6.24, gcc version 4.1.1 20070105 (Red Hat 4.1.1-51). Am I missing a header file? -- mikem --
2.6.24 is 47MB of diff ago. Please always develop and test against development kernels! --
CIFS has some build problems: linux-2.6.25-rc2-mm1/fs/cifs/cifs_debug.c:922: error: static declaration of 'cifs_proc_init' follows non-static declaration linux-2.6.25-rc2-mm1/fs/cifs/cifsproto.h:112: error: previous declaration of 'cifs_proc_init' was here linux-2.6.25-rc2-mm1/fs/cifs/cifs_debug.c:926: error: static declaration of 'cifs_proc_clean' follows non-static declaration linux-2.6.25-rc2-mm1/fs/cifs/cifsproto.h:113: error: previous declaration of 'cifs_proc_clean' was here make[3]: *** [fs/cifs/cifs_debug.o] Error 1 .config is attached. --- ~Randy
Thanks for spotting this - it only would happen if CONFIG_PROC_FS is disabled. I have fixed it in the cifs-2.6.git tree so should be fine next time akpm pulls. -- Thanks, Steve --
Building i386 kernel on x86_64, I see a build error in linking: kernel/built-in.o: In function `jiffies_64_to_usecs': (.text+0xeaed): undefined reference to `__udivdi3' make[1]: *** [.tmp_vmlinux1] Error 1 .config attached. --- ~Randy
It's possible to config a specific CPU and also enable Intel MCE checks and AMD MCE checks, ending with this: arch/x86/kernel/built-in.o: In function `smp_thermal_interrupt_init': mce_amd_64.c:(.text+0xd7ea): undefined reference to `num_k8_northbridges' mce_amd_64.c:(.text+0xd800): undefined reference to `k8_northbridges' mce_amd_64.c:(.text+0xd863): undefined reference to `k8_northbridges' mce_amd_64.c:(.text+0xd888): undefined reference to `k8_northbridges' mce_amd_64.c:(.text+0xd8b1): undefined reference to `num_k8_northbridges' mce_amd_64.c:(.text+0xd8d8): undefined reference to `num_k8_northbridges' mce_amd_64.c:(.text+0xd8f3): undefined reference to `k8_northbridges' mce_amd_64.c:(.text+0xd917): undefined reference to `num_k8_northbridges' make[1]: *** [.tmp_vmlinux1] Error 1 .config file is attached. --- ~Randy
same config no problem with mainline. YH --
That's x86-amd-thermal-interrupt-support.patch failing with
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--
cciss driver has a bad macro definition: #else /* no CONFIG_CISS_SCSI_TAPE */ /* If no tape support, then these become defined out of existence */ #define cciss_scsi_setup(cntl_num) #define cciss_unregister_scsi(ctlr) #define cciss_register_scsi(ctlr) #define cciss_seq_tape_report(struct seq_file *seq, int ctlr) #endif /* CONFIG_CISS_SCSI_TAPE */ which causes this error: In file included from /local/linsrc/linux-2.6.25-rc2-mm1/drivers/block/cciss.c:231: /local/linsrc/linux-2.6.25-rc2-mm1/drivers/block/cciss_scsi.c:1498:38: error: macro parameters must be comma-separated make[3]: *** [drivers/block/cciss.o] Error 1 --- ~Randy --
I don't think I've seen anyone else report this, but if I'm wrong, I'm sure someone will point me to the thread. This is during boot up, and doesn't seem to have any effect on the running system, that I can tell. [ 0.090840] ------------[ cut here ]------------ [ 0.090920] WARNING: at kernel/lockdep.c:2677 check_flags+0x8d/0x12d() [ 0.090986] Pid: 1, comm: swapper Not tainted 2.6.25-rc2-mm1 #49 [ 0.090986] [<c011b92d>] warn_on_slowpath+0x41/0x72 [ 0.090986] [<c0136ff1>] ? __lock_acquire+0xb99/0xbb5 [ 0.090986] [<c0140632>] ? ftrace_record_ip+0x11e/0x17a [ 0.090986] [<c0102bd0>] ? mcount_call+0x5/0x9 [ 0.090986] [<c01344f0>] ? check_chain_key+0xe/0x16a [ 0.090986] [<c0140632>] ? ftrace_record_ip+0x11e/0x17a [ 0.090986] [<c01dd1a8>] ? debug_locks_off+0x8/0x3c [ 0.090986] [<c0102bd0>] ? mcount_call+0x5/0x9 [ 0.090986] [<c01182c4>] ? sub_preempt_count+0xa/0xb0 [ 0.090986] [<c03453bd>] ? _spin_unlock_irqrestore+0x47/0x5d [ 0.090986] [<c0140632>] ? ftrace_record_ip+0x11e/0x17a [ 0.090986] [<c0102bd0>] ? mcount_call+0x5/0x9 [ 0.090986] [<c0134442>] check_flags+0x8d/0x12d [ 0.090986] [<c0137043>] lock_acquire+0x36/0x82 [ 0.090986] [<c03440c2>] down_write+0x2d/0x48 [ 0.090986] [<c015e91a>] ? kmem_cache_create+0x21/0x1a7 [ 0.090986] [<c015e91a>] kmem_cache_create+0x21/0x1a7 [ 0.090986] [<c0449642>] filelock_init+0x23/0x2c [ 0.090986] [<c016d908>] ? init_once+0x0/0x11 [ 0.090986] [<c043d697>] kernel_init+0xb6/0x203 [ 0.090986] [<c0102dae>] ? restore_nocheck_notrace+0x0/0xe [ 0.090986] [<c043d5e1>] ? kernel_init+0x0/0x203 [ 0.090986] [<c043d5e1>] ? kernel_init+0x0/0x203 [ 0.090986] [<c01038d3>] kernel_thread_helper+0x7/0x10 [ 0.090986] ======================= [ 0.090986] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.090986] possible reason: unannotated irqs-on. [ 0.090986] irq event stamp: 1404 [ 0.090986] hardirqs last enabled at (1403): ...
Sorry, here it is. # # Automatically generated make config: don't edit # Linux kernel version: 2.6.25-rc2-mm1 # Sun Feb 17 13:12:53 2008 # # CONFIG_64BIT is not set CONFIG_X86_32=y # CONFIG_X86_64 is not set CONFIG_X86=y # CONFIG_GENERIC_LOCKBREAK is not set CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_FAST_CMPXCHG_LOCAL=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y # CONFIG_GENERIC_GPIO is not set CONFIG_ARCH_MAY_HAVE_PC_FDC=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y # CONFIG_GENERIC_TIME_VSYSCALL is not set CONFIG_ARCH_HAS_CPU_RELAX=y # CONFIG_HAVE_SETUP_PER_CPU_AREA is not set CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y # CONFIG_ZONE_DMA32 is not set CONFIG_ARCH_POPULATES_NODE_MAP=y # CONFIG_AUDIT_ARCH is not set CONFIG_ARCH_SUPPORTS_AOUT=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_X86_BIOS_REBOOT=y CONFIG_KTIME_SCALAR=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y # CONFIG_SYSVIPC is not set # CONFIG_POSIX_MQUEUE is not set # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 # CONFIG_CGROUPS is not set CONFIG_GROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y # CONFIG_RT_GROUP_SCHED is not ...
This is a multi-part message in MIME format. --------------090505060502050700020809 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Ok, so here's my story on 2.6.25-rc2-mm1: Built fine on my Pentium D in 32 bit mode, booted too, although complaining once already while unpacking the initramfs: <0>[ 0.069176] BUG: spinlock bad magic on CPU#0, swapper/0 <0>[ 0.069324] lock: c2c19480, .magic: 00000000, .owner: swapper/0, .= owner_cpu: 0 <4>[ 0.069559] Pid: 0, comm: swapper Not tainted 2.6.25-rc2-mm1-testin= g #1 <4>[ 0.069710] [spin_bug+129/140] spin_bug+0x81/0x8c <4>[ 0.069907] [_raw_spin_unlock+30/118] _raw_spin_unlock+0x1e/0x76 <4>[ 0.069997] [_spin_unlock+34/65] _spin_unlock+0x22/0x41 <4>[ 0.070194] [mnt_want_write+103/138] mnt_want_write+0x67/0x8a <4>[ 0.070390] [sys_mkdirat+139/219] sys_mkdirat+0x8b/0xdb <4>[ 0.070584] [clean_path+27/79] ? clean_path+0x1b/0x4f <4>[ 0.070829] [trace_hardirqs_on+11/13] ? trace_hardirqs_on+0xb/0xd <4>[ 0.071185] [sys_mkdir+21/23] sys_mkdir+0x15/0x17 <4>[ 0.071378] [do_name+279/440] do_name+0x117/0x1b8 <4>[ 0.071570] [write_buffer+34/49] write_buffer+0x22/0x31 <4>[ 0.071763] [flush_window+105/184] flush_window+0x69/0xb8 <4>[ 0.071996] [unpack_to_rootfs+1585/2238] unpack_to_rootfs+0x631/0x= 8be <4>[ 0.072192] [trace_hardirqs_on_caller+248/301] ? trace_hardirqs_on= _caller+0xf8/0x12d <4>[ 0.072440] [restore_nocheck_notrace+0/16] ? restore_nocheck_notra= ce+0x0/0x10 <4>[ 0.072689] [populate_rootfs+37/270] populate_rootfs+0x25/0x10e <4>[ 0.072886] [alternative_instructions+344/349] ? alternative_instr= uctions+0x158/0x15d <4>[ 0.073139] [start_kernel+840/858] start_kernel+0x348/0x35a <4>[ 0.073335] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D Still, X came up fine, I could log in (Gnome feeling subjectively a bit sluggish), call up a web page from the Internet in Firefox, and start perusing ...
I guess the cause for this is a combination of preemtible RCU and conntrack using RCU since 2.6.25-rc. Using NF_CT_STAT_INC_ATOMIC should fix it, but I'd prefer to have a fix that doesn't increase overhead when regular RCU is used. I'll see if I can find a better way to fix this tommorrow. --
Could you test whether this patch fixes the netfilter warnings please?
On Thu, 21 Feb 2008 12:28:50 +0100 Use rcu_read_lock instead. local_bh_disable() won't work with some of the other forms of RCU alternatives. --
The caller already calls rcu_read_lock(). This is for the per-cpu statistics. --
de warning Yes, it does; and the system also survives substantially longer. (IOW, it hasn't crashed on me so far.) Thanks, T. --=20 Tilman Schmidt E-Mail: tilman@imap.cc Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)
Which of course it did the second after I sent off that mail. :-( No message at all this time at the time of the crash, even though I had "tail -f /var/log/messages" running in an ssh session. So the nf_conntrack BUG is fixed, but the crash (and of course the swapper "spinlock bad magic" BUG) persists. --=20 Tilman Schmidt E-Mail: tilman@imap.cc Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)
Do you have CONFIG_DEBUG_PREEMPT set? That would help find any other bugs similar to nf_conntrack. Thanx, Paul --
CONFIG_DEBUG_PREEMPT=3Dy was set but didn't produce anything. Or perhaps it did and the message just didn't make it to the disk. Time to set up a test with netconsole, I guess. --=20 Tilman Schmidt E-Mail: tilman@imap.cc Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)
Bad news: With 2.6.25-rc3, that bug has made it into mainline. Good news: Your patch fixes it there, too. So I suggest you forward it there as soon as possible. Thanks, Tilman --=20 Tilman Schmidt E-Mail: tilman@imap.cc Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)
Already done, should hit upstream soon. --
(net-related cc's removed) This look like a startup ordering bug in mnt_want_write(). --
Do you have CONFIG_ACPI_CUSTOM_DSDT_INITRD set? --
Negative. HTH T. --=20 Tilman Schmidt E-Mail: tilman@imap.cc Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)
Let me look into it a bit. Although, it does seem that this stuff is just calling into the filesystem code too early. The mnt_writers[] spinlocks are init'd with a: fs_initcall(init_mnt_writers); and populate_rootfs() is supposed to happen in a rootfs_initcall() so I'm a bit confused how it happened in this order. -- Dave --
