[BUILD_FAILURE] 2.6.25-rc5-mm1 build fails at startup_ipi_hook() with randconfig

Previous thread: [PATCH: RESEND] UIO: UIO interface to the SMX Cryptengine by Ben Nizette on Tuesday, March 11, 2008 - 1:17 am. (2 messages)

Next thread: none
From: Andrew Morton
Date: Tuesday, March 11, 2008 - 1:14 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc5/2.6.25-rc5-mm1/

- Added the kgdb tree as git-kgdb-light (Jason Wessel, Ingo Molnar)

- Added a random-security-stuff-apart-from-selinux tree as
  git-security-testing (James Morris)

- suspend-to-disk is still busted on my x86_64 t61p (git-x86, iirc)



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.  These probably are at least compilable.

- More-than-daily -mm snapshots may be found at
  http://userweb.kernel.org/~akpm/mmotm/.  These are almost certainly not
  compileable.



Changes since 2.6.25-rc3-mm1:

 ...
From: Kamalesh Babulal
Date: Tuesday, March 11, 2008 - 3:16 am

Hi Andrew,

The 2.6.25-rc5-mm1 kernel build fails with allyesconfig

  LD      .tmp_vmlinux1
fs/built-in.o: In function `reiser4_debugtrap':
/root/kernels/linux-2.6.25-rc5/fs/reiser4/debug.c:295: undefined reference to `breakpoint'
make: *** [.tmp_vmlinux1] Error 1

This build failure has been introduced by reiser4.patch, i think the
breakpoint() have been used instead of kgdb_breakpoint().


--- linux-2.6.25-rc5/fs/reiser4/debug.c	2008-03-11 22:12:45.000000000 +0530
+++ linux-2.6.25-rc5/fs/reiser4/~debug.c	2008-03-11 23:14:54.000000000 +0530
@@ -291,8 +291,8 @@ void reiser4_debugtrap(void)
 {
 	/* do nothing. Put break point here. */
 #if defined(CONFIG_KGDB) && !defined(CONFIG_REISER4_FS_MODULE)
-	extern void breakpoint(void);
-	breakpoint();
+	extern void kgdb_breakpoint(void);
+	kgdb_breakpoint();
 #endif
 }
 #endif
-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Edward Shishkin
Date: Tuesday, March 11, 2008 - 3:56 am

wow, kgdb is enabled again..

Thanks!
Edward


--

From: Kamalesh Babulal
Date: Tuesday, March 11, 2008 - 5:55 am

Hi Andrew,

The 2.6.25-rc5-mm1 kernel build fails with allmodconfig 

  MODPOST 2279 modules
ERROR: "probe_4drives" [drivers/ide/ide-core.ko] undefined!

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
From: Andrew Morton
Date: Tuesday, March 11, 2008 - 10:41 am

On Tue, 11 Mar 2008 18:25:02 +0530

Yes, it has been doing this for a while.  But apparently it doesn't happen
for Bart with just his patch queue.  <slightlypeeved>If your subssytem
fails in my tree, that doesn't automatically make it my
problem</slightlypeeved>

I'll take a look, see what went wrong.
--

From: Bartlomiej Zolnierkiewicz
Date: Tuesday, March 11, 2008 - 12:35 pm

No need to peeve - it is not like I forget about the problem or ignored it
after I got your _first_ mail.

I just couldn't reproduce it here so instead I has been working on making
_all_ probe_* variables static (it should make the problem vanish alongside
some other nice improvements).

Thanks,
Bart
--

From: Andrew Morton
Date: Tuesday, March 11, 2008 - 11:19 am

On Tue, 11 Mar 2008 18:25:02 +0530

Caused by ide-mm-ide-add-ide-4drives-host-driver-take-3.patch.  Applying
that patch alone to current mainline causes the above error after i386
`make allmodconfig'.

Just exporting the symbol doesn't fix it, so something funny is going on.

probe_4drives should not be initialised to zero.

probe_4drives should not be declared extern in drivers/ide/ide.c - please
declare it in a header which is included by the definition site and by all
users.
--

From: Bartlomiej Zolnierkiewicz
Date: Tuesday, March 11, 2008 - 12:36 pm

I was aware of the warnings and this was only temporary (it is already fixed
by to-be-posted-today patch which removes deprecated "idex=" kernel parameters
and makes _all_ probe_* variables static).

Thanks,
Bart
--

From: Randy Dunlap
Date: Tuesday, March 11, 2008 - 10:09 am

randconfig (x86_64) with
PCI=n
PARAVIRT=y
VSMP=n

ends with

arch/x86/kernel/built-in.o: In function `is_vsmp_box':
(.text+0x1178d): undefined reference to `early_pci_allowed'
arch/x86/kernel/built-in.o: In function `is_vsmp_box':
(.text+0x117a9): undefined reference to `read_pci_config'
arch/x86/kernel/built-in.o: In function `vsmp_init':
(.init.text+0x4fcc): undefined reference to `early_pci_allowed'
arch/x86/kernel/built-in.o: In function `vsmp_init':
(.init.text+0x501a): undefined reference to `read_pci_config'
make[1]: *** [.tmp_vmlinux1] Error 1

config attached.

---
~Randy
From: Jeremy Fitzhardinge
Date: Tuesday, March 11, 2008 - 11:18 am

Randy Dunlap wrote:


    J
--

From: Ravikiran G Thirumalai
Date: Tuesday, March 11, 2008 - 5:10 pm

Would anyone have objection to have PARAVIRT depend on PCI, since the
vsmp paravirt bits depend on PCI cfg space to determine if the system is
vsmp?   If not, this patch would suffice.

Glauber?

Thanks,
Kiran

---

Make PARAVIRT depend on PCI.

vSMP PARAVIRT ops probe the pci config space to determine if the
system is indeed a ScaleMP vSMP box.  Hence, depend on PCI to enable
PARAVIRT.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>

Index: linux-2.6.24/arch/x86/Kconfig
===================================================================
--- linux-2.6.24.orig/arch/x86/Kconfig	2008-03-11 16:38:26.000000000 -0700
+++ linux-2.6.24/arch/x86/Kconfig	2008-03-11 16:50:52.000000000 -0700
@@ -384,7 +384,7 @@ source "arch/x86/lguest/Kconfig"
 
 config PARAVIRT
 	bool "Enable paravirtualization code"
-	depends on !(X86_VISWS || X86_VOYAGER)
+	depends on !(X86_VISWS || X86_VOYAGER) && PCI
 	help
 	  This changes the kernel so it can modify itself when it is run
 	  under a hypervisor, potentially improving performance significantly
--

From: Randy Dunlap
Date: Tuesday, March 11, 2008 - 6:42 pm

Works for me.
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>



-- 
~Randy
--

From: Jeremy Fitzhardinge
Date: Tuesday, March 11, 2008 - 6:51 pm

NAK.  Xen doesn't depend on PCI at all.   Why not make VSMP depend on 
PCI?  Then you could put something like:

#ifdef CONFIG_X86_VSMP
extern void vsmp_init(void);
extern int is_vsmp_box(void);
#else
static inline void vsmp_init(void)
{
}

static inline int is_vsmp_box(void)
{
	return 0;
}
#endif


in an appropriate header.

Hm, looks like arch/x86/kernel/Makefile should be

obj-$(CONFIG_X86_VSMP)		+= vsmp_64.o


rather than making it depend directly on CONFIG_PARAVIRT.

    J
--

From: Ingo Molnar
Date: Wednesday, March 12, 2008 - 12:14 am

hm, that's not a good idea - there's nothing in lguest, Xen and even KVM 
that is inherently tied to PCI.

	Ingo
--

From: serge
Date: Tuesday, March 11, 2008 - 1:23 pm

Compiles and boots perfectly on s390 here.

thanks,
--

From: Andrew Morton
Date: Tuesday, March 11, 2008 - 1:39 pm

whee.

Things are going much much more smoothly now than they were in 2.6.24-rcX
and 2.6.23-rcX.  Tree integration problems are negligible and build errors
are far fewer and runtime problems seem to be less too.   Fingers crossed.

I guess this is due to a combinaton of

a) linux-next

b) intensive whining and

c) extra care which maintainers are taking (due to a) and b))


I suspect that fewer people are testing linux-next and -mm nowadays.  We
should encourage them to do so, although given the general
trainwreckishness of current mainline, this isn't really where our effort
should be expended.
--

From: Torsten Kaiser
Date: Wednesday, March 12, 2008 - 12:33 pm

On Tue, Mar 11, 2008 at 9:39 PM, Andrew Morton

2.6.25-rc3-mm1 worked nicely for me, but 2.6.25-rc5-mm1 does not boot.

dmesg:
[    0.000000] Linux version 2.6.25-rc5-mm1 (root@treogen) (gcc
version 4.2.3 (Gentoo 4.2.3 p1.0)) #1 SMP Wed Mar 12 19:51:41 CET 2008
[    0.000000] Command line: earlyprintk=serial,ttyS0,115200
console=ttyS0,115200 console=tty1 crypt_root=/dev/md1 sata_nv.swncq=1
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[    0.000000]  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000dffd0000 (usable)
[    0.000000]  BIOS-e820: 00000000dffd0000 - 00000000dffde000 (ACPI data)
[    0.000000]  BIOS-e820: 00000000dffde000 - 00000000e0000000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)
[    0.000000]  BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
[    0.000000] console [earlyser0] enabled
[    0.000000] end_pfn_map = 1179648
[    0.000000] DMI present.
[    0.000000] ACPI: RSDP 000FB080, 0024 (r2 ACPIAM)
[    0.000000] ACPI: XSDT DFFD0100, 0064 (r1 A_M_I_ OEMXSDT   4000713
MSFT       97)
[    0.000000] ACPI: FACP DFFD0290, 00F4 (r3 A_M_I_ OEMFACP   4000713
MSFT       97)
[    0.000000] ACPI: DSDT DFFD0450, 4FC5 (r1  S0027 S0027000        0
INTL 20051117)
[    0.000000] ACPI: FACS DFFDE000, 0040
[    0.000000] ACPI: APIC DFFD0390, 0080 (r1 A_M_I_ OEMAPIC   4000713
MSFT       97)
[    0.000000] ACPI: MCFG DFFD0410, 003C (r1 A_M_I_ OEMMCFG   4000713
MSFT       97)
[    0.000000] ACPI: OEMB DFFDE040, 0060 (r1 A_M_I_ AMI_OEM   4000713
MSFT       97)
[    0.000000] ACPI: HPET DFFD5420, 0038 (r1 A_M_I_ OEMHPET0  4000713
MSFT       97)
[    0.000000] ACPI: MCFG ...
From: Andrew Morton
Date: Wednesday, March 12, 2008 - 12:44 pm

On Wed, 12 Mar 2008 20:33:02 +0100

So you aren't using netconsole.  I had a series of hangs yesterday which
went away when netconsole was disabled.  I think netconsole is still

OK, so it looks like it died during networking initialisation.

Could you please add initcall_debug to the boot command line so we can see
which function it is getting stuck in?

--

From: Torsten Kaiser
Date: Wednesday, March 12, 2008 - 1:01 pm

On Wed, Mar 12, 2008 at 8:44 PM, Andrew Morton


Yes, here is the result:
[    2.573979] PCI-DMA: Disabling AGP.
[    2.577639] PCI-DMA: aperture base @ 8000000 size 65536 KB
[    2.589504] PCI-DMA: using GART IOMMU.
[    2.593258] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
[    2.600132] initcall pci_iommu_init+0x0/0x20() returned 0 after 19 msecs
[    2.622146] calling  hpet_late_init+0x0/0x140()
[    2.626689] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 31
[    2.633022] hpet0: 3 32-bit timers, 25000000 Hz
[    2.638562] initcall hpet_late_init+0x0/0x140() returned 0 after 9 msecs
[    2.654545] calling  clocksource_done_booting+0x0/0x20()
[    2.659855] initcall clocksource_done_booting+0x0/0x20()<6>Time:
hpet clocksource has been installed.
[    2.662185]  returned 0 after 0 msecs
[    2.688448] calling  init_pipe_fs+0x0/0x60()
[    2.695423] initcall init_pipe_fs+0x0/0x60() returned 0 after 0 msecs
[    2.705784] calling  init_mnt_writers+0x0/0x70()
[    2.711681] initcall init_mnt_writers+0x0/0x70() returned 0 after 0 msecs
[    2.721678] calling  eventpoll_init+0x0/0x90()
[    2.731644] initcall eventpoll_init+0x0/0x90() returned 0 after 0 msecs
[    2.738295] calling  anon_inode_init+0x0/0x130()
[    2.751614] initcall anon_inode_init+0x0/0x130() returned 0 after 0 msecs
[    2.771585] calling  pcie_aspm_init+0x0/0x30()
[    2.779297] initcall pcie_aspm_init+0x0/0x30() returned 0 after 2 msecs
[    2.793911] calling  acpi_event_init+0x0/0x52()

-> it looked like the system this time already hung here. But just
pressing the 'Alt' key let the system continue until the network hang.
(I tried this a second time, again it paused here until I pressed a key)

[   94.857929] initcall acpi_event_init+0x0/0x52() returned 0 after 29276 msecs
[   94.865002] calling  pnp_system_init+0x0/0x20()
[   94.877935] system 00:06: ioport range 0x4d0-0x4d1 has been reserved
[   94.884286] system 00:06: ioport range 0x7b0-0x7df has been reserved
[   94.897886] system 00:06: ...
From: Torsten Kaiser
Date: Thursday, March 13, 2008 - 3:05 pm

On Wed, Mar 12, 2008 at 9:01 PM, Torsten Kaiser

CONFIG_PCIEASPM does not change anything.
Also testing the range of ipc patches you suggested to Badari did not fix it.

I did a bisect, these patches are currently remaining, but I dod not
have the time for more bisect steps until tomorrow:

git-scsi-misc
git-sh
execute-tasklets-in-the-same-order-they-were-queued
git-sched
sched: work around hrtick related lockup
sched: make sure jiffies is up to date before calling __update_rq_clock()
sched: fix rq->clock overflows detection with CONFIG_NO_HZ
sched: make cpu_clock() globally synchronous
sched: remove isolcpus
ftrace: make the task state char-string visible to all
sched: add latency tracer callbacks to the scheduler
latencytop: optimize LT_BACKTRACEDEPTH loops a bit
sched: cleanup old and rarely used 'debug' features.
[SCSI] zfcp: convert zfcp to use target reset and device reset handler
[SCSI] qla4xxx: Add target reset functionality
[SCSI] scsi_error: add target reset handler
[SCSI] ps3rom: Simplify fill_from_dev_buffer()
[SCSI] scsi_debug: use shost_priv macro
[SCSI] scsi_debug: remove unnecessary checking
[SCSI] scsi_debug: remove scsi_debug.h
[SCSI] scsi_debug: stop including drivers/scsi/scsi.h
[SCSI] Remove random noop unchecked_isa_dma users
[SCSI] aacraid: READ_CAPACITY_16 shouldn't trust allocation length in cdb
[SCSI] st: show options currently set in sysfs
[SCSI] st: add option to use SILI in variable block reads
[SCSI] gdth: remove command accessors
[SCSI] aic94xx: Use sas_request_addr() to provide SAS WWN if the
adapter lacks one
[SCSI] libsas: Provide a transport-level facility to request SAS addrs
[SCSI] ips: sg chaining support to the path to non I/O commands
[SCSI] gdth: convert to PCI hotplug API
[SCSI] gdth: PCI probe cleanups, prep for PCI hotplug API conversion
rtc: rtc-sh: Add support for periodic IRQs.
sh: SuperH KEYSC keypad data for Solution Engine 7722
sh: SuperH KEYSC keypad data for MigoR
sh: SuperH KEYSC platform driver

Torsten
--

From: Andrew Morton
Date: Thursday, March 13, 2008 - 3:35 pm

On Thu, 13 Mar 2008 23:05:11 +0100



--

From: Badari Pulavarty
Date: Thursday, March 13, 2008 - 4:10 pm

Yes. I found the following patch to be the culprit.

sched: make sure jiffies is up to date before calling __update_rq_clock
()

Torsten, looking at your output, it looks like it hung at the same
place. Backing out this patch should help. Try it out. I am sure
you also have CONFIG_DETECT_SOFTLOCKUP=y in your config ?


commit 60befbc1c0b6d141c9c26e61ddd303aedd1e7396
Author: Guillaume Chazarain <guichaz@yahoo.fr>
Date:   Mon Mar 10 08:16:41 2008 +0100

    sched: make sure jiffies is up to date before calling
__update_rq_clock()

    Now that __update_rq_clock() uses jiffies to detect clock overflows,
    make sure jiffies are up to date before touch_softlockup_watchdog().

    Removed a touch_softlockup_watchdog() call becoming redundant with
the
    added tick_nohz_update_jiffies().

    Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/kernel/sched.c b/kernel/sched.c
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -66,6 +66,7 @@
 #include <linux/unistd.h>
 #include <linux/pagemap.h>
 #include <linux/hrtimer.h>
+#include <linux/tick.h>
 
 #include <asm/tlb.h>
 #include <asm/irq_regs.h>
@@ -913,7 +914,7 @@ void sched_clock_idle_wakeup_event(u64 delta_ns)
        rq->prev_clock_raw = now;
        rq->clock += delta_ns;
        spin_unlock(&rq->lock);
-       touch_softlockup_watchdog();
+       tick_nohz_update_jiffies();
 }
 EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event);

Thanks,
Badari

--

From: Ingo Molnar
Date: Friday, March 21, 2008 - 5:12 am

thanks Badari, i've backed out this patch.

	Ingo
--

From: Dave Young
Date: Tuesday, March 11, 2008 - 6:14 pm

Hi, I got the following lockdep warning:
(add linux-acpi to cc)

[    0.097109] ACPI: Core revision 20070126
[    0.097282] INFO: trying to register non-static key.
[    0.097355] the code is fine but needs lockdep annotation.
[    0.097428] turning off the locking correctness validator.
[    0.097503] Pid: 0, comm: swapper Not tainted 2.6.25-rc5-mm1 #3
[    0.097578]  [<c0127bf8>] ? printk+0x18/0x20
[    0.097716]  [<c014b01c>] __lock_acquire+0x40c/0x760
[    0.097822]  [<c0181ba0>] ? alloc_debug_processing+0xb0/0x140
[    0.097959]  [<c014b969>] lock_acquire+0x79/0xb0
[    0.098063]  [<c0140204>] ? down_trylock+0x14/0x40
[    0.098197]  [<c03df9e8>] _spin_lock_irqsave+0x48/0xa0
[    0.098303]  [<c0140204>] ? down_trylock+0x14/0x40
[    0.098436]  [<c0140204>] down_trylock+0x14/0x40
[    0.098540]  [<c027c7ea>] acpi_os_wait_semaphore+0x3e/0xb9
[    0.098647]  [<c029263e>] acpi_ut_acquire_mutex+0x34/0x72
[    0.098753]  [<c0289ab1>] acpi_ns_root_initialize+0x19/0x250
[    0.098859]  [<c05453e6>] acpi_initialize_subsystem+0x42/0x64
[    0.098966]  [<c0545725>] acpi_early_init+0x50/0xef
[    0.099070]  [<c052b7f6>] start_kernel+0x1e6/0x250
[    0.099175]  [<c052b1a0>] ? unknown_bootoption+0x0/0x130
[    0.099310]  [<c052b008>] __init_begin+0x8/0x10
[    0.099414]  =======================
--

From: Laurent Riffard
Date: Wednesday, March 12, 2008 - 12:21 am

The kernel won't build if CONFIG_NO_HZ=y and CONFIG_PREEMPT_RCU=y:

$ grep -e PREEMPT -e HZ .config
CONFIG_NO_HZ=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_RCU=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_DEBUG_PREEMPT=y
$
$ make
...
  CC      init/main.o
In file included from include/linux/rcupdate.h:60,
                 from include/linux/rculist.h:11,
                 from include/linux/dcache.h:9,
                 from include/linux/fs.h:279,
                 from include/linux/proc_fs.h:6,
                 from init/main.c:15:
include/linux/rcupreempt.h: In function 'rcu_enter_nohz':
include/linux/rcupreempt.h:91: error: 'HZ' undeclared (first use in this function)
include/linux/rcupreempt.h:91: error: (Each undeclared identifier is reported only once
include/linux/rcupreempt.h:91: error: for each function it appears in.)
include/linux/rcupreempt.h: In function 'rcu_exit_nohz':
include/linux/rcupreempt.h:99: error: 'HZ' undeclared (first use in this function)
make[1]: *** [init/main.o] Error 1
make: *** [init] Error 2
$

At first glance, I would suspect these patches:
add-warn_on_secs-macro.patch
use-warn_on_secs-in-rcupreempth.patch

~~
laurent
--

From: Andrew Morton
Date: Wednesday, March 12, 2008 - 12:44 am

hm, it works OK for me, but I don't have your full config.

This, I guess:

--- a/include/asm-generic/bug.h~add-warn_on_secs-macro-fix-fix
+++ a/include/asm-generic/bug.h
@@ -2,7 +2,7 @@
 #define _ASM_GENERIC_BUG_H
 
 #include <linux/compiler.h>
-
+#include <linux/param.h>
 
 #ifdef CONFIG_BUG
 
_

--

From: Laurent Riffard
Date: Wednesday, March 12, 2008 - 2:32 pm

Yes it does work, thanks.

But it does hang on boot. I'm unable to get any information with 
Sysrq-keys. I'll start a bisection to narrow this problem.

I attached my .config FWIW.
~~
laurent

-------
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.25-rc5-mm1
# Wed Mar 12 08:03:47 2008
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
# CONFIG_HAVE_SETUP_PER_CPU_AREA is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not ...
From: Tilman Schmidt
Date: Wednesday, March 12, 2008 - 4:43 pm

Works fine here with those very same settings, except for a BUG message
and a warning I'll report separately and which don't seem to have any
serious consequences. This is a 32 bit build on a rather ordinary
Pentium D/Intel motherboard/openSUSE 10.3 workstation.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Kamalesh Babulal
Date: Wednesday, March 12, 2008 - 2:17 am

Hi Andrew,

The 2.6.25-rc5-mm1 kernel build fails with randconfig compile

  CC      arch/x86/kernel/asm-offsets.s
In file included from include/asm/irqflags.h:59,
                 from include/linux/irqflags.h:46,
                 from include/asm/system.h:11,
                 from include/asm/processor.h:21,
                 from include/asm/atomic_32.h:5,
                 from include/asm/atomic.h:2,
                 from include/linux/crypto.h:20,
                 from arch/x86/kernel/asm-offsets_32.c:7,
                 from arch/x86/kernel/asm-offsets.c:2:
include/asm/paravirt.h: In function 
From: Kamalesh Babulal
Date: Wednesday, March 12, 2008 - 5:55 am

Hi Andrew,

The 2.6.25-rc5-mm1 kernel panics while bootup on powerpc

returning from prom_init
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc00000000000d5dc
cpu 0x0: Vector: 300 (Data Access) at [c0000000007636e0]
    pc: c00000000000d5dc: .do_IRQ+0x74/0x1f4
    lr: c00000000000d5a8: .do_IRQ+0x40/0x1f4
    sp: c000000000763960
   msr: 8000000000001032
   dar: 0
 dsisr: 40000000
  current = 0xc000000000688e60
  paca    = 0xc000000000689900
    pid   = 0, comm = swapper
enter ? for help
[c000000000763a00] c000000000004c24 hardware_interrupt_entry+0x24/0x28
--- Exception: 501 (Hardware Interrupt) at c0000000006021b0 .free_bootmem_core+0x94/0xcc
[link register   ] c00000000060373c .free_bootmem_with_active_regions+0x78/0xb8
[c000000000763cf0] c000000000602610 .init_bootmem_core+0x5c/0xfc (unreliable)
[c000000000763d80] c0000000005eb68c .do_init_bootmem+0x964/0xaf0
[c000000000763e50] c0000000005e03b0 .setup_arch+0x1a4/0x218
[c000000000763ee0] c0000000005d76bc .start_kernel+0xe8/0x424
[c000000000763f90] c000000000008590 .start_here_common+0x60/0xd0

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

From: Andrew Morton
Date: Wednesday, March 12, 2008 - 10:46 am

Beats me.  Maybe we're still enabling interrupts too early.  But the new
semaphore code got fixed (didn't it?)

--

From: Matthew Wilcox
Date: Wednesday, March 12, 2008 - 10:51 am

On the 7th, according to my records.  Easy to check -- look in
kernel/semaphore.c and see whether down() is using spin_lock_irqsave
(good) or spin_lock_irq (bad).

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Michael Ellerman
Date: Wednesday, March 12, 2008 - 3:26 pm

down() looks OK, but there's still a spin_lock_irq() in __down_common(),
although I don't know if it makes sense for us to be in __down() at that
stage.

cheers

--=20
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
From: Matthew Wilcox
Date: Wednesday, March 12, 2008 - 3:33 pm

The spin_lock_irq in __down_common is correct.  We're going to schedule(),
so we spin_unlock_irq() to save us passing the flags into the helper
function.  If we had interrupts disabled on entry, there's an Aieee
for that.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Kamalesh Babulal
Date: Thursday, March 13, 2008 - 6:02 am

Hi All,

Sorry for all the noise made :-(, something wrong in the test setup from my end, 
the kernel was 2.6.25-rc3-mm1 not 2.6.25-rc5-mm1. This bug is not seen in the
2.6.25-rc5-mm1 kernel.

-- 
Thanks & Regards,
Kamalesh Babulal,
--

From: Benjamin Herrenschmidt
Date: Wednesday, March 12, 2008 - 1:40 pm

Won't lockdep/irqtrace warn if that happens ? You don't yet have the
lockdep patches for ppc64 (I'm still trying to find out why they break
iSeries) but it should warn of such a spurrious IRQ enable on other
archs too... At least, from a quick look at the code, it -seems- that it
does have such a test.

Cheers,
Ben.


--

From: Badari Pulavarty
Date: Wednesday, March 12, 2008 - 11:14 am

Is this only on one machine ? happens all the time ?

I ran into similar issues on rc3-mm1. rc5-mm1 seems to be working fine
for me on ppc64.

Thanks,
Badari

--

From: Badari Pulavarty
Date: Wednesday, March 12, 2008 - 11:10 am

I am having trouble booting rc5-mm1 on my x86_64. (ppc64 boots & works
fine). Seems to be a networking issues (hangs on boot). Here are the 
messages on the console (not really useful to me).

On a good kernel (rc5), the next set of messages would be ..

IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 524288 bind 65536)
TCP reno registered
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 13864 for ipc namespace ffffffff806903a0
..

Sorry for not being really useful here. But would like to
know if its a known issue ? Or should I start bisecting ?

Thanks,
Badari

Linux version 2.6.25-rc5-mm1 (root@elm3b29) (gcc version 4.1.0 (SUSE Linux)) #1 SMP Wed Mar 12 12:27:14 PDT 2008
Command line: root=/dev/hda2 vga=0x314  crashkernel=64M@16M selinux=0   console=tty0 console=ttyS0,38400 resume=/dev/hda1 resume=/dev/hda1  splash=silent showopts
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000ca000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000dfef0000 (usable)
 BIOS-e820: 00000000dfef0000 - 00000000dfeff000 (ACPI data)
 BIOS-e820: 00000000dfeff000 - 00000000dff00000 (ACPI NVS)
 BIOS-e820: 00000000dff00000 - 00000000e0000000 (usable)
 BIOS-e820: 00000000fec00000 - 00000000fec00400 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 00000001e0000000 (usable)
end_pfn_map = 1966080
DMI 2.3 present.
ACPI: RSDP 000F6970, 0024 (r2 PTLTD )
ACPI: XSDT DFEFC625, 003C (r1 PTLTD      XSDT    6040000  LTP        0)
ACPI: FACP DFEFED02, 00F4 (r3 AMD    HAMMER    ...
From: Andrew Morton
Date: Wednesday, March 12, 2008 - 11:15 am

Would be good, please.

I guess here:

#
# ipc
#
ipc-use-ipc_buildid-directly-from-ipc_addid.patch
ipc-use-ipc_buildid-directly-from-ipc_addid-cleanup.patch
#
ipc-scale-msgmni-to-the-amount-of-lowmem.patch
ipc-scale-msgmni-to-the-number-of-ipc-namespaces.patch
ipc-define-the-slab_memory_callback-priority-as-a-constant.patch
ipc-recompute-msgmni-on-memory-add--remove.patch
ipc-invoke-the-ipcns-notifier-chain-as-a-work-item.patch
ipc-recompute-msgmni-on-ipc-namespace-creation-removal.patch
ipc-do-not-recompute-msgmni-anymore-if-explicitly-set-by-user.patch
ipc-re-enable-msgmni-automatic-recomputing-msgmni-if-set-to-negative.patch
#
ipc-semaphores-code-factorisation.patch
ipc-shared-memory-introduce-shmctl_down.patch
ipc-message-queues-introduce-msgctl_down.patch
ipc-semaphores-move-the-rwmutex-handling-inside-semctl_down.patch
ipc-semaphores-remove-one-unused-parameter-from-semctl_down.patch
ipc-get-rid-of-the-use-_setbuf-structure.patch
ipc-introduce-ipc_update_perm.patch
ipc-consolidate-all-xxxctl_down-functions.patch
ipc-consolidate-all-xxxctl_down-functions-fix.patch

would be the place to start looking.
--

From: Badari Pulavarty
Date: Thursday, March 13, 2008 - 10:09 am

Hi Andrew,

Finally narrowed down the problem to git-sched.patch in rc5-mm1.
I am going to try which individual patch in that git caused my
amd64 boot hang.

Peter, Ingo - here are the boot messages on the console. Any ideas ?
config file attached.

Thanks,
Badari

Linux version 2.6.25-rc5 (root@elm3b29) (gcc version 4.1.0 (SUSE Linux)) #11 SMP Thu Mar 13 12:28:17 PDT 2008
Command line: root=/dev/hda2 vga=0x314  crashkernel=64M@16M selinux=0   console=tty0 console=ttyS0,38400 resume=/dev/hda1 resume=/dev/hda1  splash=silent showopts
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000ca000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000dfef0000 (usable)
 BIOS-e820: 00000000dfef0000 - 00000000dfeff000 (ACPI data)
 BIOS-e820: 00000000dfeff000 - 00000000dff00000 (ACPI NVS)
 BIOS-e820: 00000000dff00000 - 00000000e0000000 (usable)
 BIOS-e820: 00000000fec00000 - 00000000fec00400 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 00000001e0000000 (usable)
end_pfn_map = 1966080
DMI 2.3 present.
ACPI: RSDP 000F6970, 0024 (r2 PTLTD )
ACPI: XSDT DFEFC625, 003C (r1 PTLTD      XSDT    6040000  LTP        0)
ACPI: FACP DFEFED02, 00F4 (r3 AMD    HAMMER    6040000 PTEC    F4240)
ACPI: DSDT DFEFC661, 262D (r1 AMD-K8  AMDACPI  6040000 MSFT  100000D)
ACPI: FACS DFEFFFC0, 0040
ACPI: SRAT DFEFEDF6, 0160 (r1 AMD    HAMMER    6040000 AMD         1)
ACPI: APIC DFEFEF56, 00AA (r1 PTLTD      APIC    6040000  LTP        0)
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: PXM 2 -> APIC 2 -> Node 2
SRAT: PXM 3 -> APIC 3 -> Node 3
SRAT: Node 0 PXM 0 0-a0000
SRAT: Node 0 PXM 0 0-e0000000
SRAT: Node 0 PXM 0 0-180000000
SRAT: PXM 1 (100000000-1a0000000) overlaps with PXM 0 (0-180000000)
SRAT: SRAT not used.
Scanning NUMA ...
From: Badari Pulavarty
Date: Thursday, March 13, 2008 - 10:40 am

Further narrowed it down to following patch in git-sched.patch.
When I back out this patch from rc5-mm1, my amd64 box boots fine.

commit 60befbc1c0b6d141c9c26e61ddd303aedd1e7396
Author: Guillaume Chazarain <guichaz@yahoo.fr>
Date:   Mon Mar 10 08:16:41 2008 +0100

    sched: make sure jiffies is up to date before calling
__update_rq_clock()

    Now that __update_rq_clock() uses jiffies to detect clock overflows,
    make sure jiffies are up to date before touch_softlockup_watchdog().

    Removed a touch_softlockup_watchdog() call becoming redundant with
the
    added tick_nohz_update_jiffies().

    Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/kernel/sched.c b/kernel/sched.c
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -66,6 +66,7 @@
 #include <linux/unistd.h>
 #include <linux/pagemap.h>
 #include <linux/hrtimer.h>
+#include <linux/tick.h>
 
 #include <asm/tlb.h>
 #include <asm/irq_regs.h>
@@ -913,7 +914,7 @@ void sched_clock_idle_wakeup_event(u64 delta_ns)
        rq->prev_clock_raw = now;
        rq->clock += delta_ns;
        spin_unlock(&rq->lock);
-       touch_softlockup_watchdog();
+       tick_nohz_update_jiffies();
 }
 EXPORT_SYMBOL_GPL(sched_clock_idle_wakeup_event);


Thanks,
Badari

--

From: Guillaume Chazarain
Date: Thursday, March 13, 2008 - 10:55 am

I didn't know this patch could prevent booting, but anyway it should
have been removed a long time ago:
http://lkml.org/lkml/2008/1/25/408

Thanks.

-- 
Guillaume
--

From: Badari Pulavarty
Date: Thursday, March 13, 2008 - 11:20 am

I don't know whats happening either, but my debug shows that 
tick_nohz_update_jiffies() always returns due to following
check without calling touch_softlockup_watchdog().

        if (!ts->tick_stopped)
                return;

BTW, I have CONFIG_DETECT_SOFTLOCKUP=y in my config.


Thanks,
Badari        

--

From: Tilman Schmidt
Date: Wednesday, March 12, 2008 - 4:54 pm

5/2.6.25-rc5-mm1/

This still complains during startup:

<6>[    0.063442] Checking 'hlt' instruction... OK.
<0>[    0.068233] BUG: spinlock bad magic on CPU#0, swapper/0
<0>[    0.068996]  lock: c2c19380, .magic: 00000000, .owner: swapper/0, .=
owner_cpu: 0
<4>[    0.069227] Pid: 0, comm: swapper Not tainted 2.6.25-rc5-mm1-testin=
g #1
<4>[    0.069369]  [spin_bug+124/135] spin_bug+0x7c/0x87
<4>[    0.069563]  [_raw_spin_unlock+25/113] _raw_spin_unlock+0x19/0x71
<4>[    0.069752]  [_spin_unlock+29/60] _spin_unlock+0x1d/0x3c
<4>[    0.069941]  [mnt_want_write+98/136] mnt_want_write+0x62/0x88
<4>[    0.070131]  [sys_mkdirat+134/214] sys_mkdirat+0x86/0xd6
<4>[    0.070322]  [clean_path+22/74] ? clean_path+0x16/0x4a
<4>[    0.070558]  [kfree+216/236] ? kfree+0xd8/0xec
<4>[    0.070793]  [sys_mkdir+16/18] sys_mkdir+0x10/0x12
<4>[    0.070995]  [do_name+274/435] do_name+0x112/0x1b3
<4>[    0.071184]  [write_buffer+29/44] write_buffer+0x1d/0x2c
<4>[    0.071371]  [flush_window+100/179] flush_window+0x64/0xb3
<4>[    0.071558]  [unpack_to_rootfs+1580/2233] unpack_to_rootfs+0x62c/0x=
8b9
<4>[    0.071747]  [populate_rootfs+32/265] populate_rootfs+0x20/0x109
<4>[    0.071995]  [alternative_instructions+339/344] ? alternative_instr=
uctions+0x153/0x158
<4>[    0.072235]  [start_kernel+835/853] start_kernel+0x343/0x355
<4>[    0.072422]  [i386_start_kernel+8/10] i386_start_kernel+0x8/0xa
<4>[    0.072610]  =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
<6>[    0.072808] Unpacking initramfs... done

System comes up fine, though. Not sure whom to CC.
Machine's a dual-core Pentium D running a 32 bit kernel.
Let me know if you want me to provide more information or test anything.

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Andrew Morton
Date: Wednesday, March 12, 2008 - 5:04 pm

On Thu, 13 Mar 2008 00:54:43 +0100


I thought we already fixed this, actually.  Maybe we just talked about it
a bit?
--

From: Dave Hansen
Date: Thursday, March 13, 2008 - 2:48 pm

I'm really confused by this one.  It looks to me like the initcalls got
all out of whack in their ordering.  There's no way in hell that the
populate_rootfs() call should be happening right next to cpu

If you can send me your vmlinux (not vmlinuz), I'll see how the
initcalls are laid out in it.  What distro and compiler are you on?

-- Dave

--

From: Dave Hansen
Date: Thursday, March 13, 2008 - 1:46 pm

Hi Tim,

Could you send me your full dmesg along with your kernel .config?  i
think this is an ordering issue in bootup, but I'd like to be sure.
Bonus points if I can also have your initrd. :)

-- Dave

--

From: Tilman Schmidt
Date: Thursday, March 13, 2008 - 5:35 pm

Dave,



Ok, you asked for it. Find the lot at:

http://gollum.phnxsoft.com/~ts/linux/

I guess you know what to expect, sizewise. :-)


openSUSE 10.3 and the toolchain it brought along, including GCC 4.2.1.

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Dave Hansen
Date: Friday, March 14, 2008 - 11:03 am

Tim, thanks for the excellent debugging info.  It's making this much
easier.  I actually booted your vmlinux in a kvm image and I got the
same error.  However, it is in a *completely* bogus place.  Certainly
before initcalls get run and before the lock you saw the BUG_ON() for
got initialized. 

I'm going to go try and find a gcc-4.2 and compile on that.

Andrew, I don't think this is an actual bug in the r/o bind mount code,
but a random, way early call to populate_rootfs(), somehow.  I'll keep
looking into it, though.

Notice in the dmesg that this all happens even before the first initcall
is made?  It isn't the SMP alternates freeing, at least.  I booted this
on SMP, too, with the same results. 

[    0.074025] Intel machine check architecture supported.
[    0.075013] Intel machine check reporting enabled on CPU#0.
[    0.076010] Compat vDSO mapped to ffffe000.
[    0.078030] Checking 'hlt' instruction... OK.
[    0.083987] SMP alternatives: switching to UP code
[    0.084023] Freeing SMP alternatives: 9k freed
[    0.086006] BUG: spinlock bad magic on CPU#0, swapper/0
[    0.087002]  lock: c1754380, .magic: 00000000, .owner: swapper/0, .owner_cpu: 0
[    0.088003] Pid: 0, comm: swapper Not tainted 2.6.25-rc5-mm1-testing #2
[    0.089001]  [<c01f728c>] spin_bug+0x7c/0x87
[    0.091002]  [<c01f72b0>] _raw_spin_unlock+0x19/0x71
[    0.093001]  [<c0301922>] _spin_unlock+0x1d/0x3c
[    0.095001]  [<c01981aa>] mnt_want_write+0x62/0x88
[    0.097000]  [<c018c382>] sys_mkdirat+0x86/0xd6
[    0.098695]  [<c04260ab>] ? clean_path+0x16/0x4a
[    0.100000]  [<c017fd6f>] ? kfree+0xd8/0xec
[    0.101999]  [<c018c3e2>] sys_mkdir+0x10/0x12
[    0.103999]  [<c0426353>] do_name+0x112/0x1b3
[    0.104999]  [<c042558b>] write_buffer+0x1d/0x2c
[    0.106999]  [<c04255fe>] flush_window+0x64/0xb3
[    0.108998]  [<c04272f5>] unpack_to_rootfs+0x62c/0x8b9
[    0.111000]  [<c0127d76>] ? printk+0x15/0x17
[    0.112671]  [<c0118982>] ? free_init_pages+0x82/0x8d
[    0.113998]  [<c04275a2>] ...
From: Dave Hansen
Date: Friday, March 14, 2008 - 1:06 pm

For those of you new to this thread, here's the initial report:

	http://marc.info/?t=120536629300001&r=1&w=2

I'm pretty sure the root cause of this bug is this commit:

	ACPI: basic initramfs DSDT override support
	71fc47a9adf8ee89e5c96a47222915c5485ac437

Which did this hunk:
        
        @@ -648,6 +654,7 @@ asmlinkage void __init start_kernel(void)
         
                check_bugs();
         
        +       populate_rootfs(); /* For DSDT override from initramfs
        */
                acpi_early_init(); /* before LAPIC and SMP init */
         
                /* Do the rest non-__init'ed, we're now alive */
        	rest_init();
        ...

Well, the fs initcalls aren't actually done until during rest_init(),
including initializing my mnt_writer[] spinlocks.  I guess I could
statically initialize them, but that's not the root of the problem, it's
just the canary in the coal mine.

I think the populate_rootfs() call is completely bogus and certainly
can't be done before the initcalls.  But, I don't immediately have any
better suggestions for you.  Can you delay the ACPI init until after the
fs initcalls are made?

-- Dave

--

From: Linus Torvalds
Date: Friday, March 14, 2008 - 1:20 pm

Time to just revert that one? It caused some other issues too, iirc. 

Len?

		Linus
--

From: Eric Piel
Date: Friday, March 14, 2008 - 1:51 pm

Hi,

I have made a patch to fix problems with regards to early userspace 
calls (http://lkml.org/lkml/2008/2/23/306) but I don't think it will 
solve this bug. So far I had not heard of problems with filesystem 
initialization.

I'm not sure it would be possible to delay acpi_early_init() until after 
the fs initcalls. Maybe Len knows. How about trying the opposite: what 
is the barely minimum to initialize so that the rootfs can be populated 
and read? Would it be possible to have a kind of 
early_mnt_writer_initialize() that would do that?

See you,
Eric
--

From: Dave Hansen
Date: Friday, March 14, 2008 - 2:35 pm

I *can* probably do it earlier, maybe even statically, but I think
you're missing the point a bit here.  We've just been super lucky so far
that populate_rootfs() doesn't depend on any other initcalls (or at
least BUG_ON() because of them).  There may be some more buglets hiding
around.

It'd be a shame to have to have "super_early_fs_initcall()" logic for
every part of the VFS or any other initcall for that matter that you
might need.  How do we tell all future VFS hackers that they have to do
this so that the next guy doesn't break it?  I certainly missed it. :)

We could separate out the initcalls and just have the fs ones run before
the rest do.  But, I'm not sure what interactions *THAT* might have.
There are arch-specific initcalls, and I have no idea if the fs init
code depends on *those*.  That's a lot of code to check.

It is nailed when you the patch says:

+       /*
+        * Never do this at home, only the user-space is allowed to open a file.
+        * The clean way would be to use the firmware loader. But this code must be run
+        * before there is any userspace available. So we need a static/init firmware
+        * infrastructure, which doesn't exist yet...
+        */

I think requiring FS access this early in the boot processes is just
broken.  It seems like the author of the patch knew a better way and
tried to get away with a hack.  I think it backfired. :)

-- Dave

--

From: Eric Piel
Date: Friday, March 14, 2008 - 3:50 pm

Actually, each time I look at init/main.c I feel like we are super lucky 
Well, my point was that actually populate_rootfs() does _very_ little 
with regard to FS manipulation, acpi_find_dsdt_initrd() even less. The 
task of checking that everything needed is available beforehand is 
certainly not the same magnitude as the one of the Danaides as you 
seemed to implied ;-)

The fact is, this patch has been tested a lot, because it's been used by 
several distributions for a long time. I expect that the only potential 
I'm actually the author of this comment... The static/init firmware 
infrastructure that I mentioned was more just about a way to hide the fs 
access in a special part of the kernel, not avoiding it. We used to have 
a different way but it was even uglier: append the DSDT after the 
initramfs, and then access it _directly_. This implies teaching 
populate_rootfs() to not panic when seeing DSDTs and loosing the benefit 
of the compression.

That said, I'm really not against any complete different approach. All 
that is needed is being able to read a file early at boot (the DSDT) 
without having to recompile the kernel each time the file is modified. 
For instance someone had once mentioned modifying the in-kernel DSDT by 
unlinking and relinking the bzimage. If one can show me how to do that 
I'd be happy to implement it...

Eric
--

From: Dave Hansen
Date: Friday, March 14, 2008 - 4:29 pm

The problem is defining how much "very little" is, and making sure that
all the other kernel developers agree with you on it.

Anyway, I'm sick of too much bitching and too little coding.  Andrew,
here's a patch for -mm that will at least shut up the spinlock warnings.
Al, you'll also need something similar to this for when you get Linus to
pull your git tree that has the r/o bind mount patches. 
It's a hack, but I don't know any better way to do it until the ACPI
mess gets cleaned up.

Arjan, is there a way to statically set lockdep classes for a spinlock
that I'm missing?

I'll leave it to everyone else to describe the evils of calling into
*any* fs code before the fs initcalls have been made. 

-- Dave


I'm not happy with this patch, but I don't see an easier way
to do it.  We can't statically initialize the lockdep classes
as far as I can see.

---

 linux-2.6.git-dave/fs/namespace.c        |    3 +--
 linux-2.6.git-dave/include/linux/mount.h |    1 +
 linux-2.6.git-dave/init/main.c           |    8 ++++++++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff -puN fs/namei.c~robind-statically-initialize-locks fs/namei.c
diff -puN fs/namespace.c~robind-statically-initialize-locks fs/namespace.c
--- linux-2.6.git/fs/namespace.c~robind-statically-initialize-locks	2008-03-14 16:12:44.000000000 -0700
+++ linux-2.6.git-dave/fs/namespace.c	2008-03-14 16:16:43.000000000 -0700
@@ -158,7 +158,7 @@ struct mnt_writer {
 } ____cacheline_aligned_in_smp;
 static DEFINE_PER_CPU(struct mnt_writer, mnt_writers);
 
-static int __init init_mnt_writers(void)
+int __init init_mnt_writers(void)
 {
 	int cpu;
 	for_each_possible_cpu(cpu) {
@@ -169,7 +169,6 @@ static int __init init_mnt_writers(void)
 	}
 	return 0;
 }
-fs_initcall(init_mnt_writers);
 
 static void unlock_mnt_writers(void)
 {
diff -puN init/main.c~robind-statically-initialize-locks init/main.c
--- linux-2.6.git/init/main.c~robind-statically-initialize-locks	2008-03-14 16:13:02.000000000 -0700
+++ ...
From: Tilman Schmidt
Date: Saturday, March 15, 2008 - 5:47 am

=2E

Sorry to say, it doesn't. That is, it does shut up the warning I
reported, but there's a new one appearing now instead, three lines
later. Here's the dmesg diff:

@@ -216,29 +216,30 @@
 CPU0: Thermal monitoring enabled
 Compat vDSO mapped to ffffe000.
 Checking 'hlt' instruction... OK.
-BUG: spinlock bad magic on CPU#0, swapper/0
- lock: c2c19380, .magic: 00000000, .owner: swapper/0, .owner_cpu: 0
-Pid: 0, comm: swapper Not tainted 2.6.25-rc5-mm1-testing #2
- [<c01f728c>] spin_bug+0x7c/0x87
- [<c01f72b0>] _raw_spin_unlock+0x19/0x71
- [<c0301922>] _spin_unlock+0x1d/0x3c
- [<c01981aa>] mnt_want_write+0x62/0x88
- [<c018c382>] sys_mkdirat+0x86/0xd6
- [<c04260ab>] ? clean_path+0x16/0x4a
- [<c017fd6f>] ? kfree+0xd8/0xec
- [<c018c3e2>] sys_mkdir+0x10/0x12
- [<c0426353>] do_name+0x112/0x1b3
- [<c042558b>] write_buffer+0x1d/0x2c
- [<c04255fe>] flush_window+0x64/0xb3
- [<c04272f5>] unpack_to_rootfs+0x62c/0x8b9
- [<c04275a2>] populate_rootfs+0x20/0x109
- [<c0429ed2>] ? alternative_instructions+0x153/0x158
- [<c04248f5>] start_kernel+0x343/0x355
- [<c0424019>] i386_start_kernel+0x8/0xa
- =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 Unpacking initramfs... done
-Freeing initrd memory: 8767k freed
+Freeing initrd memory: 8834k freed
 ACPI: Core revision 20070126
+INFO: trying to register non-static key.
+the code is fine but needs lockdep annotation.
+turning off the locking correctness validator.
+Pid: 0, comm: swapper Not tainted 2.6.25-rc5-mm1-testing #3
+ [<c014321e>] __lock_acquire+0x144/0xb6e
+ [<c010b1a2>] ? native_sched_clock+0xe0/0xff
+ [<c017fc57>] ? kmem_cache_alloc+0x89/0xc9
+ [<c0142ce0>] ? trace_hardirqs_on+0xe8/0x11d
+ [<c014404f>] lock_acquire+0x6a/0x90
+ [<c013b460>] ? down_trylock+0xc/0x27
+ [<c03016cb>] _spin_lock_irqsave+0x42/0x72
+ [<c013b460>] ? down_trylock+0xc/0x27
+ [<c013b460>] down_trylock+0xc/0x27
+ [<c021fa65>] acpi_os_wait_semaphore+0x67/0x13d
+ [<c023a39e>] acpi_ut_acquire_mutex+0x65/0xcf
+ [<c0230261>] ...
From: Linus Torvalds
Date: Saturday, March 15, 2008 - 12:21 pm

I've reverted the whole thing. Or rather, since there were various small 
fixup commits over time, and a simple revert doesn't really work, I ended 
up just removing the option and the code that was conditional on it - that 
way, if we really want to fight this out some time (after 2.6.25 is out) 
or some vendor wants to use a known-broken option anyway, there's a simple 
and fairly clean commit to revert the revert.

It's commit 9a9e0d685553af76cb6ae2af93cca4913e7fcd47, see 

	http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=9a9e0d...

for details if you aren't a git person.

But quite frankly I don't think that we even want to re-introduce this in 
that form. If we really want to have a dynamic custom DSDT, I think we 
should do the whole DSDT replacement *much* later by ACPI (like just 
before driver loading or something like that).

If the BIOS-provided DSDT is _so_ broken that we cannot even get core 
stuff like the CPU's going, I think it has more serious issues than any 
custom DSDT will ever fix, but letting ACPI actually switch DSDT's at 
run-time (instead of just replacing it when looking for it very very early 
in the boot sequence) in order to work around some device issues sounds 
reasonably sane.

So how about aiming to make that DSDT-replacement something you can do 
from any kernel module, _after_ the original DSDT has already been parsed? 
And then the whole "load it from initrd" turns into a regular thing that 
we can do pretty early, but that we don't have to do quite _this_ early!

		Linus
--

From: Éric Piel
Date: Saturday, March 15, 2008 - 12:42 pm

15/03/08 20:21, Linus Torvalds wrote/a 
From: Linus Torvalds
Date: Saturday, March 15, 2008 - 1:19 pm

So that avoids the VFS layer issues, but it's still strictly much worse 
than just having a run-time loading.

What's the problem with just loading a new DSDT later? Potentially as in 
*much* later: including when user-space is all up-and-running? 

For things like DVD install images, you'd quite possibly want to have a 
few known-workaround DSDT images with the installer, and just say "ok, we 
want to fix up this ACPI crap in order to get working suspend/resume" kind 
of thing.

So what's the reason for pushing for this insanely-early workaround in the 
first place, instead of letting user-space do something like

	cat my-dsdt-image > /proc/sys/acpi/DSDT

or whatever at runtime?

		Linus
--

From: Éric Piel
Date: Saturday, March 15, 2008 - 5:15 pm

Yeah, or probably more something like this nowadays ;-)
	cat my-dsdt-image > /sys/firmware/acpi/tables/DSDT

As I said in my previous email, I'm already convinced that late-override
of ACPI table approach would be very interesting to investigate.
However, this cannot be taken lightly. A _lot_ of places in the kernel
depend on the ACPI and nothing has ever been done in the direction of
dynamic modification of the APCI tables. The implementation is likely to
be much bigger than the current 100 lines of patch.

That said, it should be possible to draw some assumptions without
restraining much the functionality. Such as:
 * every object present in the original table is still present is the
new table
 * they keep the same name

Len, do you think it would be feasible? How do you think the
implementation could be done?

Eric

--

From: Len Brown
Date: Monday, March 17, 2008 - 10:27 am

I agree with Linus' decision to revert/disable this feature.
I think it is appropriate to muck with this in -mm, but not in -rc6

I don't think re-loading the DSDT at run-time would be practical.

First, booting with the OEM DSDT may nullify the benefit
of overriding the OEM DSDT -- the damage may have already been done.

Secondly, unwinding everything that depends on the DSDT is on the
order of kexec or suspend/resume.  We're talking about all the stuff
that PNP does at boot time, plus device discovery and driver binding.

The feature on the table here is an initrd DSDT override.
We already have the ability to statically compile a DSDT
override into the kernel image.  That capability is sufficient
for kernel developers.

The initrd version of the DSDT override is really for one scenario.
Somebody who has a BIOS that even Windows can't deal with -- so
no amount of "Windows bug compatbility" will help Linux with it.
They must be capable eough to generate or acquire a modified DSDT.
They must be unwilling/unable to re-build their kenrel from scratch
each time they update it.  Eg. following debian unstable updates etc.

I think that customer deserves support, particularly because they get
bragging rights that Linux works better on a box build for Windows
than Windows does:-)
However, I don't think there are enough customers like this to
justify a huge effort that would add risk to Linux.

-Len
--

From: Len Brown
Date: Monday, March 17, 2008 - 10:59 am

For a Linux distro to ship DSDT override images, they'd have to
have some licensing & support arrangement with the OEM
who actually owns that BIOS code.

While this wouldn't defy any laws of physics, it doesn't
look compatible with current industry business practices.

OEMs are more likely to simply ship a BIOS update ISO.

-Len

--

From: Pavel Machek
Date: Friday, March 21, 2008 - 6:17 am

You have interpretted code runing (AML), and you want to replace it
with different code?

Akin to changing from one kernel to different during runtime?

Yes, I guess it might work for very simple changes, but if you need to change
data structures between origina and modified DSDT, you are in for a
big trouble, right?
							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Dave Hansen
Date: Sunday, March 23, 2008 - 9:00 am

Heh.  That gave me an idea.

Can we use kexec for this?  Let's say you get as far in boot as the
initrd and realize that you're running on one of these screwed up
systems.  Can you stick the new DSDT somewhere known (and safe) in
memory, and kexec yourself back to the beginning of the kernel boot?

When you boot up the second time, you have the new, shiny DSDT there
which is, of course, used instead of the bogus BIOS one.

It costs you some bootup time, but we're talking about working around
really busted hardware here.  

-- Dave

--

From: Pavel Machek
Date: Monday, March 24, 2008 - 9:03 am

Hmmm. I guess we should turn off acpi mode, kexec, turn on acpi mode
with new dsdt.

Turning off acpi is not exactly easy, but specs describe how to do
it...

So yes, this is hard but doable.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Eric Piel
Date: Monday, March 24, 2008 - 10:05 am

Why do you think it's necessary to turn off acpi mode? What will not 
work if we keep it on all the time?

BTW, let me summarize my understanding of the kexec approach:
* the userspace write the new DSDT (cat my-dsdt-image > 
/sys/firmware/acpi/tables/DSDT)
* the kernel don't use this DSDT directly but keeps it somewhere warm 
and fuzzy in the RAM
* userspace does a kexec
* the new kernel boots and at some (early) point, dsdt_override() is 
called. It detects that the special place in the RAM for a new DSDT is 
used. It provides this pointer to ACPI as the new place to read the DSDT.

Dave, am I correctly understanding the scenario you had in mind?

I have pratically no knowledge of kexec. Is there a documented way to 
pass big chunk of data from one kernel to another one? How can I do that?

Eric
--

From: Pavel Machek
Date: Monday, March 24, 2008 - 10:19 am

Yes, and now ACPI layer tries to enable already enabled ACPI... which
is no-no according to spec, but you may be able to  get away with it.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
pomozte zachranit klanovicky les:  http://www.ujezdskystrom.info/
--

From: Dave Hansen
Date: Monday, March 24, 2008 - 10:23 am

Yeah, that's basically what I was thinking.

But, this is only for a case where we can't do the real runtime
replacement that Linus has been advocating.  That approach is clearly
superior, but I would imagine that it'll require some serious ACPI

Heh.  Documented, no.  What OS do you think this is? ;)  I'm not sure it
has ever been really needed before.  

At one point, kexec just make a copy of the e820 table to tell the new
kernel where it's ram was.  If you carved out a chunk of memory and set
it as reserved, the new kernel could go looking there.

kexec is Eric Biederman's (on cc) baby, and he might have some more
concrete suggestions for you.

-- Dave

--

From: Helge Hafting
Date: Thursday, March 27, 2008 - 2:23 am

I see a problem here. 
This could work. And if it is successful, the "kexec reboot around 
busted hw"-trick
is used for other stuff as well.

So your broken machine reboots with some fix, then it reboots with the
custom DSDT. Is the previous fix preserved? Then a third problem is hit,
another kexec reboot. Is the first fix _and_ the custom DSDT
preserved on this reboot?  Or do we get an infinite sequence of reboots,
alternating between a couple of completely unrelated fixes for bad 
hw/bios...

Once there is more than one fix utilizing this trick, some "protocol" for
managing a string of  kexec fixes might become necessary.

Helge Hafting

--

From: Len Brown
Date: Monday, March 17, 2008 - 11:05 am

I recommend that you make a new proposal for 2.6.26
that applies on top of Linus' top-of-tree and that we
include lkml in hashing it out rather than just linux-acpi.

thanks,
-Len
--

From: Dave Hansen
Date: Sunday, March 16, 2008 - 1:11 pm

Hi Tim,

Again, thanks for the excellent bug reporting. 

This is actually a different problem (and not my code again, thank
goodness).  I think a few of these got fixed in current -mm.  According

So, this looks like an on-stack ACPI structure that got initialized
wrongly.  At least we already have those dudes on the cc. :)

But, this might also get fixed by reverting the patch as Linus just did.
It might just be best to wait for another -mm release and see how it
settles out.  

-- Dave

--

From: Peter Zijlstra
Date: Monday, March 17, 2008 - 5:23 am

Actually looks like the semaphore thing again, its a spinlock inside of

Looks like another of the semaphore thingies.. Does this go away once
you apply the semaphore lockdep fixup from here:

  http://lkml.org/lkml/2008/3/12/63

--

From: Tilman Schmidt
Date: Wednesday, March 19, 2008 - 4:50 pm

Yes, it does. With that patch on top of Dave's, I see no stack
backtraces in dmesg anymore.

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Len Brown
Date: Monday, March 17, 2008 - 10:48 am

DSDT's are generally 4KB to 64KB, so I don't think compression
for a DSDT override is important.

-Len
--

From: Tilman Schmidt
Date: Wednesday, March 12, 2008 - 5:15 pm

5/2.6.25-rc5-mm1/

Late during boot, this issues the following warning on my Pentium D,
apparently when trying to load an appropriate CPU frequency driver:

[   56.759128] ------------[ cut here ]------------
[   56.765058] WARNING: at drivers/base/sys.c:173 sysdev_driver_register+=
0x34/0xce()
[   56.776027] Modules linked in: acpi_cpufreq(+) speedstep_lib ip6table_=
filter ip6_tables x_tables ipv6 microcode firmware_class loop osst st sr_=
mod cdrom pata_acpi bas_gigaset snd_hda_intel gigaset isdn snd_pcm ata_ge=
neric<6>ip_tables: (C) 2000-2006 Netfilter Core Team
[   56.785293]  snd_timer aic7xxx slhc snd ohci1394 rtc_cmos ieee1394 shp=
chp crc_ccitt iTCO_wdt e1000e rtc_core iTCO_vendor_support soundcore scsi=
_transport_spi watchdog_core pci_hotplug intel_agp button thermal agpgart=
 rtc_lib processor i2c_i801 watchdog_dev parport_pc i2c_core snd_page_all=
oc parport pata_marvell sg ext3 jbd mbcache linear sd_mod usbhid hid ff_m=
emless ahci libata scsi_mod ehci_hcd uhci_hcd usbcore dm_snapshot dm_mod
[   56.805358] Pid: 2856, comm: modprobe Not tainted 2.6.25-rc5-mm1-testi=
ng #1
[   56.810766]  [<c01272e9>] warn_on_slowpath+0x41/0x6d
[   56.820628]  [<c0230065>] ? acpi_ns_lookup+0x2b5/0x497
[   56.830455]  [<c0230e25>] ? acpi_evaluate_object+0x23e/0x249
[   56.840414]  [<c02ff809>] ? mutex_unlock+0x8/0xa
[   56.848380]  [<fa9fec1d>] ? acpi_processor_preregister_performance+0x4=
e6/0x4f1 [processor]
[   56.858297]  [<c0286438>] ? cpufreq_register_driver+0x42/0xfc
[   56.868263]  [<c026423d>] sysdev_driver_register+0x34/0xce
[   56.877974]  [<c0286476>] cpufreq_register_driver+0x80/0xfc
[   56.887327]  [<facde034>] acpi_cpufreq_init+0x34/0x3a [acpi_cpufreq]
[   56.897290]  [<c014ad7a>] sys_init_module+0x1816/0x1943
[   56.907304]  [<facb5000>] ? icmp_checkentry+0x0/0x14 [ip_tables]
[   56.917255]  [<c0183cd2>] ? sys_read+0x3b/0x60
[   56.925094]  [<c0106aec>] sysenter_past_esp+0x6d/0xc5
[   56.935071]  ...
From: Greg KH
Date: Thursday, March 13, 2008 - 11:34 am

This implys that a cpufreq module is getting registered twice in the
sysdev code :(

thanks,

greg k-h
--

From: Dave Jones
Date: Thursday, March 13, 2008 - 12:57 pm

On Thu, Mar 13, 2008 at 11:34:39AM -0700, Greg KH wrote:
 > On Thu, Mar 13, 2008 at 01:15:52AM +0100, Tilman Schmidt wrote:
 > > Am 11.03.2008 09:14 schrieb Andrew Morton:
 > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc5/2.6.25-rc5-mm1/
 > > 
 > > Late during boot, this issues the following warning on my Pentium D,
 > > apparently when trying to load an appropriate CPU frequency driver:
 > > 
 > > [   56.759128] ------------[ cut here ]------------
 > > [   56.765058] WARNING: at drivers/base/sys.c:173 sysdev_driver_register+0x34/0xce()
 > > [   56.776027] Modules linked in: acpi_cpufreq(+) speedstep_lib ip6table_filter ip6_tables x_tables ipv6 microcode firmware_class loop osst st sr_mod cdrom pata_acpi bas_gigaset snd_hda_intel gigaset isdn snd_pcm ata_generic<6>ip_tables: (C) 2000-2006 Netfilter Core Team
 > > [   56.785293]  snd_timer aic7xxx slhc snd ohci1394 rtc_cmos ieee1394 shpchp crc_ccitt iTCO_wdt e1000e rtc_core iTCO_vendor_support soundcore scsi_transport_spi watchdog_core pci_hotplug intel_agp button thermal agpgart rtc_lib processor i2c_i801 watchdog_dev parport_pc i2c_core snd_page_alloc parport pata_marvell sg ext3 jbd mbcache linear sd_mod usbhid hid ff_memless ahci libata scsi_mod ehci_hcd uhci_hcd usbcore dm_snapshot dm_mod
 > > [   56.805358] Pid: 2856, comm: modprobe Not tainted 2.6.25-rc5-mm1-testing #1
 > > [   56.810766]  [<c01272e9>] warn_on_slowpath+0x41/0x6d
 > > [   56.820628]  [<c0230065>] ? acpi_ns_lookup+0x2b5/0x497
 > > [   56.830455]  [<c0230e25>] ? acpi_evaluate_object+0x23e/0x249
 > > [   56.840414]  [<c02ff809>] ? mutex_unlock+0x8/0xa
 > > [   56.848380]  [<fa9fec1d>] ? acpi_processor_preregister_performance+0x4e6/0x4f1 [processor]
 > > [   56.858297]  [<c0286438>] ? cpufreq_register_driver+0x42/0xfc
 > > [   56.868263]  [<c026423d>] sysdev_driver_register+0x34/0xce
 > > [   56.877974]  [<c0286476>] cpufreq_register_driver+0x80/0xfc
 > > [   56.887327]  [<facde034>] acpi_cpufreq_init+0x34/0x3a [acpi_cpufreq]
 > > [   ...
From: Dave Jones
Date: Thursday, March 13, 2008 - 12:56 pm

On Thu, Mar 13, 2008 at 01:15:52AM +0100, Tilman Schmidt wrote:
 > Am 11.03.2008 09:14 schrieb Andrew Morton:
 > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc5/2.6.25-rc5-mm1/
 > 
 > Late during boot, this issues the following warning on my Pentium D,
 > apparently when trying to load an appropriate CPU frequency driver:
 > 
 > [   56.759128] ------------[ cut here ]------------
 > [   56.765058] WARNING: at drivers/base/sys.c:173 sysdev_driver_register+0x34/0xce()
 > [   56.776027] Modules linked in: acpi_cpufreq(+) speedstep_lib ip6table_filter ip6_tables x_tables ipv6 microcode firmware_class loop osst st sr_mod cdrom pata_acpi bas_gigaset snd_hda_intel gigaset isdn snd_pcm ata_generic<6>ip_tables: (C) 2000-2006 Netfilter Core Team
 > [   56.785293]  snd_timer aic7xxx slhc snd ohci1394 rtc_cmos ieee1394 shpchp crc_ccitt iTCO_wdt e1000e rtc_core iTCO_vendor_support soundcore scsi_transport_spi watchdog_core pci_hotplug intel_agp button thermal agpgart rtc_lib processor i2c_i801 watchdog_dev parport_pc i2c_core snd_page_alloc parport pata_marvell sg ext3 jbd mbcache linear sd_mod usbhid hid ff_memless ahci libata scsi_mod ehci_hcd uhci_hcd usbcore dm_snapshot dm_mod
 > [   56.805358] Pid: 2856, comm: modprobe Not tainted 2.6.25-rc5-mm1-testing #1
 > [   56.810766]  [<c01272e9>] warn_on_slowpath+0x41/0x6d
 > [   56.820628]  [<c0230065>] ? acpi_ns_lookup+0x2b5/0x497
 > [   56.830455]  [<c0230e25>] ? acpi_evaluate_object+0x23e/0x249
 > [   56.840414]  [<c02ff809>] ? mutex_unlock+0x8/0xa
 > [   56.848380]  [<fa9fec1d>] ? acpi_processor_preregister_performance+0x4e6/0x4f1 [processor]
 > [   56.858297]  [<c0286438>] ? cpufreq_register_driver+0x42/0xfc
 > [   56.868263]  [<c026423d>] sysdev_driver_register+0x34/0xce
 > [   56.877974]  [<c0286476>] cpufreq_register_driver+0x80/0xfc
 > [   56.887327]  [<facde034>] acpi_cpufreq_init+0x34/0x3a [acpi_cpufreq]
 > [   56.897290]  [<c014ad7a>] sys_init_module+0x1816/0x1943
 > [   56.907304]  [<facb5000>] ? ...
From: Greg KH
Date: Thursday, March 13, 2008 - 1:27 pm

Sure, that would be simple to do.  Will change it now, and should show
up in the next -mm.

thanks,

greg k-h
--

From: Tilman Schmidt
Date: Thursday, March 13, 2008 - 5:01 pm

able_filter ip6_tables x_tables ipv6 microcode firmware_class loop osst s=
t sr_mod cdrom pata_acpi bas_gigaset snd_hda_intel gigaset isdn snd_pcm a=
4 shpchp crc_ccitt iTCO_wdt e1000e rtc_core iTCO_vendor_support soundcore=
 scsi_transport_spi watchdog_core pci_hotplug intel_agp button thermal ag=
pgart rtc_lib processor i2c_i801 watchdog_dev parport_pc i2c_core snd_pag=
e_alloc parport pata_marvell sg ext3 jbd mbcache linear sd_mod usbhid hid=
 ff_memless ahci libata scsi_mod ehci_hcd uhci_hcd usbcore dm_snapshot dm=
=3D7

You can find it at
http://gollum.phnxsoft.com/~ts/linux/dmesg.out
and the corresponding .config right beside it at
http://gollum.phnxsoft.com/~ts/linux/config-2.6.25-rc5-mm1

CCing linux-acpi as you did in your other mail.

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Dave Jones
Date: Thursday, March 13, 2008 - 5:44 pm

On Fri, Mar 14, 2008 at 01:01:18AM +0100, Tilman Schmidt wrote:

 > > Full dmesg please, with CPU_FREQ_DEBUG=y, and boot with cpufreq.debug=7
 > 
 > You can find it at
 > http://gollum.phnxsoft.com/~ts/linux/dmesg.out
 > and the corresponding .config right beside it at
 > http://gollum.phnxsoft.com/~ts/linux/config-2.6.25-rc5-mm1
 > 
 > CCing linux-acpi as you did in your other mail.

The interesting bits..

[   46.075145] cpufreq-core: trying to register driver centrino

here we've done sysdev_driver_register(&cpu_sysdev_class,&cpufreq_sysdev_driver);
(see cpufreq_register_driver in drivers/cpufreq/cpufreq.c)
This is the only place we register sysdev entries.

[   46.075155] cpufreq-core: adding CPU 0
[   46.075163] speedstep-centrino: found unsupported CPU with Enhanced SpeedStep: send /proc/cpuinfo to cpufreq@lists.linux.org.uk
[   46.075167] cpufreq-core: initialization failed

this ENODEVs

[   46.075173] cpufreq-core: adding CPU 1
[   46.075176] cpufreq-core: initialization failed

Same for the 2nd CPU.

[   46.075180] cpufreq-core: no CPU initialized for driver centrino

here we hit this part of cpufreq_register_driver

                /* if all ->init() calls failed, unregister */
                if (ret) {
                        dprintk("no CPU initialized for driver %s\n",
                                                        driver_data->name);
                        sysdev_driver_unregister(&cpu_sysdev_class,
                                                &cpufreq_sysdev_driver);


So we release all the refs.


[   46.075185] cpufreq-core: unregistering CPU 0
[   46.075190] cpufreq-core: unregistering CPU 1

These are the sysdev callbacks.

[   46.429147] powernow: This module only works with AMD K7 CPUs
[   47.081642] speedstep-lib: x86: f, model: 6
[   47.081649] speedstep-ich: Intel(R) SpeedStep(TM) capable processor not found

These drivers don't even get as far as calling cpufreq_register_driver,
they ENODEV way before things get ...
From: Zhao Yakui
Date: Thursday, March 13, 2008 - 5:57 pm

Please set CONFIG_ACPI_DEBUG and boot the system with the option of
"acpi.debug_layer=0x01010000 acpi.debug_level=0x1f".

It will be great if the acpidump output is attached.


--

From: Tilman Schmidt
Date: Friday, March 14, 2008 - 2:58 am

CONFIG_ACPI_DEBUG is already set, but I cannot reboot the machine

Available now at
http://gollum.phnxsoft.com/~ts/linux/acpidump.out

HTH
T.

--=20
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Tilman Schmidt
Date: Saturday, March 15, 2008 - 5:16 am

Ok, that took a bit longer than I hoped, but the result is now
finally available at:

http://gollum.phnxsoft.com/~ts/linux/dmesg-acpidebug.out

Note that I doctored this a bit: the dmesg buffer had already
overflowed by the time I ran the dmesg command, so I manually
prepended the missing part from the file /var/log/boot.msg into
which SUSE saves the early kernel messages. The border between
the two is marked off by the string "~~~~~~~~splice~~~~~~~~",
and I left a line of overlap to make it very clear.

The output of acpidump is unchanged wrt what I already posted.
(Unsurprisingly, but nevertheless I checked. Call me paranoid. ;-)

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Helge Hafting
Date: Thursday, March 13, 2008 - 7:03 am

This kernel hangs during shutdown, and that prevents automatic poweroff.
I have one small patch that improves the iwl3945 wireless driver a little.
Dell D830 laptop, 64-bit smp


It looks like this:
*** Last service has quit. ***
Your system will now POWER OFF!
Goodbye
Bug: unable to handle kernel paging request at ffffffff8020a7ad
IP: [<ffffffff80211b5a>] text_poke+0xe/0x15
PGD 203067 PUD 2+7+63 PMD 7f3ba163 PTE 20a161
Oops: 0003 [1] SMP
last sysfs file: 
/sys/devices/LNXSYSTM:00/device:00/ACPI0003:00/power_supply/AC/online
CPU 0
Modules linked in: tun pcmcia dock piix iTCO_wdt ata_piix watchdog_core 
watchdog_dev intel_agp ata_generic hci_usb

Pid: 7606, comm: initng Not tainted 2.6.25-rc5-mm1

RIP: 0010:[<ffffffff80211b5a>] [<ffffffff80211b5a>]text_poke+0xe/0x15
RSP: 0000:ffff81007e559cb8 EFLAGS: 00010083

(register dump omitted, but I can reproduce anytime if it matters)
Process initng (pid 7606, threadinfo...)
call trace:
alternatives_smp_unlock
alternatives_smp_switch
? schedule_timeout
__cpu_die
_cpu_down
disable_nonboot_cpus
kernel_power_off
sys_reboot
? handle_mm_fault
? __up_read
? do_page_fault
? __put_user
? error_exit
system_call_after_swapgs

(rest omitted)

sysrq still works at this point
sysrq+P gives:

CPU 0:
Modules linked in (same as before)
Pid: 0, comm: swapper Tainted: G D 2.6.25-rc5-mm1
RIP ... acpi_idle_enter
(register dump omitted)
acpi_idle_enter_bm
menu_select
cpuidle_idle_call
cpuidle_idle_call
default_idle
cpu_idle
rest_init


sysrq+O fails to deactivate a mouse, complains that the disk
may not be spun down properly, prepares for sleep state S5,
but don't power off. sysrq doesn't work after this.

Helge Hafting

--

From: Andrew Morton
Date: Thursday, March 13, 2008 - 9:12 am

Yes, I was hitting the text_poke() oops with 2.6.25-rc3-mm1 but not with
2.6.25-rc5-mm1.

This _might_ have been due to a snafu in git-x86: it had a [patch 2/2] from
Mathieu but was missing the needed [patch 1/2].  But I don't know if this
was the cause and I don't know whether 2.6.25-rc3-mm1's git-x86 had the
same problem.

--

From: Helge Hafting
Date: Tuesday, March 25, 2008 - 5:23 am

Andrew Morton wrote:
The problem seems to be solved in 2.6.25-rc6.

Helge Hafting
--

From: Tilman Schmidt
Date: Thursday, March 13, 2008 - 12:48 pm

5/2.6.25-rc5-mm1/

I'm noticing a strange effect with this:

On my openSUSE 10.3 development machine with SUSEs default MTA
Postfix installed, I occasionally send a pre-formatted mail by
feeding it directly into "/usr/sbin/sendmail -t". If I try that
while running a 2.6.25-rc5-mm1 kernel, I get:

ts@xenon:~/kernel> /usr/sbin/sendmail -t < patch-usb-reduce-syslog-clutte=
r-v3
postdrop: warning: can't open /proc/net/if_inet6 (Permission denied) - sk=
ipping IPv6 configuration
sendmail: warning: command "/usr/sbin/postdrop -r" exited with status 1
sendmail: fatal: ts(1000): unable to execute /usr/sbin/postdrop -r: Succe=
ss
ts@xenon:~/kernel>

and unsurprisingly, the mail is not sent. If I do the same as root,
everything works as usual, there is no console output from the
sendmail command, and the mail goes out as it should. All other
networking applications appear to be running normally.

On a 2.6.25-rc5 (non-mm) kernel I do not need to run the sendmail
command as root. It works just as well if I run it as myself.

IPv6 is not in use on that machine. The Ethernet interface has
just the link local IPv6 address. Possibly relevant information:

ts@xenon:~> /sbin/ifconfig -a
eth0      Protokoll:Ethernet  Hardware Adresse 00:19:D1:03:D8:FF
          inet Adresse:192.168.59.102  Bcast:192.168.59.255  Maske:255.25=
5.255.0
          UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:78 errors:0 dropped:0 overruns:0 frame:0
          TX packets:145 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 Sendewarteschlangenl=E4nge:100
          RX bytes:9547 (9.3 Kb)  TX bytes:17952 (17.5 Kb)
          Speicher:92c00000-92c20000

lo        Protokoll:Lokale Schleife
          inet Adresse:127.0.0.1  Maske:255.0.0.0
          inet6 Adresse: ::1/128 G=FCltigkeitsbereich:Maschine
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:2 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 ...
From: Daniel Lezcano
Date: Thursday, March 13, 2008 - 3:21 pm

Hi Tilman,

Is it possible to have your config file used to compile the kernel ?
--

From: Tilman Schmidt
Date: Thursday, March 13, 2008 - 5:08 pm

Sure. You can find it at
http://gollum.phnxsoft.com/~ts/linux/config-2.6.25-rc5-mm1

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Daniel Lezcano
Date: Monday, March 17, 2008 - 3:44 am

[Empty message]
From: Benjamin Thery
Date: Monday, March 17, 2008 - 5:50 am

I also tried to reproduce your problem with Postfix (on a Debian
distro) but failed to
obtain the error message.

While googling for the error string, I found this link which report
the same kind of
error when Postfix is used with grsecurity (in 2006):

http://blog.jensthebrain.de/archives/2006/12/11/IPv6-Probleme-mit-Postfix-und-grsecurity

I barely understand German so I'm not sure it is related to your problem.

Benjamin


--

From: Tilman Schmidt
Date: Monday, March 17, 2008 - 6:35 am

m.

The userspace failure described there is indeed the same as mine:
Postfix' sendmail command tries to open "/proc/net/if_inet6"
which fails with EACCES.

But I have never installed grsecurity on this machine, and the
problem appeared for me only with kernel 2.6.25-rc5-mm1, not when
running kernel 2.6.25-rc5 on the same machine, so I guess the
cause must be something different.

What's also strange is that I can "cat /proc/net/if_inet6" from
the command line as the same non-root user with no problem at all.
strace of "cat /proc/net/if_inet6" has:

open("/proc/net/if_inet6", O_RDONLY|O_LARGEFILE) =3D 3

strace of "/usr/sbin/sendmail", however:

open("/proc/net/if_inet6", O_RDONLY) =3D -1 EACCES (Permission denied)

Both run as

ts@xenon:~> id
uid=3D1000(ts) gid=3D100(users) groups=3D0(root),14(uucp),16(dialout),33(=
video),100(users),112(bacula)

HTH
T.

--=20
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Tilman Schmidt
Date: Monday, March 17, 2008 - 6:06 am

It's the one that comes with openSUSE 10.3:

ts@xenon:~> rpm -q postfix

Sure, no problem. You may find them at

http://gollum.phnxsoft.com/~ts/linux/main.cf
http://gollum.phnxsoft.com/~ts/linux/strace.log

HTH
T.

--=20
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Daniel Lezcano
Date: Monday, March 17, 2008 - 6:17 am

Thank you very much,  I will try to reproduce it with a simple program.
--

From: Benjamin Thery
Date: Wednesday, March 19, 2008 - 10:52 am

Tilman,

I've finally managed to reproduce your problem with Postfix on one of
my victims.

Earlier, in the afternoon, I wrote a piece of code that triggered a
similar behaviour,
but I wasn't sure it was exactly the problem you found. So, I've
rebuilt Postfix, added
some traces and, voila, same issue as yours.
(The version of Postfix originally  installed on my machine seems to
have IPv6 disabled)

I bisected the problem to the commit "[NET]: Make /proc/net a symlink
on /proc/self/net (v3)"

Here is what happens:

- Recently /proc/net has been moved to /proc/self/net, and
/proc/self/net is a symlink
  on this directory.
- Before that everybody could access /proc/net and read /proc/net/if_inet6:
   dr-xr-xr-x   6 root      root              0 2008-03-05 15:23 /proc/net

- Now, /proc/self/net has a more restrictive access mode and ony the
owner of the
  process can enter the directory:
  dr-xr--r-- 5 toto toto 0 Mar 19 17:30 net

  This is not a problem in most of the cases, but it becomes annoying
when a process
  decides to change its UID or GID. It may loose access to its own
/proc/self/net entries.

- What happens in the Postfix case is the 'sendmail' process executes the
   '/usr/sbin/postdrop' binary to enqueue the message, but unfortunately
   '/usr/bin/postdrop' has the setgid bit set:
   -rwxr-sr-x 1 root postdrop 479475 Mar 19 17:14 /usr/sbin/postdrop

   The process egid changes and this seems to be problematic to access
   /proc/self/net/if_inet6. :)

I've attached a tiny test program that can be used to reproduce the problem
without Postfix.
- Either execute it as root and give it an unprivileged uid in argument
  ./test-proc_net_if_inet6 1001

- Or change its ownership and access mode to: -rwxr-sr-x root postdrop
  and execute it as a lambda user.
   chown root:postdrop test-proc_net_if_inet6; chmod 2755 test-proc_net_if_=
inet6
   ./test-proc_net_if_inet6

I've found the cause but not the fix. :)
(Adding Pavel in cc:)

Regards,
Benjamin


From: Andrew Morton
Date: Wednesday, March 19, 2008 - 2:16 pm

On Wed, 19 Mar 2008 18:52:41 +0100

Thanks for that - most useful.

Although this is advertised as a 2.6.25-rc5-mm1 problem, I assume the
regression is also in mainline? 2.6.25-rc6?

--

From: Benjamin Thery
Date: Wednesday, March 19, 2008 - 3:14 pm

On Wed, Mar 19, 2008 at 10:16 PM, Andrew Morton

Yes, it is in mainline. I reproduced it on 2.6.25-rc5.

Benjamin
--

From: David Miller
Date: Wednesday, March 19, 2008 - 3:49 pm

From: Andrew Morton <akpm@linux-foundation.org>

It is in 2.6.25-rc6, correct.

If Pavel or someone else doesn't produce a good fix soon
I'll revert the guilty change as this bug is worse than
the problem that changeset fixes.
--

From: Benjamin Thery
Date: Thursday, March 20, 2008 - 1:26 am

Andre Noll sent a patch to LKML, acked by Pavel:

"Fix permissions of /proc/net"
http://thread.gmane.org/gmane.linux.kernel/655148

Benjamin
--

From: Rafael J. Wysocki
Date: Thursday, March 20, 2008 - 3:21 am

Have you tested that patch?

Rafael
--

From: Pavel Emelyanov
Date: Thursday, March 20, 2008 - 5:52 am

From: Benjamin Thery
Date: Thursday, March 20, 2008 - 6:48 am

Also tested here. It fixes the regression.


Benjamin
--

From: Rafael J. Wysocki
Date: Thursday, March 20, 2008 - 7:38 am

OK, thanks.

Rafael
--

From: Tilman Schmidt
Date: Wednesday, March 19, 2008 - 4:31 pm

My results:

up to 2.6.25-rc5 -- good
2.6.25-rc5-mm1 -- bad
2.6.25-rc6 -- bad

HTH
T.

--=20
Tilman Schmidt                          E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Unge=F6ffnet mindestens haltbar bis: (siehe R=FCckseite)

From: Laurent Riffard
Date: Thursday, March 13, 2008 - 3:07 pm

Le 11.03.2008 09:14, Andrew Morton a 
From: Andrew Morton
Date: Thursday, March 13, 2008 - 3:38 pm

On Thu, 13 Mar 2008 23:07:30 +0100

Actually I later dropped
signals-send_signal-factor-out-signal_group_exit-checks.patch at Oleg's
request.

But I don't think we did that because it was known to be buggy, so perhaps
the same bug crept back in in another form..


--

From: Oleg Nesterov
Date: Thursday, March 13, 2008 - 10:26 pm

Laurent, thanks a lot!


Yes, currently I suspect we have another bug.

And. While doing this patch I forgot we should fix the bugs with init first!
(will try to make the patch soon).

Laurent, any chance you can try 2.6.25-rc5-mm1 + the patch below?
Unlikely it can help, but would be great to be sure.

Oleg.

--- MM/kernel/signal.c~	2008-03-14 08:08:07.000000000 +0300
+++ MM/kernel/signal.c	2008-03-14 08:08:17.000000000 +0300
@@ -719,6 +719,10 @@ static void complete_signal(int sig, str
 		/*
 		 * This signal will be fatal to the whole group.
 		 */
+if (is_global_init(p)) {
+	printk(KERN_CRIT "ERR!! init is killed by %d\n", sig);
+	WARN_ON_ONCE(1);
+} else
 		if (!sig_kernel_coredump(sig)) {
 			/*
 			 * Start a group exit and wake everybody up.

--

From: Laurent Riffard
Date: Friday, March 14, 2008 - 2:06 pm

Le 14.03.2008 06:26, Oleg Nesterov a 
From: Oleg Nesterov
Date: Saturday, March 15, 2008 - 5:03 am

Great. Thanks a lot Laurent!

So what happens is:

We have the very old bug (bugs, actually) with the global init && signals
which I tried to fix many times but can't find a simple solution. The fatal
signal sent to init doesn't really kill it (we have the check in
get_signal_to_deliver) but it sets SIGNAL_GROUP_EXIT. This is wrong, now
init can't exec, this has other bad implications, and this is just insane.

With the signals-send_signal-factor-out-signal_group_exit-checks.patch the
task with SIGNAL_GROUP_EXIT doesn't recieve the signals. While this change
itself is (I hope) correct, the "killed" /sbin/init now can't see SIGCHLD

Not a kernel problem, but this looks a bit strange to me.

init has SIG_DFL for SIGUSR1, and someone does kill(1, SIGUSR1).
Note that init was explicitly targeted, the signal was not sent
to prgp or -1.

Most likely Ubuntu knows what it does, and I can't find any email
at ubuntu.com to cc...

Oleg.

--

From: Mariusz Kozlowski
Date: Sunday, March 16, 2008 - 2:38 pm

Hello,

	The build on my laptop (32bit x86) fails.

sound/drivers/pcsp/pcsp.c: In function 'snd_pcsp_create':
sound/drivers/pcsp/pcsp.c:54: error: 'loops_per_jiffy' undeclared (first use in this function)
sound/drivers/pcsp/pcsp.c:54: error: (Each undeclared identifier is reported only once
sound/drivers/pcsp/pcsp.c:54: error: for each function it appears in.)

Seems like the patch below is needed.

	Mariusz

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

--- linux-2.6.25-rc5-mm1-a/sound/drivers/pcsp/pcsp.c	2008-03-16 21:34:28.000000000 +0100
+++ linux-2.6.25-rc5-mm1-b/sound/drivers/pcsp/pcsp.c	2008-03-16 21:58:58.000000000 +0100
@@ -12,6 +12,7 @@
 #include <sound/initval.h>
 #include <sound/pcm.h>
 #include <linux/input.h>
+#include <linux/delay.h>
 #include <asm/bitops.h>
 #include "pcsp_input.h"
 #include "pcsp.h"
--

From: Mariusz Kozlowski
Date: Friday, March 28, 2008 - 3:52 pm

Hello,

	The gregkh-pci-pci-sparc64-use-generic-pci_enable_resources.patch which
replaces arch-specific code with generic pci_enable_resources() makes my sparc64
box unable to boot (that's what quilt bisection says). At first I see these messages:

hme 0000:00:01.1: device not available because of BAR 0 [1ff80008000:1ff8000f01f] collisions
sym53c8xx 0000:00:03.0: device not available because of BAR 0 [1fe02010400:1fe020104ff] collisions
sym53c8xx 0000:00:03.1: device not available because of BAR 0 [1fe02010800:1fe020108ff] collisions

and finally, infamous

VFS: Cannot open root device "sda3" or unknown-block(0,0)

	Mariusz

PS. I attached .config used at bisection time.

# lspci 
0000:00:00.0 Host bridge: Sun Microsystems Computer Corp. Psycho PCI Bus Module
0000:00:01.0 Bridge: Sun Microsystems Computer Corp. EBUS (rev 01)
0000:00:01.1 Ethernet controller: Sun Microsystems Computer Corp. Happy Meal (rev 01)
0000:00:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14)
0000:00:03.1 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14)
0001:00:00.0 Host bridge: Sun Microsystems Computer Corp. Psycho PCI Bus Module

$ uname -a
Linux sparc64 2.6.25-rc5 #2 SMP PREEMPT Fri Mar 28 12:16:30 CET 2008 sparc64 sun4u TI UltraSparc II (BlackBird) GNU/Linux
From: David Miller
Date: Friday, March 28, 2008 - 4:10 pm

From: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

Yes, that generic code won't work because of the NULL
r->parent check.

Alpha, ARM, V32, FRV, IA64, MIPS, MN10300, PARISC, PPC,
SH, V850, X86, and Xtensa are all likely to run into
problems because of this change.

The only platform that did the check as a test of r->parent
being NULL is Powerpc.

The rest either didn't check (like sparc64), or tested it by going:

	if (!r->start && r->end)

So the amount of potential breakage from this change is enormous.
--

From: Benjamin Herrenschmidt
Date: Friday, March 28, 2008 - 5:44 pm

ppc and x86 won't have problem, I haven't checked the others, sparc64

Yup, though that makes sense to do it that way on platforms that

Not that big, but yeah, it should be limited to platforms that actually
build a resource-tree and keep track of assigned & allocated resources,
which sparc64 doesn't (which is fair enough, if your firmware is 100%
right and your kernel never has to assigns things itself). The NULL
parent is a 100% indication that the resource was properly claimed and
put in the resource-tree (and thus is non conflicting) on those
platforms, but it's unused on sparc64.

Basically, on platforms like x86 or powerpc, the PCI subsystem at boot
builds a resource tree by collecting resources for all enabled devices
and bridges in a first pass, then all others in a second pass, checking
for conflicts or unassigned ones, and potentially re-assigning and
re-allocating bridges if necessary.

Sparc64 takes a different approach, it basically doesn't bother with a
full resource tree, and just claims what driver claim, which is fine as
long as you are certain that you always get a perfectly well assigned &
non conflicting setup done by your firmware.

The "full featured" approach is necessary for platforms where this isn't
the case, such as powerpc, even with a pretty good OF like Apple ones,
since they love to not assign resources that they know their MacOS
driver will not need (such as not assigning IO space and closing it on
the P2P bridge) which doesn't necessarily quite work with the
requirements of the linux  drivers, in addition to also gross bugs they
have on some versions when using cards with P2P bridges on them.

In addition, we also need that resource management to be able to
dynamically assign resource after boot as our OF doesn't stay alive to
do it, such as when using cardbus cards, or other type of hotplug things
for which the firmware doesn't do dynamic resource allocation.

So, the meat of the original patch isn't bad per-se. There is definitely
a ...
Previous thread: [PATCH: RESEND] UIO: UIO interface to the SMX Cryptengine by Ben Nizette on Tuesday, March 11, 2008 - 1:17 am. (2 messages)

Next thread: none