ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-mm... - Added the wm97xx touchscreen driver tree, as git-wm97xx.patch (Mark Brown <broonie@opensource.wolfsonmicro.com>) - git-alsa.patch has been replaced by git-alsa-tiwai.patch - git-drm.patch is dropped due to build errors - git-md-accel.patch has been replaced with git-async_tx.patch - git-xfs.patch is dropped due to extensive git rejects. - Added the VFS git tree, as git-vfs.patch (Al Viro <viro@zeniv.linux.org.uk>) Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail majordomo@vger.kernel.org - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. These probably are ...
Been seeing these crop up once in a while - can take hours after a reboot before I see the first one, but once I see one, I'm likely to see more, at a frequency of anywhere from ~5seconds to ~10 minutes between BUG msgs. BUG: scheduling while atomic: swapper/0/0xffffffff Pid: 0, comm: swapper Tainted: P 2.6.25-rc8-mm1 #4 Call Trace: [<ffffffff8020b2f4>] ? default_idle+0x0/0x74 [<ffffffff8022be19>] __schedule_bug+0x5d/0x61 [<ffffffff80552aea>] schedule+0x11a/0x9e4 [<ffffffff805536ce>] ? preempt_schedule+0x3c/0xaa [<ffffffff802480f1>] ? hrtimer_forward+0x82/0x96 [<ffffffff804600a4>] ? cpuidle_idle_call+0x0/0xd5 [<ffffffff8020b2f4>] ? default_idle+0x0/0x74 [<ffffffff8020b2e0>] cpu_idle+0xf6/0x10a [<ffffffff80540cb2>] rest_init+0x86/0x8a Eventually, I end up with a basically hung system, and need to alt-sysrq-B. Yes, I know it's tainted, and it's possible the root cause is a self-inflicted buggy module - but the traceback above seems odd. Did some of my code manage to idle the CPU while is_atomic was set, or is the path from cpu_idle on down doing something it shouldn't be? (I admit being confused - if my code was the source of the is_atomic error, shouldn't it have been caught on the *previous* call to schedule - the one that ran through all the queues and decided we should invoke idle?
Sounds sane. Perhaps preempt_count is getting mucked up in interrupt context? iirc there's some toy in either the recently-added tracing code or still in the -rt tree which would help find a missed unlock, but I forget what it was. Ingo will know... --
After $ echo -n 4-1.2 >/sys/bus/usb/drivers/usb/unbind $ echo -n 4-1.2 >/sys/bus/usb/drivers/usb/bind I have this in logs: sysfs: duplicate filename 'usbdev4.12_ep81' can not be created ------------[ cut here ]------------ WARNING: at /home/l/latest/xxx/fs/sysfs/dir.c:425 sysfs_add_one+0x99/0xc0() Modules linked in: usbhid hid nls_cp437 vfat fat usb_storage tun bitrev ipv6 arc4 ecb crypto_blkcipher cryptomgr crypto_algapi ath5k mac80211 sr_mod crc32 ohci1394 rtc_cmos cfg80211 ieee1394 floppy rtc_core ehci_hcd rtc_lib ff_memless cdrom [last unloaded: hid] Pid: 539, comm: bash Tainted: G W 2.6.25-rc8-mm1_64 #395 Call Trace: [<ffffffff8022f07f>] warn_on_slowpath+0x5f/0x80 [<ffffffff80230197>] ? printk+0x67/0x70 [<ffffffff802d9bd0>] ? sysfs_ilookup_test+0x0/0x20 [<ffffffff802a12e8>] ? ifind+0x58/0xc0 [<ffffffff802d9bd0>] ? sysfs_ilookup_test+0x0/0x20 [<ffffffff802d9f49>] sysfs_add_one+0x99/0xc0 [<ffffffff802daf68>] sysfs_create_link+0xa8/0x130 [<ffffffff8038ebda>] device_add+0x2aa/0x4d0 [<ffffffff80310c26>] ? kobject_init+0x36/0x80 [<ffffffff8038ee19>] device_register+0x19/0x20 [<ffffffff803dbbec>] usb_create_ep_files+0x19c/0x320 [<ffffffff803dadb3>] usb_create_sysfs_intf_files+0xd3/0x100 [<ffffffff803d630c>] usb_set_configuration+0x3ac/0x5f0 [<ffffffff803df81a>] generic_probe+0x7a/0xb0 [<ffffffff803d83fa>] usb_probe_device+0x3a/0x40 [<ffffffff80390ceb>] driver_probe_device+0x9b/0x1a0 [<ffffffff803901b3>] driver_bind+0xb3/0x100 [<ffffffff8038f8a7>] drv_attr_store+0x27/0x30 [<ffffffff802d94ab>] sysfs_write_file+0xeb/0x140 [<ffffffff8028cc57>] vfs_write+0xc7/0x170 [<ffffffff8028d2f0>] sys_write+0x50/0x90 [<ffffffff8020b5eb>] system_call_after_swapgs+0x7b/0x80 ---[ end trace 6ee6d593d4e510b4 ]--- I think, this is a 2.6.25-rc5-mm1 regression, there while :; do echo -n 4-1.2 ...
Does this also show up in 2.6.25-rc8 without -mm? I thought I fixed this already, I don't see what slipped into -mm that would have caused it to come back. Time to run some more tests... Oh, also note that binding and unbinding the main "usb" driver is not encouraged, or even supported. I'm amazed it works, as this is not something that any "real" user would do as it makes no sense at all because we have no "alternative" drivers yet for the main USB device. thanks, greg k-h --
It's a real bug. I don't have time to track it down now. Next week... Alan Stern --
Here's the answer. The bug was introduced when the definition of device_is_registered() in include/linux/device.h was changed. The old definition returned 0 when called inside a driver's remove method for a device being unregistered, whereas the new definition returns 1. I don't know when this change was made. This patch ought to fix the problem. Jiri, can you confirm that it works? Alan Stern ----------------------------------------------------------- Removing an interface's sysfs files before unregistering the interface doesn't work properly, because usb_unbind_interface() will reinstall altsetting 0 and thereby create new sysfs files. This patch (as1074) removes the files after the unregistration is finished. It's not quite as clean, but at least it works. Also, there's no need to check if an interface has been registered before removing its sysfs files. If it hasn't been registered then the files won't have been created, so usb_remove_sysfs_intf_files() will simply do nothing. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> --- Index: usb-2.6/drivers/usb/core/message.c =================================================================== --- usb-2.6.orig/drivers/usb/core/message.c +++ usb-2.6/drivers/usb/core/message.c @@ -1089,8 +1089,8 @@ void usb_disable_device(struct usb_devic continue; dev_dbg(&dev->dev, "unregistering interface %s\n", interface->dev.bus_id); - usb_remove_sysfs_intf_files(interface); device_del(&interface->dev); + usb_remove_sysfs_intf_files(interface); } /* Now that the interfaces are unbound, nobody should @@ -1231,7 +1231,7 @@ int usb_set_interface(struct usb_device */ /* prevent submissions using previous endpoint settings */ - if (iface->cur_altsetting != alt && device_is_registered(&iface->dev)) + if (iface->cur_altsetting != alt) usb_remove_sysfs_intf_files(iface); usb_disable_interface(dev, iface); @@ -1330,8 +1330,7...
Tested-by: Jiri Slaby <jirislaby@gmail.com> Works well, thanks. --
I've changed that in the -mm tree to make some PCI stuff much easier. I didn't realize that USB was depending on when this was being set, sorry about it. I like your fix better, it makes the code path much simpler :) thanks, greg k-h --
Well, it's not really any _simpler_, since all I did was interchange two lines of code. But I agree this way is better. It doesn't depend on the behavior of device_is_registered() in the ill-defined situation where the device is in the middle of being unregistered. Alan Stern --
8/2.6.25-rc8-mm1/ This fails to come up on my development machine, apparently because it has trouble accessing the SATA hard disks. Hardware: Intel Pentium D940, Intel DQ965GF board, two SATA hard disks. Some unusual things I noticed during the boot process: - a message "doing fast boot" that looked unfamiliar; unfortunately it scrolled off too quickly to note its context - for each of the two SATA ports in use, a message "SATA port is slow to respond, please be patient" accompanied by about 10 secs wait - it actually got past the point where it mounts the root file system, so it must have thought it could access the disks - finally, the system hung completely after the SUSE startup messages Setting current sysctl status from /etc/sysctl.conf net.ipv4.icmp_echo_ignore_broadcasts =3D 1 with a dead keyboard and I had to hit the Win^Wreset button. - After rebooting into 2.6.24-rc8 (which works fine), nothing had been written to the disks, not even the dmesg output which SUSE usually dumps into /var/log/boot.msg early during startup. Before I try booting that kernel again, any instructions on what to watch out for? Is netconsole usable again? Other ideas? Regards, Tilman
On Fri, 04 Apr 2008 01:08:19 +0200 Usual stuff: `diff -u dmesg-2.6.25-rc8 dmesg-2.6.25-rc8-mm1'. Bisection. Thanks. --
Final report, seeing -mm2 is out: - Netconsole works. (grumblestupidsusefirewallgrumble) - The hang during boot only happens with kernels compiled with CONFIG_CIFS_EXPERIMENTAL=3Dy It also doesn't always happen at the same point in the boot sequence. I'm suspecting it might be triggered by some network packet. Anyway, it's obviously *not* a SATA problem. (That was just me jumping to conclusions, because ...) - That leaves only the messages ata1: port is slow to respond, please be patient (Status 0x80) ata1: COMRESET failed (errno=3D-16) and accompanying delays during boot, for each installed SATA disk. I'll try to find the time to retest this with 2.6.25-rc8-mm2. Thanks, Tilman --=20 Tilman Schmidt E-Mail: tilman@imap.cc Wehrhausweg 66 Fax: +49 228 4299019 53227 Bonn, Germany
Done. The messages and delays do *not* happen with 2.6.25-rc8-mm2. HTH Tilman
I don't remember seeing a report of the CIFS hang. It might be caused by bkl-removal-convert-cifs-over-to-unlocked_ioctl.patch, but it's hard to see That would be good, thanks. --
This is taking longer than I hoped, so here's a little progress report.
That message doesn't make it into dmesg. It's apparently a Suse thing,
These messages seem to be a separate issue. I also get them with
a .config that otherwise brings up the system successfully. That
allowed me to capture a dmesg, so here are some possibly interesting
hunks of the diff between a mainline kernel and a working 2.6.25-rc8-mm1
one:
--- dmesg-2.6.25-rc8-git.nots-reordered 2008-04-09 15:29:52.000000000 +02=
00
+++ dmesg-2.6.25-rc8-mm1.nots 2008-04-09 00:48:42.000000000 +0200
@@ -1,4 +1,4 @@
- Linux version 2.6.25-rc8-testing-00210-g51ac03f (ts@xenon) (gcc version=
4.2.1 (SUSE Linux)) #37 SMP PREEMPT Wed Apr 9 01:27:07 CEST 2008
+ Linux version 2.6.25-rc8-mm1-testing (ts@xenon) (gcc version 4.2.1 (SUS=
E Linux)) #6 SMP PREEMPT Wed Apr 9 00:24:23 CEST 2008
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000008f000 (usable)
BIOS-e820: 000000000008f000 - 00000000000a0000 (reserved)
[...]
@@ -244,12 +277,10 @@
CPU1: Intel P4/Xeon Extended MCE MSRs (24) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) D CPU 3.20GHz stepping 04
- Total of 2 processors activated (12796.06 BogoMIPS).
+ Total of 2 processors activated (12796.87 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=3D0x31 apic1=3D0 pin1=3D2 apic2=3D-1 pin2=3D-1
- checking TSC synchronization [CPU#0 -> CPU#1]:
- Measured 560 cycles TSC warp between CPUs, turning off TSC clock.
- Marking TSC unstable due to: check_tsc_sync_source failed.
+ checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
CPU0 attaching sched-domain:
domain 0: span 03
[Nice - at last a kernel that likes my TSC; not sure if it matters though=
=2E]
@@ -846,26 +880,36 @@
PCI: Setting latency timer of device 0000:00:1f.2 to 64
scsi0 : ahci
PM: Adding info for No Bus:host0
+ PM: Adding info for No Bus:host0
scsi1 : ahci
PM: Adding info for No...Actually git-agp.patch is broken, and should have been dropped. the drm patch is fine. I did mention this to you in two separate e-mails :) Dave. --
git-drm has a bunch of git rejects against mainline. I had a go at fixing them but it didn't work out and I had other stuff to look at. --
(Yes, I know the kernel is tainted. Hopefully the traceback will make enough sense that it won't matter. I think I cc'd most everybody who is listed in MAINTAINERS or had a non-trivial jbd, quota, or ext3 patch in the broken-out/) So I was running a 'yum update' on my laptop, walked away to ask a cow-orker a question, and came back to find it had BUG'ed twice... Amazingly enough, although it died in ext3 code, it apparently only nuked whatever filesystem it was handling, as syslog was still able to log the gory details into a file in /var. Given that a kernel rpm was the one it failed on, the I/O was almost certainly on either / or /boot - both ext3. / is mounted with quotas, /boot isn't, so I'm betting on / Apr 2 13:48:07 turing-police yum: Updated: texlive-texmf-latex-2007-18.fc9.noarch Apr 2 13:48:08 turing-police yum: Updated: 1:openoffice.org-xsltfilter-2.4.0-12.4.fc9.x86_64 Apr 2 13:48:09 turing-police yum: Updated: 1:openoffice.org-javafilter-2.4.0-12.4.fc9.x86_64 Apr 2 13:48:12 turing-police yum: Updated: kernel-headers-2.6.25-0.185.rc7.git6.fc9.x86_64 (here, it started updating kernel-2.6.25-0.185.rc7.git6 and died while I wasn't looking) [34895.379293] ------------[ cut here ]------------ [34895.379299] kernel BUG at fs/jbd/transaction.c:275! [34895.379302] invalid opcode: 0000 [1] PREEMPT SMP [34895.379306] last sysfs file: /sys/devices/platform/coretemp.1/temp1_input [34895.379309] CPU 0 [34895.379311] Modules linked in: gspca(U) compat_ioctl32 videodev v4l1_compat irnet ppp_generic slhc irtty_sir sir_dev ircomm_tty ircomm irda crc_ccitt coretemp vmnet(P)(U) vmmon(P)(U) nf_conntrack_ftp xt_pkttype ipt_REJECT ipt_osf nf_conntrack_ipv4 xt_ipisforif ipt_recent ipt_LOG xt_u32 iptable_filter ip_tables xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables sha256_generic aes_generic acpi_cpufreq tpm_tis arc4 pcmcia ecb iwl3945 yenta_socket nvidia(P)(U) iTCO_wdt firmware_class iTCO_vendor_support rsrc_nonstatic mac80211 v...
<snip>
Try this patch, it will keep us from re-entering the fs when we aren't supposed
to. cc'ing Eric Paris since he's the only selinux guy I know :). I don't think
any of the other allocations in here need to be fixed, but I didn't look too
carefully.
Signed-off-by: Josef Bacik <jbacik@redhat.com>
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index c2fef7b..820d07a 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -180,7 +180,7 @@ static int inode_alloc_security(struct inode *inode)
struct task_security_struct *tsec = current->security;
struct inode_security_struct *isec;
- isec = kmem_cache_zalloc(sel_inode_cache, GFP_KERNEL);
+ isec = kmem_cache_zalloc(sel_inode_cache, GFP_NOFS);
if (!isec)
return -ENOMEM;
@@ -2429,7 +2429,7 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
return -EOPNOTSUPP;
if (name) {
- namep = kstrdup(XATTR_SELINUX_SUFFIX, GFP_KERNEL);
+ namep = kstrdup(XATTR_SELINUX_SUFFIX, GFP_NOFS);
if (!namep)
return -ENOMEM;
*name = namep;
---- Stephen Smalley National Security Agency --
On Wed, 2 Apr 2008 15:27:15 -0400 Might fix it. But 2.6.24's inode_alloc_security() also uses GFP_KERNEL and doesn't have this bug. What changed? --
Looks legitimate, although we've been doing that since Linux 2.6.0-test3 (selinux merge) for inode_alloc_security and d_instantiate, and since Linux 2.6.14 for inode_init_security, so something is at least triggering it more easily now. inode_doinit_with_dentry looks like another instance and security_context_to_sid_core as well. -- Stephen Smalley National Security Agency --
I guess it is just the combination of someone using SELinux + quota (or several journaling filesystems) + being unlucky under memory pressure that makes this happen only rarely. Josef, have you been successful in reproducing the problem under older kernel? Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs --
Not yet, I haven't been lucky enough apparently. I'm going to kick it off on a couple of boxes over the weekend and see if I can't hit it on one of them instead of just relying on the one box. Josef --
Thanks, I'll push this to Linus, but note that further analysis is required. -- James Morris <jmorris@namei.org> --
Please review.
More cases where SELinux must not re-enter the fs code.
Called from the d_instantiate security hook.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
---
security/selinux/hooks.c | 7 ++++---
security/selinux/include/security.h | 3 ++-
security/selinux/ss/services.c | 12 +++++++-----
3 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 41a049f..95b51b6 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1143,7 +1143,7 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
}
len = INITCONTEXTLEN;
- context = kmalloc(len, GFP_KERNEL);
+ context = kmalloc(len, GFP_NOFS);
if (!context) {
rc = -ENOMEM;
dput(dentry);
@@ -1161,7 +1161,7 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
}
kfree(context);
len = rc;
- context = kmalloc(len, GFP_KERNEL);
+ context = kmalloc(len, GFP_NOFS);
if (!context) {
rc = -ENOMEM;
dput(dentry);
@@ -1185,7 +1185,8 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
rc = 0;
} else {
rc = security_context_to_sid_default(context, rc, &sid,
- sbsec->def_sid);
+ sbsec->def_sid,
+ GFP_NOFS);
if (rc) {
printk(KERN_WARNING "%s: context_to_sid(%s) "
"returned %d for dev=%s ino=%ld\n",
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index f7d2f03..44e12ec 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -86,7 +86,8 @@ int security_sid_to_context(u32 sid, char **scontext,
int security_context_to_sid(char *scontext, u32 scontext_len,
u32 *out_sid);
-int security_context_to_sid_default(char *scontext, u32 scontext_len, u32 *out_sid, u32 def_sid);
+int security_contex...-- James Morris <jmorris@namei.org> --
I don't see why the problem couldn't happen in 2.6.24, I'm sure if I generate enough memory pressure and start creating a bunch of files I could reproduce the same thing. /me wanders off to try, Josef --
On Wed, 02 Apr 2008 15:12:49 -0400 The backtrace tells it all - we were inside a transaction for filesystem A, went into page reclaim, reclaimed an inode for filesystem B and then DQUOT_DROP() tried to start a transaction on filesystem B. JBD doesn't like cross-fs nested transactions (it'll corrupt task_struct.journal_info, and will cause ab/ba deadlocks). So it went BUG. Presumably something in the quota updates in -mm caused this. --
I think quota is innocent in this ;). We start a transaction in ext3_dquot_drop() for quite some time already. The problem is really in inode_alloc_security() and Josef pointed out. We really aren't allowed to allocate with GFP_KERNEL there because the reclaim code could as well decide to just write an inode on a different filesystem... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR --
Hello, sparc64 box, gcc 4.1.2 CC arch/sparc64/mm/init.o arch/sparc64/mm/init.c: In function 'paging_init': arch/sparc64/mm/init.c:1303: error: size of array 'type name' is negative and this is BUILD_BUG_ON(BITS_PER_LONG - NR_PAGEFLAGS != 32); Mariusz
yup, thanks. That's due to some page-flag rework in the memory management queue. The patches which broke mips as well. I'm pushing cross-compilers in Christoph's direction and hoping stuff gets fixed... --
CC [M] drivers/net/wireless/iwlwifi/iwl3945-base.o drivers/net/wireless/iwlwifi/iwl3945-base.c: In function 'iwl3945_build_tx_cmd_basic': drivers/net/wireless/iwlwifi/iwl3945-base.c:2492: error: 'struct iwl3945_priv' has no member named 'rxtxpackets' make[4]: *** [drivers/net/wireless/iwlwifi/iwl3945-base.o] Error 1 # # Automatically generated make config: don't edit # Linux kernel version: 2.6.25-rc8-mm1 # Tue Apr 1 21:44:54 2008 # # CONFIG_64BIT is not set CONFIG_X86_32=y # CONFIG_X86_64 is not set CONFIG_X86=y # CONFIG_GENERIC_LOCKBREAK is not set CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_FAST_CMPXCHG_LOCAL=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y # CONFIG_GENERIC_GPIO is not set CONFIG_ARCH_MAY_HAVE_PC_FDC=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y # CONFIG_GENERIC_TIME_VSYSCALL is not set CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y # CONFIG_HAVE_SETUP_PER_CPU_AREA is not set CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y # CONFIG_ZONE_DMA32 is not set CONFIG_ARCH_POPULATES_NODE_MAP=y # CONFIG_AUDIT_ARCH is not set CONFIG_ARCH_SUPPORTS_AOUT=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_X86_SMP=y CONFIG_X86_32_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y CONFIG_KTIME_SCALAR=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y C...
Thanks! John Linville just posted a fix for this problem to
wireless-testing ("drivers/net/wireless/iwlwifi/iwl-3945.h: correct
CONFIG_IWL4965_LEDS typo")
Reinette
--And with John's fix, I'm able to build with IWL3945_LEDS defined and there's now an "ooooh shiny" LED that hasn't worked since I got the laptop. :)
Apparently not ready for prime time...
Hi Andrew,
The 2.6.25-rc8-mm1 kernel build fails on x86_64, when compiled with randconfig option
In file included from include/net/dst.h:15,
from include/net/sock.h:57,
from include/linux/if_pppox.h:145,
from fs/compat_ioctl.c:39:
include/net/neighbour.h: In functionHi Andrew, The 2.6.25-rc8-mm1 kernel panic's while bootup on the power machine(s). [ 0.000000] ------------[ cut here ]------------ [ 0.000000] kernel BUG at arch/powerpc/mm/init_64.c:240! [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1] [ 0.000000] SMP NR_CPUS=32 NUMA PowerMac [ 0.000000] Modules linked in: [ 0.000000] NIP: c0000000003d1dcc LR: c0000000003d1dc4 CTR: c00000000002b6ac [ 0.000000] REGS: c00000000049b960 TRAP: 0700 Not tainted (2.6.25-rc8-mm1-autokern1) [ 0.000000] MSR: 9000000000021032 <ME,IR,DR> CR: 44000088 XER: 20000000 [ 0.000000] TASK = c0000000003f9c90[0] 'swapper' THREAD: c000000000498000 CPU: 0 [ 0.000000] GPR00: c0000000003d1dc4 c00000000049bbe0 c0000000004989d0 0000000000000001 [ 0.000000] GPR04: d59aca40f0000000 000000000b000000 0000000000000010 0000000000000000 [ 0.000000] GPR08: 0000000000000004 0000000000000001 c00000027e520800 c0000000004bf0f0 [ 0.000000] GPR12: c0000000004bf020 c0000000003fa900 0000000000000000 0000000000000000 [ 0.000000] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.000000] GPR20: 0000000000000000 0000000000000000 0000000000000000 4000000001400000 [ 0.000000] GPR24: 00000000017d64b0 c0000000003d6250 0000000000000000 c000000000504000 [ 0.000000] GPR28: 0000000000000000 cf000000001f8000 0000000001000000 cf00000000000000 [ 0.000000] NIP [c0000000003d1dcc] .vmemmap_populate+0xb8/0xf4 [ 0.000000] LR [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4 [ 0.000000] Call Trace: [ 0.000000] [c00000000049bbe0] [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4 (unreliable) [ 0.000000] [c00000000049bc70] [c0000000003d2ee8] .sparse_mem_map_populate+0x38/0x60 [ 0.000000] [c00000000049bd00] [c0000000003c242c] .sparse_early_mem_map_alloc+0x54/0x94 [ 0.000000] [c00000000049bd90] [c0000000003c250c] .sparse_init+0xa0/0x20c [ 0.000000] [c00000000049be50] [c0000000003ab7d0] .setup_arch+0x1ac/0x218 [ 0.000000] [c00000000049bee0] ...
int __meminit vmemmap_populate(struct page *start_page,
unsigned long nr_pages, int node)
{
unsigned long mode_rw;
unsigned long start = (unsigned long)start_page;
unsigned long end = (unsigned long)(start_page + nr_pages);
unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX;
/* Align to the page size of the linear mapping. */
start = _ALIGN_DOWN(start, page_size);
for (; start < end; start += page_size) {
int mapped;
void *p;
if (vmemmap_populated(start, page_size))
continue;
p = vmemmap_alloc_block(page_size, node);
if (!p)
return -ENOMEM;
pr_debug("vmemmap %08lx allocated at %p, physical %08lx.\n",
start, p, __pa(p));
mapped = htab_bolt_mapping(start, start + page_size,
__pa(p), mode_rw, mmu_linear_psize,
mmu_kernel_ssize);
=====> BUG_ON(mapped < 0);
}
return 0;
}
Beats me. pseries? Badari has been diddling with the bolted memory code
in git-powerpc...
--It does look like this is resolved with the patch below, if my testing
is to be believed (results out on TKO):
[PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
From: Yinghai Lu <yhlu.kernel@gmail.com>
Andrew, I believe you just sucked that up into -mm.
-apw
--One of the machines is the Power5 and another is PowerMac G5, on which the same kernel panic is seen. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. --
Can you enable DEBUG_LOW in arch/powerpc/platforms/pseries/lpar.c, that should show what's happening in hpte_insert(). cheers --=20 Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person
Okay. Found it. Root cause is: mm-make-mem_map-allocation-continuous.patch and its friends in -mm. You have to call sparse_init_one_section() on each pmap and usemap as we allocate - since valid_section() depends on it (which is needed by vmemmap_populate() to check if the section is populated or not). On ppc, we need to call htab_bolted_mapping() on each section and we need to skip existing sections. These patches tried to group all allocations together and then later calls sparse_init_one_section() - which is not good :( Please let me know, if its doesn't make sense - I will try to explain better :) Thanks, Badari --
http://lkml.org/lkml/2008/4/2/592 Thanks YH --
will send you patch workaround it... YH --
Just define DEBUG_LOW did not fetch and debug information, so added some printk to htab_bolt_mapping () and pSeries_lpar_hpte_insert () [boot]0012 Setup Arch htab_bolt_mapping (vstart cf00000000000000, vend cf00000001000000, pstart 3000000,mode 190, psize 4, ssize 0) htab_bolt_mapping: calling c000000000888f00 _hpte_insert(group=252078, va=d59aca40f0000000, pa=0000000003000000, rflags=194, vflags=10, psize=4 ssize=0) htab_bolt_mapping (vstart cf00000000000000, vend cf00000001000000, pstart 4000000,mode 190, psize 4, ssize 0) htab_bolt_mapping: calling c000000000888f00 _hpte_insert(group=252078, va=d59aca40f0000000, pa=0000000004000000, rflags=194, vflags=10, psize=4 ssize=0) htab_bolt_mapping (vstart cf00000000000000, vend cf00000001000000, pstart 5000000,mode 190, psize 4, ssize 0) htab_bolt_mapping: calling c000000000888f00 _hpte_insert(group=252078, va=d59aca40f0000000, pa=0000000005000000, rflags=194, vflags=10, psize=4 ssize=0) htab_bolt_mapping (vstart cf00000000000000, vend cf00000001000000, pstart 6000000,mode 190, psize 4, ssize 0) htab_bolt_mapping: calling c000000000888f00 _hpte_insert(group=252078, va=d59aca40f0000000, pa=0000000006000000, rflags=194, vflags=10, psize=4 ssize=0) htab_bolt_mapping (vstart cf00000000000000, vend cf00000001000000, pstart 8000000,mode 190, psize 4, ssize 0) htab_bolt_mapping: calling c000000000888f00 _hpte_insert(group=252078, va=d59aca40f0000000, pa=0000000008000000, rflags=194, vflags=10, psize=4 ssize=0) htab_bolt_mapping (vstart cf00000000000000, vend cf00000001000000, pstart 9000000,mode 190, psize 4, ssize 0) htab_bolt_mapping: calling c000000000888f00 _hpte_insert(group=252078, va=d59aca40f0000000, pa=0000000009000000, rflags=194, vflags=10, psize=4 ssize=0) htab_bolt_mapping (vstart cf00000000000000, vend cf00000001000000, pstart a000000,mode 190, psize 4, ssize 0) htab_bolt_mapping: calling c000000000888f00 _hpte_insert(group=252078, va=d59aca40f0000000, pa=000000000a000000, rflags=194, vflags=10, psize=4 ssize=0) htab_bolt_mapping (vsta...
Kamalesh, With your config, I am able to reproduce the problem. I haven't touched that part of code. I can take a look at it. It looks like we are trying to create mapping for same "vaddr" multiple times and we get failures after few creates. I am not sure why we are trying to create so many times with same vaddr. Thanks, Badari --
Dell Latitude D820, Core2 T7200, x86_64. Built my usual .config cleanly, booted OK, has gone for a half hour of fairly representative usage without any oopsen or other dmesg surprises...
Yes, it passed testing on my six test machines without any runtime problems at all. Weird. Lots of compile-time problems, but that's usual. --
Hi Andrew,
MIPS build fails with the following:
$ make ARCH=mips CROSS_COMPILE=mips-unknown-linux-gnu-
...
[skipped]
...
CC arch/mips/mips-boards/generic/init.o
In file included from include/asm/cacheflush.h:13,
from arch/mips/mips-boards/generic/init.c:30:
include/linux/mm.h:411:63: "NR_PAGEFLAGS" is not defined
include/linux/mm.h:459:62: "NR_PAGEFLAGS" is not defined
make[1]: *** [arch/mips/mips-boards/generic/init.o] Error 1
make: *** [arch/mips/mips-boards/generic] Error 2
Thanks,
Dmitri
--ahh, yup, known problem, sorry. We are slowly working on a fix. --
This the fix that I posted a couple of days ago after Andrew noted the
problem:
From: Christoph Lameter <clameter@sgi.com>
Subject: Allow override of definition for asm constant
MIPS has a different way of defining asm constants which causes troubles
for bounds.h generation (see also the Kbuild script).
Add a new per arch CONFIG variable
CONFIG_ASM_SYMBOL_PREFIX
which can be set to define an alternate header for asm constant definitions.
Use this for MIPS to make bounds determination work right.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
---
arch/mips/Kconfig | 7 +++++++
kernel/bounds.c | 11 ++++++++++-
2 files changed, 17 insertions(+), 1 deletion(-)
Index: linux-2.6.25-rc5-mm1/arch/mips/Kconfig
===================================================================
--- linux-2.6.25-rc5-mm1.orig/arch/mips/Kconfig 2008-03-31 13:14:26.888383587 -0700
+++ linux-2.6.25-rc5-mm1/arch/mips/Kconfig 2008-03-31 13:14:28.028403612 -0700
@@ -2019,6 +2019,13 @@ config I8253
config ZONE_DMA32
bool
+#
+# Used to override gas symbol setup in kernel/bounds.c.
+#
+config ASM_SYMBOL_PREFIX
+ string
+ default "@@@#define "
+
source "drivers/pcmcia/Kconfig"
source "drivers/pci/hotplug/Kconfig"
Index: linux-2.6.25-rc5-mm1/kernel/bounds.c
===================================================================
--- linux-2.6.25-rc5-mm1.orig/kernel/bounds.c 2008-03-31 13:14:26.904383870 -0700
+++ linux-2.6.25-rc5-mm1/kernel/bounds.c 2008-03-31 13:14:28.028403612 -0700
@@ -9,8 +9,17 @@
#include <linux/page-flags.h>
#include <linux/mmzone.h>
+#ifdef CONFIG_ASM_SYMBOL_PREFIX
+#define PREFIX CONFIG_ASM_SYMBOL_PREFIX
+#else
+/*
+ * Standard gas way of defining an asm symbol
+ */
+#define PREFIX "->"
+#endif
+
#define DEFINE(sym, val) \
- asm volatile("\n->" #sym " %0 " #val : : "i" (val))
+ asm volatile("\n" PREFIX #sym " %0 " : : "i" (val))
#define BLANK() asm volatile("\n->" : :)
--On Wed, 2 Apr 2008 10:33:32 -0700 (PDT) I'm obviously missing something here. i386 generates ->NR_PAGEFLAGS $18 __NR_PAGEFLAGS # mips generates ->NR_PAGEFLAGS 18 __NR_PAGEFLAGS # The only difference is the "$". This can be trivially handled in the sed expression which filters this .s file. Why are we diddling with that "->" thing, and why does it even exist? --
For some reason the asm-offset.c for mips generates it differently and the Maybe the simple solution is to drop the strange mips way of doing things I guess this is some convention to allow the Kbuild set script to extract the value. There must be some reason that they added the strange prefix. ccing Sam who may shed some light on this. --
When the asm-offset stuff were consolidated the mips variant did not match the others. I do not recall if I ever tried this on a mips tool-chain and as my dev box is busted atm I cannot even test it out now. I would be happy if we could kill the MIPS specific sed expression in the top-level Kbuild file. Ralf - can you take a look at this and see if mips really generates different assembler syntax which warrants the different sed expression. If mips really needs a different sed expression then we should adjust it so the output is similar to the other archs. Sam --
The reason for MIPS doing things a little different is that the resulting
<asm/asm-offsets.h> doesn't look like machine generated jibberish. So
how about below patch which combines the two sed expressions.
Ralf
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
diff --git a/Kbuild b/Kbuild
index 7136de7..2bd4a3c 100644
--- a/Kbuild
+++ b/Kbuild
@@ -52,10 +52,8 @@ targets += arch/$(SRCARCH)/kernel/asm-offsets.s
# Default sed regexp - multiline due to syntax constraints
define sed-y
- "/^->/{s:^->\([^ ]*\) [\$$#]*\([^ ]*\) \(.*\):#define \1 \2 /* \3 */:; s:->::; p;}"
+ "/^->/{s:^->\([^ ]*\) [\$$#]*\([^ ]*\) \(.*\):#define \1 \2 /* \3 */:; s:->::; p;}; /^@@@/{s/^@@@//; s/ \#.*\$$//; p;};"
endef
-# Override default regexp for specific architectures
-sed-$(CONFIG_MIPS) := "/^@@@/{s/^@@@//; s/ \#.*\$$//; p;}"
quiet_cmd_offsets = GEN $@
define cmd_offsets
--Well but it is machine generated and it may be best if mips would do more
of the same that is done in other arches? We do not want special arch
cases in Kbuild.
How about this patch?
Subject: Standardize mips asm-offsets.c somewhat
mips uses a different pattern to signal a constant in the asm code generated
by asm-offsets.c which in turn requires special handling in Kbuild and
causes trouble for the new mechanism to count the number of page flags.
Remove the special handling and make mips use the same string as all the
other arches (->).
It seems that MIPS tried to have nice looking asm output. Sadly this
patch disturbsthat nice formatting somewhat and makes it look like asm
output for any otherarch.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
---
Kbuild | 2
arch/mips/kernel/asm-offsets.c | 392 ++++++++++++++++++++---------------------
2 files changed, 196 insertions(+), 198 deletions(-)
Index: linux-2.6.25-rc8-mm1/Kbuild
===================================================================
--- linux-2.6.25-rc8-mm1.orig/Kbuild 2008-04-03 14:53:38.581697916 -0700
+++ linux-2.6.25-rc8-mm1/Kbuild 2008-04-03 14:53:41.411694858 -0700
@@ -54,8 +54,6 @@ targets += arch/$(SRCARCH)/kernel/asm-of
define sed-y
"/^->/{s:^->\([^ ]*\) [\$$#]*\([^ ]*\) \(.*\):#define \1 \2 /* \3 */:; s:->::; p;}"
endef
-# Override default regexp for specific architectures
-sed-$(CONFIG_MIPS) := "/^@@@/{s/^@@@//; s/ \#.*\$$//; p;}"
quiet_cmd_offsets = GEN $@
define cmd_offsets
Index: linux-2.6.25-rc8-mm1/arch/mips/kernel/asm-offsets.c
===================================================================
--- linux-2.6.25-rc8-mm1.orig/arch/mips/kernel/asm-offsets.c 2008-04-03 14:53:38.601695308 -0700
+++ linux-2.6.25-rc8-mm1/arch/mips/kernel/asm-offsets.c 2008-04-03 14:59:46.939017142 -0700
@@ -20,193 +20,193 @@
#define text(t) __asm__("\n@@@" t)
#define _offset(type, member) (&(((type *)NULL)->member))
#define...Almost. It compiles into a usable header but breaks the text() macro which is used to emit a commit (actually any string literal) into the With your patch nothing will be emited. The existing non-MIPS sed expression in Kbuild doesn't allow for that which is why I added the handling of @@@-prefixed strings to the sed expression. And once that is there the remaining asm-offset.c change is no longer needed. Ralf --
The text macro still emits the same text. Nothing is changed. Why does Why would kbuild have to handle comments? --
Ahh you want to insert comments into the generated include/asm-*/asm-offsets.h. Hmmm, the header comments state that it was generated so one would hopefully look at the source file instead ? If we want comments etc in there then we may want to do in some standardized fashion that works across all arches. Most of the arch/*/asm-offsets.c contents are exactly the same. Mips is deviating the most. If we could put some of the common stuff into common header files then this may turn out to be a nice code cleanup. --
I confirm that with this patch applied, the kernel build succeeds. Did not try to boot it, though. Thanks, --
| Francois Romieu | Re: PROBLEM: 2.6.23-rc "NETDEV WATCHDOG: eth0: transmit timed out" |
| Greg Kroah-Hartman | [PATCH 040/196] kobject: add kobject_add_ng function |
| Dave Airlie | [git pull] drm patches for 2.6.27 final |
| john stultz | [PATCH] correct inconsistent ntp interval/tick_length usage |
| Krzysztof Halasa | Re: [PATCH v2] Re: WAN: new PPP code for generic HDLC |
| David Miller | Re: [PATCH] Expose netdevice dev_id through sysfs |
| Dave Jones | odd RTL8139 quirk. |
| Auke Kok | [PATCH 4/8] e1000e: lower ring minimum size to 64 |
git: | |
| Miklos Vajna | [rfc] git submodules howto |
| Andrew Morton | Untracked working tree files |
| Ben Collins | Re: [kernel.org users] [RFD] On deprecating "git-foo" for builtins |
| Jon Smirl | ! [rejected] master -> master (non-fast forward) |
| rancor | How to copy/pipe console buffert to file? |
| Pieter Verberne | File collision while using pkg_add |
| Greg Thomas | Re: Is |
