ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc7/2.6.23-rc7-mm... - New git tree git-powerpc-galak.patch added to the -mm lineup: ppc32 things, mainly (Kumar Gala <galak@gate.crashing.org>) Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail majordomo@vger.kernel.org - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of the bug. Some developers filter by Subject: when looking for messages to read. - Occasional snapshots of the -mm lineup are uploaded to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on the mm-commits list. Changes since 2.6.23-rc6-mm1: origin.patch git-acpi.patch git-alsa.patch git-arm.patch git-audit-master.patch git-avr32.patch git-cifs.patch git-cpufreq.patch git-powerpc.patch git-powerpc-galak.patch git-drm.patch git-dvb.patch git-hwmon.patch git-gfs2-nmw.patch git-hid.patch git-ieee1394.patch...
Le 24.09.2007 11:17, Andrew Morton a
[adding kexec m-l] --- ~Randy Phaedrus says that Quality is about caring. -
Hi Andrew, The drivers/net/pasemi_mac seems to be broken and build fails with CC [M] drivers/net/pasemi_mac.o drivers/net/pasemi_mac.c: In function ‘pasemi_mac_probe’: drivers/net/pasemi_mac.c:1153: error: conflicting types for ‘mac’ drivers/net/pasemi_mac.c:1151: error: previous declaration of ‘mac’ was here drivers/net/pasemi_mac.c:1170: error: incompatible types in assignment drivers/net/pasemi_mac.c:1172: error: request for member ‘pdev’ in something not a structure or union drivers/net/pasemi_mac.c:1173: error: request for member ‘netdev’ in something not a structure or union drivers/net/pasemi_mac.c:1175: error: request for member ‘napi’ in something not a structure or union drivers/net/pasemi_mac.c:1180: error: request for member ‘dma_txch’ in something not a structure or union drivers/net/pasemi_mac.c:1181: error: request for member ‘dma_rxch’ in something not a structure or union drivers/net/pasemi_mac.c:1187: error: request for member ‘dma_if’ in something not a structure or union drivers/net/pasemi_mac.c:1189: error: request for member ‘dma_if’ in something not a structure or union drivers/net/pasemi_mac.c:1194: error: request for member ‘type’ in something not a structure or union drivers/net/pasemi_mac.c:1197: error: request for member ‘type’ in something not a structure or union drivers/net/pasemi_mac.c:1205: warning: passing argument 1 of ‘pasemi_get_mac_addr’ from incompatible pointer type drivers/net/pasemi_mac.c:1205: error: request for member ‘mac_addr’ in something not a structure or union drivers/net/pasemi_mac.c:1209: error: request for member ‘mac_addr’ in something not a structure or union drivers/net/pasemi_mac.c:1209: error: request for member ‘mac_addr’ in something not a structure or union drivers/net/pasemi_mac.c:1216: warning: passing argument 1 of ‘pasemi_mac_map_regs’ from incompatible pointer type drivers/net/pasemi_mac.c:1220: error: request for member ‘rx_status’ in so...
Hi Andrew,
The build fails with following error
CC drivers/block/ps3disk.o
drivers/block/ps3disk.c: In function ‘ps3disk_scatter_gather’:
drivers/block/ps3disk.c:115: error: ‘bio’ undeclared (first use in this
function)
drivers/block/ps3disk.c:115: error: (Each undeclared identifier is
reported only once
drivers/block/ps3disk.c:115: error: for each function it appears in.)
drivers/block/ps3disk.c:115: error: ‘j’ undeclared (first use in this
function)
drivers/block/ps3disk.c:116: error: implicit declaration of function
‘bio_kunmap_bvec’
make[2]: *** [drivers/block/ps3disk.o] Error 1
make[1]: *** [drivers/block] Error 2
make: *** [drivers] Error 2
The function bio_kunmap_bvec is missing.I tried checking the git-block.patch
as well as the linux/kernel/git/axboe/linux-2.6-block.git and did not
find this function.
Previously this function was replaced by __bio_kunmap_atomic();
This patch does not solves the implicit "declaration of function
‘bio_kunmap_bvec’"
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com
<mailto:kamalesh@linux.vnet.ibm.com>>
---
--- linux-2.6.23-rc7/drivers/block/ps3disk.c 2007-09-24 20:50:41.000000000 +0530
+++ linux-2.6.23-rc7/drivers/block/~ps3disk.c 2007-09-24 20:50:59.000000000 +0530
@@ -112,7 +112,7 @@ static void ps3disk_scatter_gather(struc
else
memcpy(buf, dev->bounce_buf+offset, size);
offset += size;
- flush_kernel_dcache_page(bio_iovec_idx(bio, j)->bv_page);
+ flush_kernel_dcache_page(bio_iovec_idx(iter.bio, iter.i)->bv_page);
bio_kunmap_bvec(bvec, flags);
i++;
}
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-On (25/09/07 01:11), Kamalesh Babulal didst pronounce: Your mailer appears to have mangled both your signoff and the whitespace in the patch and it does not apply. However, fixing it does not solve the problem because of this mysterious bio_kunmap_bvec() that is only referenced by this -- -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -
This should fix things up. diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c index 8e05ba7..a7fd66a 100644 --- a/drivers/block/ps3disk.c +++ b/drivers/block/ps3disk.c @@ -106,14 +106,14 @@ static void ps3disk_scatter_gather(struct ps3_storage_device *dev, (unsigned long)iter.bio->bi_sector); size = bvec->bv_len; - buf = bvec_kmap_irq(bvec, flags); + buf = bvec_kmap_irq(bvec, &flags); if (gather) memcpy(dev->bounce_buf+offset, buf, size); else memcpy(buf, dev->bounce_buf+offset, size); offset += size; - flush_kernel_dcache_page(bio_iovec_idx(bio, j)->bv_page); - bio_kunmap_bvec(bvec, flags); + flush_kernel_dcache_page(bvec->bv_page); + bvec_kunmap_irq(buf, &flags); i++; } } -- Jens Axboe -
This builds although I lack the hardware to really test it. However, in 2.6.23-rc8-mm1 it collides with git-block-ps3disk-fix.patch. This is a version on top of that stack but I guess the best thing to do is replace git-block-ps3disk-fix.patch with Jens patch once it is signed off. Not signing off because this is just a rebase. Assuming the other one gets signed off, consider it; Acked-by: Mel Gorman <mel@csn.ul.ie> --- diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc8-mm1-clean/drivers/block/ps3disk.c linux-2.6.23-rc8-mm1-fix-ps3disk/drivers/block/ps3disk.c --- linux-2.6.23-rc8-mm1-clean/drivers/block/ps3disk.c 2007-09-25 12:05:40.000000000 +0100 +++ linux-2.6.23-rc8-mm1-fix-ps3disk/drivers/block/ps3disk.c 2007-09-25 12:09:19.000000000 +0100 @@ -106,14 +106,14 @@ static void ps3disk_scatter_gather(struc (unsigned long)iter.bio->bi_sector); size = bvec->bv_len; - buf = bvec_kmap_irq(bvec, flags); + buf = bvec_kmap_irq(bvec, &flags); if (gather) memcpy(dev->bounce_buf+offset, buf, size); else memcpy(buf, dev->bounce_buf+offset, size); offset += size; - flush_kernel_dcache_page(bio_iovec_idx(iter.bio, iter.i)->bv_page); - bio_kunmap_bvec(bvec, flags); + flush_kernel_dcache_page(bvec->bv_page); + bvec_kunmap_irq(buf, &flags); i++; } } -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -
Thanks, but I already integrated the fix into the existing patch, so that bisect will work. -- Jens Axboe -
With the five hotfixes applied it works for me. But it fails to power down my system when shutting down. It prints twice 'System halted' and blinks the keyboard leds, but does not switch off. On all other kernel version I only see one keyboard blink before the power goes out. I compared its dmesg to vanilla-rc7 and -rc4-mm1, but expect that rc-4 assigns different IRQs I can't see any differences except the normal variation in BogoMips etc. As the system still responded to SysRq I got the following informations: [ 415.770000] SysRq : Show Regs [ 415.770000] CPU 3: [ 415.780000] Modules linked in: radeon drm nfsd exportfs ipv6 tuner tea5767 tda8290 tuner_simple mt20xx tvaudio msp3400 bttv video_buf ir_common compat_ioctl32 btcx_risc tveeprom videodev v4l2_common v4l1_compat pata_amd usbhid hid sg [ 415.780000] Pid: 0, comm: swapper Not tainted 2.6.23-rc7-mm1 #1 [ 415.780000] RIP: 0010:[<ffffffff8020ac79>] [<ffffffff8020ac79>] default_idle+0x29/0x40 [ 415.780000] RSP: 0018:ffff81010038bf30 EFLAGS: 00000246 [ 415.780000] RAX: 0000000000000400 RBX: ffffffff80810040 RCX: 0000000000000000 [ 415.780000] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000005 [ 415.780000] RBP: 0000000000030400 R08: 0000000000000000 R09: ffff81010038be68 [ 415.950000] R10: 000000000100002c R11: ffffffff80219be0 R12: 0000000000000000 [ 415.950000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 415.950000] FS: 00007f35c69726f0(0000) GS:ffff810100319700(0000) knlGS:0000000000000000 [ 415.950000] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 415.950000] CR2: 00007fe432928c40 CR3: 0000000000201000 CR4: 00000000000006e0 [ 416.070000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 416.070000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 416.070000] [ 416.070000] Call Trace: [ 416.070000] [<ffffffff8020acea>] cpu_idle+0x5a/0x90 [ 416.070000] No blocked tasks were shown with SysRq+...
On Mon, 24 Sep 2007 21:07:19 +0200 hm, dunno. The only substantial patch which touches arch/x86_64/kernel/process.c (which is where cpu_idle lives) is x86_64-prep-idle-loop-for-dynticks.patch. The problem is, 2.6.23-rc6-mm1's git-acpi patch had all the new cpuidle code in it. Len dropped all that code over the weekend (which is when I picked this copy of his tree), so 2.6.23-rc7-mm1 doesn't have the cpuidle code. Len will be reapplying the cpuidle patches today(ish) so next -mm _will_ have the cpuidle code. So what we have in rc7-mm1 is this transient no-cpuidle state. It could be that the x86_64 dynticks code (which was developed previously tested in conjunction with the cpuidle patches) has some dependency on cpuidle. So it's all a bit of a mess :( I think I'll basically stop applying things which don't look like bugfixes for a while and try to get more -mm's out, as we seriously need to get this lot stabilised asap. Len, would it be possible to restore cpuidle sometime today please? -
Can your check whether 2.6.23-rc7 + http://tglx.de/projects/hrtimers/2.6.23-rc7/patch-2.6.23-rc7-hrt1.patch It should not. cpuidle makes use of dynticks not the other way round. tglx -
Yes, powers off normally. Torsten -
Ok, so it's probably some merge artifact in -mm. We'll get this sorted out once Len has his new tree available. tglx -
There isn't a total_memory identifier within this function's scope. The patch was compile/link tested. Signed-off-by: Bob Picco <bob.picco@hp.com> arch/ia64/kernel/efi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c =================================================================== --- linux-2.6.23-rc7-mm1.orig/arch/ia64/kernel/efi.c 2007-09-24 09:54:40.000000000 -0400 +++ linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c 2007-09-24 10:50:51.000000000 -0400 @@ -1085,7 +1085,7 @@ efi_memmap_init(unsigned long *s, unsign *s = (u64)kern_memmap; *e = (u64)++k; - return total_memory; + return total_mem; } void -
I'm observing a problem with this kernel (as well as 2.6.23-rc6-mm1) which
manifests itself only in my Postfix/application mail.logs:
Sep 25 00:25:40 tornado postfix/smtp[12520]: fatal: select lock: Cannot allocate
memory
Sep 25 00:25:41 tornado postfix/master[8002]: warning: process
/usr/lib64/postfix/smtp pid 12520 exit status 1
This is happening frequently with processes started via 'master' (smtp, smtpd
and cleanup), but it does not appear to have any noticeable operational impact
apart from logging a lot of copies of this message.
The corresponding code in Postfix which triggers this is (choice of 3 files in
src/master are all possibilities which all have much the same code)
/*
* The event loop, at last.
*/
while (var_use_limit == 0 || use_count < var_use_limit || client_count > 0) {
if (multi_server_lock != 0) {
watchdog_stop(watchdog);
if (myflock(vstream_fileno(multi_server_lock), INTERNAL_LOCK,
MYFLOCK_OP_EXCLUSIVE) < 0)
msg_fatal("select lock: %m");
}
watchdog_start(watchdog);
delay = loop ? loop(multi_server_name, multi_server_argv) : -1;
event_loop(delay);
}
multi_server_exit();
}
Now I'm not convinced this is an application problem, because I'm only seeing
this after running up kernel 2.6.23-rc6-mm1 or 2.6.23-rc7-mm1 and with NO
changes to the application itself. Using the same application binaries it does
not occur with 2.6.22 mainline. [I didn't get a lot of testing with the -mm
release prior to that unfortunately due to some other breakage.]
Is there anything new in the last two or so -mm kernels which could have caused
this?
I've put my .config up at http://www.reub.net/files/kernel/2.6.23-rc7-mm1.config
Thanks,
Reuben
-ug. Lots of people have been futzing with the fs/locks.c code: cleanup-macros-for-distinguishing-mandatory-locks.patch fix-potential-oops-in-generic_setlease-v2.patch fix-potential-oops-in-generic_setlease.patch fs-locksc-use-list_for_each_entry-instead-of-list_for_each.patch git-nfs.patch git-nfsd.pc rework-proc-locks-via-seq_files-and-seq_list-helpers-fix-2.patch rework-proc-locks-via-seq_files-and-seq_list-helpers.patch slab-api-remove-useless-ctor-parameter-and-reorder-parameters.patch -
Oog. Looks like it's the "Memory shortage can result in inconsistent
flocks state" patch--the error variable is being set in some cases when
it shouldn't be. Does the following fix it?
That's in my git tree, not in mainline. I'll fix up my copy.
And I'll spend some time today figuring out what to do about regression
testing for the posix lock, flock, and lease code.
Thanks for the bug report!
--b.
diff --git a/fs/locks.c b/fs/locks.c
index a6c5917..3e8bfd2 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -740,6 +740,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
new_fl = locks_alloc_lock();
if (new_fl == NULL)
goto out;
+ error = 0;
}
for_each_lock(inode, before) {
-Yes that has fixed it, thanks! Reuben -
a.k.a. mm-use-pagevec-to-rotate-reclaimable-page-fix-2.patch
rotate_reclaimable_page() is not necessarily called with IRQ disabled:
it must do so when calling the helpfully commented pagevec_move_tail().
Hmm, if pagevec_move_tail() is assuming IRQ disabled, why should it
bother with irqsave/irqrestore variants of spin_lock? Because we like
to see them on lru_lock? But vmscan.c already has one bare spin_lock().
Signed-off-by: Hugh Dickins <hugh@veritas.com>
---
mm/swap.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- 2.6.23-rc7-mm1/mm/swap.c 2007-09-24 11:05:55.000000000 +0100
+++ linux/mm/swap.c 2007-09-24 13:08:12.000000000 +0100
@@ -102,7 +102,6 @@ static void pagevec_move_tail(struct pag
int i;
int pgmoved = 0;
struct zone *zone = NULL;
- unsigned long uninitialized_var(flags);
for (i = 0; i < pagevec_count(pvec); i++) {
struct page *page = pvec->pages[i];
@@ -110,9 +109,9 @@ static void pagevec_move_tail(struct pag
if (pagezone != zone) {
if (zone)
- spin_unlock_irqrestore(&zone->lru_lock, flags);
+ spin_unlock(&zone->lru_lock);
zone = pagezone;
- spin_lock_irqsave(&zone->lru_lock, flags);
+ spin_lock(&zone->lru_lock);
}
if (PageLRU(page) && !PageActive(page)) {
list_move_tail(&page->lru, &zone->inactive_list);
@@ -120,7 +119,7 @@ static void pagevec_move_tail(struct pag
}
}
if (zone)
- spin_unlock_irqrestore(&zone->lru_lock, flags);
+ spin_unlock(&zone->lru_lock);
__count_vm_events(PGROTATED, pgmoved);
release_pages(pvec->pages, pvec->nr, pvec->cold);
pagevec_reinit(pvec);
@@ -150,6 +149,7 @@ void move_tail_pages()
int rotate_reclaimable_page(struct page *page)
{
struct pagevec *pvec;
+ unsigned long flags;
if (PageLocked(page))
return 1;
@@ -162,9 +162,11 @@ int rotate_reclaimable_page(struct page
if (PageLRU(page) && !PageActive(page)) {
page_cache_...Hi Andrew, Kernel BUG over x86_64 (AMD Opteron(tm) Processor 844). Similar kernel Bug was reported for 2.6.23-rc2-mm1 at http://lkml.org/lkml/2007/8/10/20 and the mm-dirty-balancing-for-tasks.patch was dropped from 2.6.23-rc2-mm2. And the same patch is in this -mm version, suspect whether is it the same patch triggering this Bug. BUG: soft lockup - CPU#0 stuck for 11s! [events/0:15] CPU 0: Modules linked in: Pid: 15, comm: events/0 Tainted: G D 2.6.23-rc7-mm1-autokern1 #1 RIP: 0010:[<ffffffff8021be46>] [<ffffffff8021be46>] __smp_call_function_mask+0x9a/0xc4 RSP: 0000:ffff8100017add80 EFLAGS: 00000297 RAX: 00000000000000fc RBX: ffff8100017adde0 RCX: 0000000000000001 RDX: 00000000000008fc RSI: 00000000000000fc RDI: 000000000000000e RBP: ffffc20002d11000 R08: ffff8100017ac000 R09: ffffffff80675e38 R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000f R13: ffffffff8021bcfe R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffffffff8065a000(0000) knlGS:00000000556aa2a0 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: ffffc20002d11008 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: Inexact backtrace: [<ffffffff802157a4>] mcheck_check_cpu+0x0/0x31 [<ffffffff802157a4>] mcheck_check_cpu+0x0/0x31 [<ffffffff8021becf>] smp_call_function_mask+0x5f/0x72 [<ffffffff802157a4>] mcheck_check_cpu+0x0/0x31 [<ffffffff8021bf82>] smp_call_function+0x19/0x1b [<ffffffff8023a773>] on_each_cpu+0x16/0x2b [<ffffffff802158a2>] mcheck_timer+0x0/0x7c [<ffffffff802158c0>] mcheck_timer+0x1e/0x7c [<ffffffff802444b9>] run_workqueue+0x88/0x109 [<ffffffff8024453a>] worker_thread+0x0/0xf4 [<ffffffff80244623>] worker_thread+0xe9/0xf4 [<ffffffff8024841d>] autoremove_wake_function+0x0/0x37 [<ffffffff8024841d>] autoremove_wa...
hm, I thought we'd fixed the problems in that patchset. Peter, were you aware of this one? -
On Mon, 24 Sep 2007 09:44:48 -0700 Andrew Morton Nope, and the stacktrace is utterly puzzling. /me goes read the lkml.org link Kamalesh Babulal: do you still get: BUG: spinlock bad magic on msgs? Because those I could reproduce using fsx, and I fixed all that. -
Hi Peter, I do not get BUG: spinlock bad magic messages any more, but the softlock message is thrown more than 30 time, while running the ltp runall. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -
On Mon, 24 Sep 2007 22:38:03 +0530 Kamalesh Babulal
It would be good to know what function on_each_cpu is executing, could
you try something like:
---
kernel/softirq.c | 5 +++++
kernel/softlockup.c | 7 +++++++
2 files changed, 12 insertions(+)
Index: linux-2.6/kernel/softirq.c
===================================================================
--- linux-2.6.orig/kernel/softirq.c
+++ linux-2.6/kernel/softirq.c
@@ -645,6 +645,8 @@ __init int spawn_ksoftirqd(void)
}
#ifdef CONFIG_SMP
+
+DEFINE_PER_CPU(void (*)(void *info), last_on_each_cpu);
/*
* Call a function on all processors
*/
@@ -653,6 +655,9 @@ int on_each_cpu(void (*func) (void *info
int ret = 0;
preempt_disable();
+
+ per_cpu(last_on_each_cpu, smp_processor_id()) = func;
+
ret = smp_call_function(func, info, retry, wait);
local_irq_disable();
func(info);
Index: linux-2.6/kernel/softlockup.c
===================================================================
--- linux-2.6.orig/kernel/softlockup.c
+++ linux-2.6/kernel/softlockup.c
@@ -15,6 +15,8 @@
#include <linux/notifier.h>
#include <linux/module.h>
#include <linux/kgdb.h>
+#include <linux/percpu.h>
+#include <linux/kallsyms.h>
#include <asm/irq_regs.h>
@@ -71,6 +73,8 @@ void touch_all_softlockup_watchdogs(void
}
EXPORT_SYMBOL(touch_all_softlockup_watchdogs);
+DECLARE_PER_CPU(void (*)(void *), last_on_each_cpu);
+
/*
* This callback runs from the timer interrupt, and checks
* whether the watchdog thread has hung or not:
@@ -122,6 +126,9 @@ void softlockup_tick(void)
printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n",
this_cpu, now - touch_timestamp,
current->comm, task_pid_nr(current));
+ printk(KERN_ERR " last_on_each_cpu: [<%p>] ",
+ per_cpu(last_on_each_cpu, this_cpu));
+ print_symbol("%s\n", (unsigned long)per_cpu(last_on_each_cpu, this_cpu));
if (regs)
show_regs(regs);
else
-On Mon, 24 Sep 2007 21:20:58 +0200 Peter Zijlstra I've just completed 2 full ltp runs on a dual-core opteron machine but could not reproduce this problem. Kamalesh, would it be possible for you to reproduce with that patch, so we can see what function is holding up the cpu? -
Hi Peter, After running the test with the patch you provided, i observed an oops message which was at the top of the these soft lockup message and the oops is the same as the oops reported at http://lkml.org/lkml/2007/9/24/107. And when i applied the patch for the oops proposed at http://lkml.org/lkml/2007/9/25/57 the oops as well as the soft lockup's are not seen. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -
I also get this compile error on s390. 'linux/scatterlist.h' has disappeared from the #include pile but where ? /home/clg/linux/2.6.23-rc7-mm1/net/sctp/auth.c: In function `sctp_auth_calculate_hmac': /home/clg/linux/2.6.23-rc7-mm1/net/sctp/auth.c:695: error: storage size of 'sg' isn't known /home/clg/linux/2.6.23-rc7-mm1/net/sctp/auth.c:695: warning: unused variable `sg' Cheers, C. -
putting Vlad in Cc: The following patch works of course but it seems to simplistic for s390. Cheers, C. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> --- net/sctp/auth.c | 1 + 1 file changed, 1 insertion(+) Index: 2.6.23-rc7-mm1/net/sctp/auth.c =================================================================== --- 2.6.23-rc7-mm1.orig/net/sctp/auth.c +++ 2.6.23-rc7-mm1/net/sctp/auth.c @@ -36,6 +36,7 @@ #include <linux/types.h> #include <linux/crypto.h> +#include <linux/scatterlist.h> #include <net/sctp/sctp.h> #include <net/sctp/auth.h> -
Thanks, applied. -- Jens Axboe -
Odd that it didn't show up on x86 or ia64, but simple enough. ACK. -
Most likely those archs end up pulling in scatterlist.h through some other maze of includes. -- Jens Axboe -
Hi Andrew, Kernel oops over x86_64 (AMD Opteron(tm) Processor 844) Unable to handle kernel NULL pointer dereference at 0000000000000070 RIP: [<ffffffff80290630>] fasync_helper+0x6b/0xe4 PGD 181949067 PUD 182228067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /devices/system/node/possible CPU 3 Modules linked in: Pid: 18156, comm: fcntl23 Not tainted 2.6.23-rc7-mm1-autokern1 #1 RIP: 0010:[<ffffffff80290630>] [<ffffffff80290630>] fasync_helper+0x6b/0xe4 RSP: 0000:ffff810082bdfdb8 EFLAGS: 00010046 RAX: 00000000fffffff4 RBX: ffff8101821a9000 RCX: 0000000000000000 RDX: ffff8101821a9000 RSI: ffff810180026900 RDI: ffffffff806286b8 RBP: ffff810082bdfde8 R08: 0000000000000002 R09: ffff81018072124b R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000070 R13: 0000000000000001 R14: 0000000000000000 R15: ffff810181875cc0 FS: 0000000000000000(0000) GS:ffff810180721380(0063) knlGS:00000000556aa2a0 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000000000070 CR3: 00000001818c4000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process fcntl23 (pid: 18156, threadinfo ffff810082bde000, task ffff810082309530) last branch before last exception/interrupt from [<ffffffff804ad9e0>] _write_lock_irq+0x14/0x15 to [<ffffffff80290630>] fasync_helper+0x6b/0xe4 Stack: 0000000400000004 0000000000000000 0000000000000000 ffff810181875cc0 0000000000000004 ffff810182b3d238 ffff810082bdfee8 ffffffff802939cc ffff810082bdfe18 0000000000000000 0000000000000000 ffff810082bdfe10 Call Trace: [<ffffffff802939cc>] fcntl_setlease+0x99/0x101 [<ffffffff80290370>] sys_fcntl+0x2a3/0x2ce [<ffffffff802b14cf>] compat_sys_fcntl64+0x2ee/0x2ff [<ffffffff80224292>] ia32_sysret+0x0/0xa DWARF2 unwinder stuck at ia32_sysret+0x0/0xa Leftover inexact backtrace: Code: 49 8b 34 24 4c 89 e2 48 85 f6 74 2a 4c 39 7e 10 75 1a 45 85 RIP [<ff...
Please, try with this patch too: diff --git a/fs/locks.c b/fs/locks.c index c0fe71a..f599508 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1423,7 +1423,7 @@ int generic_setlease(struct file *filp, locks_copy_lock(new_fl, lease); locks_insert_lock(before, new_fl); - *flp = fl; + *flp = new_fl; return 0; out: -
Hi, Pavel, You did not signoff on the patch. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -
I did not, but this is just a patch to test. I know, that it most likely fixes the problem, but since Kamalesh didn't tell us how he had triggered it, I'd like him to Ack it :) Thanks, Pavel -
Ok, just wanted to let you know in case you missed it out. In case Andrew picked it up. That's all! -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c: In function `dasd_eckd_build_cp': /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1181: error: syntax error before "struct" /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: `iter' undeclared (first use in this function) /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: (Each undeclared identifier is reported only once /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: for each function it appears in.) /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: `bv' undeclared (first use in this function) /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: left-hand operand of comma expression has no effect /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: left-hand operand of comma expression has no effect /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: left-hand operand of comma expression has no effect /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: left-hand operand of comma expression has no effect /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: statement with no effect /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: statement with no effect /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: statement with no effect /home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: statement with no effect make[3]: *** [drivers/s390/block/dasd_eckd.o] Error 1 make[2]: *** [drivers/s390/block] Error 2 Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> --- drivers/s390/block/dasd_eckd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: 2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c =================================================================== --- 2.6.23-rc7-mm1.orig/drivers/s390/block/dasd_eckd.c...
Oops, looks like neither Neil nor I cross compiled this on s390. Thanks, I'll apply it. -- Jens Axboe -
Seeing the following from an older power LPAR, pretty sure we had
this in the previous -mm also:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000047ac8
cpu 0x0: Vector: 300 (Data Access) at [c00000000058f750]
pc: c000000000047ac8: .pSeries_log_error+0x364/0x420
lr: c000000000047a4c: .pSeries_log_error+0x2e8/0x420
sp: c00000000058f9d0
msr: 8000000000001032
dar: 0
dsisr: 42000000
current = 0xc0000000004a9b30
paca = 0xc0000000004aa700
pid = 0, comm = swapper
enter ? for help
[c00000000058faf0] c000000000021164 .rtas_call+0x200/0x250
[c00000000058fba0] c000000000049cd0 .early_enable_eeh+0x168/0x360
[c00000000058fc70] c00000000002f674 .traverse_pci_devices+0x8c/0x138
[c00000000058fd10] c000000000460ce8 .eeh_init+0x1a8/0x200
[c00000000058fdb0] c00000000045fb70 .pSeries_setup_arch+0x128/0x234
[c00000000058fe40] c00000000044f830 .setup_arch+0x214/0x24c
[c00000000058fee0] c000000000446a38 .start_kernel+0xd4/0x3e4
[c00000000058ff90] c000000000373194 .start_here_common+0x54/0x58
This machine is a:
processor : 0
cpu : POWER4+ (gq)
clock : 1703.965296MHz
revision : 19.0
[...]
timebase : 212995662
machine : CHRP IBM,7040-681
-apw
-I haven't forgetten about this ... and am looking at it now. Seems that whenever I go to reserve the machine pSeries-102, someone else is using it :-) --linas -
This panic is caused by "[POWERPC] pseries: Fix jumbled no_logging flag."
(79c0108d1b9db4864ab77b2a95dfa04f2dcf264c), in the powerpc/for-2.6.24
branch. It looks to me that we have logging enabled too early now.
I think the following is a reasonable fix?
---
Explicitly enable RTAS error logging, when it should be ready.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
arch/powerpc/platforms/pseries/rtasd.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
index 30925d2..0df5d0d 100644
--- a/arch/powerpc/platforms/pseries/rtasd.c
+++ b/arch/powerpc/platforms/pseries/rtasd.c
@@ -54,7 +54,10 @@ static unsigned int rtas_event_scan_rate;
static int full_rtas_msgs = 0;
/* Stop logging to nvram after first fatal error */
-static int no_more_logging;
+static int no_more_logging = 1; /* Until we initialize everything,
+ * make sure we don't try logging
+ * anything */
+
static int error_log_cnt;
@@ -414,6 +417,8 @@ static int rtasd(void *unused)
memset(logdata, 0, rtas_error_log_max);
rc = nvram_read_error_log(logdata, rtas_error_log_max,
&err_type, &error_log_cnt);
+ /* We can use rtas_log_buf now */
+ no_more_logging = 0;
if (!rc) {
if (err_type != ERR_FLAG_ALREADY_LOGGED) {
Yours Tony
linux.conf.au http://linux.conf.au/ || http://lca2008.linux.org.au/
Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!
-I realise it'll make the patch bigger, but this doesn't seem like a particularly good name for the variable anymore. cheers --=20 Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person
Sure, what about?
Clarify when RTAS logging is enabled.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
arch/powerpc/platforms/pseries/rtasd.c | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
index 30925d2..73401c8 100644
--- a/arch/powerpc/platforms/pseries/rtasd.c
+++ b/arch/powerpc/platforms/pseries/rtasd.c
@@ -54,8 +54,9 @@ static unsigned int rtas_event_scan_rate;
static int full_rtas_msgs = 0;
/* Stop logging to nvram after first fatal error */
-static int no_more_logging;
-
+static int logging_enabled; /* Until we initialize everything,
+ * make sure we don't try logging
+ * anything */
static int error_log_cnt;
/*
@@ -217,7 +218,7 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
}
/* Write error to NVRAM */
- if (!no_more_logging && !(err_type & ERR_FLAG_BOOT))
+ if (logging_enabled && !(err_type & ERR_FLAG_BOOT))
nvram_write_error_log(buf, len, err_type, error_log_cnt);
/*
@@ -229,8 +230,8 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
printk_log_rtas(buf, len);
/* Check to see if we need to or have stopped logging */
- if (fatal || no_more_logging) {
- no_more_logging = 1;
+ if (fatal || !logging_enabled) {
+ logging_enabled = 0;
spin_unlock_irqrestore(&rtasd_log_lock, s);
return;
}
@@ -302,7 +303,7 @@ static ssize_t rtas_log_read(struct file * file, char __user * buf,
spin_lock_irqsave(&rtasd_log_lock, s);
/* if it's 0, then we know we got the last one (the one in NVRAM) */
- if (rtas_log_size == 0 && !no_more_logging)
+ if (rtas_log_size == 0 && logging_enabled)
nvram_clear_error_log();
spin_unlock_irqrestore(&rtasd_log_lock, s);
@@ -414,6 +415,8 @@ static int rtasd(void *unused)
memset(logdata, 0, rtas_err...For what it's worth, on a different ppc64 box, this resolves a similar panic for me. Tested-by: Nishanth Aravamudan <nacc@us.ibm.com> Thanks, Nish -
For the reasons explained, I'd really like to nack Tony's patch. --linas -
I see. Can you reply in this thread with the patch you mentioned in your other reply? (or point me to a copy of it) Thanks, Nish -
What exactly happens that allows us to do logging? I don't see any ordering between anything else and the setting of the flag, and AFAICT we're not inside a spinlock or anything here. cheers --=20 Michael Ellerman OzLabs, IBM Australia Development Lab wwweb: http://michael.ellerman.id.au phone: +61 2 6212 1183 (tie line 70 21183) We do not inherit the earth from our ancestors, we borrow it from our children. - S.M.A.R.T Person
Until we allocate the error log buffer. The original crash was
for a null-pointer deref of the unallocated buffer. I just sent
out a patch to fix this; its a bit simpler than the below.
In that email, I remarked:
Andy Whitcroft's crash was appearently due to firmware complaining
about lost power, (actually, lost power supply redundancy!), which
occurred very early during boot.
Type 00000040 (EPOW)
Status: bypassed new
Residual error from previous boot.
EPOW Sensor Value: 00000002
EPOW warning due to loss of redundancy.
EPOW general power fault.
I've no clue why firmware thought it was OK to report this
during one of the earliest calls to RTAS; I'm still investiigating
that.
--linas
-Fine, but on some boots (I noticed this on rc6-mm1 too, but not before):
0000:00:1a.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
# lspci -vns 0000:00:1a.7
00:1a.7 0c03: 8086:293c (rev 02) (prog-if 20 [EHCI])
Subsystem: 8086:293c
Flags: bus master, medium devsel, latency 0, IRQ 19
Memory at ffa7b400 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port
Capabilities: [98] Vendor Specific Information
# lspci -vns 0000:00:1d.7
00:1d.7 0c03: 8086:293a (rev 02) (prog-if 20 [EHCI])
Subsystem: 8086:293a
Flags: bus master, medium devsel, latency 0, IRQ 23
Memory at ffa7b000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port
Capabilities: [98] Vendor Specific Information
regards,
--
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-Any changes in your BIOS setup? What about with vanilla 2.6.23-rc6? Or vanilla 2.6.23-rc7? The USB part of the code here hasn't changed in quite a while. Any difference in behavior must be the result of changes in some other part of the kernel. Possibly ACPI. This might be a good job for git-bisect. Alan Stern -
unlikely, but still possible -- I've made some changes in BIOS recently when I Ok, I'll play with that little bit. thanks, -- Jiri Slaby (jirislaby@gmail.com) Faculty of Informatics, Masaryk University -
USB Legacy Support is about the only change which springs to mind. But who knows... A buggy BIOS could do almost anything. Alan Stern -
Hmm, I have usb legacy keyboard switched on because of grub and bios to allow me typing. I booted 23-rc7 4 times, and the latest -mm 3 times just now and can't reproduce it, I just wonder by what is this conditioned. regards, -- Jiri Slaby (jirislaby@gmail.com) Faculty of Informatics, Masaryk University -
Warm boot vs. cold boot, maybe. Alan Stern -
Hmm, no. I don't know, I can't see it anymore so far (using rc8-mm2). I'll keep eyes on it, anyways. thanks, -- Jiri Slaby (jirislaby@gmail.com) Faculty of Informatics, Masaryk University -
Getting compile errors on S390:
CC arch/s390/mm/cmm.o
arch/s390/mm/cmm.c: In function `cmm_init':
arch/s390/mm/cmm.c:431: error: implicit declaration of function
`register_oom_notifier'
arch/s390/mm/cmm.c:443: error: implicit declaration of function
`unregister_oom_notifier'
make[1]: *** [arch/s390/mm/cmm.o] Error 1
make: *** [arch/s390/mm] Error 2
-apw
-yes. It's from oom-move-prototypes-to-appropriate-header-file.patch. I think this patch fixes it. C. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> --- arch/s390/mm/cmm.c | 1 + 1 file changed, 1 insertion(+) Index: 2.6.23-rc7-mm1/arch/s390/mm/cmm.c =================================================================== --- 2.6.23-rc7-mm1.orig/arch/s390/mm/cmm.c +++ 2.6.23-rc7-mm1/arch/s390/mm/cmm.c @@ -17,6 +17,7 @@ #include <linux/ctype.h> #include <linux/swap.h> #include <linux/kthread.h> +#include <linux/oom.h> #include <asm/pgalloc.h> #include <asm/uaccess.h> -
Hi Andrew,
The kernel build fails with
CC arch/ia64/kernel/efi.o
arch/ia64/kernel/efi.c: In function 'efi_memmap_init':
arch/ia64/kernel/efi.c:1088: error: 'total_memory' undeclared (first use in this function)
arch/ia64/kernel/efi.c:1088: error: (Each undeclared identifier is reported only once
arch/ia64/kernel/efi.c:1088: error: for each function it appears in.)
make[1]: *** [arch/ia64/kernel/efi.o] Error 1
make: *** [arch/ia64/kernel] Error 2
The use-extended-crashkernel-command-line-on-ia64.patch uses total_mem and
return total_memory.
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
---
--- linux-2.6.23-rc7/arch/ia64/kernel/efi.c 2007-09-24 15:28:06.000000000 +0530
+++ linux-2.6.23-rc7/arch/ia64/kernel/~efi.c 2007-09-24 16:56:03.000000000 +0530
@@ -1085,7 +1085,7 @@ efi_memmap_init(unsigned long *s, unsign
*s = (u64)kern_memmap;
*e = (u64)++k;
- return total_memory;
+ return total_mem;
}
void
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-This patch:
- makes hidp_setup_input() return int to indicate errors;
- checks its return value to handle errors.
And this time it is against -rc7-mm1 tree.
Thanks to roel and Marcel Holtmann for comments.
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
---
net/bluetooth/hidp/core.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
Index: linux-2.6.23-rc7-mm1/net/bluetooth/hidp/core.c
===================================================================
--- linux-2.6.23-rc7-mm1.orig/net/bluetooth/hidp/core.c
+++ linux-2.6.23-rc7-mm1/net/bluetooth/hidp/core.c
@@ -625,7 +625,7 @@ static struct device *hidp_get_device(st
return conn ? &conn->dev : NULL;
}
-static inline void hidp_setup_input(struct hidp_session *session, struct hidp_connadd_req *req)
+static inline int hidp_setup_input(struct hidp_session *session, struct hidp_connadd_req *req)
{
struct input_dev *input = session->input;
int i;
@@ -669,7 +669,7 @@ static inline void hidp_setup_input(stru
input->event = hidp_input_event;
- input_register_device(input);
+ return input_register_device(input);
}
static int hidp_open(struct hid_device *hid)
@@ -822,8 +822,11 @@ int hidp_add_connection(struct hidp_conn
session->flags = req->flags & (1 << HIDP_BLUETOOTH_VENDOR_ID);
session->idle_to = req->idle_to;
- if (session->input)
- hidp_setup_input(session, req);
+ if (session->input) {
+ err = hidp_setup_input(session, req);
+ if (err < 0)
+ goto failed;
+ }
if (session->hid)
hidp_setup_hid(session, req);
-Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Regards Marcel -
From: Marcel Holtmann <marcel@holtmann.org> Applied, thanks. -
It lived fast, it died young, it didn't leave a pretty corpse... Something in the startup scripts did a 'touch', and ker-blam. [ 15.668000] Unable to handle kernel NULL pointer dereference at 0000000000000252 RIP: [ 15.668000] [<ffffffff802a1dd1>] __mnt_is_readonly+0x9/0x1e [ 15.668000] PGD 52be067 PUD 5645067 PMD 0 [ 15.668000] Oops: 0000 [1] PREEMPT SMP [ 15.668000] last sysfs file: /block/dm-13/dev [ 15.668000] CPU 0 [ 15.668000] Modules linked in: rtc [ 15.668000] Pid: 528, comm: touch Not tainted 2.6.23-rc7-mm1 #1 [ 15.668000] RIP: 0010:[<ffffffff802a1dd1>] [<ffffffff802a1dd1>] __mnt_is_readonly+0x9/0x1e [ 15.668000] RSP: 0018:ffff8100045fddd8 EFLAGS: 00010202 [ 15.668000] RAX: 0000000000000001 RBX: ffff810002c10680 RCX: 0000000000000001 [ 15.668000] RDX: ffff810082504000 RSI: ffff810005243168 RDI: 0000000000000202 [ 15.668000] RBP: ffff8100045fddd8 R08: 0000000000000001 R09: 0000000000000002 [ 15.668000] R10: 0000000000000000 R11: ffff8100045fde68 R12: 0000000000000202 [ 15.668000] R13: 00000000ffffffe2 R14: ffff8100052c1d80 R15: ffff8100039aa8a0 [ 15.668000] FS: 00007f9527f596f0(0000) GS:ffffffff806b6000(0000) knlGS:0000000000000000 [ 15.668000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 15.668000] CR2: 0000000000000252 CR3: 00000000052cb000 CR4: 00000000000006e0 [ 15.668000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 15.668000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 15.668000] Process touch (pid: 528, threadinfo ffff8100045fc000, task ffff8100047517e0) [ 15.668000] last branch before last exception/interrupt [ 15.668000] from [<ffffffff802a4d1b>] mnt_want_write+0x44/0xb5 [ 15.668000] to [<ffffffff802a1dc8>] __mnt_is_readonly+0x0/0x1e [ 15.668000] Stack: ffff8100045fde08 ffffffff802a4d20 ffff8100045fddf8 0000000000000000 [ 15.668000] 00000000fffffff7 ffff810005243140 ffff8100045fdf28 ffffffff802ad288 [ 15.668000] ...
do_times passes an unitialized vfsmount into mnt_want_write. Here's
the quick fix (untested), but the right fix is to restructure the complete
mess do_utimes is (never let a libc developer write your kernel code.. :)):
Index: linux-2.6.23-rc6/fs/utimes.c
===================================================================
--- linux-2.6.23-rc6.orig/fs/utimes.c 2007-09-24 14:02:24.000000000 +0200
+++ linux-2.6.23-rc6/fs/utimes.c 2007-09-24 14:03:57.000000000 +0200
@@ -59,6 +59,7 @@ long do_utimes(int dfd, char __user *fil
struct inode *inode;
struct iattr newattrs;
struct file *f = NULL;
+ struct vfsmount *mnt;
error = -EINVAL;
if (times && (!nsec_valid(times[0].tv_nsec) ||
@@ -79,17 +80,19 @@ long do_utimes(int dfd, char __user *fil
if (!f)
goto out;
dentry = f->f_path.dentry;
+ mnt = f->f_path.mnt;
} else {
error = __user_walk_fd(dfd, filename, (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW, &nd);
if (error)
goto out;
dentry = nd.dentry;
+ mnt = nd.mnt;
}
inode = dentry->d_inode;
- error = mnt_want_write(nd.mnt);
+ error = mnt_want_write(mnt);
if (error)
goto dput_and_out;
-Close - it still blew up, as one reference to nd.mnt remained. Fixed patch
is appended - system boots all the way with this applied.
--- linux-2.6.23-rc7-mm1/fs/utimes.c.dist 2007-09-24 05:57:38.000000000 -0400
+++ linux-2.6.23-rc7-mm1/fs/utimes.c 2007-09-24 08:48:34.000000000 -0400
@@ -59,6 +59,7 @@ long do_utimes(int dfd, char __user *fil
struct inode *inode;
struct iattr newattrs;
struct file *f = NULL;
+ struct vfsmount *mnt;
error = -EINVAL;
if (times && (!nsec_valid(times[0].tv_nsec) ||
@@ -79,17 +80,19 @@ long do_utimes(int dfd, char __user *fil
if (!f)
goto out;
dentry = f->f_path.dentry;
+ mnt = f->f_path.mnt;
} else {
error = __user_walk_fd(dfd, filename, (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW, &nd);
if (error)
goto out;
dentry = nd.dentry;
+ mnt = nd.mnt;
}
inode = dentry->d_inode;
- error = mnt_want_write(nd.mnt);
+ error = mnt_want_write(mnt);
if (error)
goto dput_and_out;
@@ -135,7 +138,7 @@ long do_utimes(int dfd, char __user *fil
error = notify_change(dentry, &newattrs);
mutex_unlock(&inode->i_mutex);
mnt_drop_write_and_out:
- mnt_drop_write(nd.mnt);
+ mnt_drop_write(mnt);
dput_and_out:
if (f)
fput(f);