login
Header Space

 
 

Re: 2.6.23-rc7-mm1 -- powerpc rtas panic

Previous thread: [PATCH] change inotifyfs magic as the same magic is used for futexfs (v2) by Andrey Mirkin on Monday, September 24, 2007 - 4:46 am. (2 messages)

Next thread: [PATCH] Patches for tiny 386 kernels, again. Linux kernel 2.6.22.7 by Jonathan Campbell on Monday, September 24, 2007 - 4:51 am. (14 messages)
To: <linux-kernel@...>
Date: Monday, September 24, 2007 - 5:17 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc7/2.6.23-rc7-mm...

- New git tree git-powerpc-galak.patch added to the -mm lineup: ppc32
  things, mainly (Kumar Gala &lt;galak@gate.crashing.org&gt;)



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.



Changes since 2.6.23-rc6-mm1:

 origin.patch
 git-acpi.patch
 git-alsa.patch
 git-arm.patch
 git-audit-master.patch
 git-avr32.patch
 git-cifs.patch
 git-cpufreq.patch
 git-powerpc.patch
 git-powerpc-galak.patch
 git-drm.patch
 git-dvb.patch
 git-hwmon.patch
 git-gfs2-nmw.patch
 git-hid.patch
 git-ieee1394.patch...
To: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 4:10 pm

Le 24.09.2007 11:17, Andrew Morton a
To: Laurent Riffard <laurent.riffard@...>, <kexec@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 7:11 pm

[adding kexec m-l]



---
~Randy
Phaedrus says that Quality is about caring.
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Monday, September 24, 2007 - 6:20 pm

Hi Andrew,

The drivers/net/pasemi_mac seems to be broken and build fails with

CC [M] drivers/net/pasemi_mac.o
drivers/net/pasemi_mac.c: In function ‘pasemi_mac_probe’:
drivers/net/pasemi_mac.c:1153: error: conflicting types for ‘mac’
drivers/net/pasemi_mac.c:1151: error: previous declaration of ‘mac’ was here
drivers/net/pasemi_mac.c:1170: error: incompatible types in assignment
drivers/net/pasemi_mac.c:1172: error: request for member ‘pdev’ in
something not a structure or union
drivers/net/pasemi_mac.c:1173: error: request for member ‘netdev’ in
something not a structure or union
drivers/net/pasemi_mac.c:1175: error: request for member ‘napi’ in
something not a structure or union
drivers/net/pasemi_mac.c:1180: error: request for member ‘dma_txch’ in
something not a structure or union
drivers/net/pasemi_mac.c:1181: error: request for member ‘dma_rxch’ in
something not a structure or union
drivers/net/pasemi_mac.c:1187: error: request for member ‘dma_if’ in
something not a structure or union
drivers/net/pasemi_mac.c:1189: error: request for member ‘dma_if’ in
something not a structure or union
drivers/net/pasemi_mac.c:1194: error: request for member ‘type’ in
something not a structure or union
drivers/net/pasemi_mac.c:1197: error: request for member ‘type’ in
something not a structure or union
drivers/net/pasemi_mac.c:1205: warning: passing argument 1 of
‘pasemi_get_mac_addr’ from incompatible pointer type
drivers/net/pasemi_mac.c:1205: error: request for member ‘mac_addr’ in
something not a structure or union
drivers/net/pasemi_mac.c:1209: error: request for member ‘mac_addr’ in
something not a structure or union
drivers/net/pasemi_mac.c:1209: error: request for member ‘mac_addr’ in
something not a structure or union
drivers/net/pasemi_mac.c:1216: warning: passing argument 1 of
‘pasemi_mac_map_regs’ from incompatible pointer type
drivers/net/pasemi_mac.c:1220: error: request for member ‘rx_status’ in
so...
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <axboe@...>, <neilb@...>
Date: Monday, September 24, 2007 - 3:41 pm

Hi Andrew,

The build fails with following error

CC drivers/block/ps3disk.o
drivers/block/ps3disk.c: In function ‘ps3disk_scatter_gather’:
drivers/block/ps3disk.c:115: error: ‘bio’ undeclared (first use in this
function)
drivers/block/ps3disk.c:115: error: (Each undeclared identifier is
reported only once
drivers/block/ps3disk.c:115: error: for each function it appears in.)
drivers/block/ps3disk.c:115: error: ‘j’ undeclared (first use in this
function)
drivers/block/ps3disk.c:116: error: implicit declaration of function
‘bio_kunmap_bvec’
make[2]: *** [drivers/block/ps3disk.o] Error 1
make[1]: *** [drivers/block] Error 2
make: *** [drivers] Error 2

The function bio_kunmap_bvec is missing.I tried checking the git-block.patch
as well as the linux/kernel/git/axboe/linux-2.6-block.git and did not
find this function.

Previously this function was replaced by __bio_kunmap_atomic();
This patch does not solves the implicit "declaration of function
‘bio_kunmap_bvec’"

Signed-off-by: Kamalesh Babulal &lt;kamalesh@linux.vnet.ibm.com
&lt;mailto:kamalesh@linux.vnet.ibm.com&gt;&gt;
---

--- linux-2.6.23-rc7/drivers/block/ps3disk.c    2007-09-24 20:50:41.000000000 +0530
+++ linux-2.6.23-rc7/drivers/block/~ps3disk.c   2007-09-24 20:50:59.000000000 +0530
@@ -112,7 +112,7 @@ static void ps3disk_scatter_gather(struc
                else
                        memcpy(buf, dev-&gt;bounce_buf+offset, size);
                offset += size;
-               flush_kernel_dcache_page(bio_iovec_idx(bio, j)-&gt;bv_page);
+              flush_kernel_dcache_page(bio_iovec_idx(iter.bio, iter.i)-&gt;bv_page);
                bio_kunmap_bvec(bvec, flags);
                i++;
        }

-- 

Thanks &amp; Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.

-
To: Kamalesh Babulal <kamalesh@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <axboe@...>, <neilb@...>, <Geert.Uytterhoeven@...>, <geoffrey.levand@...>
Date: Tuesday, September 25, 2007 - 6:23 am

On (25/09/07 01:11), Kamalesh Babulal didst pronounce:


Your mailer appears to have mangled both your signoff and the whitespace in
the patch and it does not apply. However, fixing it does not solve the problem
because of this mysterious bio_kunmap_bvec() that is only referenced by this

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To: Mel Gorman <mel@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <neilb@...>, <Geert.Uytterhoeven@...>, <geoffrey.levand@...>
Date: Tuesday, September 25, 2007 - 6:31 am

This should fix things up.

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index 8e05ba7..a7fd66a 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -106,14 +106,14 @@ static void ps3disk_scatter_gather(struct ps3_storage_device *dev,
 			(unsigned long)iter.bio-&gt;bi_sector);
 
 		size = bvec-&gt;bv_len;
-		buf = bvec_kmap_irq(bvec, flags);
+		buf = bvec_kmap_irq(bvec, &amp;flags);
 		if (gather)
 			memcpy(dev-&gt;bounce_buf+offset, buf, size);
 		else
 			memcpy(buf, dev-&gt;bounce_buf+offset, size);
 		offset += size;
-		flush_kernel_dcache_page(bio_iovec_idx(bio, j)-&gt;bv_page);
-		bio_kunmap_bvec(bvec, flags);
+		flush_kernel_dcache_page(bvec-&gt;bv_page);
+		bvec_kunmap_irq(buf, &amp;flags);
 		i++;
 	}
 }

-- 
Jens Axboe

-
To: Jens Axboe <jens.axboe@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <neilb@...>, <Geert.Uytterhoeven@...>, <geoffrey.levand@...>
Date: Tuesday, September 25, 2007 - 7:15 am

This builds although I lack the hardware to really test it. However, in
2.6.23-rc8-mm1 it collides with git-block-ps3disk-fix.patch. This is a
version on top of that stack but I guess the best thing to do is replace
git-block-ps3disk-fix.patch with Jens patch once it is signed off.

Not signing off because this is just a rebase. Assuming the other one
gets signed off, consider it;

Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;

--- 

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc8-mm1-clean/drivers/block/ps3disk.c linux-2.6.23-rc8-mm1-fix-ps3disk/drivers/block/ps3disk.c
--- linux-2.6.23-rc8-mm1-clean/drivers/block/ps3disk.c	2007-09-25 12:05:40.000000000 +0100
+++ linux-2.6.23-rc8-mm1-fix-ps3disk/drivers/block/ps3disk.c	2007-09-25 12:09:19.000000000 +0100
@@ -106,14 +106,14 @@ static void ps3disk_scatter_gather(struc
 			(unsigned long)iter.bio-&gt;bi_sector);
 
 		size = bvec-&gt;bv_len;
-		buf = bvec_kmap_irq(bvec, flags);
+		buf = bvec_kmap_irq(bvec, &amp;flags);
 		if (gather)
 			memcpy(dev-&gt;bounce_buf+offset, buf, size);
 		else
 			memcpy(buf, dev-&gt;bounce_buf+offset, size);
 		offset += size;
-		flush_kernel_dcache_page(bio_iovec_idx(iter.bio, iter.i)-&gt;bv_page);
-		bio_kunmap_bvec(bvec, flags);
+		flush_kernel_dcache_page(bvec-&gt;bv_page);
+		bvec_kunmap_irq(buf, &amp;flags);
 		i++;
 	}
 }

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To: Mel Gorman <mel@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <neilb@...>, <Geert.Uytterhoeven@...>, <geoffrey.levand@...>
Date: Tuesday, September 25, 2007 - 7:23 am

Thanks, but I already integrated the fix into the existing patch, so
that bisect will work.

-- 
Jens Axboe

-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Monday, September 24, 2007 - 3:07 pm

With the five hotfixes applied it works for me.

But it fails to power down my system when shutting down.

It prints twice 'System halted' and blinks the keyboard leds, but does
not switch off. On all other kernel version I only see one keyboard
blink before the power goes out.

I compared its dmesg to vanilla-rc7 and -rc4-mm1, but expect that rc-4
assigns different IRQs I can't see any differences except the normal
variation in BogoMips etc.

As the system still responded to SysRq I got the following informations:
[  415.770000] SysRq : Show Regs
[  415.770000] CPU 3:
[  415.780000] Modules linked in: radeon drm nfsd exportfs ipv6 tuner
tea5767 tda8290 tuner_simple mt20xx tvaudio msp3400 bttv video_buf
ir_common compat_ioctl32 btcx_risc tveeprom videodev v4l2_common
v4l1_compat pata_amd usbhid hid sg
[  415.780000] Pid: 0, comm: swapper Not tainted 2.6.23-rc7-mm1 #1
[  415.780000] RIP: 0010:[&lt;ffffffff8020ac79&gt;]  [&lt;ffffffff8020ac79&gt;]
default_idle+0x29/0x40
[  415.780000] RSP: 0018:ffff81010038bf30  EFLAGS: 00000246
[  415.780000] RAX: 0000000000000400 RBX: ffffffff80810040 RCX: 0000000000000000
[  415.780000] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000005
[  415.780000] RBP: 0000000000030400 R08: 0000000000000000 R09: ffff81010038be68
[  415.950000] R10: 000000000100002c R11: ffffffff80219be0 R12: 0000000000000000
[  415.950000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  415.950000] FS:  00007f35c69726f0(0000) GS:ffff810100319700(0000)
knlGS:0000000000000000
[  415.950000] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  415.950000] CR2: 00007fe432928c40 CR3: 0000000000201000 CR4: 00000000000006e0
[  416.070000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  416.070000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  416.070000]
[  416.070000] Call Trace:
[  416.070000]  [&lt;ffffffff8020acea&gt;] cpu_idle+0x5a/0x90
[  416.070000]

No blocked tasks were shown with SysRq+...
To: Torsten Kaiser <just.for.lkml@...>
Cc: <linux-kernel@...>, Thomas Gleixner <tglx@...>, Len Brown <lenb@...>
Date: Monday, September 24, 2007 - 3:34 pm

On Mon, 24 Sep 2007 21:07:19 +0200



hm, dunno.  The only substantial patch which touches
arch/x86_64/kernel/process.c (which is where cpu_idle lives) is
x86_64-prep-idle-loop-for-dynticks.patch.

The problem is, 2.6.23-rc6-mm1's git-acpi patch had all the new cpuidle
code in it.  Len dropped all that code over the weekend (which is when I
picked this copy of his tree), so 2.6.23-rc7-mm1 doesn't have the cpuidle
code.  Len will be reapplying the cpuidle patches today(ish) so next -mm
_will_ have the cpuidle code.

So what we have in rc7-mm1 is this transient no-cpuidle state.  It could be
that the x86_64 dynticks code (which was developed previously tested in
conjunction with the cpuidle patches) has some dependency on cpuidle.

So it's all a bit of a mess :(

I think I'll basically stop applying things which don't look like bugfixes
for a while and try to get more -mm's out, as we seriously need to get this
lot stabilised asap.

Len, would it be possible to restore cpuidle sometime today please?
-
To: Andrew Morton <akpm@...>
Cc: Torsten Kaiser <just.for.lkml@...>, <linux-kernel@...>, Len Brown <lenb@...>
Date: Monday, September 24, 2007 - 4:25 pm

Can your check whether 2.6.23-rc7 +
http://tglx.de/projects/hrtimers/2.6.23-rc7/patch-2.6.23-rc7-hrt1.patch


It should not. cpuidle makes use of dynticks not the other way round.

	tglx


-
To: Thomas Gleixner <tglx@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Len Brown <lenb@...>
Date: Tuesday, September 25, 2007 - 3:32 am

Yes, powers off normally.

Torsten
-
To: Torsten Kaiser <just.for.lkml@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Len Brown <lenb@...>
Date: Tuesday, September 25, 2007 - 3:44 am

Ok, so it's probably some merge artifact in -mm. We'll get this sorted
out once Len has his new tree available.

	tglx


-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <bob.picco@...>
Date: Monday, September 24, 2007 - 11:18 am

There isn't a total_memory identifier within this function's scope. The
patch was compile/link tested.

Signed-off-by: Bob Picco &lt;bob.picco@hp.com&gt;

 arch/ia64/kernel/efi.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c
===================================================================
--- linux-2.6.23-rc7-mm1.orig/arch/ia64/kernel/efi.c	2007-09-24 09:54:40.000000000 -0400
+++ linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c	2007-09-24 10:50:51.000000000 -0400
@@ -1085,7 +1085,7 @@ efi_memmap_init(unsigned long *s, unsign
 	*s = (u64)kern_memmap;
 	*e = (u64)++k;
 
-	return total_memory;
+	return total_mem;
 }
 
 void
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Monday, September 24, 2007 - 10:52 am

I'm observing a problem with this kernel (as well as 2.6.23-rc6-mm1) which 
manifests itself only in my Postfix/application mail.logs:

Sep 25 00:25:40 tornado postfix/smtp[12520]: fatal: select lock: Cannot allocate 
memory
Sep 25 00:25:41 tornado postfix/master[8002]: warning: process 
/usr/lib64/postfix/smtp pid 12520 exit status 1

This is happening frequently with processes started via 'master' (smtp, smtpd 
and cleanup), but it does not appear to have any noticeable operational impact 
apart from logging a lot of copies of this message.

The corresponding code in Postfix which triggers this is (choice of 3 files in 
src/master are all possibilities which all have much the same code)

     /*
      * The event loop, at last.
      */
     while (var_use_limit == 0 || use_count &lt; var_use_limit || client_count &gt; 0) {
         if (multi_server_lock != 0) {
             watchdog_stop(watchdog);
             if (myflock(vstream_fileno(multi_server_lock), INTERNAL_LOCK,
                         MYFLOCK_OP_EXCLUSIVE) &lt; 0)
                 msg_fatal("select lock: %m");
         }
         watchdog_start(watchdog);
         delay = loop ? loop(multi_server_name, multi_server_argv) : -1;
         event_loop(delay);
     }
     multi_server_exit();
}


Now I'm not convinced this is an application problem, because I'm only seeing 
this after running up kernel 2.6.23-rc6-mm1 or 2.6.23-rc7-mm1 and with NO 
changes to the application itself.  Using the same application binaries it does 
not occur with 2.6.22 mainline.  [I didn't get a lot of testing with the -mm 
release prior to that unfortunately due to some other breakage.]

Is there anything new in the last two or so -mm kernels which could have caused 
this?

I've put my .config up at http://www.reub.net/files/kernel/2.6.23-rc7-mm1.config

Thanks,
Reuben
-
To: Reuben Farrelly <reuben-linuxkernel@...>
Cc: <linux-kernel@...>, Pavel Emelyanov <xemul@...>, J. Bruce Fields <bfields@...>
Date: Monday, September 24, 2007 - 12:59 pm

ug.

Lots of people have been futzing with the fs/locks.c code:

cleanup-macros-for-distinguishing-mandatory-locks.patch
fix-potential-oops-in-generic_setlease-v2.patch
fix-potential-oops-in-generic_setlease.patch
fs-locksc-use-list_for_each_entry-instead-of-list_for_each.patch
git-nfs.patch
git-nfsd.pc
rework-proc-locks-via-seq_files-and-seq_list-helpers-fix-2.patch
rework-proc-locks-via-seq_files-and-seq_list-helpers.patch
slab-api-remove-useless-ctor-parameter-and-reorder-parameters.patch



-
To: Andrew Morton <akpm@...>
Cc: Reuben Farrelly <reuben-linuxkernel@...>, <linux-kernel@...>, Pavel Emelyanov <xemul@...>
Date: Monday, September 24, 2007 - 1:12 pm

Oog.  Looks like it's the "Memory shortage can result in inconsistent
flocks state" patch--the error variable is being set in some cases when
it shouldn't be.  Does the following fix it?

That's in my git tree, not in mainline.  I'll fix up my copy.

And I'll spend some time today figuring out what to do about regression
testing for the posix lock, flock, and lease code.

Thanks for the bug report!

--b.

diff --git a/fs/locks.c b/fs/locks.c
index a6c5917..3e8bfd2 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -740,6 +740,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
 		new_fl = locks_alloc_lock();
 		if (new_fl == NULL)
 			goto out;
+		error = 0;
 	}
 
 	for_each_lock(inode, before) {
-
To: J. Bruce Fields <bfields@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Pavel Emelyanov <xemul@...>
Date: Monday, September 24, 2007 - 5:31 pm

Yes that has fixed it, thanks!

Reuben
-
To: Andrew Morton <akpm@...>
Cc: Hisashi Hifumi <hifumi.hisashi@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 9:17 am

a.k.a. mm-use-pagevec-to-rotate-reclaimable-page-fix-2.patch

rotate_reclaimable_page() is not necessarily called with IRQ disabled:
it must do so when calling the helpfully commented pagevec_move_tail().

Hmm, if pagevec_move_tail() is assuming IRQ disabled, why should it
bother with irqsave/irqrestore variants of spin_lock?  Because we like
to see them on lru_lock?  But vmscan.c already has one bare spin_lock().

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
---

 mm/swap.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

--- 2.6.23-rc7-mm1/mm/swap.c	2007-09-24 11:05:55.000000000 +0100
+++ linux/mm/swap.c	2007-09-24 13:08:12.000000000 +0100
@@ -102,7 +102,6 @@ static void pagevec_move_tail(struct pag
 	int i;
 	int pgmoved = 0;
 	struct zone *zone = NULL;
-	unsigned long uninitialized_var(flags);
 
 	for (i = 0; i &lt; pagevec_count(pvec); i++) {
 		struct page *page = pvec-&gt;pages[i];
@@ -110,9 +109,9 @@ static void pagevec_move_tail(struct pag
 
 		if (pagezone != zone) {
 			if (zone)
-				spin_unlock_irqrestore(&amp;zone-&gt;lru_lock, flags);
+				spin_unlock(&amp;zone-&gt;lru_lock);
 			zone = pagezone;
-			spin_lock_irqsave(&amp;zone-&gt;lru_lock, flags);
+			spin_lock(&amp;zone-&gt;lru_lock);
 		}
 		if (PageLRU(page) &amp;&amp; !PageActive(page)) {
 			list_move_tail(&amp;page-&gt;lru, &amp;zone-&gt;inactive_list);
@@ -120,7 +119,7 @@ static void pagevec_move_tail(struct pag
 		}
 	}
 	if (zone)
-		spin_unlock_irqrestore(&amp;zone-&gt;lru_lock, flags);
+		spin_unlock(&amp;zone-&gt;lru_lock);
 	__count_vm_events(PGROTATED, pgmoved);
 	release_pages(pvec-&gt;pages, pvec-&gt;nr, pvec-&gt;cold);
 	pagevec_reinit(pvec);
@@ -150,6 +149,7 @@ void move_tail_pages()
 int rotate_reclaimable_page(struct page *page)
 {
 	struct pagevec *pvec;
+	unsigned long flags;
 
 	if (PageLocked(page))
 		return 1;
@@ -162,9 +162,11 @@ int rotate_reclaimable_page(struct page 
 
 	if (PageLRU(page) &amp;&amp; !PageActive(page)) {
 		page_cache_...
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Monday, September 24, 2007 - 9:13 am

Hi Andrew,

Kernel BUG over x86_64 (AMD Opteron(tm) Processor 844).

Similar kernel Bug was reported for 2.6.23-rc2-mm1
at http://lkml.org/lkml/2007/8/10/20 and the 
mm-dirty-balancing-for-tasks.patch was dropped from 2.6.23-rc2-mm2.
And the same patch is in this -mm version, suspect whether is it the
same patch triggering this Bug.

BUG: soft lockup - CPU#0 stuck for 11s! [events/0:15]
CPU 0:
Modules linked in:
Pid: 15, comm: events/0 Tainted: G      D 2.6.23-rc7-mm1-autokern1 #1
RIP: 0010:[&lt;ffffffff8021be46&gt;]  [&lt;ffffffff8021be46&gt;] __smp_call_function_mask+0x9a/0xc4
RSP: 0000:ffff8100017add80  EFLAGS: 00000297
RAX: 00000000000000fc RBX: ffff8100017adde0 RCX: 0000000000000001
RDX: 00000000000008fc RSI: 00000000000000fc RDI: 000000000000000e
RBP: ffffc20002d11000 R08: ffff8100017ac000 R09: ffffffff80675e38
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000f
R13: ffffffff8021bcfe R14: 0000000000000000 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffffffff8065a000(0000) knlGS:00000000556aa2a0
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffffc20002d11008 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
Inexact backtrace:
 [&lt;ffffffff802157a4&gt;] mcheck_check_cpu+0x0/0x31
 [&lt;ffffffff802157a4&gt;] mcheck_check_cpu+0x0/0x31
 [&lt;ffffffff8021becf&gt;] smp_call_function_mask+0x5f/0x72
 [&lt;ffffffff802157a4&gt;] mcheck_check_cpu+0x0/0x31
 [&lt;ffffffff8021bf82&gt;] smp_call_function+0x19/0x1b
 [&lt;ffffffff8023a773&gt;] on_each_cpu+0x16/0x2b
 [&lt;ffffffff802158a2&gt;] mcheck_timer+0x0/0x7c
 [&lt;ffffffff802158c0&gt;] mcheck_timer+0x1e/0x7c
 [&lt;ffffffff802444b9&gt;] run_workqueue+0x88/0x109
 [&lt;ffffffff8024453a&gt;] worker_thread+0x0/0xf4
 [&lt;ffffffff80244623&gt;] worker_thread+0xe9/0xf4
 [&lt;ffffffff8024841d&gt;] autoremove_wake_function+0x0/0x37
 [&lt;ffffffff8024841d&gt;] autoremove_wa...
To: Kamalesh Babulal <kamalesh@...>
Cc: <linux-kernel@...>, Peter Zijlstra <a.p.zijlstra@...>
Date: Monday, September 24, 2007 - 12:44 pm

hm, I thought we'd fixed the problems in that patchset.  Peter, were
you aware of this one?
-
To: Andrew Morton <akpm@...>
Cc: Kamalesh Babulal <kamalesh@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 12:57 pm

On Mon, 24 Sep 2007 09:44:48 -0700 Andrew Morton

Nope, and the stacktrace is utterly puzzling.

/me goes read the lkml.org link

Kamalesh Babulal: do you still get:
  BUG: spinlock bad magic on

msgs?

Because those I could reproduce using fsx, and I fixed all that.
-
To: Peter Zijlstra <a.p.zijlstra@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 1:08 pm

Hi Peter,

I do not get BUG: spinlock bad magic messages any more, but the softlock message is
thrown more than 30 time, while running the ltp runall.

-- 
Thanks &amp; Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To: Kamalesh Babulal <kamalesh@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 3:20 pm

On Mon, 24 Sep 2007 22:38:03 +0530 Kamalesh Babulal

It would be good to know what function on_each_cpu is executing, could
you try something like:

---
 kernel/softirq.c    |    5 +++++
 kernel/softlockup.c |    7 +++++++
 2 files changed, 12 insertions(+)

Index: linux-2.6/kernel/softirq.c
===================================================================
--- linux-2.6.orig/kernel/softirq.c
+++ linux-2.6/kernel/softirq.c
@@ -645,6 +645,8 @@ __init int spawn_ksoftirqd(void)
 }
 
 #ifdef CONFIG_SMP
+
+DEFINE_PER_CPU(void (*)(void *info), last_on_each_cpu);
 /*
  * Call a function on all processors
  */
@@ -653,6 +655,9 @@ int on_each_cpu(void (*func) (void *info
 	int ret = 0;
 
 	preempt_disable();
+
+	per_cpu(last_on_each_cpu, smp_processor_id()) = func;
+
 	ret = smp_call_function(func, info, retry, wait);
 	local_irq_disable();
 	func(info);
Index: linux-2.6/kernel/softlockup.c
===================================================================
--- linux-2.6.orig/kernel/softlockup.c
+++ linux-2.6/kernel/softlockup.c
@@ -15,6 +15,8 @@
 #include &lt;linux/notifier.h&gt;
 #include &lt;linux/module.h&gt;
 #include &lt;linux/kgdb.h&gt;
+#include &lt;linux/percpu.h&gt;
+#include &lt;linux/kallsyms.h&gt;
 
 #include &lt;asm/irq_regs.h&gt;
 
@@ -71,6 +73,8 @@ void touch_all_softlockup_watchdogs(void
 }
 EXPORT_SYMBOL(touch_all_softlockup_watchdogs);
 
+DECLARE_PER_CPU(void (*)(void *), last_on_each_cpu);
+
 /*
  * This callback runs from the timer interrupt, and checks
  * whether the watchdog thread has hung or not:
@@ -122,6 +126,9 @@ void softlockup_tick(void)
 	printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n",
 			this_cpu, now - touch_timestamp,
 			current-&gt;comm, task_pid_nr(current));
+	printk(KERN_ERR " last_on_each_cpu: [&lt;%p&gt;] ",
+			per_cpu(last_on_each_cpu, this_cpu));
+	print_symbol("%s\n", (unsigned long)per_cpu(last_on_each_cpu, this_cpu));
 	if (regs)
 		show_regs(regs);
 	else
-
To: Peter Zijlstra <a.p.zijlstra@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Tuesday, September 25, 2007 - 7:05 am

On Mon, 24 Sep 2007 21:20:58 +0200 Peter Zijlstra

I've just completed 2 full ltp runs on a dual-core opteron machine but
could not reproduce this problem.

Kamalesh, would it be possible for you to reproduce with that patch, so
we can see what function is holding up the cpu?
-
To: Peter Zijlstra <a.p.zijlstra@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Tuesday, September 25, 2007 - 9:07 am

Hi Peter,

After running the test with the patch you provided, i observed an oops message
which was at the top of the these soft lockup message and the oops is the same as 
the oops reported at http://lkml.org/lkml/2007/9/24/107.

And when i applied the patch for the oops proposed at 
http://lkml.org/lkml/2007/9/25/57 the oops as well as the soft lockup's are not seen.

-- 
Thanks &amp; Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Heiko Carstens <heiko.carstens@...>
Date: Monday, September 24, 2007 - 9:00 am

I also get this compile error on s390. 'linux/scatterlist.h' has disappeared 
from the #include pile but where ? 

/home/clg/linux/2.6.23-rc7-mm1/net/sctp/auth.c: In function `sctp_auth_calculate_hmac':
/home/clg/linux/2.6.23-rc7-mm1/net/sctp/auth.c:695: error: storage size of 'sg' isn't known
/home/clg/linux/2.6.23-rc7-mm1/net/sctp/auth.c:695: warning: unused variable `sg'

Cheers,

C.
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Heiko Carstens <heiko.carstens@...>, Vlad Yasevich <vladislav.yasevich@...>
Date: Monday, September 24, 2007 - 9:10 am

putting Vlad in Cc: 


The following patch works of course but it seems to simplistic for s390.

Cheers,

C.


Signed-off-by: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
---
 net/sctp/auth.c |    1 +
 1 file changed, 1 insertion(+)

Index: 2.6.23-rc7-mm1/net/sctp/auth.c
===================================================================
--- 2.6.23-rc7-mm1.orig/net/sctp/auth.c
+++ 2.6.23-rc7-mm1/net/sctp/auth.c
@@ -36,6 +36,7 @@
 
 #include &lt;linux/types.h&gt;
 #include &lt;linux/crypto.h&gt;
+#include &lt;linux/scatterlist.h&gt;
 #include &lt;net/sctp/sctp.h&gt;
 #include &lt;net/sctp/auth.h&gt;
-
To: Cedric Le Goater <clg@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Heiko Carstens <heiko.carstens@...>, Vlad Yasevich <vladislav.yasevich@...>
Date: Monday, September 24, 2007 - 12:57 pm

Thanks, applied.

-- 
Jens Axboe

-
To: Cedric Le Goater <clg@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Heiko Carstens <heiko.carstens@...>
Date: Monday, September 24, 2007 - 9:29 am

Odd that it didn't show up on x86 or ia64, but simple enough.

ACK.


-
To: Vlad Yasevich <vladislav.yasevich@...>
Cc: Cedric Le Goater <clg@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, Heiko Carstens <heiko.carstens@...>
Date: Monday, September 24, 2007 - 12:58 pm

Most likely those archs end up pulling in scatterlist.h through some
other maze of includes.

-- 
Jens Axboe

-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Monday, September 24, 2007 - 8:55 am

Hi Andrew,

Kernel oops over x86_64 (AMD Opteron(tm) Processor 844)

Unable to handle kernel NULL pointer dereference at 0000000000000070 RIP: 
 [&lt;ffffffff80290630&gt;] fasync_helper+0x6b/0xe4
PGD 181949067 PUD 182228067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /devices/system/node/possible
CPU 3 
Modules linked in:
Pid: 18156, comm: fcntl23 Not tainted 2.6.23-rc7-mm1-autokern1 #1
RIP: 0010:[&lt;ffffffff80290630&gt;]  [&lt;ffffffff80290630&gt;] fasync_helper+0x6b/0xe4
RSP: 0000:ffff810082bdfdb8  EFLAGS: 00010046
RAX: 00000000fffffff4 RBX: ffff8101821a9000 RCX: 0000000000000000
RDX: ffff8101821a9000 RSI: ffff810180026900 RDI: ffffffff806286b8
RBP: ffff810082bdfde8 R08: 0000000000000002 R09: ffff81018072124b
R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000070
R13: 0000000000000001 R14: 0000000000000000 R15: ffff810181875cc0
FS:  0000000000000000(0000) GS:ffff810180721380(0063) knlGS:00000000556aa2a0
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000000000070 CR3: 00000001818c4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process fcntl23 (pid: 18156, threadinfo ffff810082bde000, task ffff810082309530)
last branch before last exception/interrupt
 from  [&lt;ffffffff804ad9e0&gt;] _write_lock_irq+0x14/0x15
 to  [&lt;ffffffff80290630&gt;] fasync_helper+0x6b/0xe4
Stack:  0000000400000004 0000000000000000 0000000000000000 ffff810181875cc0
 0000000000000004 ffff810182b3d238 ffff810082bdfee8 ffffffff802939cc
 ffff810082bdfe18 0000000000000000 0000000000000000 ffff810082bdfe10
Call Trace:
 [&lt;ffffffff802939cc&gt;] fcntl_setlease+0x99/0x101
 [&lt;ffffffff80290370&gt;] sys_fcntl+0x2a3/0x2ce
 [&lt;ffffffff802b14cf&gt;] compat_sys_fcntl64+0x2ee/0x2ff
 [&lt;ffffffff80224292&gt;] ia32_sysret+0x0/0xa
DWARF2 unwinder stuck at ia32_sysret+0x0/0xa
Leftover inexact backtrace:
Code: 49 8b 34 24 4c 89 e2 48 85 f6 74 2a 4c 39 7e 10 75 1a 45 85 
RIP  [&lt;ff...
To: Kamalesh Babulal <kamalesh@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 9:10 am

Please, try with this patch too:

diff --git a/fs/locks.c b/fs/locks.c
index c0fe71a..f599508 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1423,7 +1423,7 @@ int generic_setlease(struct file *filp, 
 	locks_copy_lock(new_fl, lease);
 	locks_insert_lock(before, new_fl);
 
-	*flp = fl;
+	*flp = new_fl;
 	return 0;
 
 out:

-
To: Pavel Emelyanov <xemul@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 9:21 am

Hi, Pavel,

You did not signoff on the patch.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL
-
To: <balbir@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 11:34 am

I did not, but this is just a patch to test. I know, that it
most likely fixes the problem, but since Kamalesh didn't tell
us how he had triggered it, I'd like him to Ack it :)

Thanks,
Pavel


-
To: Pavel Emelyanov <xemul@...>
Cc: Kamalesh Babulal <kamalesh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 12:10 pm

Ok, just wanted to let you know in case you missed it out.
In case Andrew picked it up. That's all!


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Heiko Carstens <heiko.carstens@...>, Jens Axboe <jens.axboe@...>
Date: Monday, September 24, 2007 - 8:47 am

/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c: In function `dasd_eckd_build_cp':
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1181: error: syntax error before "struct"
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: `iter' undeclared (first use in this function)
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: (Each undeclared identifier is reported only once
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: for each function it appears in.)
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: error: `bv' undeclared (first use in this function)
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: left-hand operand of comma expression has no effect
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: left-hand operand of comma expression has no effect
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: left-hand operand of comma expression has no effect
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: left-hand operand of comma expression has no effect
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: statement with no effect
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1209: warning: statement with no effect
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: statement with no effect
/home/clg/linux/2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c:1257: warning: statement with no effect
make[3]: *** [drivers/s390/block/dasd_eckd.o] Error 1
make[2]: *** [drivers/s390/block] Error 2

Signed-off-by: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
---
 drivers/s390/block/dasd_eckd.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: 2.6.23-rc7-mm1/drivers/s390/block/dasd_eckd.c
===================================================================
--- 2.6.23-rc7-mm1.orig/drivers/s390/block/dasd_eckd.c...
To: Cedric Le Goater <clg@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Heiko Carstens <heiko.carstens@...>
Date: Monday, September 24, 2007 - 12:56 pm

Oops, looks like neither Neil nor I cross compiled this on s390. Thanks,
I'll apply it.

-- 
Jens Axboe

-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <linuxppc-dev@...>
Date: Monday, September 24, 2007 - 8:35 am

Seeing the following from an older power LPAR, pretty sure we had
this in the previous -mm also:

Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000047ac8
cpu 0x0: Vector: 300 (Data Access) at [c00000000058f750]
    pc: c000000000047ac8: .pSeries_log_error+0x364/0x420
    lr: c000000000047a4c: .pSeries_log_error+0x2e8/0x420
    sp: c00000000058f9d0
   msr: 8000000000001032
   dar: 0
 dsisr: 42000000
  current = 0xc0000000004a9b30
  paca    = 0xc0000000004aa700
    pid   = 0, comm = swapper
enter ? for help
[c00000000058faf0] c000000000021164 .rtas_call+0x200/0x250
[c00000000058fba0] c000000000049cd0 .early_enable_eeh+0x168/0x360
[c00000000058fc70] c00000000002f674 .traverse_pci_devices+0x8c/0x138
[c00000000058fd10] c000000000460ce8 .eeh_init+0x1a8/0x200
[c00000000058fdb0] c00000000045fb70 .pSeries_setup_arch+0x128/0x234
[c00000000058fe40] c00000000044f830 .setup_arch+0x214/0x24c
[c00000000058fee0] c000000000446a38 .start_kernel+0xd4/0x3e4
[c00000000058ff90] c000000000373194 .start_here_common+0x54/0x58

This machine is a:

processor       : 0
cpu             : POWER4+ (gq)
clock           : 1703.965296MHz
revision        : 19.0

[...]

timebase        : 212995662
machine         : CHRP IBM,7040-681

-apw
-
To: Andy Whitcroft <apw@...>
Cc: Andrew Morton <akpm@...>, <linuxppc-dev@...>, <linux-kernel@...>
Date: Tuesday, October 2, 2007 - 7:28 pm

I haven't forgetten about this ... and am looking at it now.
Seems that whenever I go to reserve the machine pSeries-102,
someone else is using it :-)

--linas
-
To: Linas Vepstas <linas@...>
Cc: Andy Whitcroft <apw@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Tuesday, October 2, 2007 - 8:26 pm

This panic is caused by "[POWERPC] pseries: Fix jumbled no_logging flag."
(79c0108d1b9db4864ab77b2a95dfa04f2dcf264c), in the powerpc/for-2.6.24
branch.  It looks to me that we have logging enabled too early now.

I think the following is a reasonable fix?

---
Explicitly enable RTAS error logging, when it should be ready.


Signed-off-by: Tony Breeds &lt;tony@bakeyournoodle.com&gt;

---

 arch/powerpc/platforms/pseries/rtasd.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
index 30925d2..0df5d0d 100644
--- a/arch/powerpc/platforms/pseries/rtasd.c
+++ b/arch/powerpc/platforms/pseries/rtasd.c
@@ -54,7 +54,10 @@ static unsigned int rtas_event_scan_rate;
 static int full_rtas_msgs = 0;
 
 /* Stop logging to nvram after first fatal error */
-static int no_more_logging;
+static int no_more_logging = 1; /* Until we initialize everything,
+                                 * make sure we don't try logging
+                                 * anything */
+
 
 static int error_log_cnt;
 
@@ -414,6 +417,8 @@ static int rtasd(void *unused)
 	memset(logdata, 0, rtas_error_log_max);
 	rc = nvram_read_error_log(logdata, rtas_error_log_max,
 	                          &amp;err_type, &amp;error_log_cnt);
+	/* We can use rtas_log_buf now */
+	no_more_logging = 0;
 
 	if (!rc) {
 		if (err_type != ERR_FLAG_ALREADY_LOGGED) {

Yours Tony

  linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

-
To: Tony Breeds <tony@...>
Cc: Linas Vepstas <linas@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Tuesday, October 2, 2007 - 8:30 pm

I realise it'll make the patch bigger, but this doesn't seem like a
particularly good name for the variable anymore.

cheers

--=20
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
To: Michael Ellerman <michael@...>
Cc: Linas Vepstas <linas@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Tuesday, October 2, 2007 - 9:19 pm

Sure, what about?

Clarify when RTAS logging is enabled.

Signed-off-by: Tony Breeds &lt;tony@bakeyournoodle.com&gt;

---
 arch/powerpc/platforms/pseries/rtasd.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/rtasd.c b/arch/powerpc/platforms/pseries/rtasd.c
index 30925d2..73401c8 100644
--- a/arch/powerpc/platforms/pseries/rtasd.c
+++ b/arch/powerpc/platforms/pseries/rtasd.c
@@ -54,8 +54,9 @@ static unsigned int rtas_event_scan_rate;
 static int full_rtas_msgs = 0;
 
 /* Stop logging to nvram after first fatal error */
-static int no_more_logging;
-
+static int logging_enabled; /* Until we initialize everything,
+                             * make sure we don't try logging
+                             * anything */
 static int error_log_cnt;
 
 /*
@@ -217,7 +218,7 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
 	}
 
 	/* Write error to NVRAM */
-	if (!no_more_logging &amp;&amp; !(err_type &amp; ERR_FLAG_BOOT))
+	if (logging_enabled &amp;&amp; !(err_type &amp; ERR_FLAG_BOOT))
 		nvram_write_error_log(buf, len, err_type, error_log_cnt);
 
 	/*
@@ -229,8 +230,8 @@ void pSeries_log_error(char *buf, unsigned int err_type, int fatal)
 		printk_log_rtas(buf, len);
 
 	/* Check to see if we need to or have stopped logging */
-	if (fatal || no_more_logging) {
-		no_more_logging = 1;
+	if (fatal || !logging_enabled) {
+		logging_enabled = 0;
 		spin_unlock_irqrestore(&amp;rtasd_log_lock, s);
 		return;
 	}
@@ -302,7 +303,7 @@ static ssize_t rtas_log_read(struct file * file, char __user * buf,
 
 	spin_lock_irqsave(&amp;rtasd_log_lock, s);
 	/* if it's 0, then we know we got the last one (the one in NVRAM) */
-	if (rtas_log_size == 0 &amp;&amp; !no_more_logging)
+	if (rtas_log_size == 0 &amp;&amp; logging_enabled)
 		nvram_clear_error_log();
 	spin_unlock_irqrestore(&amp;rtasd_log_lock, s);
 
@@ -414,6 +415,8 @@ static int rtasd(void *unused)
 	memset(logdata, 0, rtas_err...
To: Tony Breeds <tony@...>
Cc: Michael Ellerman <michael@...>, Linas Vepstas <linas@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, October 4, 2007 - 8:01 pm

For what it's worth, on a different ppc64 box, this resolves a similar
panic for me.

Tested-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;

Thanks,
Nish
-
To: Nish Aravamudan <nish.aravamudan@...>
Cc: Tony Breeds <tony@...>, Michael Ellerman <michael@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Friday, October 5, 2007 - 12:03 pm

For the reasons explained, I'd really like to nack Tony's patch.

--linas
-
To: Linas Vepstas <linas@...>
Cc: Tony Breeds <tony@...>, Michael Ellerman <michael@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Sunday, October 7, 2007 - 11:47 pm

I see. Can you reply in this thread with the patch you mentioned in
your other reply? (or point me to a copy of it)

Thanks,
Nish
-
To: Tony Breeds <tony@...>
Cc: Linas Vepstas <linas@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Wednesday, October 3, 2007 - 12:09 am

What exactly happens that allows us to do logging? I don't see any
ordering between anything else and the setting of the flag, and AFAICT
we're not inside a spinlock or anything here.

cheers

--=20
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
To: Michael Ellerman <michael@...>
Cc: Tony Breeds <tony@...>, <linuxppc-dev@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Wednesday, October 3, 2007 - 2:50 pm

Until we allocate the error log buffer. The original crash was 
for a null-pointer deref of the unallocated buffer. I just sent 
out a patch to fix this; its a bit simpler than the below.

In that email, I remarked:

Andy Whitcroft's crash was appearently due to firmware complaining
about lost power, (actually, lost power supply redundancy!), which
occurred very early during boot.

    Type                00000040 (EPOW)
    Status:             bypassed new
    Residual error from previous boot.
    EPOW Sensor Value:  00000002
    EPOW warning due to loss of redundancy.
    EPOW general power fault.

I've no clue why firmware thought it was OK to report this
during one of the earliest calls to RTAS; I'm still investiigating
that.

--linas
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, David Brownell <dbrownell@...>, <linux-usb-devel@...>
Date: Monday, September 24, 2007 - 8:33 am

Fine, but on some boots (I noticed this on rc6-mm1 too, but not before):
0000:00:1a.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001

# lspci -vns 0000:00:1a.7
00:1a.7 0c03: 8086:293c (rev 02) (prog-if 20 [EHCI])
        Subsystem: 8086:293c
        Flags: bus master, medium devsel, latency 0, IRQ 19
        Memory at ffa7b400 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Debug port
        Capabilities: [98] Vendor Specific Information

# lspci -vns 0000:00:1d.7
00:1d.7 0c03: 8086:293a (rev 02) (prog-if 20 [EHCI])
        Subsystem: 8086:293a
        Flags: bus master, medium devsel, latency 0, IRQ 23
        Memory at ffa7b000 (32-bit, non-prefetchable) [size=1K]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Debug port
        Capabilities: [98] Vendor Specific Information

regards,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-
To: Jiri Slaby <jirislaby@...>
Cc: Andrew Morton <akpm@...>, David Brownell <dbrownell@...>, <linux-kernel@...>, <linux-usb-devel@...>
Date: Monday, September 24, 2007 - 10:41 am

Any changes in your BIOS setup?

What about with vanilla 2.6.23-rc6?  Or vanilla 2.6.23-rc7?

The USB part of the code here hasn't changed in quite a while.  Any 
difference in behavior must be the result of changes in some other part 
of the kernel.  Possibly ACPI.

This might be a good job for git-bisect.

Alan Stern

-
To: Alan Stern <stern@...>
Cc: Andrew Morton <akpm@...>, David Brownell <dbrownell@...>, <linux-kernel@...>, <linux-usb-devel@...>
Date: Monday, September 24, 2007 - 2:45 pm

unlikely, but still possible -- I've made some changes in BIOS recently when I

Ok, I'll play with that little bit.

thanks,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-
To: Jiri Slaby <jirislaby@...>
Cc: Andrew Morton <akpm@...>, David Brownell <dbrownell@...>, <linux-kernel@...>, <linux-usb-devel@...>
Date: Monday, September 24, 2007 - 3:06 pm

USB Legacy Support is about the only change which springs to mind.  But 
who knows...  A buggy BIOS could do almost anything.

Alan Stern

-
To: Alan Stern <stern@...>
Cc: Andrew Morton <akpm@...>, David Brownell <dbrownell@...>, <linux-kernel@...>, <linux-usb-devel@...>
Date: Monday, September 24, 2007 - 3:18 pm

Hmm, I have usb legacy keyboard switched on because of grub and bios to allow me
 typing.

I booted 23-rc7 4 times, and the latest -mm 3 times just now and can't reproduce
it, I just wonder by what is this conditioned.

regards,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-
To: Jiri Slaby <jirislaby@...>
Cc: Andrew Morton <akpm@...>, David Brownell <dbrownell@...>, <linux-kernel@...>, <linux-usb-devel@...>
Date: Monday, September 24, 2007 - 3:41 pm

Warm boot vs. cold boot, maybe.

Alan Stern

-
To: Alan Stern <stern@...>
Cc: Andrew Morton <akpm@...>, David Brownell <dbrownell@...>, <linux-kernel@...>, <linux-usb-devel@...>
Date: Sunday, September 30, 2007 - 4:26 am

Hmm, no. I don't know, I can't see it anymore so far (using rc8-mm2). I'll keep
eyes on it, anyways.

thanks,
-- 
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Martin Schwidefsky <schwidefsky@...>, Heiko Carstens <heiko.carstens@...>, <linux390@...>, <linux-s390@...>
Date: Monday, September 24, 2007 - 8:32 am

Getting compile errors on S390:

    CC      arch/s390/mm/cmm.o
  arch/s390/mm/cmm.c: In function `cmm_init':
  arch/s390/mm/cmm.c:431: error: implicit declaration of function
  				`register_oom_notifier'
  arch/s390/mm/cmm.c:443: error: implicit declaration of function
  				`unregister_oom_notifier'
  make[1]: *** [arch/s390/mm/cmm.o] Error 1
  make: *** [arch/s390/mm] Error 2

-apw
-
To: Andy Whitcroft <apw@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, Martin Schwidefsky <schwidefsky@...>, Heiko Carstens <heiko.carstens@...>, <linux390@...>, <linux-s390@...>, David Rientjes <rientjes@...>
Date: Monday, September 24, 2007 - 8:49 am

yes. It's from oom-move-prototypes-to-appropriate-header-file.patch.

I think this patch fixes it.

C.

Signed-off-by: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
---
 arch/s390/mm/cmm.c |    1 +
 1 file changed, 1 insertion(+)

Index: 2.6.23-rc7-mm1/arch/s390/mm/cmm.c
===================================================================
--- 2.6.23-rc7-mm1.orig/arch/s390/mm/cmm.c
+++ 2.6.23-rc7-mm1/arch/s390/mm/cmm.c
@@ -17,6 +17,7 @@
 #include &lt;linux/ctype.h&gt;
 #include &lt;linux/swap.h&gt;
 #include &lt;linux/kthread.h&gt;
+#include &lt;linux/oom.h&gt;
 
 #include &lt;asm/pgalloc.h&gt;
 #include &lt;asm/uaccess.h&gt;
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Monday, September 24, 2007 - 7:42 am

Hi Andrew,

The kernel build fails with 

  CC      arch/ia64/kernel/efi.o
arch/ia64/kernel/efi.c: In function 'efi_memmap_init':
arch/ia64/kernel/efi.c:1088: error: 'total_memory' undeclared (first use in this function)
arch/ia64/kernel/efi.c:1088: error: (Each undeclared identifier is reported only once
arch/ia64/kernel/efi.c:1088: error: for each function it appears in.)
make[1]: *** [arch/ia64/kernel/efi.o] Error 1
make: *** [arch/ia64/kernel] Error 2

The use-extended-crashkernel-command-line-on-ia64.patch uses total_mem and 
return total_memory.

Signed-off-by: Kamalesh Babulal &lt;kamalesh@linux.vnet.ibm.com&gt;
---
--- linux-2.6.23-rc7/arch/ia64/kernel/efi.c     2007-09-24 15:28:06.000000000 +0530
+++ linux-2.6.23-rc7/arch/ia64/kernel/~efi.c    2007-09-24 16:56:03.000000000 +0530
@@ -1085,7 +1085,7 @@ efi_memmap_init(unsigned long *s, unsign
        *s = (u64)kern_memmap;
        *e = (u64)++k;
 
-       return total_memory;
+       return total_mem;
 }
 
 void

-- 
Thanks &amp; Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To: Andrew Morton <akpm@...>
Cc: roel <12o3l@...>, <linux-kernel@...>, <bluez-devel@...>, <maxk@...>, Marcel Holtmann <marcel@...>
Date: Monday, September 24, 2007 - 7:30 am

This patch:
- makes hidp_setup_input() return int to indicate errors;
- checks its return value to handle errors.

And this time it is against -rc7-mm1 tree.

Thanks to roel and Marcel Holtmann for comments.

Signed-off-by: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;

---
 net/bluetooth/hidp/core.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

Index: linux-2.6.23-rc7-mm1/net/bluetooth/hidp/core.c
===================================================================
--- linux-2.6.23-rc7-mm1.orig/net/bluetooth/hidp/core.c
+++ linux-2.6.23-rc7-mm1/net/bluetooth/hidp/core.c
@@ -625,7 +625,7 @@ static struct device *hidp_get_device(st
 	return conn ? &amp;conn-&gt;dev : NULL;
 }
 
-static inline void hidp_setup_input(struct hidp_session *session, struct hidp_connadd_req *req)
+static inline int hidp_setup_input(struct hidp_session *session, struct hidp_connadd_req *req)
 {
 	struct input_dev *input = session-&gt;input;
 	int i;
@@ -669,7 +669,7 @@ static inline void hidp_setup_input(stru
 
 	input-&gt;event = hidp_input_event;
 
-	input_register_device(input);
+	return input_register_device(input);
 }
 
 static int hidp_open(struct hid_device *hid)
@@ -822,8 +822,11 @@ int hidp_add_connection(struct hidp_conn
 	session-&gt;flags   = req-&gt;flags &amp; (1 &lt;&lt; HIDP_BLUETOOTH_VENDOR_ID);
 	session-&gt;idle_to = req-&gt;idle_to;
 
-	if (session-&gt;input)
-		hidp_setup_input(session, req);
+	if (session-&gt;input) {
+		err = hidp_setup_input(session, req);
+		if (err &lt; 0)
+			goto failed;
+	}
 
 	if (session-&gt;hid)
 		hidp_setup_hid(session, req);
-
To: WANG Cong <xiyou.wangcong@...>
Cc: Andrew Morton <akpm@...>, roel <12o3l@...>, <linux-kernel@...>, <bluez-devel@...>, <maxk@...>
Date: Monday, September 24, 2007 - 6:18 pm

Signed-off-by: Marcel Holtmann &lt;marcel@holtmann.org&gt;

Regards

Marcel


-
To: <marcel@...>
Cc: <xiyou.wangcong@...>, <akpm@...>, <12o3l@...>, <linux-kernel@...>, <bluez-devel@...>, <maxk@...>
Date: Wednesday, September 26, 2007 - 1:57 am

From: Marcel Holtmann &lt;marcel@holtmann.org&gt;

Applied, thanks.
-
To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Monday, September 24, 2007 - 6:35 am

It lived fast, it died young, it didn't leave a pretty corpse...

Something in the startup scripts did a 'touch', and ker-blam.

[   15.668000] Unable to handle kernel NULL pointer dereference at 0000000000000252 RIP: 
[   15.668000]  [&lt;ffffffff802a1dd1&gt;] __mnt_is_readonly+0x9/0x1e
[   15.668000] PGD 52be067 PUD 5645067 PMD 0 
[   15.668000] Oops: 0000 [1] PREEMPT SMP 
[   15.668000] last sysfs file: /block/dm-13/dev
[   15.668000] CPU 0 
[   15.668000] Modules linked in: rtc
[   15.668000] Pid: 528, comm: touch Not tainted 2.6.23-rc7-mm1 #1
[   15.668000] RIP: 0010:[&lt;ffffffff802a1dd1&gt;]  [&lt;ffffffff802a1dd1&gt;] __mnt_is_readonly+0x9/0x1e
[   15.668000] RSP: 0018:ffff8100045fddd8  EFLAGS: 00010202
[   15.668000] RAX: 0000000000000001 RBX: ffff810002c10680 RCX: 0000000000000001
[   15.668000] RDX: ffff810082504000 RSI: ffff810005243168 RDI: 0000000000000202
[   15.668000] RBP: ffff8100045fddd8 R08: 0000000000000001 R09: 0000000000000002
[   15.668000] R10: 0000000000000000 R11: ffff8100045fde68 R12: 0000000000000202
[   15.668000] R13: 00000000ffffffe2 R14: ffff8100052c1d80 R15: ffff8100039aa8a0
[   15.668000] FS:  00007f9527f596f0(0000) GS:ffffffff806b6000(0000) knlGS:0000000000000000
[   15.668000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   15.668000] CR2: 0000000000000252 CR3: 00000000052cb000 CR4: 00000000000006e0
[   15.668000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   15.668000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   15.668000] Process touch (pid: 528, threadinfo ffff8100045fc000, task ffff8100047517e0)
[   15.668000] last branch before last exception/interrupt
[   15.668000]  from  [&lt;ffffffff802a4d1b&gt;] mnt_want_write+0x44/0xb5
[   15.668000]  to  [&lt;ffffffff802a1dc8&gt;] __mnt_is_readonly+0x0/0x1e
[   15.668000] Stack:  ffff8100045fde08 ffffffff802a4d20 ffff8100045fddf8 0000000000000000
[   15.668000]  00000000fffffff7 ffff810005243140 ffff8100045fdf28 ffffffff802ad288
[   15.668000]  ...
To: <Valdis.Kletnieks@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 8:05 am

do_times passes an unitialized vfsmount into mnt_want_write.  Here's
the quick fix (untested), but the right fix is to restructure the complete
mess do_utimes is (never let a libc developer write your kernel code.. :)):


Index: linux-2.6.23-rc6/fs/utimes.c
===================================================================
--- linux-2.6.23-rc6.orig/fs/utimes.c	2007-09-24 14:02:24.000000000 +0200
+++ linux-2.6.23-rc6/fs/utimes.c	2007-09-24 14:03:57.000000000 +0200
@@ -59,6 +59,7 @@ long do_utimes(int dfd, char __user *fil
 	struct inode *inode;
 	struct iattr newattrs;
 	struct file *f = NULL;
+	struct vfsmount *mnt;
 
 	error = -EINVAL;
 	if (times &amp;&amp; (!nsec_valid(times[0].tv_nsec) ||
@@ -79,17 +80,19 @@ long do_utimes(int dfd, char __user *fil
 		if (!f)
 			goto out;
 		dentry = f-&gt;f_path.dentry;
+		mnt = f-&gt;f_path.mnt;
 	} else {
 		error = __user_walk_fd(dfd, filename, (flags &amp; AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW, &amp;nd);
 		if (error)
 			goto out;
 
 		dentry = nd.dentry;
+		mnt = nd.mnt;
 	}
 
 	inode = dentry-&gt;d_inode;
 
-	error = mnt_want_write(nd.mnt);
+	error = mnt_want_write(mnt);
 	if (error)
 		goto dput_and_out;
 


-
To: Christoph Hellwig <hch@...>, Dave Hansen <haveblue@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Monday, September 24, 2007 - 8:58 am

Close - it still blew up, as one reference to nd.mnt remained.  Fixed patch
is appended - system boots all the way with this applied.

--- linux-2.6.23-rc7-mm1/fs/utimes.c.dist	2007-09-24 05:57:38.000000000 -0400
+++ linux-2.6.23-rc7-mm1/fs/utimes.c	2007-09-24 08:48:34.000000000 -0400
@@ -59,6 +59,7 @@ long do_utimes(int dfd, char __user *fil
 	struct inode *inode;
 	struct iattr newattrs;
 	struct file *f = NULL;
+	struct vfsmount *mnt;
 
 	error = -EINVAL;
 	if (times &amp;&amp; (!nsec_valid(times[0].tv_nsec) ||
@@ -79,17 +80,19 @@ long do_utimes(int dfd, char __user *fil
 		if (!f)
 			goto out;
 		dentry = f-&gt;f_path.dentry;
+		mnt = f-&gt;f_path.mnt;
 	} else {
 		error = __user_walk_fd(dfd, filename, (flags &amp; AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW, &amp;nd);
 		if (error)
 			goto out;
 
 		dentry = nd.dentry;
+		mnt = nd.mnt;
 	}
 
 	inode = dentry-&gt;d_inode;
 
-	error = mnt_want_write(nd.mnt);
+	error = mnt_want_write(mnt);
 	if (error)
 		goto dput_and_out;
 
@@ -135,7 +138,7 @@ long do_utimes(int dfd, char __user *fil
 	error = notify_change(dentry, &amp;newattrs);
 	mutex_unlock(&amp;inode-&gt;i_mutex);
 mnt_drop_write_and_out:
-	mnt_drop_write(nd.mnt);
+	mnt_drop_write(mnt);
 dput_and_out:
 	if (f)
 		fput(f);
To: <Valdis.Kletnieks@...>
Cc: Christoph Hellwig <hch@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Subject: