Re: [-mm patch] kernel/lockdep_proc.c: make 2 functions static

Previous thread: Re: [PATCH/RFC] Is it OK for 'read' to return nuls for a file that never had nuls in it? by Neil Brown on Wednesday, May 30, 2007 - 11:51 pm. (1 message)

Next thread: [PATCH 1/3] lguest: speed up PARAVIRT_LAZY_FLUSH handling by Rusty Russell on Thursday, May 31, 2007 - 12:23 am. (3 messages)
From: Andrew Morton
Date: Wednesday, May 30, 2007 - 11:58 pm

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc3/2.6.22-rc3-mm1/

- Merged the convert-cpusets-to-container-infrastructure patches.  These
  will probably be dropped and redone.



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.



Changes since 2.6.22-rc2-mm1:


 origin.patch
 git-acpi.patch
 git-alsa.patch
 git-avr32.patch
 git-cifs.patch
 git-cpufreq.patch
 git-drm.patch
 git-dvb.patch
 git-gfs2-nmw.patch
 git-hid.patch
 git-ieee1394.patch
 git-input.patch
 git-kbuild.patch
 git-kvm.patch
 git-leds.patch
 git-libata-all.patch
 git-md-accel.patch
 ...
From: Cornelia Huck
Date: Thursday, May 31, 2007 - 5:09 am

On Wed, 30 May 2007 23:58:23 -0700,


scsi fails to build on !HAS_DMA architectures:


I split those functions out into a new file. Builds on s390 and i386.



scsi: Don't build scsi_dma_{map,unmap} for !HAS_DMA

Move scsi_dma_{map,unmap} into scsi_lib_dma.c which is only build
if HAS_DMA is set.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>

---
 drivers/scsi/Kconfig        |    5 ++++
 drivers/scsi/Makefile       |    6 ++---
 drivers/scsi/scsi_lib.c     |   38 ---------------------------------
 drivers/scsi/scsi_lib_dma.c |   50 ++++++++++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_cmnd.h    |    2 +
 5 files changed, 60 insertions(+), 41 deletions(-)

--- linux-2.6.orig/drivers/scsi/Kconfig
+++ linux-2.6/drivers/scsi/Kconfig
@@ -10,6 +10,7 @@ config RAID_ATTRS
 config SCSI
 	tristate "SCSI device support"
 	depends on BLOCK
+	select SCSI_DMA if HAS_DMA
 	---help---
 	  If you want to use a SCSI hard disk, SCSI tape drive, SCSI CD-ROM or
 	  any other SCSI device under Linux, say Y and make sure that you know
@@ -29,6 +30,10 @@ config SCSI
 	  However, do not compile this as a module if your root file system
 	  (the one containing the directory /) is located on a SCSI device.
 
+config SCSI_DMA
+	bool
+	default n
+
 config SCSI_TGT
 	tristate "SCSI target support"
 	depends on SCSI && EXPERIMENTAL
--- linux-2.6.orig/drivers/scsi/Makefile
+++ linux-2.6/drivers/scsi/Makefile
@@ -145,9 +145,9 @@ obj-$(CONFIG_SCSI_DEBUG)	+= scsi_debug.o
 obj-$(CONFIG_SCSI_WAIT_SCAN)	+= scsi_wait_scan.o
 
 scsi_mod-y			+= scsi.o hosts.o scsi_ioctl.o constants.o \
-				   scsicam.o scsi_error.o scsi_lib.o \
-				   scsi_scan.o scsi_sysfs.o \
-				   scsi_devinfo.o
+				   scsicam.o scsi_error.o scsi_lib.o
+scsi_mod-$(CONFIG_SCSI_DMA)	+= scsi_lib_dma.o
+scsi_mod-y			+= scsi_scan.o scsi_sysfs.o scsi_devinfo.o
 scsi_mod-$(CONFIG_SCSI_NETLINK)	+= scsi_netlink.o
 scsi_mod-$(CONFIG_SYSCTL)	+= scsi_sysctl.o
 scsi_mod-$(CONFIG_SCSI_PROC_FS)	+= ...
From: Matthew Wilcox
Date: Thursday, May 31, 2007 - 5:15 am

Why not just put #ifdef CONFIG_HAS_DMA / #endif around the pair of
functions?  I don't see the need to add a new Kconfig symbol and a new
file for this.

-

From: Cornelia Huck
Date: Thursday, May 31, 2007 - 5:20 am

On Thu, 31 May 2007 06:15:57 -0600,

I prefer a new file over #ifdefs in c files. (New dma-dependent stuff
would also have a place where it could go to.)

But I'll do whatever ends up as consensus :)
-

From: Jeff Garzik
Date: Thursday, May 31, 2007 - 5:35 am

50 lines isn't much need for a new file.

	Jeff



-

From: Cornelia Huck
Date: Thursday, May 31, 2007 - 8:11 am

On Thu, 31 May 2007 08:35:13 -0400,

OK, so here's an alternative patch:


scsi: Don't build scsi_dma_{map,unmap} for !HAS_DMA

Use #ifdef CONFIG_HAS_DMA for the two dma-dependent functions.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>

---
 drivers/scsi/scsi_lib.c  |    2 ++
 include/scsi/scsi_cmnd.h |    2 ++
 2 files changed, 4 insertions(+)

--- linux-2.6.orig/drivers/scsi/scsi_lib.c
+++ linux-2.6/drivers/scsi/scsi_lib.c
@@ -2291,6 +2291,7 @@ void scsi_kunmap_atomic_sg(void *virt)
 }
 EXPORT_SYMBOL(scsi_kunmap_atomic_sg);
 
+#ifdef CONFIG_HAS_DMA
 /**
  * scsi_dma_map - perform DMA mapping against command's sg lists
  * @cmd:	scsi command
@@ -2328,3 +2329,4 @@ void scsi_dma_unmap(struct scsi_cmnd *cm
 	}
 }
 EXPORT_SYMBOL(scsi_dma_unmap);
+#endif
--- linux-2.6.orig/include/scsi/scsi_cmnd.h
+++ linux-2.6/include/scsi/scsi_cmnd.h
@@ -135,8 +135,10 @@ extern void scsi_kunmap_atomic_sg(void *
 extern struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *, gfp_t);
 extern void scsi_free_sgtable(struct scatterlist *, int);
 
+#ifdef CONFIG_HAS_DMA
 extern int scsi_dma_map(struct scsi_cmnd *cmd);
 extern void scsi_dma_unmap(struct scsi_cmnd *cmd);
+#endif
 
 #define scsi_sg_count(cmd) ((cmd)->use_sg)
 #define scsi_sglist(cmd) ((struct scatterlist *)(cmd)->request_buffer)
-

From: Christoph Hellwig
Date: Thursday, May 31, 2007 - 8:13 am

The scsi core shouldn't know anything about dma mappings, so a separate
file is a good idea just to keep the separation clean.
-

From: Andrew Morton
Date: Thursday, May 31, 2007 - 3:10 pm

On Thu, 31 May 2007 16:13:38 +0100

ok, let's go this way.

Cornelia, afaict your patch has no actual delendency upon Dan's
dma-mapping-prevent-dma-dependent-code-from-linking-on.patch, correct?  If
so, I can merge it via James and then merge Dan's patch once James has
merged.

If there is a dependency then I guess I merge both into a single diff and
merge it all in one hit.

btw, this:

diff -puN include/scsi/scsi_cmnd.h~scsi-dont-build-scsi_dma_mapunmap-for-has_dma include/scsi/scsi_cmnd.h
--- a/include/scsi/scsi_cmnd.h~scsi-dont-build-scsi_dma_mapunmap-for-has_dma
+++ a/include/scsi/scsi_cmnd.h
@@ -135,8 +135,10 @@ extern void scsi_kunmap_atomic_sg(void *
 extern struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *, gfp_t);
 extern void scsi_free_sgtable(struct scatterlist *, int);
 
+#ifdef CONFIG_SCSI_DMA
 extern int scsi_dma_map(struct scsi_cmnd *cmd);
 extern void scsi_dma_unmap(struct scsi_cmnd *cmd);
+#endif
 
 #define scsi_sg_count(cmd) ((cmd)->use_sg)
 #define scsi_sglist(cmd) ((struct scatterlist *)(cmd)->request_buffer)

We don't really need the ifdefs here.  If someone incorrectly calls these
functions then they'll get a link-time failure anyway.  The downside of
removing these ifdefs is that they won't get a compile-time warning, but I
tend to think that this small cost is worth it.
-

From: Cornelia Huck
Date: Friday, June 1, 2007 - 12:09 am

On Thu, 31 May 2007 15:10:05 -0700,


OK, fine with me.
-

From: Michal Piotrowski
Date: Thursday, May 31, 2007 - 8:29 am

Hi,


FYI suspend to disk doesn't work anymore on my box, system hangs after "Suspending console(s)" message.

[  186.297753] Shrinking memory...  -\|/-\|done (113064 pages freed)
[  187.841914] Freed 452256 kbytes in 1.54 seconds (293.67 MB/s)
[  187.847730] Suspending console(s)

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc3-mm1/console.log
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.22-rc3-mm1/mm-config

Regards,
Michal


-- 
"Najbardziej brakowało mi twojego milczenia."
-- Andrzej Sapkowski "Coś więcej"
-

From: Rafael J. Wysocki
Date: Thursday, May 31, 2007 - 12:58 pm

Hmm, that might be a couple of things, actually.

To see if the patches directly related to hibernation/suspend cause this, can
you please test 2.6.22-rc3 with the patch series at

http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc3/patches/

applied?

Greetings,
Rafael
-

From: Rafael J. Wysocki
Date: Thursday, May 31, 2007 - 2:30 pm

Ahem, I broke it. :-(

Andrew, the following fix is needed on top of
freezer-make-kernel-threads-nonfreezable-by-default.patch

---
From: Rafael J. Wysocki <rjw@sisk.pl>

migration_thread should not be freezable, or it will break hibernation and
suspend on SMP.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/sched.c |    3 ---
 1 file changed, 3 deletions(-)

Index: linux-2.6.22-rc3/kernel/sched.c
===================================================================
--- linux-2.6.22-rc3.orig/kernel/sched.c
+++ linux-2.6.22-rc3/kernel/sched.c
@@ -5157,13 +5157,10 @@ static int migration_thread(void *data)
 	BUG_ON(rq->migration_thread != current);
 
 	set_current_state(TASK_INTERRUPTIBLE);
-	set_freezable();
 	while (!kthread_should_stop()) {
 		struct migration_req *req;
 		struct list_head *head;
 
-		try_to_freeze();
-
 		spin_lock_irq(&rq->lock);
 
 		if (cpu_is_offline(cpu)) {
-

From: Michal Piotrowski
Date: Thursday, May 31, 2007 - 10:53 am

CPU hotplug test triggered this

[ 4972.038008] CPU 1 is now offline
[ 4972.041411] lockdep: not fixing up alternatives.
[ 4972.051553] 
[ 4972.051555] =================================
[ 4972.057562] [ INFO: inconsistent lock state ]
[ 4972.062056] 2.6.22-rc3-mm1 #10
[ 4972.065184] ---------------------------------
[ 4972.069663] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
[ 4972.075758] sh/702 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 4972.080554]  (&n->list_lock){++..}, at: [<c0181288>] add_partial+0xe/0x27

l *0xc0181288
0xc0181288 is in add_partial (/home/devel/linux-mm/mm/slub.c:1193).
1188    }
1189
1190    static void add_partial(struct kmem_cache_node *n, struct page *page)
1191    {
1192            spin_lock(&n->list_lock);
1193            n->nr_partial++;
1194            list_add(&page->lru, &n->partial);
1195            spin_unlock(&n->list_lock);
1196    }
1197


[ 4972.087650] {in-hardirq-W} state was registered at:
[ 4972.092656]   [<c0143bfe>] mark_lock+0x82/0x557
[ 4972.097323]   [<c0144c5f>] __lock_acquire+0x476/0xd36
[ 4972.102562]   [<c01455bd>] lock_acquire+0x9e/0xb8
[ 4972.107342]   [<c0348993>] _spin_lock+0x38/0x62
[ 4972.111993]   [<c0181d29>] deactivate_slab+0xb9/0x179
[ 4972.117300]   [<c0181e56>] flush_slab+0x6d/0x72
[ 4972.122063]   [<c0181e8c>] __flush_cpu_slab+0x31/0x36
[ 4972.127335]   [<c0181ea5>] flush_cpu_slab+0x14/0x17
[ 4972.132401]   [<c0113d6f>] smp_call_function_interrupt+0x3a/0x56
[ 4972.138607]   [<c0104c73>] call_function_interrupt+0x33/0x38
[ 4972.144503]   [<c0102b59>] default_idle+0x50/0x69
[ 4972.149421]   [<c01023eb>] cpu_idle+0xb3/0xf8
[ 4972.153889]   [<c03454fa>] rest_init+0x56/0x58
[ 4972.158402]   [<c04f39c7>] start_kernel+0x351/0x359
[ 4972.163450]   [<ffffffff>] 0xffffffff
[ 4972.167221] irq event stamp: 2451
[ 4972.170695] hardirqs last  enabled at (2451): [<c0104228>] restore_nocheck+0x12/0x15
[ 4972.178699] hardirqs last disabled at (2449): [<c012b4e9>] __do_softirq+0x93/0xe5
[ 4972.186393] softirqs last  ...
From: Michal Piotrowski
Date: Thursday, May 31, 2007 - 11:08 am

not guilty

Regards,
Michal

-- 
"Najbardziej brakowało mi twojego milczenia."
-- Andrzej Sapkowski "Coś więcej"
-

From: Andrew Morton
Date: Thursday, May 31, 2007 - 11:31 am

On Thu, 31 May 2007 19:53:07 +0200

Yep, that's a bug in slub.  We take that lock in the IPI handler.  If a CPU
is currently holding that lock and then takes the IPI and enters

Perhaps a suitable fix would be local_irq_disable() in flush_slab().
-

From: Christoph Lameter
Date: Thursday, May 31, 2007 - 11:41 am

add_partial runs with interrupts disabled. The interrupts are disabled 

A cpu cannot enter an IPI handler while interrupts are disabled. That 

As far as I can tell: Interrupts are always disabled when flush_slab is 
run. 

Sometimes we use spin_lock_irqsave for the list_lock and at other times 
spin_lock if interrupts are already disabled. Is that the problem?

-

From: Andrew Morton
Date: Thursday, May 31, 2007 - 11:53 am

On Thu, 31 May 2007 11:41:22 -0700 (PDT)

Nope, the problem is in the part of my email which you deleted ;)

[ 4972.243670] 
[ 4972.243670] stack backtrace:
[ 4972.248166]  [<c0105281>] dump_trace+0x63/0x1eb
[ 4972.252755]  [<c0105423>] show_trace_log_lvl+0x1a/0x2f
[ 4972.257969]  [<c0106061>] show_trace+0x12/0x14
[ 4972.262463]  [<c0106079>] dump_stack+0x16/0x18
[ 4972.266974]  [<c0142ff8>] print_usage_bug+0x140/0x14a
[ 4972.272109]  [<c0143e1a>] mark_lock+0x29e/0x557
[ 4972.276708]  [<c0144cda>] __lock_acquire+0x4f1/0xd36
[ 4972.281740]  [<c01455bd>] lock_acquire+0x9e/0xb8
[ 4972.286416]  [<c0348993>] _spin_lock+0x38/0x62
[ 4972.290936]  [<c0181288>] add_partial+0xe/0x27
[ 4972.295458]  [<c0181cd7>] deactivate_slab+0x67/0x179
[ 4972.300497]  [<c0181e56>] flush_slab+0x6d/0x72
[ 4972.305018]  [<c0181e8c>] __flush_cpu_slab+0x31/0x36
[ 4972.310049]  [<c01836e8>] slab_cpuup_callback+0x38/0x5b
[ 4972.315348]  [<c01325d2>] notifier_call_chain+0x2b/0x4a
[ 4972.320637]  [<c013261e>] __raw_notifier_call_chain+0x19/0x1e
[ 4972.326473]  [<c013263d>] raw_notifier_call_chain+0x1a/0x1c
[ 4972.332117]  [<c014bb62>] _cpu_down+0x19c/0x25a
[ 4972.336724]  [<c014bc48>] cpu_down+0x28/0x3a
[ 4972.341063]  [<c027f800>] store_online+0x27/0x5a
[ 4972.345757]  [<c027c854>] sysdev_store+0x20/0x25
[ 4972.350443]  [<c01c2739>] sysfs_write_file+0xc5/0xfd
[ 4972.355482]  [<c0187243>] vfs_write+0xd1/0x15a
[ 4972.360004]  [<c0187873>] sys_write+0x3d/0x72
[ 4972.364411]  [<c01041e0>] syscall_call+0x7/0xb
[ 4972.368924]  [<b7ff0410>] 0xb7ff0410
[ 4972.372562]  =======================
[ 4975.412963] lockdep: not fixing up alternatives.

we're not disbling local irqs on the cpu hotplug path.

Could do local_irq_disable() in slab_cpuup_callback(), I guess.
-

From: Christoph Lameter
Date: Thursday, May 31, 2007 - 11:57 am

Ahh I see.


SLUB: Fix locking for hotplug callbacks.

Hotplug callbacks seem to be performed with interrupts enabled. Slub requires
interrupts to be disabled for flushing caches.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: slub/mm/slub.c
===================================================================
--- slub.orig/mm/slub.c	2007-05-31 11:49:48.000000000 -0700
+++ slub/mm/slub.c	2007-05-31 11:54:09.000000000 -0700
@@ -2663,6 +2663,19 @@ static void for_all_slabs(void (*func)(s
 }
 
 /*
+ * Version of __flush_cpu_slab for the case that interrupts
+ * are enabled.
+ */
+static void cpu_slab_flush(struct kmem_cache *s, int cpu)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__flush_cpu_slab(s, cpu);
+	local_irq_restore(flags);
+}
+
+/*
  * Use the cpu notifier to insure that the cpu slabs are flushed when
  * necessary.
  */
@@ -2676,7 +2689,7 @@ static int __cpuinit slab_cpuup_callback
 	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
-		for_all_slabs(__flush_cpu_slab, cpu);
+		for_all_slabs(cpu_slab_flush, cpu);
 		break;
 	default:
 		break;
 
-

From: Mariusz Kozlowski
Date: Thursday, May 31, 2007 - 1:43 pm

Hello

	This is from iMac G3. The spufs_mem_mmap_fault() code looks bad
in arch/powerpc/platforms/cell/spufs/file.c but somehow I'm unable to find
the patch to blame hmm.

arch/powerpc/platforms/cell/spufs/file.c: In function 'spufs_mem_mmap_fault':
arch/powerpc/platforms/cell/spufs/file.c:122: error: 'address' undeclared (first use in this function)
arch/powerpc/platforms/cell/spufs/file.c:122: error: (Each undeclared identifier is reported only once
arch/powerpc/platforms/cell/spufs/file.c:122: error: for each function it appears in.)
arch/powerpc/platforms/cell/spufs/file.c:141: error: expected ';' before 'if'
arch/powerpc/platforms/cell/spufs/file.c:122: warning: unused variable 'addr0'
make[3]: *** [arch/powerpc/platforms/cell/spufs/file.o] Blad 1
make[2]: *** [arch/powerpc/platforms/cell/spufs] Blad 2
make[1]: *** [arch/powerpc/platforms/cell] Blad 2

Regards,

	Mariusz
-

From: Andrew Morton
Date: Thursday, May 31, 2007 - 2:19 pm

On Thu, 31 May 2007 22:43:18 +0200

Yeah, that's the fix-fault-vs-invalidate-race patches, or my poor attempt
to fix them when spufs changed.  I suppose I'll have a poke at it next time
I get the powerpc machine fired up.
-

From: Mariusz Kozlowski
Date: Friday, June 1, 2007 - 1:50 pm

I #if 0'ed that piece of code inside spufs_mem_mmap_fault() and run make again.
This is 'make allmodconfig && make' result:

ERROR: ".ps3av_set_hdr" [drivers/ps3/ps3av_cmd.ko] undefined!
ERROR: ".ps3av_do_pkt" [drivers/ps3/ps3av_cmd.ko] undefined!
ERROR: ".ps3_vuart_write" [drivers/ps3/ps3av_cmd.ko] undefined!
ERROR: ".ps3_vuart_read" [drivers/ps3/ps3av_cmd.ko] undefined!
ERROR: ".ps3av_cmd_fin" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_av_video_disable_sig" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_set_video_mode" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_video_get_monitor_info" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_audio_active" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_set_audio_mode" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_set_av_audio_param" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_vuart_read" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_av_get_hw_conf" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_av_video_mute" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_av_tv_mute" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_audio_mode" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_av_hdmi_mode" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_set_av_video_cs" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_vuart_write" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_audio_mute" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_avb_param" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_init" [drivers/ps3/ps3av.ko] undefined!
ERROR: ".ps3av_cmd_av_audio_mute" [drivers/ps3/ps3av.ko] undefined!
ERROR: "pmu_batteries" [drivers/power/pmu_battery.ko] undefined!
ERROR: "pmu_battery_count" [drivers/power/pmu_battery.ko] undefined!
ERROR: "pmu_power_flags" [drivers/power/pmu_battery.ko] undefined!
ERROR: "irq_map" [drivers/net/pasemi_mac.ko] undefined!
ERROR: "pmu_batteries" [drivers/macintosh/apm_emu.ko] undefined!
ERROR: "pmu_battery_count" ...
From: Andrew Morton
Date: Friday, June 1, 2007 - 2:02 pm

On Fri, 1 Jun 2007 22:50:58 +0200

Yeah, allmodconfig tends to fall over in a heap on a lot of the
less-lavishly-maintained architectures.  If any of these are specific to
-mm then I guess we should fix them up, prevent the kernel from actually
going backwards.



-

From: Mariusz Kozlowski
Date: Friday, June 1, 2007 - 2:21 pm

I recall compiling earlier versions of -mm on this iMac just fine a few months ago.
Now it looks like a bunch of new warnings appeared and if fails to compile for various
reasons in different places. I'm thinking of running -mm in next few days so maybe
something interesting will come up :-)

BTW. This is 'make allnoconfig && make' result:

  MODPOST vmlinux
ln: accessing `arch/powerpc/boot/zImage': No such file or directory
make[1]: *** [arch/powerpc/boot/zImage] Error 1
make: *** [zImage] Error 2
-

From: Benjamin Herrenschmidt
Date: Friday, June 1, 2007 - 4:30 pm

Some of the later seems to be related to the lack of CONFIG_PM .. it's
not much a lavish maintainership issue than the fact that nobody every
builds the powermac drivers without CONFIG_PM :-) I'll look into fixing
some of these.

As for the ps3 bits, it's a known problem, the ps3 support is still very
much a work in progress.

Cheers,
Ben.


-

From: Segher Boessenkool
Date: Saturday, June 2, 2007 - 1:40 am

To be fair, almost all of the powerpc allmodconfig build
problems are caused by x86-only drivers (and most of those
I doubt still work on x86, even).


Segher

-

From: Valdis.Kletnieks
Date: Thursday, May 31, 2007 - 3:05 pm

Builds, boots, seems to be behaving on my laptop (Dell D820, X86_64).

Meta-question: Is there a useful address/mailbox/webpage to toss *working*
reports at?  



From: Andrew Morton
Date: Thursday, May 31, 2007 - 3:16 pm

On Thu, 31 May 2007 18:05:04 -0400


ooh, don't know.  Nobody's ever had a -mm kernel which worked before ;)
-

From: Mark Fasheh
Date: Thursday, May 31, 2007 - 4:13 pm

Andrew, thanks for getting that back in there.


mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch broke ocfs2 shared
writable mmap. We hang on a page lock because ->page_mkwrite() is
being called with the page already locked:

+	/*
+	 * For consistency in subsequent calls, make the nopage_page always
+	 * locked.
+	 */
+	if (unlikely(!(vma->vm_flags & VM_CAN_INVALIDATE)))
+		lock_page(nopage_page);

It wasn't previously being called with the page lock held, intentionally.
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com
-

From: Nick Piggin
Date: Thursday, May 31, 2007 - 6:01 pm

Ah, I didn't realise you were using that yet. I expect ocfs2 is using
VM_CAN_INVALIDATE there anyway.

Hmm, this becomes easier to deal with after page_mkwrite is merged with
->fault. But for now, can we just lock the page at the do_wp_page site
as well, and change the API? All users I have seen want the page locked
there anyway...

-

From: Mark Fasheh
Date: Thursday, May 31, 2007 - 6:24 pm

Unfortunately that doesn't work for ocfs2 for exactly the same reasons page
lock doesn't work during a write either - there's a cluster lock inversion
and we might have to zero adjacent pages for an allocating write.

What's involved in merging it with ->fault?

Here's a nasty idea... Would it be valid for ->page_mkwrite to unlock the
page, so long as it's returned in a locked state? Though, do we even need
the page lock that early? It seemed to me that you were adding it for
consistency reasons (I could be wrong though).
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com
-

From: Nick Piggin
Date: Thursday, May 31, 2007 - 6:34 pm

I guess you could just fail the page_mkwrite and have it try again? ..

Just I have't sent the patch (because at the time there were no
page_mkwrite users to look at).

It is nicer too, because the nopage path only has  to call into
the filesystem once, to return the page (the filesystem can check
whether it is for write, and do the page_mkwrite thing at that
time). do_wp_page obviously still involves the extra call, and
that will be with a flag telling the fs that it isn't a "nopage"

You could do that, but you'd have to probably check that it is
within i_size after you relock it, I think... yeah, that might
be the best thing for ocfs to do for now.
-

From: Mark Fasheh
Date: Thursday, May 31, 2007 - 6:45 pm

Well, ocfs2 already does i_size checks in page_mkwrite, so we're covered
with respect to truncate races.

I'm still not clear though - what was the reason for adding the page locking
there in the 1st place?
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com
-

From: Nick Piggin
Date: Thursday, May 31, 2007 - 6:53 pm

Yeah, its to cover page invalidation races. There is a description in
an earlier patch's changelog.

-

From: Mark Fasheh
Date: Thursday, May 31, 2007 - 10:20 pm

Ok. So how about the attached patch? It's a bit different than discussed,
but I think it's much cleaner because it preserves the current behavior of
the callback and keeps that bit of page locking inside core code. Not tested
as of yet, but I can run it tommorrow.
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com

From: Mark Fasheh <mark.fasheh@oracle.com>

[PATCH] Release page lock before calling ->page_mkwrite

__do_fault() was calling ->page_mkwrite() with the page lock held, which
violates the locking rules for that callback. Release and retake the page
lock around the callback to avoid deadlocking file systems which manually
take it.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
---
 mm/memory.c |   14 +++++++++-----
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 7221618..491cc27 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2378,11 +2378,14 @@ static int __do_fault(struct mm_struct *
 			 * address space wants to know that the page is about
 			 * to become writable
 			 */
-			if (vma->vm_ops->page_mkwrite &&
-			    vma->vm_ops->page_mkwrite(vma, page) < 0) {
-				fdata.type = VM_FAULT_SIGBUS;
-				anon = 1; /* no anon but release faulted_page */
-				goto out;
+			if (vma->vm_ops->page_mkwrite) {
+				unlock_page(page);
+				if (vma->vm_ops->page_mkwrite(vma, page) < 0) {
+					fdata.type = VM_FAULT_SIGBUS;
+					anon = 1; /* no anon but release faulted_page */
+					goto out_unlocked;
+				}
+				lock_page(page);
 			}
 		}
 
@@ -2434,6 +2437,7 @@ static int __do_fault(struct mm_struct *
 
 out:
 	unlock_page(faulted_page);
+out_unlocked:
 	if (anon)
 		page_cache_release(faulted_page);
 	else if (dirty_page) {
-- 
1.4.2.3

-

From: Mark Fasheh
Date: Friday, June 1, 2007 - 3:01 pm

Ok - this patch seems to check out fine in testing - no more deadlocking.

Andrew, if this is ok with you I'd really like to see that fix in -mm. Ocfs2
mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch I think we're
pretty safe (as I noted before) because Ocfs2 re-checks the mapping under
lock to protect against trucate races. That's been an "unwritten"
requirement of page_mkwrite() anyway.

Speaking of requirements, attached is my sad attempt at documenting the API.
I know it might be merged into ->fault at some point, but we really ought to
have _something_ in the meantime.
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com


From: Mark Fasheh <mark.fasheh@oracle.com>

[PATCH] Document ->page_mkwrite() locking

There seems to be very little documentation about this callback in general.
The locking in particular is a bit tricky, so it's worth having this in
writing.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

---

 Documentation/filesystems/Locking |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

2320eadfa34199c779638edbdbb6c491df09c49b
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 970c8ec..91ec4b4 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -512,13 +512,22 @@ prototypes:
 	void (*close)(struct vm_area_struct*);
 	struct page *(*fault)(struct vm_area_struct*, struct fault_data *);
 	struct page *(*nopage)(struct vm_area_struct*, unsigned long, int *);
+	int (*page_mkwrite)(struct vm_area_struct *, struct page *);
 
 locking rules:
-		BKL	mmap_sem
+		BKL	mmap_sem	PageLocked(page)
 open:		no	yes
 close:		no	yes
 fault:		no	yes
 nopage:		no	yes
+page_mkwrite:	no	yes		no
+
+	->page_mkwrite() is called when a previously read-only page is
+about to become writeable. The file system is responsible for
+protecting against truncate races. Once appropriate action has been
+taking to lock out truncate, ...
From: Andrew Morton
Date: Friday, June 1, 2007 - 3:25 pm

On Fri, 1 Jun 2007 15:01:18 -0700

ug, OK.  I get a ginormous reject when merging ocfs2 on Nick's stuff which
I've been largely ignoring thus far.

Perhaps I need to go back to staging Nick's stuff after the git trees.  I'll
take a look.
-

From: Mark Fasheh
Date: Friday, June 1, 2007 - 3:33 pm

Huh, I'm a bit confused... I created this patch on top of 2.6.22-rc3-mm1
which most certainly contains a merge of git-ocfs2.patch and the series
which at least contains
mm-fix-fault-vs-invalidate-race-for-linear-mappings.patch.

So, which of Nick's patches are we talking about here?

Btw, I know you tend to handle rejects yourself, but if it's a major PITA
I'd be happy to help out. Boy, I'm hoping I didn't just ask for a load of
trouble there :)
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com
-

From: Andrew Morton
Date: Friday, June 1, 2007 - 3:47 pm

On Fri, 1 Jun 2007 15:33:02 -0700

Right.  I did a lot of tricksy work for rc3-mm1 to merge git-ocfs2 on top
of Nick's stuff.  Then I repulled your tree and lost it all.  This is
because I was dumb and I fixed rc3-mm1's git-ocfs.patch rather than doing a
separate fix-rejects-in-git-ocfs2.patch.


Is OK - I'll move Nick's patches back to behind the git trees and it'll all come
good.

-

From: Mark Fasheh
Date: Friday, June 1, 2007 - 3:53 pm

Phew ok. Once again, thanks for all the work you do getting the ocfs2 git
patches into -mm.
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com
-

From: Arnaldo Carvalho de Melo
Date: Thursday, May 31, 2007 - 7:01 pm

http://lkml.org/lkml/2007/5/30/565 fixes the above patch :-)

<BIG SNIP>

- Arnaldo
-

From: Andrew Morton
Date: Thursday, May 31, 2007 - 7:12 pm

I don't know what you mean.  The code is already using
for_each_possible_cpu() and
x86-fix-oprofile-double-free-was-re-multiple-free.patch doesn't change
that.

-

From: Arnaldo Carvalho de Melo
Date: Thursday, May 31, 2007 - 7:24 pm

Yes, the code, i.e. nmi_setup already uses for_each_possible_cpu(), that 
is not the problem. The problem is allocate_msr doing a 
for_each_online_cpu(), i.e. not allocating for each_possible_cpu. Chris 
tested and acked the patch: http://lkml.org/lkml/2007/5/31/36

- Arnaldo
-

From: Michael Ellerman
Date: Thursday, May 31, 2007 - 8:52 pm

I think these two should be in the 2.6.22 definite-queue, unless Eric disagrees.

cheers
-

From: Eric W. Biederman
Date: Thursday, May 31, 2007 - 10:55 pm

They are simple bug fixes for regression in 2.6.22 so I don't see a reason
to delay them.

Eric
-

From: Mel Gorman
Date: Friday, June 1, 2007 - 9:42 am

Came across this while automating allnoconfig, allmodconfig
and defconfig build tests. I haven't checked to make 100% sure but
rework-ptep_set_access_flags-and-fix-sun4c.patch is the most likely candidate
based on the error - patch author cc'd.

Test result
===========
Machine name: elm3b10
Architecture: ia64
Build args:   kernel 2.6.22-rc3-mm1 
Result:       Failed

Standard build:          Completed successfully
make allnoconfig build:  Failed and terminated the run
06/01/07-08:10:54 building kernel - make -j4 vmlinux.gz
  CHK     include/linux/version.h
  UPD     include/linux/version.h
  CHK     include/linux/utsrelease.h
  UPD     include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-ia64
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/basic/docproc
  CC      arch/ia64/kernel/asm-offsets.s
  CC      scripts/mod/empty.o
  HOSTCC  scripts/mod/mk_elfconfig
  MKELF   scripts/mod/elfconfig.h
  HOSTCC  scripts/mod/file2alias.o
  GEN     include/asm-ia64/asm-offsets.h
  CALL    scripts/checksyscalls.sh
<stdin>:1380:2: warning: #warning syscall revokeat not implemented
<stdin>:1384:2: warning: #warning syscall frevoke not implemented
<stdin>:1392:2: warning: #warning syscall sched_yield_to not implemented
  HOSTCC  scripts/mod/modpost.o
  HOSTCC  scripts/kallsyms
  HOSTCC  scripts/mod/sumversion.o
  HOSTCC  scripts/conmakehash
  HOSTLD  scripts/mod/modpost
  CC      init/main.o
  LD      usr/built-in.o
  CHK     include/linux/compile.h
  UPD     include/linux/compile.h
  CC      arch/ia64/kernel/acpi.o
  CC      init/do_mounts.o
  AS      arch/ia64/kernel/entry.o
  CC      arch/ia64/kernel/efi.o
  CC      init/noinitramfs.o
  CC      init/calibrate.o
  CC      init/version.o
  LD      init/mounts.o
  LD      init/built-in.o
  AS      arch/ia64/kernel/efi_stub.o
  CC      arch/ia64/mm/init.o
  LDS     arch/ia64/kernel/gate.lds
  AS      arch/ia64/kernel/gate.o
  AS      arch/ia64/kernel/fsys.o
  CC      arch/ia64/kernel/ia64_ksyms.o
  CC      ...
From: Andrew Morton
Date: Friday, June 1, 2007 - 10:00 am

this?

--- a/include/asm-ia64/pgtable.h~rework-ptep_set_access_flags-and-fix-sun4c-fix
+++ a/include/asm-ia64/pgtable.h
@@ -546,7 +546,7 @@ extern void lazy_mmu_prot_update (pte_t 
 # define ptep_set_access_flags(__vma, __addr, __ptep, __entry, __safely_writable) \
 ({										\
 	int __changed = !pte_same(*(__ptep), __entry);				\
-	if (__changed) {							\
+	if (__changed)							\
 		ptep_establish(__vma, __addr, __ptep, __entry)			\
 	__changed;								\
 })

-

From: Mel Gorman
Date: Friday, June 1, 2007 - 11:50 am

Fails with

mm/memory.c: In function `do_wp_page':
mm/memory.c:1700: error: parse error before "__changed"
mm/memory.c: In function `handle_pte_fault':
mm/memory.c:2544: error: parse error before "__changed"

Am currently testing the following;

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.22-rc3-mm1-clean/include/asm-ia64/pgtable.h linux-2.6.22-rc3-mm1-ia64fix/include/asm-ia64/pgtable.h
--- linux-2.6.22-rc3-mm1-clean/include/asm-ia64/pgtable.h	2007-06-01 09:24:39.000000000 +0100
+++ linux-2.6.22-rc3-mm1-ia64fix/include/asm-ia64/pgtable.h	2007-06-01 19:44:48.000000000 +0100
@@ -546,8 +546,8 @@ extern void lazy_mmu_prot_update (pte_t 
 # define ptep_set_access_flags(__vma, __addr, __ptep, __entry, __safely_writable) \
 ({										\
 	int __changed = !pte_same(*(__ptep), __entry);				\
-	if (__changed) {							\
-		ptep_establish(__vma, __addr, __ptep, __entry)			\
+	if (__changed)							\
+		ptep_establish(__vma, __addr, __ptep, __entry);			\
 	__changed;								\
 })
 #endif

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-

From: Mel Gorman
Date: Friday, June 1, 2007 - 1:55 pm

IA64 fails to build with allnoconfig due to an error in the !CONFIG_SMP
case. Ths following patch fixes it and should be considered a fix to
rework-ptep_set_access_flags-and-fix-sun4c.patch.

allmodconfig and defconfig tests are still running.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 pgtable.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.22-rc3-mm1-clean/include/asm-ia64/pgtable.h linux-2.6.22-rc3-mm1-ia64fix/include/asm-ia64/pgtable.h
--- linux-2.6.22-rc3-mm1-clean/include/asm-ia64/pgtable.h	2007-06-01 09:24:39.000000000 +0100
+++ linux-2.6.22-rc3-mm1-ia64fix/include/asm-ia64/pgtable.h	2007-06-01 19:44:48.000000000 +0100
@@ -546,8 +546,8 @@ extern void lazy_mmu_prot_update (pte_t 
 # define ptep_set_access_flags(__vma, __addr, __ptep, __entry, __safely_writable) \
 ({										\
 	int __changed = !pte_same(*(__ptep), __entry);				\
-	if (__changed) {							\
-		ptep_establish(__vma, __addr, __ptep, __entry)			\
+	if (__changed)							\
+		ptep_establish(__vma, __addr, __ptep, __entry);			\
 	__changed;								\
 })
 #endif
-

From: Adrian Bunk
Date: Saturday, June 2, 2007 - 6:57 am

I'm getting the following compile error in 2.6.22-rc3-mm1 with 
CONFIG_X86_CMPXCHG=n (with -Werror-implicit-function-declaration - 
otherwise it would be a link error):

<--  snip  -->

...
  CC      drivers/xen/grant-table.o
/home/bunk/linux/kernel-2.6/linux-2.6.22-rc3-mm1/drivers/xen/grant-table.c: In function ‘gnttab_end_foreign_access_ref’:
/home/bunk/linux/kernel-2.6/linux-2.6.22-rc3-mm1/drivers/xen/grant-table.c:203: error: implicit declaration of function ‘sync_cmpxchg’
make[3]: *** [drivers/xen/grant-table.o] Error 1

<--  snip  -->

Adding a dependency of XEN on X86_CMPXCHG should not be a problem and 
not prevent any reasonable real-life usage.

But what worries me is that a seemingly architecture independent 
driver uses a function only available in some configurations.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Adrian Bunk
Date: Thursday, June 28, 2007 - 4:36 pm

Still present as of 2.6.22-rc6-mm1.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Jeremy Fitzhardinge
Date: Thursday, June 28, 2007 - 8:21 pm

Sorry, must have missed this one.   Um, yeah, I guess we have a 
dependency on cmpxchg.  I'll do a patch.

    J

-

From: Adrian Bunk
Date: Saturday, June 2, 2007 - 10:06 am

CONFIG_XEN_BLKDEV_FRONTEND shouldn't silently prevent the compilation of 
most other block drivers.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---
--- linux-2.6.22-rc3-mm1/drivers/block/Makefile.old	2007-06-02 18:21:12.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/block/Makefile	2007-06-02 18:21:53.000000000 +0200
@@ -28,7 +28,7 @@
 obj-$(CONFIG_VIODASD)		+= viodasd.o
 obj-$(CONFIG_BLK_DEV_SX8)	+= sx8.o
 obj-$(CONFIG_BLK_DEV_UB)	+= ub.o
-obj-$(CONFIG_XEN_BLKDEV_FRONTEND) := xen-blkfront.o
+obj-$(CONFIG_XEN_BLKDEV_FRONTEND) += xen-blkfront.o
 obj-$(CONFIG_XILINX_SYSACE)	+= xsysace.o
 obj-$(CONFIG_LGUEST_GUEST)	+= lguest_blk.o
 

-

From: Adrian Bunk
Date: Saturday, June 2, 2007 - 10:14 am

statistics-infrastructure-make-printk_clock-a-generic-kernel-wide-nsec-resolution.patch 
shows why __attribute__((weak)) is harmful because you don't see if a 
required non-weak implemtation is missing:

In this case, the weak printk_clock() was renamed to timestamp_clock(), 
but the ARM and i386 implementations weren't renamed...

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Andrew Morton
Date: Monday, June 4, 2007 - 2:22 pm

On Sat, 2 Jun 2007 19:14:25 +0200

printk_clock() is sched_clock() in disguise, and I'm not sure that making
sched_clock() more widely available in this fashion is something that we
want to do anyway.

Anyway, the statistics patches have just celebrated their first birthday
and I don't see that they're getting sufficient momentum or interest to
ever get into mainline so I think I'll drop them, sorry.

-

From: Martin Peschke
Date: Monday, June 4, 2007 - 4:52 pm

Andrew,
the lock contention statistics, which have been added to -mm recently, duplicate 
code that we have in the statistics patches. I think I can slim the lock 
tracking patches further down considerably (similar to my attempt at 
timerstats). I have a working prototype that is getting some polishing brushes. 
Would you like to wait how this goes?

As to timestamp_clock(): its useful for statistics, but still a minor feature. 
It would be unfortuante if that was the stumbling block for my patches. Am I 
right that the fix for the issue pointed at by Adrian is to rename those two 
occurrences of printk_clock()? Do you want me to submit a patch?

-

From: Russell King
Date: Monday, June 4, 2007 - 8:59 pm

Note that sched_clock() can not be used early on ARM; it might want to
access MMIO which is not accessible until later in setup_arch().  This

If it's ends up being based upon sched_clock() instead of printk_clock()
on ARM then it'll break stuff horribly (== non-bootable kernels.)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-

From: Adrian Bunk
Date: Saturday, June 2, 2007 - 12:09 pm

The ASYNC_* options are for internal helper code and should therefore 
not be user visible.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

BTW: Please move the async_tx directory under drivers/ or lib/

 async_tx/Kconfig |   28 ++++++++--------------------
 1 file changed, 8 insertions(+), 20 deletions(-)

--- linux-2.6.22-rc3-mm1/async_tx/Kconfig.old	2007-06-02 19:30:32.000000000 +0200
+++ linux-2.6.22-rc3-mm1/async_tx/Kconfig	2007-06-02 19:31:20.000000000 +0200
@@ -1,27 +1,15 @@
-menuconfig ASYNC_CORE
-	tristate "Asynchronous Bulk Memory Transfers/Transforms"
-	default n
-	---help---
-	  This enables the async_tx interface layer for dma (offload) engines.
-	  Subsystems coded to this api will use offload engines for bulk memory
-	  operations (e.g. memcpy, memset, xor...).  When an offload engine is not
-	  available the interface will implicitly fall back to a software
-	  implementation of the operation.
-
-	  If unsure, say N
-
-if ASYNC_CORE
+config ASYNC_CORE
+	tristate
 
 config ASYNC_MEMCPY
-	default m
-	tristate "async_memcpy support"
+	tristate
+	select ASYNC_CORE
 
 config ASYNC_XOR
-	default m
-	tristate "async_xor support"
+	tristate
+	select ASYNC_CORE
 
 config ASYNC_MEMSET
-	default m
-	tristate "async_memset support"
+	tristate
+	select ASYNC_CORE
 
-endif

-

From: Williams, Dan J
Date: Monday, June 4, 2007 - 9:19 am

Yes, I was feeling somewhat exposed with the options in the top level
config, but at least it got a few more eyes on the code.  I will fold in
your patch and move async_tx under lib/ for now, but at some point I
would like to investigate the potential synergies with crypto/.

Thanks,
Dan
-

From: Adrian Bunk
Date: Sunday, June 3, 2007 - 1:54 pm

This patch makes the needlessly global xpad_play_effect() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---
--- linux-2.6.22-rc3-mm1/drivers/input/joystick/xpad.c.old	2007-06-03 22:24:23.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/input/joystick/xpad.c	2007-06-03 22:24:39.000000000 +0200
@@ -376,7 +376,8 @@
 		   __FUNCTION__, retval);
 }
 
-int xpad_play_effect(struct input_dev *dev, void *data, struct ff_effect *effect)
+static int xpad_play_effect(struct input_dev *dev, void *data,
+			    struct ff_effect *effect)
 {
 	struct usb_xpad *xpad = input_get_drvdata(dev);
 

-

From: Adrian Bunk
Date: Sunday, June 3, 2007 - 1:54 pm

This patch makes some needlessly global code static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 drivers/i2c/chips/ds1682.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- linux-2.6.22-rc3-mm1/drivers/i2c/chips/ds1682.c.old	2007-06-03 22:18:53.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/i2c/chips/ds1682.c	2007-06-03 22:21:07.000000000 +0200
@@ -121,12 +121,12 @@
 /*
  * Simple register attributes
  */
-SENSOR_DEVICE_ATTR_2(elapsed_time, S_IRUGO | S_IWUSR, ds1682_show, ds1682_store,
-		     4, DS1682_REG_ELAPSED);
-SENSOR_DEVICE_ATTR_2(alarm_time, S_IRUGO | S_IWUSR, ds1682_show, ds1682_store,
-		     4, DS1682_REG_ALARM);
-SENSOR_DEVICE_ATTR_2(event_count, S_IRUGO | S_IWUSR, ds1682_show, ds1682_store,
-		     2, DS1682_REG_EVT_CNTR);
+static SENSOR_DEVICE_ATTR_2(elapsed_time, S_IRUGO | S_IWUSR, ds1682_show,
+			    ds1682_store, 4, DS1682_REG_ELAPSED);
+static SENSOR_DEVICE_ATTR_2(alarm_time, S_IRUGO | S_IWUSR, ds1682_show,
+			    ds1682_store, 4, DS1682_REG_ALARM);
+static SENSOR_DEVICE_ATTR_2(event_count, S_IRUGO | S_IWUSR, ds1682_show,
+			    ds1682_store, 2, DS1682_REG_EVT_CNTR);
 
 static const struct attribute_group ds1682_group = {
 	.attrs = (struct attribute *[]) {

-

From: Jean Delvare
Date: Monday, June 4, 2007 - 1:15 am

Hi Adrian,


Good catch. I've folded this fix into i2c-ds1628-new-driver.patch,
thanks for reporting.

-- 
Jean Delvare
-

From: Adrian Bunk
Date: Sunday, June 3, 2007 - 1:54 pm

This patch makes the needlessly global dmi_id_init() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---
--- linux-2.6.22-rc3-mm1/drivers/firmware/dmi-id.c.old	2007-06-03 22:17:10.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/firmware/dmi-id.c	2007-06-03 22:17:24.000000000 +0200
@@ -161,7 +161,7 @@
 
 extern int dmi_available;
 
-int __init dmi_id_init(void)
+static int __init dmi_id_init(void)
 {
 	int ret, i;
 

-

From: Greg KH
Date: Thursday, June 7, 2007 - 9:38 pm

Thanks, I've merged this with the original.

greg k-h
-

From: Adrian Bunk
Date: Sunday, June 3, 2007 - 1:54 pm

Due to a typo the tea5761 tuner support was dead code.

This patch also fixes a bug in the no longer dead code:
A void function can't return anything.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 drivers/media/video/tuner-core.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- linux-2.6.22-rc3-mm1/drivers/media/video/tuner-core.c.old	2007-06-03 22:29:37.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/media/video/tuner-core.c	2007-06-03 22:35:54.000000000 +0200
@@ -25,7 +25,7 @@
 
 /* standard i2c insmod options */
 static unsigned short normal_i2c[] = {
-#ifdef CONFIG_TUNER_5761
+#ifdef CONFIG_TUNER_TEA5761
 	0x10,
 #endif
 	0x42, 0x43, 0x4a, 0x4b,			/* tda8290 */
@@ -192,12 +192,12 @@ static void set_type(struct i2c_client *
 		}
 		t->mode_mask = T_RADIO;
 		break;
-#ifdef CONFIG_TUNER_5761
+#ifdef CONFIG_TUNER_TEA5761
 	case TUNER_TEA5761:
 		if (tea5761_tuner_init(c) == EINVAL) {
 			t->type = TUNER_ABSENT;
 			t->mode_mask = T_UNINITIALIZED;
-			return -ENODEV;
+			return;
 		}
 		t->mode_mask = T_RADIO;
 		break;
@@ -473,7 +473,7 @@ static int tuner_attach(struct i2c_adapt
 	/* autodetection code based on the i2c addr */
 	if (!no_autodetect) {
 		switch (addr) {
-#ifdef CONFIG_TUNER_5761
+#ifdef CONFIG_TUNER_TEA5761
 		case 0x10:
 			if (tea5761_autodetection(&t->i2c) != EINVAL) {
 				t->type = TUNER_TEA5761;
-

From: Valdis.Kletnieks
Date: Monday, June 4, 2007 - 11:00 am

Under 22-rc2-mm1, if my VPN connection got reset, ppp0 just quietly went away.

Under 22-rc3-mm1, it seems to end up wedged and waiting for references to
go away:

Jun  4 09:23:01 turing-police kernel: [90089.270707] unregister_netdevice: waiting for ppp0 to become free. Usage count = 8
Jun  4 09:23:11 turing-police kernel: [90099.396121] unregister_netdevice: waiting for ppp0 to become free. Usage count = 8
Jun  4 09:23:21 turing-police kernel: [90109.520574] unregister_netdevice: waiting for ppp0 to become free. Usage count = 8
Jun  4 09:23:32 turing-police kernel: [90119.653129] unregister_netdevice: waiting for ppp0 to become free. Usage count = 8

'echo t > /proc/sysrq_trigger' shows pppd hung up here:

Jun  4 10:52:57 turing-police kernel: [95478.047892] pppd          D 0000000105ad3830  4968  3815      1 (NOTLB)
Jun  4 10:52:57 turing-police kernel: [95478.047902]  ffff810008d5fd78 0000000000000086 0000000000000000 ffff810003490000
Jun  4 10:52:57 turing-police kernel: [95478.047911]  ffff810008d5fd28 ffff810008d4a040 ffff810003461820 ffff810008d4a2b0
Jun  4 10:52:57 turing-police kernel: [95478.047920]  0000000105ad3733 0000000000000202 00000000000000ff ffffffff80239795
Jun  4 10:52:57 turing-police kernel: [95478.047928] Call Trace:
Jun  4 10:52:57 turing-police kernel: [95478.047936]  [<ffffffff805207a2>] schedule_timeout+0x8d/0xb4
Jun  4 10:52:57 turing-police kernel: [95478.047945]  [<ffffffff805207e2>] schedule_timeout_uninterruptible+0x19/0x1b
Jun  4 10:52:57 turing-police kernel: [95478.047954]  [<ffffffff802397bb>] msleep+0x14/0x1e
Jun  4 10:52:57 turing-police kernel: [95478.047963]  [<ffffffff8048aa4e>] netdev_run_todo+0x12f/0x234 
Jun  4 10:52:57 turing-police kernel: [95478.047972]  [<ffffffff8049166f>] rtnl_unlock+0x35/0x37
Jun  4 10:52:57 turing-police kernel: [95478.047981]  [<ffffffff804894a9>] unregister_netdev+0x1e/0x23
Jun  4 10:52:57 turing-police kernel: [95478.047994]  [<ffffffff88a5f2c2>] :ppp_generic:ppp_shutdown_interface+0x67/0xbb
Jun  4 10:52:57 ...
From: Andrew Morton
Date: Tuesday, June 5, 2007 - 11:14 pm

I don't know what could have caused this, sorry.  If it's still there in next -mm
(which is still 100000 compile fixes away) it'd be good if you could bisect it.
Suspects would be git-net.patch, get-netdev-all.patch and gregkh-driver-*.patch

Thanks.
-

From: Adrian Bunk
Date: Monday, June 4, 2007 - 3:12 pm

e1000_{read,write}_pci_cfg() are no longer used.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 drivers/net/e1000/e1000_hw.h   |    2 --
 drivers/net/e1000/e1000_main.c |    4 ++++
 2 files changed, 4 insertions(+), 2 deletions(-)

--- linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_hw.h.old	2007-06-04 22:03:05.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_hw.h	2007-06-04 22:03:14.000000000 +0200
@@ -421,8 +421,6 @@ void e1000_tbi_adjust_stats(struct e1000
 void e1000_get_bus_info(struct e1000_hw *hw);
 void e1000_pci_set_mwi(struct e1000_hw *hw);
 void e1000_pci_clear_mwi(struct e1000_hw *hw);
-void e1000_read_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t * value);
-void e1000_write_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t * value);
 int32_t e1000_read_pcie_cap_reg(struct e1000_hw *hw, uint32_t reg, uint16_t *value);
 void e1000_pcix_set_mmrbc(struct e1000_hw *hw, int mmrbc);
 int e1000_pcix_get_mmrbc(struct e1000_hw *hw);
--- linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_main.c.old	2007-06-04 22:03:24.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_main.c	2007-06-04 22:03:40.000000000 +0200
@@ -4888,6 +4888,8 @@ e1000_pci_clear_mwi(struct e1000_hw *hw)
 	pci_clear_mwi(adapter->pdev);
 }
 
+#if 0
+
 void
 e1000_read_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t *value)
 {
@@ -4904,6 +4906,8 @@ e1000_write_pci_cfg(struct e1000_hw *hw,
 	pci_write_config_word(adapter->pdev, reg, *value);
 }
 
+#endif  /*  0  */
+
 int
 e1000_pcix_get_mmrbc(struct e1000_hw *hw)
 {

-

From: Adrian Bunk
Date: Monday, June 4, 2007 - 3:13 pm

This patch contains the following cleanups:
- make the following needlessly global functions static:
  - core.c: mmc_schedule_delayed_work()
  - core.c: mmc_flush_scheduled_work()
- removes the prototope of the following non-existing function:
  - core.h: mmc_schedule_work()
- proper prototypes for three functions from core.c in core.h
 
Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 drivers/mmc/core/core.c |   37 +++++++++++++++++++------------------
 drivers/mmc/core/core.h |    6 +++---
 drivers/mmc/core/host.c |    4 ----
 3 files changed, 22 insertions(+), 25 deletions(-)

--- linux-2.6.22-rc3-mm1/drivers/mmc/core/core.h.old	2007-06-04 21:53:42.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/mmc/core/core.h	2007-06-04 23:19:44.000000000 +0200
@@ -64,9 +64,9 @@ static inline void mmc_delay(unsigned in
 	}
 }
 
-int mmc_schedule_work(struct work_struct *work);
-int mmc_schedule_delayed_work(struct delayed_work *work, unsigned long delay);
-void mmc_flush_scheduled_work(void);
+void mmc_rescan(struct work_struct *work);
+void mmc_start_host(struct mmc_host *host);
+void mmc_stop_host(struct mmc_host *host);
 
 #endif
 
--- linux-2.6.22-rc3-mm1/drivers/mmc/core/core.c.old	2007-06-04 21:54:46.000000000 +0200
+++ linux-2.6.22-rc3-mm1/drivers/mmc/core/core.c	2007-06-04 21:57:38.000000000 +0200
@@ -37,6 +37,25 @@
 extern int mmc_attach_mmc(struct mmc_host *host, u32 ocr);
 extern int mmc_attach_sd(struct mmc_host *host, u32 ocr);
 
+static struct workqueue_struct *workqueue;
+
+/*
+ * Internal function. Schedule delayed work in the MMC work queue.
+ */
+static int mmc_schedule_delayed_work(struct delayed_work *work,
+				     unsigned long delay)
+{
+	return queue_delayed_work(workqueue, work, delay);
+}
+
+/*
+ * Internal function. Flush all scheduled work from the MMC work queue.
+ */
+static void mmc_flush_scheduled_work(void)
+{
+	flush_workqueue(workqueue);
+}
+
 /**
  *	mmc_request_done - finish processing an MMC request
  *	@host: MMC host ...
From: Pierre Ossman
Date: Wednesday, June 6, 2007 - 11:36 am

Thanks, applied.

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:11 am

hm, mm1 hangs during bootup on one of my boxes:

 Calling initcall 0xc0628d39: pci_mmcfg_late_insert_resources+0x0/0x44()
 initcall 0xc0628d39: pci_mmcfg_late_insert_resources+0x0/0x44() returned 0.
 initcall 0xc0628d39 ran for 0 msecs: 
 pci_mmcfg_late_insert_resources+0x0/0x44()
 Calling initcall 0xc062abd1: tcp_congestion_default+0x0/0xf()
 initcall 0xc062abd1: tcp_congestion_default+0x0/0xf() returned 0.
 initcall 0xc062abd1 ran for 0 msecs: tcp_congestion_default+0x0/0xf()

it usually hangs in different places. Full bootlog below. Same kernel 
bzImage boots fine on another box.

the NMI watchdog warning is a bit weird:

 Calling initcall 0xc0613e4f: check_nmi_watchdog+0x0/0x1a8()
 Testing NMI watchdog ... CPU#0: NMI appears to be stuck (0->0)!
 CPU#1: NMI appears to be stuck (0->0)!
 initcall 0xc0613e4f: check_nmi_watchdog+0x0/0x1a8() returned -1.
 initcall 0xc0613e4f ran for 27 msecs: check_nmi_watchdog+0x0/0x1a8()

i'll test it in a minute with that turned off.

	Ingo

------------>
Linux version 2.6.22-rc3-mm1-v16-rc2 (mingo@dione) (gcc version 4.0.2) #9 SMP Tue Jun 5 10:48:06 CEST 2007
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
 BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
 BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
 BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
using polling idle threads.
127MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000f5680
Entering add_active_range(0, 0, 262128) 0 entries of 256 used
sizeof(struct page) = 32
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->   229376
  HighMem    229376 ->   262128
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
 ...
From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:18 am

yeah, with nmi_watchdog=0 it works fine. nmi_watchdog=2 always worked on 
this box so this is some new regression.

	Ingo
-

From: Andrew Morton
Date: Tuesday, June 5, 2007 - 2:24 am

In my experience that means that it wedged in a timer tick.  Often the first

hm.  I haven't seen any similar reports.
-

From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:33 am

yeah. I tried !hres and !dynticks too and that doesnt make any 
difference to the end result - so my guess is on the NMI watchdog 
re-programming thing on K8 CPUs (running the 32-bit kernel), which is 
done in every NMI tick. check_watchdog() for some reason thought there's 
no NMI, and later on an NMI still arrived? Something like that.

vanilla kernel works fine.

	Ingo
-

From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:39 am

i have put an early_printk() into the NMI handler but it never triggers.

	Ingo
-

From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:42 am

commenting out check_nmi_watchdog() produces a booting kernel too. So 
it's a side-effect of check_nmi_watchdog(). Problem is, nothing in nmi.c 
changed in -mm1 AFAICS.

	Ingo
-

From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:45 am

ah, plain -rc3 hangs too. So it's one of these commits i guess:

commit 1eeb66a1bb973534dc3d064920a5ca683823372e
commit 09198e68501a7e34737cd9264d266f42429abcdc
commit bbba11c35baaad3f70f32e185a2c1d40d7901fe9
commit bf8696ed6dfa561198b4736deaf11ab68dcc4845

	Ingo
-

From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:50 am

oh, damn. The resulting kernel after having undone these doesnt even 
build ...

	Ingo
-

From: Ingo Molnar
Date: Tuesday, June 5, 2007 - 2:56 am

Andi/Linus Cc:-ed - these NMI watchdog changes came over the x86_64 
tree. I suspect this patch:

 commit 09198e68501a7e34737cd9264d266f42429abcdc
 Author: Andi Kleen <ak@suse.de>
 Date:   Wed May 2 19:27:20 2007 +0200

     [PATCH] i386: Clean up NMI watchdog code

Andi - just boot with nmi_watchdog=2 on a dual-core Athlon64 CPU.

	Ingo
-

From: Björn
Date: Sunday, June 10, 2007 - 11:10 am

I still fail to reproduce this, could you send me your config?

Thanks,
Björn
-

From: Ingo Molnar
Date: Monday, June 18, 2007 - 5:11 am

From: Björn
Date: Monday, June 18, 2007 - 7:31 am

Still no hang here. Just to make sure that I didn't mess the test up,
here's what I did:

Get pristine 2.6.22-rc5 kernel sources.
Put your config in place as .config
Run: make ARCH=i386 CFLAGS_KERNEL="-m32" AFLAGS_KERNEL="-m32"
Install the kernel.
Reboot, pass "nmi_watchdog=2" as kernel parameter.

As your config is for a 64bit kernel, several config items had to be set
manually. In one run, I accepted the default values, in the second run,
I tried to adjust those items to match your 64bit config as good as I
could.

Anything wrong with that?

Björn
-

From: Ingo Molnar
Date: Sunday, June 24, 2007 - 11:18 pm

FYI, latest -git _still_ hangs with the NMI watchdog with 
nmi_watchdog=2, at the same place:

 Calling initcall 0xc06cc620: check_nmi_watchdog+0x0/0x1f0()
 Testing NMI watchdog ... CPU#0: NMI appears to be stuck (0->0)!
 CPU#1: NMI appears to be stuck (0->0)!
 initcall 0xc06cc620: check_nmi_watchdog+0x0/0x1f0() returned -1.
 initcall 0xc06cc620 ran for 27 msecs: check_nmi_watchdog+0x0/0x1f0()
 initcall at 0xc06cc620: check_nmi_watchdog+0x0/0x1f0(): returned with 
 error code -1
 Calling initcall 0xc06ccbb0: io_apic_bug_finalize+0x0/0x20()
 initcall 0xc06ccbb0: io_apic_bug_finalize+0x0/0x20() returned 0.
 initcall 0xc06ccbb0 ran for 0 msecs: io_apic_bug_finalize+0x0/0x20()
 Calling initcall 0xc06ccd00: balanced_irq_init+0x0/0x1e0()
 Starting balanced_irq
 [hard hang]

full bootlog attached below.

	Ingo

-------------------------------->
Linux version 2.6.22-rc5-cfs-v19 (mingo@dione) (gcc version 4.0.2) #7 SMP Mon Jun 25 08:09:25 CEST 2007
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
 BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
 BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
 BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000f5680
Entering add_active_range(0, 0, 262128) 0 entries of 256 used
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->   229376
  HighMem    229376 ->   262128
early_node_map[1] active PFN ranges
    0:        0 ->   262128
On node 0 totalpages: 262128
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 1760 pages used for memmap
  Normal zone: 223520 pages, LIFO batch:31
  ...
From: Ingo Molnar
Date: Sunday, June 24, 2007 - 11:59 pm

hm, restoring nmi.c to the v2.6.21 state does not fix the nmi_watchdog=2 
hang. I'll do a bisection run.

	Ingo

-

From: Ingo Molnar
Date: Monday, June 25, 2007 - 1:05 am

and after spending an hour on 15 bisection steps:

 git-bisect start
 git-bisect good d1be341dba5521506d9e6dccfd66179080705bea
 git-bisect bad a06381fec77bf88ec6c5eb6324457cb04e9ffd69
 git-bisect bad 794543a236074f49a8af89ef08ef6a753e4777e5
 git-bisect good 24a77daf3d80bddcece044e6dc3675e427eef3f3
 git-bisect bad ea62ccd00fd0b6720b033adfc9984f31130ce195
 git-bisect good 7e20ef030dde0e52dd5a57220ee82fa9facbea4e
 git-bisect bad f19cccf366a07e05703c90038704a3a5ffcb0607
 git-bisect good 0d08e0d3a97cce22ebf80b54785e00d9b94e1add
 git-bisect bad 856f44ff4af6e57fdc39a8b2bec498c88438bd27
 git-bisect bad f8822f42019eceed19cc6c0f985a489e17796ed8
 git-bisect good 1c3d99c11c47c8a1a9ed6a46555dbf6520683c52
 git-bisect good b239fb2501117bf3aeb4dd6926edd855be92333d
 git-bisect good 98de032b681d8a7532d44dfc66aa5c0c1c755a9d
 git-bisect good 42c24fa22e86365055fc931d833f26165e687c19

the winner is ...

 f8822f42019eceed19cc6c0f985a489e17796ed8 is first bad commit
 commit f8822f42019eceed19cc6c0f985a489e17796ed8
 Author: Jeremy Fitzhardinge <jeremy@goop.org>
 Date:   Wed May 2 19:27:14 2007 +0200

    [PATCH] i386: PARAVIRT: Consistently wrap paravirt ops callsites to make them patchable

... our wonderful paravirt subsystem, honed to eternal perfection by the 
testing-machine x86_64 tree.

reverting -git-curr's paravirt.c, paravirt.h, smp.c and tlbflush.h to 
before the bad commit makes the NMI watchdog work again. Patch against 
-rc6 is below.

	Ingo

------------------------>
Subject: [patch, 2.6.22-rc6] fix nmi_watchdog=2 bootup hang
From: Ingo Molnar <mingo@elte.hu>

nmi_watchdog=2 hangs on i386:

 Calling initcall 0xc06cc620: check_nmi_watchdog+0x0/0x1f0()
 Testing NMI watchdog ... CPU#0: NMI appears to be stuck (0->0)!
 CPU#1: NMI appears to be stuck (0->0)!
 initcall 0xc06cc620: check_nmi_watchdog+0x0/0x1f0() returned -1.
 initcall 0xc06cc620 ran for 27 msecs: check_nmi_watchdog+0x0/0x1f0()
 initcall at 0xc06cc620: check_nmi_watchdog+0x0/0x1f0(): returned with
 error code ...
From: Ingo Molnar
Date: Monday, June 25, 2007 - 1:26 am

and of course i'm happy to test any patch that is simpler than the 
brutal revert i sent.

	Ingo
-

From: Björn
Date: Monday, June 25, 2007 - 5:45 am

wrmsrl() looks broken, dropping the upper 32bits of the value to be
written. Does this help?

Björn
---
diff --git a/include/asm-i386/paravirt.h b/include/asm-i386/paravirt.h
index d7a0512..7f846a7 100644
--- a/include/asm-i386/paravirt.h
+++ b/include/asm-i386/paravirt.h
@@ -539,7 +539,7 @@ static inline int paravirt_write_msr(unsigned msr, unsigned low, unsigned high)
 	val = paravirt_read_msr(msr, &_err);	\
 } while(0)
 
-#define wrmsrl(msr,val)		((void)paravirt_write_msr(msr, val, 0))
+#define wrmsrl(msr,val)		wrmsr(msr, (u32)((u64)(val)), ((u64)(val))>>32)
 #define wrmsr_safe(msr,a,b)	paravirt_write_msr(msr, a, b)
 
 /* rdmsr with exception handling */
-

From: Jeremy Fitzhardinge
Date: Monday, June 25, 2007 - 5:49 am

Crap.  That's embarrassing.  Does it help, because it seems likely?  
(Esp since Ingo didn't even have CONFIG_PARAVIRT enabled, so most of his 
revert would have been dead code anyway.)

    J
-

From: Björn
Date: Monday, June 25, 2007 - 6:06 am

He has. The config Ingo sent was for x86_64, which (AFAICT) doesn't have
CONFIG_PARAVIRT, so the config was unfortunately useless. But his
bootlog tells us:

Booting paravirtualized kernel on bare hardware

Björn
-

From: Ingo Molnar
Date: Monday, June 25, 2007 - 11:50 am

this did the trick, rc6 plus your fix and the NMI watchdog works again! 
Thanks! I suspect other code (oprofile?) broke due to this too.

below is a tidied up patch for upstream application. Must-have for 
2.6.22.

	Ingo

----------------->
From: Björn Steinbrink <B.Steinbrink@gmx.de>
Subject: [patch, 2.6.22-rc6] fix nmi_watchdog=2 bootup hang, take #2

wrmsrl() is broken, dropping the upper 32bits of the value to be
written. This broke the NMI watchdog on AMD hardware. (and it
probably broke other code too.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/asm-i386/paravirt.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/include/asm-i386/paravirt.h
===================================================================
--- linux.orig/include/asm-i386/paravirt.h
+++ linux/include/asm-i386/paravirt.h
@@ -539,7 +539,7 @@ static inline int paravirt_write_msr(uns
 	val = paravirt_read_msr(msr, &_err);	\
 } while(0)
 
-#define wrmsrl(msr,val)		((void)paravirt_write_msr(msr, val, 0))
+#define wrmsrl(msr,val)		wrmsr(msr, (u32)((u64)(val)), ((u64)(val))>>32)
 #define wrmsr_safe(msr,a,b)	paravirt_write_msr(msr, a, b)
 
 /* rdmsr with exception handling */
-

From: Jeremy Fitzhardinge
Date: Monday, June 25, 2007 - 5:40 am

Er, wow.  I've been running with this stuff for months without a 
problem.   Do you have CONFIG_PARAVIRT enabled?  Do you still get the 
hang if you boot with "noreplace-paravirt" to disable the patching? 

Your revert patch seems to take out quite a lot of stuff, some unrelated 
to the paravirt_ops.  Where did that come from?

I presume there's one bad callsite in here which is used by the nmi path 
more or less exclusively.  Is the bug simply that it hangs if you boot 

What's this?  This isn't paravirt_ops related, is it?

    J
-

From: Björn
Date: Monday, June 25, 2007 - 6:13 am

Are you running on AMD hardware? As Intel performance counters are only
32 bits wide, the wrmsrl bug should be a non-issue at least for the NMI
watchdog on Intel hardware. AMD uses 48bit wide performance counters,
which are probably less happy ;-)

Björn
-

From: Rusty Russell
Date: Tuesday, June 5, 2007 - 8:16 am

drivers/built-in.o: In function `ahci_port_start':
/home/rusty/linux-2.6.22-rc3-mm1/drivers/ata/ahci.c:1631: undefined reference to `ahci_port_resume'

Presumably because:
# CONFIG_PM is not set

Cheers,
Rusty.


-

From: Adrian Bunk
Date: Tuesday, June 5, 2007 - 2:50 pm

This patch makes some needlessly global code static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

BTW: Please don't #include C files in sched.c

 kernel/sched.c      |    2 +-
 kernel/sched_fair.c |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.22-rc3-mm1/kernel/sched_fair.c.old	2007-06-05 22:18:18.000000000 +0200
+++ linux-2.6.22-rc3-mm1/kernel/sched_fair.c	2007-06-05 22:29:00.000000000 +0200
@@ -134,7 +134,7 @@
 	return curr->load_weight * (s64)(granularity / NICE_0_LOAD);
 }
 
-unsigned long get_rq_load(struct rq *rq)
+static unsigned long get_rq_load(struct rq *rq)
 {
 	unsigned long load = rq->cpu_load[CPU_LOAD_IDX_MAX-1] + 1;
 
@@ -384,7 +384,7 @@
 	p->exec_start = 0;
 }
 
-long div64_s(s64 divident, unsigned long divisor)
+static long div64_s(s64 divident, unsigned long divisor)
 {
 	u64 tmp;
 
--- linux-2.6.22-rc3-mm1/kernel/sched.c.old	2007-06-05 22:29:19.000000000 +0200
+++ linux-2.6.22-rc3-mm1/kernel/sched.c	2007-06-05 22:29:38.000000000 +0200
@@ -564,7 +564,7 @@
  * if you go up 1 level, it's -10% CPU usage, if you go down 1 level
  * it's +10% CPU usage.
  */
-const int prio_to_weight[40] = {
+static const int prio_to_weight[40] = {
 /* -20 */ 88818, 71054, 56843, 45475, 36380, 29104, 23283, 18626, 14901, 11921,
 /* -10 */  9537,  7629,  6103,  4883,  3906,  3125,  2500,  2000,  1600,  1280,
 /*   0 */  NICE_0_LOAD /* 1024 */,

-

From: Andrew Morton
Date: Tuesday, June 5, 2007 - 11:54 pm

"divident" does appear to be a word, but I suspect "dividend" was intended.

Why is this function lurking in the CPU scheduler rather than in
lib/somewhere.c?

Doesn't an unsigned divide give the same result as a signed one?
-

From: Ingo Molnar
Date: Wednesday, June 6, 2007 - 12:30 am

In this case it's not that bad. It makes the source quite a bit cleaner 
and avoids having to create artificial interfaces, global functions, 


no! 0xfffffff0 / 2 is 0x7fffffff when the division is unsigned, and 
7ffffff8 (== -8) when signed. On x86 the silicon only offers us unsigned 
64-bit division, so we first have to make '+16' out of -16, then divide 
by 2, and turn the +4 into -4. (On x86_64 there's no such problem, 
there's an idiv and a div 64-bit instruction as well, and gcc picks the 
right one depending on the type of the variable.)

i think Roman has recently done a nice cleanup patch that introduces 
this? I'll change CFS to use that interface once it's upstream.

	Ingo
-

From: Adrian Bunk
Date: Wednesday, June 6, 2007 - 5:31 am

The idiom used in the kernel for such code is "global code and compile 
the files separately". The expected inclusion for a C file into the 
kernel is through the Makefile, and everything else is surprising when 
looking through the code.

"artificial interfaces" is not a problem since these are completely 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Ingo Molnar
Date: Wednesday, June 6, 2007 - 12:02 am

thanks, applied.

	Ingo
-

From: Adrian Bunk
Date: Tuesday, June 5, 2007 - 2:51 pm

This patch makes two needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 kernel/lockdep_proc.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.22-rc3-mm1/kernel/lockdep_proc.c.old	2007-06-05 22:07:30.000000000 +0200
+++ linux-2.6.22-rc3-mm1/kernel/lockdep_proc.c	2007-06-05 22:07:57.000000000 +0200
@@ -364,7 +364,7 @@
 /*
  * sort on absolute number of contentions
  */
-int lock_stat_cmp(const void *l, const void *r)
+static int lock_stat_cmp(const void *l, const void *r)
 {
 	const struct lock_stat_data *dl = l, *dr = r;
 	unsigned long nl, nr;
@@ -567,8 +567,8 @@
 	return res;
 }
 
-ssize_t lock_stat_write(struct file *file, const char __user *buf,
-		size_t count, loff_t *ppos)
+static ssize_t lock_stat_write(struct file *file, const char __user *buf,
+			       size_t count, loff_t *ppos)
 {
 	struct lock_class *class;
 	char c;

-

From: Peter Zijlstra
Date: Tuesday, June 5, 2007 - 3:34 pm

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>


-

From: Adrian Bunk
Date: Tuesday, June 5, 2007 - 2:50 pm

This patch makes needlessly global code static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 kernel/power/disk.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.22-rc3-mm1/kernel/power/disk.c.old	2007-06-05 22:12:07.000000000 +0200
+++ linux-2.6.22-rc3-mm1/kernel/power/disk.c	2007-06-05 22:13:37.000000000 +0200
@@ -45,7 +45,7 @@
 
 static int hibernation_mode = HIBERNATION_SHUTDOWN;
 
-struct hibernation_ops *hibernation_ops;
+static struct hibernation_ops *hibernation_ops;
 
 /**
  * hibernation_set_ops - set the global hibernate operations
@@ -231,7 +231,7 @@
  *	to power off or reboot.
  */
 
-void power_down(void)
+static void power_down(void)
 {
 	switch (hibernation_mode) {
 	case HIBERNATION_TEST:

-

From: Rafael J. Wysocki
Date: Tuesday, June 5, 2007 - 3:10 pm

ACK


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-

From: Adrian Bunk
Date: Tuesday, June 5, 2007 - 2:51 pm

This patch makes the needlessly global struct proc_pid_sched_operations 
static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---
--- linux-2.6.22-rc3-mm1/fs/proc/base.c.old	2007-06-05 22:01:08.000000000 +0200
+++ linux-2.6.22-rc3-mm1/fs/proc/base.c	2007-06-05 22:41:57.000000000 +0200
@@ -952,7 +952,7 @@
 	return ret;
 }
 
-const struct file_operations proc_pid_sched_operations = {
+static const struct file_operations proc_pid_sched_operations = {
 	.open		= sched_open,
 	.read		= seq_read,
 	.write		= sched_write,
-

From: Ingo Molnar
Date: Wednesday, June 6, 2007 - 12:32 am

thanks.

Acked-by: Ingo Molnar <mingo@elte.hu>

	Ingo
-

Previous thread: Re: [PATCH/RFC] Is it OK for 'read' to return nuls for a file that never had nuls in it? by Neil Brown on Wednesday, May 30, 2007 - 11:51 pm. (1 message)

Next thread: [PATCH 1/3] lguest: speed up PARAVIRT_LAZY_FLUSH handling by Rusty Russell on Thursday, May 31, 2007 - 12:23 am. (3 messages)