Re: [PATCH] mmc: fix for CONFIG_PM disabled

Previous thread: viafb output device rework - part 1 by Florian Tobias Schandinat on Wednesday, August 11, 2010 - 4:49 pm. (12 messages)

Next thread: Re: + drivers-acpi-apei-erst-dbgc-get_useru64-doesnt-work-on-i386.patch added to -mm tree by Randy Dunlap on Wednesday, August 11, 2010 - 4:43 pm. (10 messages)
From: akpm
Date: Wednesday, August 11, 2010 - 4:10 pm

The mm-of-the-moment snapshot 2010-08-11-16-10 has been uploaded to

   http://userweb.kernel.org/~akpm/mmotm/

and will soon be available at

   git://zen-kernel.org/kernel/mmotm.git

It contains the following patches against ...
From: Valdis.Kletnieks
Date: Thursday, August 12, 2010 - 9:18 am

Throws a RCU complaint.  Hopefully somebody on the cc: list knows what it is about...

[    0.026136] CPU0: Intel(R) Core(TM)2 Duo CPU     P8700  @ 2.53GHz stepping 0a
[    0.028399] NMI watchdog enabled, takes one hw-pmu counter.
[    0.030019] lockdep: fixing up alternatives.
[    0.031178] 
[    0.031179] ===================================================
[    0.031182] [ INFO: suspicious rcu_dereference_check() usage. ]
[    0.031184] ---------------------------------------------------
[    0.031187] kernel/sched.c:618 invoked rcu_dereference_check() without protection!
[    0.031189] 
[    0.031189] other info that might help us debug this:
[    0.031190] 
[    0.031192] 
[    0.031193] rcu_scheduler_active = 1, debug_locks = 1
[    0.031195] 3 locks held by kworker/0:0/4:
[    0.031197]  #0:  (events){+.+.+.}, at: [<ffffffff810504ca>] process_one_work+0x1b6/0x37d
[    0.031210]  #1:  ((&c_idle.work)){+.+.+.}, at: [<ffffffff810504ca>] process_one_work+0x1b6/0x37d
[    0.031217]  #2:  (&rq->lock){-.-...}, at: [<ffffffff81b5f9b8>] init_idle+0x2b/0x114
[    0.031225] 
[    0.031226] stack backtrace:
[    0.031229] Pid: 4, comm: kworker/0:0 Not tainted 2.6.35-mmotm0811 #1
[    0.031232] Call Trace:
[    0.031237]  [<ffffffff810661eb>] lockdep_rcu_dereference+0x9d/0xa5
[    0.031242]  [<ffffffff8102b751>] task_group+0x7b/0x8a
[    0.031246]  [<ffffffff81b5f9b8>] ? init_idle+0x2b/0x114
[    0.031250]  [<ffffffff8102b775>] set_task_rq+0x15/0x6e
[    0.031253]  [<ffffffff81b5fa5e>] init_idle+0xd1/0x114
[    0.031257]  [<ffffffff81b5fb44>] fork_idle+0x8e/0x9d
[    0.031261]  [<ffffffff81b5de6f>] do_fork_idle+0x17/0x28
[    0.031265]  [<ffffffff8105052b>] process_one_work+0x217/0x37d
[    0.031269]  [<ffffffff810504ca>] ? process_one_work+0x1b6/0x37d
[    0.031273]  [<ffffffff81b5de58>] ? do_fork_idle+0x0/0x28
[    0.031277]  [<ffffffff81051775>] worker_thread+0x17e/0x251
[    0.031281]  [<ffffffff810515f7>] ? worker_thread+0x0/0x251
[    0.031285]  [<ffffffff8105544a>] ...
From: Paul E. McKenney
Date: Monday, August 16, 2010 - 10:23 am

Hello, Valdis!


Interesting!  My first thought was that this is a false positive, given
that lockdep_is_held(&task_rq(p)->lock) is one of the arguments to
task_subsys_state_check() and thus to rcu_dereference_check().  However...

Given the "lockdep: fixing up alternatives" above, we know that cpu==1,
and that the code is running on CPU 0.

So init_idle() acquires the specified CPU's runqueue lock:

	struct rq *rq = cpu_rq(cpu);
	...
	raw_spin_lock_irqsave(&rq->lock, flags);

Then init_idle() goes on to initialize a number of fields in the
new idle task's task structure, then calls __set_task_cpu() to set
up the new idle task on the specified CPU.

Now, __set_task_cpu() invokes set_task_rq(), which invokes task_group(),
which as mentioned before specifies lockdep_is_held(&task_rq(p)->lock)
as one of the splat-avoiding conditions.  But the new idle task does
not yet have its current CPU set to CPU 1 -- that doesn't happen until
the end of __set_task_cpu().  Therefore, task_rq(p) will return 0.

So, if I am reading the code correctly, task_group() will be checking
for CPU 0's runqueue, when we are instead holding CPU 1's runqueue lock.
The patch below fixes this by acquiring both locks, as is done during
task migration.  Untested, probably does not even compile.


Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 sched.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 70fa78d..81a6a0a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5314,9 +5314,11 @@ void __cpuinit init_idle_bootup_task(struct task_struct *idle)
 void __cpuinit init_idle(struct task_struct *idle, int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
+	struct rq *oldrq = task_rq(idle);
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&rq->lock, flags);
+	local_irq_save(flags);
+	double_rq_lock(oldrq, rq);
 
 	__sched_fork(idle);
 	idle->state = TASK_RUNNING;
@@ -5329,7 +5331,8 @@ void __cpuinit init_idle(struct ...
From: Zdenek Kabelac
Date: Tuesday, October 5, 2010 - 3:05 am

I'm still seeing this INFO message on my vanilla 2.6.36-rc kernel.

----------------------

ftrace: converting mcount calls to 0f 1f 44 00 00
ftrace: allocating 16045 entries in 63 pages
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz stepping 0a
NMI watchdog enabled, takes one hw-pmu counter.
lockdep: fixing up alternatives.

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/sched.c:618 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
3 locks held by kworker/0:0/4:
 #0:  (events){+.+.+.}, at: [<ffffffff8106e78e>] process_one_work+0x12e/0x560
 #1:  ((&c_idle.work)){+.+.+.}, at: [<ffffffff8106e78e>]
process_one_work+0x12e/0x560
 #2:  (&rq->lock){......}, at: [<ffffffff814772c3>] init_idle+0x30/0x12c

stack backtrace:
Pid: 4, comm: kworker/0:0 Not tainted 2.6.36-rc6-00085-g6e34025 #1
Call Trace:
 [<ffffffff81089c1b>] lockdep_rcu_dereference+0xbb/0xc0
 [<ffffffff8103e7d5>] set_task_rq+0x2f5/0x300
 [<ffffffff81477374>] init_idle+0xe1/0x12c
 [<ffffffff81477769>] fork_idle+0x90/0x9f
 [<ffffffff81048dfa>] ? enqueue_entity+0x13a/0x430
 [<ffffffff81482789>] ? sub_preempt_count+0x59/0x60
 [<ffffffff814752f5>] do_fork_idle+0x1c/0x2d
 [<ffffffff8106e7fa>] process_one_work+0x19a/0x560
 [<ffffffff8106e78e>] ? process_one_work+0x12e/0x560
 [<ffffffff814752d9>] ? do_fork_idle+0x0/0x2d
 [<ffffffff81070189>] worker_thread+0x169/0x340
 [<ffffffff81070020>] ? worker_thread+0x0/0x340
 [<ffffffff810752e6>] kthread+0xa6/0xb0
 [<ffffffff81004014>] kernel_thread_helper+0x4/0x10
 [<ffffffff8147efcb>] ? _raw_spin_unlock_irq+0x3b/0x60
 [<ffffffff8147f600>] ? restore_args+0x0/0x30
 [<ffffffff81075240>] ? kthread+0x0/0xb0
 [<ffffffff81004010>] ? kernel_thread_helper+0x0/0x10
Booting Node   0, Processors  ...
From: Paul E. McKenney
Date: Wednesday, October 6, 2010 - 4:04 pm

Hello, Zdenek,

I believe that the following patch from Peter Z. should address this.

							Thanx, Paul

------------------------------------------------------------------------

commit e3dd67d97b3c2aad366b845c797745a78efaf90d
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Thu Sep 16 17:50:31 2010 +0200

    sched: fix RCU lockdep splat from task_group()
    
    This addresses the following RCU lockdep splat:
    
    [0.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping 03
    [0.052999] lockdep: fixing up alternatives.
    [0.054105]
    [0.054106] ===================================================
    [0.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
    [0.054999] ---------------------------------------------------
    [0.054999] kernel/sched.c:616 invoked rcu_dereference_check() without protection!
    [0.054999]
    [0.054999] other info that might help us debug this:
    [0.054999]
    [0.054999]
    [0.054999] rcu_scheduler_active = 1, debug_locks = 1
    [0.054999] 3 locks held by swapper/1:
    [0.054999]  #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff814be933>] cpu_up+0x42/0x6a
    [0.054999]  #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810400d8>] cpu_hotplug_begin+0x2a/0x51
    [0.054999]  #2:  (&rq->lock){-.-...}, at: [<ffffffff814be2f7>] init_idle+0x2f/0x113
    [0.054999]
    [0.054999] stack backtrace:
    [0.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
    [0.054999] Call Trace:
    [0.054999]  [<ffffffff81068054>] lockdep_rcu_dereference+0x9b/0xa3
    [0.054999]  [<ffffffff810325c3>] task_group+0x7b/0x8a
    [0.054999]  [<ffffffff810325e5>] set_task_rq+0x13/0x40
    [0.054999]  [<ffffffff814be39a>] init_idle+0xd2/0x113
    [0.054999]  [<ffffffff814be78a>] fork_idle+0xb8/0xc7
    [0.054999]  [<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
    [0.054999]  [<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
    [0.054999]  [<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
    [0.054999]  [<ffffffff814bcea6>] ? ...
From: Ben Greear
Date: Wednesday, October 6, 2010 - 4:18 pm

I get a similar lockdep splat, even with that patch applied,
so I think it is not completely fixed.

I'm using wireless-testing, which is based on 2.6.36-rc6.

See my previous email:
http://groups.google.com/group/linux.kernel/browse_thread/thread/fcef23494cfda353

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

--

From: Zdenek Kabelac
Date: Monday, October 18, 2010 - 5:26 am

I'm using kernel patched with this patch - but I still get this error
- though at different place:
(not really sure how it is related - but of course the RCU complain
disappeared during  boot).

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/sched.c:618 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
1 lock held by make/6137:
 #0:  (&rq->lock){......}, at: [<ffffffff810487d7>] task_fork_fair+0x67/0x180

stack backtrace:
Pid: 6137, comm: make Not tainted 2.6.36-rc8-00024-ga7ac73b #6
Call Trace:
 [<ffffffff81089b7b>] lockdep_rcu_dereference+0xbb/0xc0
 [<ffffffff8103e605>] set_task_rq+0x2f5/0x300
 [<ffffffff810488db>] task_fork_fair+0x16b/0x180
 [<ffffffff8104b634>] sched_fork+0xe4/0x280
 [<ffffffff8104fa55>] copy_process+0x6e5/0x13d0
 [<ffffffff81119809>] ? __do_fault+0x3b9/0x4b0
 [<ffffffff810507fb>] do_fork+0x8b/0x490
 [<ffffffff8111d6b6>] ? handle_mm_fault+0x196/0xa90
 [<ffffffff8147dc2d>] ? retint_swapgs+0xe/0x13
 [<ffffffff8147dc2d>] ? retint_swapgs+0xe/0x13
 [<ffffffff8100c395>] sys_vfork+0x25/0x30
 [<ffffffff81003583>] stub_vfork+0x13/0x20
 [<ffffffff810031db>] ? system_call_fastpath+0x16/0x1b
loop: module loaded


Zdenek
--

From: Paul E. McKenney
Date: Sunday, November 7, 2010 - 11:46 am

OK, so this one requires that we either be in an rcu_read_lock()
critical section, that we hold the task's alloc_lock(), that the
cgroup_mutex is held, or that the runqueue lock for the CPU that
the task last ran on is held.  Unfortunately, we are creating the
task, so there is no last CPU that it ran on, thus confusing the
check.

But this looks -really- familiar...

Could you please apply commit b0a0f667, which hit mainline after 2.6.36?

							Thanx, Paul
--

From: Randy Dunlap
Date: Thursday, August 12, 2010 - 9:36 am

From: Randy Dunlap <randy.dunlap@oracle.com>

Minimal patch to fix mmc to build when CONFIG_PM is not enabled:

(.text+0x128fcd): undefined reference to `mmc_pm_notify'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 drivers/mmc/core/host.c |    2 ++
 1 file changed, 2 insertions(+)

Seems to be needed in mainline also.

--- mmotm-2010-0811-1610.orig/drivers/mmc/core/host.c
+++ mmotm-2010-0811-1610/drivers/mmc/core/host.c
@@ -86,7 +86,9 @@ struct mmc_host *mmc_alloc_host(int extr
 	init_waitqueue_head(&host->wq);
 	INIT_DELAYED_WORK(&host->detect, mmc_rescan);
 	INIT_DELAYED_WORK_DEFERRABLE(&host->disable, mmc_host_deeper_disable);
+#ifdef CONFIG_PM
 	host->pm_notify.notifier_call = mmc_pm_notify;
+#endif
 
 	/*
 	 * By default, hosts do not support SGIO or large requests.
--

From: Uwe Kleine-König
Date: Wednesday, August 18, 2010 - 2:10 am

I sent the same patch[1] a few hours later than Randy, mine got

	Acked-by: Kukjin Kim <kgene.kim@samsung.com>
	Acked-by: Maxim Levitsky <maximlevitsky@gmail.com>

.  I think it's fine to add these to Randy's patch, too, together with
my Ack.  Maybe add a reference to the breaking commit as my patch did?
	
Thanks
Uwe

[1] http://mid.gmane.org/1281691473-15481-1-git-send-email-u.kleine-koenig@pengutronix.de

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
--

From: Valdis.Kletnieks
Date: Thursday, August 12, 2010 - 9:37 am

Not sure if it's an e1000e bug, or an iptables bug that happened to trip on
like the first few packets accepted after the interface came up, so I'll cc:
everybody and let ya'll fight over it. :)

[  431.011194] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
[  431.062183] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
[  431.064607] ADDRCONF(NETDEV_UP): eth0: link is not ready
[  432.691161] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
[  432.691177] e1000e 0000:00:19.0: eth0: 10/100 speed: disabling TSO
[  432.695461] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[  432.697278] 
[  432.697279] =================================
[  432.697477] [ INFO: inconsistent lock state ]
[  432.697581] 2.6.35-mmotm0811 #1
[  432.697682] ---------------------------------
[  432.697785] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[  432.697890] kworker/0:0/0 [HC0[0]:SC1[2]:HE1:SE0] takes:
[  432.697994]  (&(&lock->lock)->rlock){+.?...}, at: [<ffffffff814d39c8>] ip6t_do_table+0xc1/0x646
[  432.698028] {SOFTIRQ-ON-W} state was registered at:
[  432.698028]   [<ffffffff8106762f>] __lock_acquire+0x3a3/0xd6a
[  432.698028]   [<ffffffff81068514>] lock_acquire+0x10a/0x130
[  432.698028]   [<ffffffff81557cd9>] _raw_spin_lock+0x36/0x45
[  432.698028]   [<ffffffff814d3375>] xt_info_wrlock+0x1c/0x1e
[  432.698028]   [<ffffffff814d48df>] get_counters+0x93/0x14a
[  432.698028]   [<ffffffff814d49c3>] alloc_counters.clone.3+0x2d/0x41
[  432.698028]   [<ffffffff814d4f98>] do_ip6t_get_ctl+0x110/0x367
[  432.698028]   [<ffffffff814402a7>] nf_sockopt+0x5c/0x88
[  432.698028]   [<ffffffff814402e6>] nf_getsockopt+0x13/0x15
[  432.698028]   [<ffffffff814ba05e>] ipv6_getsockopt+0x94/0xc3
[  432.698028]   [<ffffffff814c1175>] rawv6_getsockopt+0x48/0x54
[  432.698028]   [<ffffffff8141533a>] sock_common_getsockopt+0xf/0x11
[  432.698028]   [<ffffffff814147dd>] sys_getsockopt+0x75/0x93
[  432.698028]   [<ffffffff8100272b>] system_call_fastpath+0x16/0x1b
[  432.698028] irq event stamp: ...
From: Valdis.Kletnieks
Date: Thursday, August 12, 2010 - 11:59 am

Something appears to be borked in the ALSA arena.  There's no actual volume coming
out of the system, and 'alsamixer' is insisting that the volume slider only goes from 0
to 10% or so, no further.  However, experimentation shows that the volume slider
in 'xine' *does* affect the 'Amp-Out vals' lines, and alsamixer has *no* effect on
what 'Amp-Out vals' lists.

A diff of alsa-info.sh for the two kernels shows them being identical, so I'm only
attaching one copy.

It may be the weekend before I find time to do a bisection of this.
From: Takashi Iwai
Date: Thursday, August 12, 2010 - 12:37 pm

At Thu, 12 Aug 2010 14:59:32 -0400,

Hm, there shouldn't be a change regarding the volume control.
There can be an issue about PCM stream, e.g. in commit 
    eb541337b7a43822fce7d0c9d967ee149b2d9a96
    ALSA: hda - Make converter setups sticky

To be sure, could you try to revert it?  If it doesn't help but still
you get strange volume behavior, please get alsa-info.sh output at
different volume levels for comparison.


thanks,

Takashi
--

From: Jiri Slaby
Date: Thursday, August 12, 2010 - 2:06 pm

Didn't you (like some other people) get into the state where pulseaudio
doesn't work? It chooses as an output a dummy driver automatically, then
you can change volume, play sound, but actually it all goes to /dev/null.

It took me a while before I figured out that it's a "dummy" driver I
have in pulseaudio.

regards,
-- 
js
--

From: Takashi Iwai
Date: Thursday, August 12, 2010 - 2:11 pm

At Thu, 12 Aug 2010 23:06:22 +0200,

Looks like there is a breakage regarding open/close due to fs/notify/*
changes.  I guess you can hear still sounds like:

	% aplay -Dplughw foo.wav


Takashi
--

From: Valdis.Kletnieks
Date: Thursday, August 12, 2010 - 7:13 pm

Confirming that Linus's patch fixes it for me:

commit 2069601b3f0ea38170d4b509b89f3ca0a373bdc1
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Thu Aug 12 14:23:04 2010 -0700

    Revert "fsnotify: store struct file not struct path"
    
    This reverts commit 3bcf3860a4ff9bbc522820b4b765e65e4deceb3e (and the
    accompanying commit c1e5c954020e "vfs/fsnotify: fsnotify_close can delay
    the final work in fput" that was a horribly ugly hack to make it work at
    all).


Previous thread: viafb output device rework - part 1 by Florian Tobias Schandinat on Wednesday, August 11, 2010 - 4:49 pm. (12 messages)

Next thread: Re: + drivers-acpi-apei-erst-dbgc-get_useru64-doesnt-work-on-i386.patch added to -mm tree by Randy Dunlap on Wednesday, August 11, 2010 - 4:43 pm. (10 messages)