On Sat, Aug 30, 2008 at 5:30 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
Ok, mine is definitely because of hotplug. The issue you are seeing is
probably more important, though, sorry for hijacking your thread. You
should be able to do the SysRq thing as well, though. SysRq-w (dump
blocked tasks) and SysRq-d (held locks) here:
SysRq : Show Blocked State
task PC stack pid father
bash D d450a1ee 5912 3772 3720
f68e1df4 00200046 2fb00387 d450a1ee 2fb00387 00000004 f68e1df0 c015aa4a
c2036d80 f7830cc0 1e881e2a 000000bc f67d72c0 f67d7534 c2036d80 f68e0000
f86e8552 00000000 6db5f2ce 00000000 f86e8552 f67d781c f67d72c0 c0d1dcf4
Call Trace:
[<c015aa4a>] ? __lock_acquire+0x27a/0xa00
[<c05a1a38>] schedule_timeout+0x78/0xc0
[<c015904f>] ? mark_held_locks+0x6f/0x90
[<c015927b>] ? trace_hardirqs_on+0xb/0x10
[<c01591e4>] ? trace_hardirqs_on_caller+0xd4/0x160
[<c015927b>] ? trace_hardirqs_on+0xb/0x10
[<c05a0e33>] wait_for_common+0xb3/0x130
[<c012c860>] ? default_wake_function+0x0/0x10
[<c05a0f42>] wait_for_completion+0x12/0x20
[<c016c264>] __stop_machine+0x184/0x1f0
[<c016c310>] ? stop_cpu+0x0/0xa0
[<c016c0d0>] ? chill+0x0/0x10
[<c057e920>] ? take_cpu_down+0x0/0x30
[<c057ea77>] _cpu_down+0xd7/0x270
[<c0179d1d>] ? __synchronize_sched+0x2d/0x40
[<c0146ef0>] ? wakeme_after_rcu+0x0/0x10
[<c057ec5a>] cpu_down+0x4a/0x70
[<c057ff68>] store_online+0x38/0x80
[<c057ff30>] ? store_online+0x0/0x80
[<c0312f5c>] sysdev_store+0x2c/0x40
[<c01e67b2>] sysfs_write_file+0xa2/0x100
[<c01a8606>] vfs_write+0x96/0x130
[<c01e6710>] ? sysfs_write_file+0x0/0x100
[<c01a8b4d>] sys_write+0x3d/0x70
[<c01040db>] sysenter_do_call+0x12/0x3f
=======================
SysRq : Show Locks Held
Showing all locks held in the system:
1 lock held by mingetty/3520:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3521:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3522:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3523:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3524:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
4 locks held by bash/3772:
#0: (&buffer->mutex){--..}, at: [<c01e673b>] sysfs_write_file+0x2b/0x100
#1: (cpu_add_remove_lock){--..}, at: [<c01367af>] cpu_maps_update_begin+0xf/00
#2: (&cpu_hotplug.lock){--..}, at: [<c013680a>] cpu_hotplug_begin+0x1a/0x50
#3: (lock){--..}, at: [<c016c132>] __stop_machine+0x52/0x1f0
1 lock held by bash/5074:
#0: (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
2 locks held by bash/6363:
#0: (sysrq_key_table_lock){++..}, at: [<c030094a>] __handle_sysrq+0x1a/0x120
#1: (tasklist_lock){..--}, at: [<c01574a1>] debug_show_all_locks+0x31/0x180
=============================================
So it seems (in my case) to be waiting for &finished to complete
(__stop_machine()):
/* This will release the thread on our CPU. */
put_cpu();
wait_for_completion(&finished);
mutex_unlock(&lock);
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--