Re: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Vegard Nossum
Date: Saturday, August 30, 2008 - 8:54 am

On Sat, Aug 30, 2008 at 5:30 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:

Ok, mine is definitely because of hotplug. The issue you are seeing is
probably more important, though, sorry for hijacking your thread. You
should be able to do the SysRq thing as well, though. SysRq-w (dump
blocked tasks) and SysRq-d (held locks) here:

SysRq : Show Blocked State
  task                PC stack   pid father
bash          D d450a1ee  5912  3772   3720
       f68e1df4 00200046 2fb00387 d450a1ee 2fb00387 00000004 f68e1df0 c015aa4a
       c2036d80 f7830cc0 1e881e2a 000000bc f67d72c0 f67d7534 c2036d80 f68e0000
       f86e8552 00000000 6db5f2ce 00000000 f86e8552 f67d781c f67d72c0 c0d1dcf4
Call Trace:
 [<c015aa4a>] ? __lock_acquire+0x27a/0xa00
 [<c05a1a38>] schedule_timeout+0x78/0xc0
 [<c015904f>] ? mark_held_locks+0x6f/0x90
 [<c015927b>] ? trace_hardirqs_on+0xb/0x10
 [<c01591e4>] ? trace_hardirqs_on_caller+0xd4/0x160
 [<c015927b>] ? trace_hardirqs_on+0xb/0x10
 [<c05a0e33>] wait_for_common+0xb3/0x130
 [<c012c860>] ? default_wake_function+0x0/0x10
 [<c05a0f42>] wait_for_completion+0x12/0x20
 [<c016c264>] __stop_machine+0x184/0x1f0
 [<c016c310>] ? stop_cpu+0x0/0xa0
 [<c016c0d0>] ? chill+0x0/0x10
 [<c057e920>] ? take_cpu_down+0x0/0x30
 [<c057ea77>] _cpu_down+0xd7/0x270
 [<c0179d1d>] ? __synchronize_sched+0x2d/0x40
 [<c0146ef0>] ? wakeme_after_rcu+0x0/0x10
 [<c057ec5a>] cpu_down+0x4a/0x70
 [<c057ff68>] store_online+0x38/0x80
 [<c057ff30>] ? store_online+0x0/0x80
 [<c0312f5c>] sysdev_store+0x2c/0x40
 [<c01e67b2>] sysfs_write_file+0xa2/0x100
 [<c01a8606>] vfs_write+0x96/0x130
 [<c01e6710>] ? sysfs_write_file+0x0/0x100
 [<c01a8b4d>] sys_write+0x3d/0x70
 [<c01040db>] sysenter_do_call+0x12/0x3f
 =======================


SysRq : Show Locks Held

Showing all locks held in the system:
1 lock held by mingetty/3520:
 #0:  (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3521:
 #0:  (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3522:
 #0:  (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3523:
 #0:  (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
1 lock held by mingetty/3524:
 #0:  (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
4 locks held by bash/3772:
 #0:  (&buffer->mutex){--..}, at: [<c01e673b>] sysfs_write_file+0x2b/0x100
 #1:  (cpu_add_remove_lock){--..}, at: [<c01367af>] cpu_maps_update_begin+0xf/00
 #2:  (&cpu_hotplug.lock){--..}, at: [<c013680a>] cpu_hotplug_begin+0x1a/0x50
 #3:  (lock){--..}, at: [<c016c132>] __stop_machine+0x52/0x1f0
1 lock held by bash/5074:
 #0:  (&tty->atomic_read_lock){--..}, at: [<c02f3004>] read_chan+0x424/0x640
2 locks held by bash/6363:
 #0:  (sysrq_key_table_lock){++..}, at: [<c030094a>] __handle_sysrq+0x1a/0x120
 #1:  (tasklist_lock){..--}, at: [<c01574a1>] debug_show_all_locks+0x31/0x180

=============================================

So it seems (in my case) to be waiting for &finished to complete
(__stop_machine()):

        /* This will release the thread on our CPU. */
        put_cpu();
        wait_for_completion(&finished);
        mutex_unlock(&lock);


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!, Thomas Backlund, (Sat Aug 30, 5:46 am)
Re: BUG: kernel-2.6.27-rc5: soft lockup - CPU#X stuck for 61s!, Vegard Nossum, (Sat Aug 30, 8:54 am)