Re: latest -git: hibernate: possible circular locking dependency detected

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Dmitry Adamushko
Date: Thursday, August 21, 2008 - 10:38 am

Hi,

[ cc: Peter and Oleg ]

heh, my mind might have been also 'hibernated' by the everning but I
still dare to speculate :-)


=======================================================

this path is triggered as a result of "echo disk > /sys/power/state"

disable_nonboot_cpus() calls cpu_maps_update_being() which takes
"cpu_add_remove_lock" (lock-1).

If we go down the road cleanup_workqueue_thread() ->
flush_cpu_workqueue() will take "cwq->lock" (lock-2).
So this should be the second lock.



hmm, did you somehow hit "Sysrq + o"?

'cause I don't see any other places (say, with handle_sysrq(k,...)
where "k" migth be 'o') from where do_power_off() might have been
triggered...

however, I think there are 2 problems with handle_poweroff()
[ kernel/power/poweroff.c ]

(1) it doesn't ensure that the 'cpu' it gets via
first_cpu(cpu_online_map) can't disappear (race with cpu_down()) on
the way to schedule_work_on()

[ I pressume, neither generic sysrq nor console layer takes care of
it. They shoudn't of course ]

(2) run_workqueue() [ which in the end calls do_poweroff() ] takes the
"cwq->lock" (which is lock-2 in our terminology)

well, actually it release it before calling "work->fun()" but is the
'lockdep' annotation right here? Peter?

(I admit, I never looked at lockdep and do make assumptions on its syntax here).

The lock-1 will be taken as a result of

then, do_poweroff() -> kernel_power_off() -> disable_nonboot_cpus()

which calls cpu_maps_update_begin() and takes "cpu_add_remove_lock"

and this looks dangerous. Due to the same reason as was before with
the use of get_online_cpus() by workqueue handlers before
CPU_POST_DEAD introduction
(http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3da1c84c00...)

I guess, it may deadlock as the lock-1 has been already taken before
calling cleanup_workqueue_thread() -> flush_cpu_workqueue() and
completion of the former chain depends in turn on being able to
acquire the very same lock.

hm?





-- 
Best regards,
Dmitry Adamushko
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: latest -git: hibernate: possible circular locking depe ..., Dmitry Adamushko, (Thu Aug 21, 10:38 am)
Re: latest -git: hibernate: possible circular locking depe ..., Rafael J. Wysocki, (Thu Aug 21, 11:22 am)
Re: latest -git: hibernate: possible circular locking depe ..., Rafael J. Wysocki, (Thu Aug 21, 12:07 pm)
Re: latest -git: hibernate: possible circular locking depe ..., Rafael J. Wysocki, (Thu Aug 21, 12:31 pm)
Re: latest -git: hibernate: possible circular locking depe ..., Rafael J. Wysocki, (Thu Aug 21, 12:54 pm)