Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_watchdog and touch_softlockup_watchdog

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Frederic Weisbecker
Date: Tuesday, August 17, 2010 - 7:48 pm

On Tue, Aug 17, 2010 at 09:13:20AM -0400, Don Zickus wrote:


(Adding Len Brown in Cc.

Len, this is about acpi_os_stall() that touches the watchdog while
running in a preemptable section, this triggers warnings because of
the use of local cpu accessors. We are debating about the appropriate
way to solve this).

The more I think about it, the more I think that doesn't make sense
to have touch_nmi_watchdog() callable from preemptable code.

It is buggy by nature.

If you run in a preemptable section, then interrupts can fire, and if
they can, the nmi watchdog is fine and doesn't need to be touched.

Here the problem is more in the softlockup watchdog, because even if you
run in a preemptable section, if you run a !CONFIG_PREEMPT kernel, then
you can't be preempted and the watchdog won't be scheduled until the
udelay loop finishes. But to solve that you would need cond_resched()
calls, not touching the watchdog.

Because touching the softlockup watchdog doesn't make sense either
if you can migrate: you can run the udelay on CPU 0, then migrate on
CPU 1 and call touch_softlockup_watchdog() from there. Which makes
definetely no sense. This is buggy.

And because we want to avoid such buggy uses of the touch_whatever_watchdog()
APIs, these function must continue to check they are called from non-preemptable
code. Randomly touching the watchdog could hide real lockups to the user.

The problem is on the caller. Considering such udelays loop:

* if it's in a irq disabled section, call touch_nmi_watchdog(), because this
  could prevent the nmi watchdog irq from firing
* if it's in a non-preemptable section, call touch_softlockup_watchdog(), because
  this could prevent the softlockup watchdog task from beeing scheduled
* if it's from a preemptable task context, this should call cond_resched() to
  avoid huge latencies on !CONFIG_PREEMPT


But acpi_os_stall() seem to be called from 4 different places, and these places
may run in different context like the above described.

The ACPI code should probably use more specific busy-loop APIs, depending on the
context it runs.

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
fix BUG: using smp_processor_id() in touch_nmi_watchdog an ..., Sergey Senozhatsky, (Fri Aug 13, 3:21 am)
[PATCH] fix BUG using smp_processor_id() in touch_nmi_watc ..., Sergey Senozhatsky, (Mon Aug 16, 7:08 am)
Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_ ..., Frederic Weisbecker, (Mon Aug 16, 7:59 pm)
[PATCH] fix BUG using smp_processor_id() in touch_nmi_watc ..., Sergey Senozhatsky, (Tue Aug 17, 12:56 am)
Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_ ..., Sergey Senozhatsky, (Tue Aug 17, 1:39 am)
Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_ ..., Sergey Senozhatsky, (Tue Aug 17, 2:24 am)
Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_ ..., Sergey Senozhatsky, (Tue Aug 17, 3:28 am)
Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_ ..., Sergey Senozhatsky, (Tue Aug 17, 3:39 am)
Re: [PATCH] fix BUG using smp_processor_id() in touch_nmi_ ..., Frederic Weisbecker, (Tue Aug 17, 7:48 pm)
[PATCH] avoid second smp_processor_id() call in __touch_wa ..., Sergey Senozhatsky, (Wed Sep 22, 2:00 am)
Re: [PATCH] avoid second smp_processor_id() call in __touc ..., Frederic Weisbecker, (Wed Sep 22, 9:27 am)
Re: [PATCH] avoid second smp_processor_id() call in __touc ..., Frederic Weisbecker, (Wed Sep 22, 9:47 am)
Re: [PATCH] avoid second smp_processor_id() call in __touc ..., Sergey Senozhatsky, (Sat Sep 25, 10:43 am)