the Blackfin on-chip watchdog has controllable behavior ... it can be
configured to reset the processor (like a normal watchdog), or it can
be configured to simply generate an interrupt.
i can see embedded systems where simply resetting the system is not
desirable ... perhaps it's the control system for some machinery and
resetting the system will force it to reinitialize itself which could
cause problems if a guy is servicing the insides at the time ;)
the Blackfin watchdog driver has a module parameter, "action" ... the
default will have the watchdog act like every other watchdog out there
-- it reboots after a timeout. however, by setting the action param
appropriately, the watchdog goes into simple interrupt generation.
the question then becomes, how do people developing their
board-specific version customize what happens when the timeout occurs
and the interrupt is fired ? making every customer who wishes to
customize the watchdog behavior edit the watchdog driver is
troublesome as they're now blurring the distinct parts: the
watchdog-specific piece and their board-specific piece.
what i'm doing now is weak symbols:
...
extern irqreturn_t bfin_board_watchdog_interrupt(void) __attribute__((weak));
static irqreturn_t bfin_wdt_interrupt(int irq, void *dev_id)
{
if (bfin_board_watchdog_interrupt) {
return bfin_board_watchdog_interrupt();
} else {
bfin_wdt_stop();
bfin_wdt_keepalive();
bfin_wdt_start();
return IRQ_HANDLED;
}
}
...
is this completely bad mojo ? is there some other mechanism that
provides what i want and i just dont know about it ? or do i just
make people change the driver to fit their application, thus throwing
out the idea of keeping all board-specific details in just the boards
file ...
-mike
-
It sounds like your constraining your driver based on terminology. Watchdogs on most embedded platforms support either a 'reset' mode or otherwise act as periodic timers, trying to push both of these functionalities in to a watchdog driver is rather pointless. CONFIG_WATCHDOG implies 'reset' mode by definition. If you wish to use your watchdog timer as a periodic timer, simply have a clocksource/clockevents established for it, leave the watchdog driver as a reset-only thing, and let the user decide which one they want either via Kconfig or the kernel command line. (The watchdog driver can just -ENODEV or -EBUSY if the clocksource is active). -
I agree for a product in the hands of a customer: let the watchdog
pull your bacon out of the fire.
But what about debugging? Suppose your embedded computer with custom
drivers locks up solid every few hundred hours. It would be nice if
the watchdog gave a stack dump instead erasing the evidence. How about
having "action=reset" and "action=debug"?
-- Daniel
-
Again, CONFIG_WATCHDOG implies reset by definition. If you'd like to propose pluggable policies for CONFIG_WATCHDOG and post the code for that, go right ahead. For soft lockups, there's already the softlockup code which does what you seem to be leaning towards. For hard lockups (which is where CONFIG_WATCHDOG comes in handy), you're likely not going to get any output anyways. In either case, wiggling this in to CONFIG_WATCHDOG is very much changing the meaning of what CONFIG_WATCHDOG means today, and doesn't seem to buy us anything. -
"action" corresponds one to one with the functionality of the hardware device ... what the board guy wishes to do when the interrupt expires is up to the board guy and that may include doing a debug dump somewhere ... that is most definitely not a realm i want to tread into it ;) -mike -
I understand what you mean - typically - most people think of watchdog == reset. But, calling it a periodic timer, and servicing it with the watchdog user space demon is even more confusing - isn't it? -Robin -
No, not typically. This is _precisely_ what CONFIG_WATCHDOG means, and Calling it a periodic timer when its in periodic timer mode makes sense. Why you would want to interface that with a userspace watchdog daemon is beyond me, they're conceptually unrelated. Please read my original mail on the subject. I'm not advocating hiding a clocksource somewhere in the depths of CONFIG_WATCHDOG, they're completely unrelated. -
No disagreements - but I don't think that a watchdog that doesn't cause a
Agreed again - periodic timers have nothing to do with watchdogs. This is
where I am confused about why you are saying that the only event a watchdog
I (and many others) consider a "watchdog" a clock sink - something that needs
to be poked within certain limits (too fast can indicate a failures just as
too slow is a failure).
The event or how something is notified of the failure of the watchdog to be
serviced shouldn't determine what the name is.
What's in a name? that which we call a watchdog
By any other name would smell as sweet;
-Bill S
-Robin
-
I'm not sure what else you think it is? On most platforms, when it's not in reset mode, it works as a free-running timer with an IRQ generated on overflow. I've certainly used the watchdog as a system timer before on No, what I said was that the only event that _matters_ to CONFIG_WATCHDOG is a hard reset. So far no one has suggested anything outside of hard reset, periodic timer, or softlockup detection that would be useful to extend CONFIG_WATCHDOG for. If you're talking about specific events, clockevents are still a much better way to go than trying to hammer something in to CONFIG_WATCHDOG that it was never designed for. If you have some 'special' events for your watchdog that would be of use to others, tying these in as Currently there's nothing in the kernel that cares about clearing 'too fast'. I can't imagine why this _should_ be treated as a failure, but feel free If all you want is a timer that you occasionally have to poke and then take some notification when it expires, you can just use a regular one-shot timer anyways and bank off of the system timer, the 'watchdog' is certainly not doing anything useful at this point. So far the only example anyone has provided outside of periodic timers or hardware reset has been dumping the stack when something gets stuck. Softlockup does this already today, using a timer. If your system is completely dead, you won't have any way to trigger or see the stack dump anyways, so the watchdog doesn't buy you anything there, either. What many watchdogs do today is simply to have split timer for userspace and the actual hardware (where userspace has to poke the timer every now and then, or the kernel will allow the overflow). This is pretty common for watchdogs with very fast overflow periods. There's certainly nothing wrong with having a timer that runs out and kicks a notifier chain if there's something special you want to do, but tying up the watchdog hardware for that is silly. There are many other things one has to use the ...
Many watchdogs can be hooked up to a non-maskable interrupt (NMI) that
cannot be disabled or preempted. You get a stack dump even for drastic
bugs: ISR lock up, timer misconfiguration, level-sensitive interrupt
line stuck asserted, and so forth. Getting that information by other
means can be painful and/or expensive.
The Blackfin chip in the original message appears to support watchdog
NMI.
-- Daniel
-
it does ... it has four modes: - reset (drivers/char/watchdog/bfin_wdt.c) - interrupt (i'll prob write a clockevents driver for this) - NMI (no plans to do anything for this as NMI is unused in Blackfin) - nothing (have yet to find a use case for this) -mike -
my constraint was trying to keep all of the code that deals with the watchdog in one file ... those were the blinders i had on from the get go so the idea of having different drivers that work with the watchdog hmm, i'll poke the clocksource/clockevents stuff as well as the notifier idea from Alan thanks -mike -
There are two possibilities of interest I can think of (and maybe both are useful). One is to deliver a signal to someone on expiry the other would be to use notifier chains and export either the notifier or add/remove operations. That allows multiple modules and users to chain onto the expiry event Take a look at include/linux/notifier.h -
