Re: how to allow board writers to customize driver behavior (watchdog here)

Previous thread: [PATCH] [POWERPC] suppress modpost warnings for references from the .toc by Stephen Rothwell on Wednesday, May 23, 2007 - 9:16 pm. (1 message)

Next thread: [PATCH netdev] "wrong timeout value" in sk_wait_data() by Vasily Averin on Wednesday, May 23, 2007 - 9:22 pm. (4 messages)
From: Mike Frysinger
Date: Wednesday, May 23, 2007 - 9:21 pm

the Blackfin on-chip watchdog has controllable behavior ... it can be
configured to reset the processor (like a normal watchdog), or it can
be configured to simply generate an interrupt.

i can see embedded systems where simply resetting the system is not
desirable ... perhaps it's the control system for some machinery and
resetting the system will force it to reinitialize itself which could
cause problems if a guy is servicing the insides at the time ;)

the Blackfin watchdog driver has a module parameter, "action" ... the
default will have the watchdog act like every other watchdog out there
-- it reboots after a timeout.  however, by setting the action param
appropriately, the watchdog goes into simple interrupt generation.
the question then becomes, how do people developing their
board-specific version customize what happens when the timeout occurs
and the interrupt is fired ?  making every customer who wishes to
customize the watchdog behavior edit the watchdog driver is
troublesome as they're now blurring the distinct parts: the
watchdog-specific piece and their board-specific piece.

what i'm doing now is weak symbols:
...
extern irqreturn_t bfin_board_watchdog_interrupt(void) __attribute__((weak));
static irqreturn_t bfin_wdt_interrupt(int irq, void *dev_id)
{
    if (bfin_board_watchdog_interrupt) {
        return bfin_board_watchdog_interrupt();
    } else {
        bfin_wdt_stop();
        bfin_wdt_keepalive();
        bfin_wdt_start();
        return IRQ_HANDLED;
    }
}
...

is this completely bad mojo ?  is there some other mechanism that
provides what i want and i just dont know about it ?  or do i just
make people change the driver to fit their application, thus throwing
out the idea of keeping all board-specific details in just the boards
file ...
-mike
-

From: Paul Mundt
Date: Wednesday, May 23, 2007 - 10:23 pm

It sounds like your constraining your driver based on terminology.
Watchdogs on most embedded platforms support either a 'reset' mode or
otherwise act as periodic timers, trying to push both of these
functionalities in to a watchdog driver is rather pointless.
CONFIG_WATCHDOG implies 'reset' mode by definition.

If you wish to use your watchdog timer as a periodic timer, simply have a
clocksource/clockevents established for it, leave the watchdog driver as
a reset-only thing, and let the user decide which one they want either
via Kconfig or the kernel command line. (The watchdog driver can just
-ENODEV or -EBUSY if the clocksource is active).
-

From: Daniel Newby
Date: Thursday, May 24, 2007 - 1:47 am

I agree for a product in the hands of a customer:  let the watchdog
pull your bacon out of the fire.

But what about debugging?  Suppose your embedded computer with custom
drivers locks up solid every few hundred hours.  It would be nice if
the watchdog gave a stack dump instead erasing the evidence.  How about
having "action=reset" and "action=debug"?

    -- Daniel
-

From: Paul Mundt
Date: Thursday, May 24, 2007 - 2:32 am

Again, CONFIG_WATCHDOG implies reset by definition. If you'd like to
propose pluggable policies for CONFIG_WATCHDOG and post the code for
that, go right ahead.

For soft lockups, there's already the softlockup code which does what
you seem to be leaning towards. For hard lockups (which is where
CONFIG_WATCHDOG comes in handy), you're likely not going to get any
output anyways. In either case, wiggling this in to CONFIG_WATCHDOG is
very much changing the meaning of what CONFIG_WATCHDOG means today, and
doesn't seem to buy us anything.
-

From: Mike Frysinger
Date: Thursday, May 24, 2007 - 8:08 am

"action" corresponds one to one with the functionality of the hardware
device ... what the board guy wishes to do when the interrupt expires
is up to the board guy and that may include doing a debug dump
somewhere ... that is most definitely not a realm i want to tread into
it ;)
-mike
-

From: Robin Getz
Date: Thursday, May 24, 2007 - 6:29 am

I understand what you mean - typically - most people think of watchdog == 
reset.

But, calling it a periodic timer, and servicing it with the watchdog user 
space demon is even more confusing - isn't it?

-Robin
-

From: Paul Mundt
Date: Thursday, May 24, 2007 - 8:23 am

No, not typically. This is _precisely_ what CONFIG_WATCHDOG means, and
Calling it a periodic timer when its in periodic timer mode makes sense.
Why you would want to interface that with a userspace watchdog daemon is
beyond me, they're conceptually unrelated.

Please read my original mail on the subject. I'm not advocating hiding a
clocksource somewhere in the depths of CONFIG_WATCHDOG, they're
completely unrelated.
-

From: Robin Getz
Date: Thursday, May 24, 2007 - 10:32 am

No disagreements - but I don't think that a watchdog that doesn't cause a 

Agreed again - periodic timers have nothing to do with watchdogs. This is 
where I am confused about why you are saying that the only event a watchdog 


I (and many others) consider a "watchdog" a clock sink - something that needs 
to be poked within certain limits (too fast can indicate a failures just as 
too slow is a failure).

The event or how something is notified of the failure of the watchdog to be 
serviced shouldn't determine what the name is.

    What's in a name? that which we call a watchdog
    By any other name would smell as sweet;
                                         -Bill S

-Robin
-

From: Paul Mundt
Date: Thursday, May 24, 2007 - 9:04 pm

I'm not sure what else you think it is? On most platforms, when it's not
in reset mode, it works as a free-running timer with an IRQ generated on
overflow. I've certainly used the watchdog as a system timer before on
No, what I said was that the only event that _matters_ to CONFIG_WATCHDOG
is a hard reset. So far no one has suggested anything outside of hard
reset, periodic timer, or softlockup detection that would be useful to
extend CONFIG_WATCHDOG for.

If you're talking about specific events, clockevents are still a much
better way to go than trying to hammer something in to CONFIG_WATCHDOG
that it was never designed for. If you have some 'special' events for
your watchdog that would be of use to others, tying these in as
Currently there's nothing in the kernel that cares about clearing 'too fast'.
I can't imagine why this _should_ be treated as a failure, but feel free
If all you want is a timer that you occasionally have to poke and then
take some notification when it expires, you can just use a regular
one-shot timer anyways and bank off of the system timer, the 'watchdog'
is certainly not doing anything useful at this point.

So far the only example anyone has provided outside of periodic timers or
hardware reset has been dumping the stack when something gets stuck.
Softlockup does this already today, using a timer.

If your system is completely dead, you won't have any way to trigger or
see the stack dump anyways, so the watchdog doesn't buy you anything
there, either.

What many watchdogs do today is simply to have split timer for userspace
and the actual hardware (where userspace has to poke the timer every now
and then, or the kernel will allow the overflow). This is pretty common
for watchdogs with very fast overflow periods.

There's certainly nothing wrong with having a timer that runs out and
kicks a notifier chain if there's something special you want to do, but
tying up the watchdog hardware for that is silly. There are many other
things one has to use the ...
From: Daniel Newby
Date: Friday, May 25, 2007 - 3:09 am

Many watchdogs can be hooked up to a non-maskable interrupt (NMI) that
cannot be disabled or preempted.  You get a stack dump even for drastic
bugs:  ISR lock up, timer misconfiguration, level-sensitive interrupt
line stuck asserted, and so forth.  Getting that information by other
means can be painful and/or expensive.

The Blackfin chip in the original message appears to support watchdog
NMI.

    -- Daniel
-

From: Mike Frysinger
Date: Friday, May 25, 2007 - 10:55 am

it does ... it has four modes:
 - reset (drivers/char/watchdog/bfin_wdt.c)
 - interrupt (i'll prob write a clockevents driver for this)
 - NMI (no plans to do anything for this as NMI is unused in Blackfin)
 - nothing (have yet to find a use case for this)
-mike
-

From: Mike Frysinger
Date: Thursday, May 24, 2007 - 8:12 am

my constraint was trying to keep all of the code that deals with the
watchdog in one file ... those were the blinders i had on from the get
go so the idea of having different drivers that work with the watchdog

hmm, i'll poke the clocksource/clockevents stuff as well as the
notifier idea from Alan

thanks
-mike
-

From: Alan Cox
Date: Thursday, May 24, 2007 - 3:01 am

There are two possibilities of interest I can think of (and maybe both
are useful). One is to deliver a signal to someone on expiry the other
would be to use notifier chains and export either the notifier or
add/remove operations. That allows multiple modules and users to chain
onto the expiry event

Take a look at include/linux/notifier.h

-

Previous thread: [PATCH] [POWERPC] suppress modpost warnings for references from the .toc by Stephen Rothwell on Wednesday, May 23, 2007 - 9:16 pm. (1 message)

Next thread: [PATCH netdev] "wrong timeout value" in sk_wait_data() by Vasily Averin on Wednesday, May 23, 2007 - 9:22 pm. (4 messages)