On Sun, Apr 22, 2007 at 01:31:55PM +0100, Christoph Hellwig wrote:
I'm not quite sure what the intent of this patch really is, being
at most a somewhat passing and casual user of kernel threads.
Some background may be useful: (this in reply to some comments from
Andrew Morton)
EEH events are supposed to be very rare, as they correspond to
hardware failures, typically PCI bus parity errors, but also
things like wild DMA's. The code that generates these will limit
them to no more than 6 per hour per pci device. Any more than that,
and the PCI device is permanently disabled (the sysadmin would
need to do something to recover).
The only reason for using threads here is to get the error recovery
out of an interrupt context (where errors may be detected), and then,
an hour later, decrement a counter (which is how we limit these to
6 per hour). Thread reaping is "trivial", the thread just exits
after an hour.
Since these are events rare, I've no particular concern about
performance or resource consumption. The current code seems
to work just fine. :-)
--linas
-