The following comment refers to the "Timer routine either sets" below,
right?
Correct on both counts. I had forgotten that the watchdog routine
clears STS_IAA.
Okay, so this isn't as bad as it seemed. I don't have a copy of your
most recent patch, but it seems clear that the watchdog routine must:
First remove the circumstances that would cause the controller
to set IAA. I guess that means clearing IAAD; it's not
entirely clear from the spec whether this will do what we
want.
Then clear IAA (if it happens to be set).
This is the only way to avoid the race, and I know that my original
version of the routine does these steps in the wrong order (if at all).
That should be fixed. Given sufficiently bizarre hardware we can't be
certain that things won't still go wrong on occasion, but this is the
best we can do for now -- weird hardware can be handled as it arises.
The other change to make (which you have already anticipated) is to
guard against ehci->reclaim == NULL in end_unlink_async(). There's no
real need for a warning or stack dump; it should just return silently
when this happens. If there is a warning, maybe it should be placed at
the site of the caller (for example, in ehci_irq() when STS_IAA is
detected).
Alan Stern
--