(added linux-kernel to CC, since I think it's more of a general kernel
issue)To be brutally frank, I always thought this disable_irq() mess was a
hack both ugly and fragile. This disable_irq() work that appeared in a
couple net drivers was correct at the time, so I didn't feel I had the
justification to reject it, but it still gave me a bad feeling.I think the scenario you outline is an illustration of the approach's
fragility: disable_irq() is a heavy hammer that originated with INTx,
and it relies on a chip-specific disable method (kernel/irq/manage.c)
that practically guarantees behavior will vary across MSI/INTx/etc.Practices like forcedeth's unique locking work for a time, but it should
be a warning sign any time you stray from the normal spin_lock_irqsave()
method of synchronization.Based on your report, it is certainly possible that there is a problem
with MSI's desc->chip->disable() method... but I would actually
recommend working around the problem by making the forcedeth locking
more standardized by removing all those disable_irq() hacks.Using spinlocks like other net drivers (note: avoid NETIF_F_LLTX
drivers) has a high probability of both fixing your current problem, and
giving forcedeth a more stable foundation for the long term. In my
humble opinion :)Jeff
-
I checked the code: IRQ_DISABLE is implemented in software, i.e.
handle_level_irq() only calls handle_IRQ_event() [and then the nic irq
handler] if IRQ_DISABLE is not set.
OTHO: The last trace looks as if nv_do_nic_poll() is interrupted by an irq.Perhaps something corrupts dev->irq? The irq is requested with
request_irq(np->pci_dev->irq, handler, IRQF_SHARED, dev->name, dev)
and disabled with
disable_irq_lockdep(dev->irq);Someone around with a MSI capable board? The forcedeth driver does
dev->irq = pci_dev->irq
in nv_probe(), especially before pci_enable_msi().
Does pci_enable_msi() change pci_dev->irq? Then we would disable the
wrong interrupt....--
Manfred
-
Remember, fundamentally MSI-X is a one-to-many relationship, when you
consider a single PCI device might have multiple vectors.Jeff
-
msi-x is using other entry
if (np->msi_flags & NV_MSI_X_ENABLED)
enable_irq_lockdep(np->msi_x_entry[NV_MSI_X_VECTOR_ALL].vector);
YH
-
Correct, but the overall point was that MSI-X conceptually conflicts
with the existing "lockless" disable_irq() schedule, which was written
when there was a one-one relationship between irq, PCI device, and work
to be done.Jeff
-
at this point, nic in mcp55 is using msi or INTx.
YH
-
Correct.
For msi-x, the driver does three disable_irq() calls to the correct
vectors. ugly, but nevertheless correct.
The bug only affected msi: The driver did disable_irq(<old INTx irq
num>) instead of disable_irq(<new msi-x irq num>).
The patch that I've attached to the bugzilla report 9047 seems to fix
the crash, thus I would propose to apply it to 2.6.23 and 2.6.24. I'll
send a seperate mail.All other problem [reduce code duplication, less ugly locking, ...]
should be fixed with independant patches.--
-
Can I use your new driver with RHEL 5 or RHEL 5.1?
YH
-
Not without modification, since it depends on the napi_struct work
currently in torvalds/linux-2.6.git.But I am currently rewriting the fe-lock yet again, and most of those
changes can be applied to pre-napi_struct forcedeth.Jeff
-
the request_irq==>setup_irq will make dev->irq = pci_dev->irq.
YH
-
Where is that?
Otherwise I would propose the attached patch. My board is not
MSI-capable, thus I can't test it myself.--
Manfred
Why not just copy pcidev->irq to dev->irq once ?
Ben.
-
it seems e1000 is using np->pci_dev->irq directly too.
YH
-
Heh, allright, doesn't matter, I was just proposing to avoid one more
indirection :-)Ben.
-
in nv_request_irq
if ((ret = pci_enable_msi(np->pci_dev)) == 0) {
np->msi_flags |= NV_MSI_ENABLED;
if (request_irq(np->pci_dev->irq, handler,
IRQF_SHARED, dev->name, dev) != 0) {in request_irq
int request_irq(unsigned int irq, irq_handler_t handler,
unsigned long irqflags, const char *devname, void *dev_id)
...
action->dev_id = dev_id;
...
retval = setup_irq(irq, action);in setup_irq
int setup_irq(unsigned int irq, struct irqaction *new)
....
new->irq = irq;it seems I missed that here new is irqaction instead of net_dev.
YH
-
I wonder if the race is between soft_timer for nv_do_nic_poll from
different CPUsYH
-
Interested parties should try the forcedeth patches I just posted :)
Jeff
-
The patches look good, but I would still prefer to understand why the
current implementation causes crashes before rewriting everything.I've just noticed that netpoll_send_skb() calls ->hard_start_xmit() and
->poll() within local_irq_save().
Thus it seems that spin_lock_irqsave() must be used in the poll() and
hard_start_xmit() callbacks, at least in nics that support
POLL_CONTROLLER :-(--
Manfred-
On Fri, 28 Sep 2007 22:47:16 -0400
I'll try and clean it up if the author doesn't get to it first.
--
Stephen Hemminger <shemminger@linux-foundation.org>-
I didn't see anything in disable_irq that would cause it to fail in
the suggested way. But I couldn't quite convince myself we were
race free either. I didn't see anything that was specific to MSI
that would cause something. But switching from level to edge
triggered, and to a lower latency delivery path may have causedI took a look at the underlying side of this.
I don't know if the MSI capability for the forcedeth supports a mask
bit or not. Mine doesn't even have a msi capability. If it doesn't
support a mask bit the pci spec provides not valid way to mask the
interrupt, so what we do is actually disable the msi capability.
At which point we might get weird INTx interactions.We have a similar case with ioapics and INTx that also turns
a hardware level disable into a reroute to another irq command.
So I'm going to take a look and see how infrequently we can use
hardware level disabled.Since it looks like hardware level disables tend to be creatively
implemented I recommend using disable_irq as little as possible.Eric
-
also the driver need to have seperate handlers for ioapic, msi, msi-x...
current nv_nic_irq_optimized keep checking msi_flag...also nv_enable_hw_interrupts and nv_disable_hw_interrupts is not
corresponding in MSI side..YH
-
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Rafael J. Wysocki | [Bug #11210] libata badness |
| Andrea Arcangeli | [PATCH 00 of 11] mmu notifier #v16 |
| Andrew Morton | Re: -mm merge plans for 2.6.23 -- sys_fallocate |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Daniel Eischen | Re: error with thread |
| David Miller | Re: [GIT]: Networking |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
