Does MSI-X work properly in 2.6.20+ kernels? Having some issues...

Submitted by Anonymous
on October 11, 2007 - 10:27am

Hey all,
I'm playing with a box that has 10gbit NICs in it. The NICs support MSI-X and I have verified that the driver executes pci_enable_msix() which returns success. Once the machine is up and running, I check out /proc/interrupts which indicates my usage of MSI and the multiple interrupts which should theoretically be able to run on any core. For some strange reason, all of the interrupts only run on the first core. This appears to be a problem after 3.5gbit/sec where the soft interrupt workload consumes 100% of the first core and I can no longer push anymore traffic. I've tried kernels 2.6.16, 2.6.17, 2.6.20, and 2.6.22. All seem to have the same problem. I can manually change the smp_affinity entries in /proc, but this renders the card disabled after an hour of use. Does anyone know if there's anything else I need?

Thanks in advance!

===============================

(output of cat /proc/interrupts)
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15
0: 246757 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge timer
8: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge rtc
9: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level acpi
58: 232144 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level eth0
90: 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level ehci_hcd:usb1
98: 291 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level ohci_hcd:usb2
146: 93317 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i0
154: 185346 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i1
162: 252905 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i2
170: 244500 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i3
178: 279896 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i4
186: 234754 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i5
194: 203388 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i6
202: 286182 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_i7
210: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth5_s8
NMI: 453 51 38 29 35 96 28 28 29 29 29 28 28 32 30 30
LOC: 246075 246058 246041 246024 246007 245990 245973 245956 245939 245922 245905 245888 245871 245854 245837 245820
ERR: 0
MIS: 0

napi?

strcmp
on
October 11, 2007 - 4:25pm

i thought that napi detects the high interrupt rate and switches to polling instead of interrupts? also, did you notice that not only the msi-x-interrupt but all interrupts only arrive at cpu0? can your hardware do better at all?

Check

Anonymous (not verified)
on
October 12, 2007 - 1:30am

Check /proc/irq/[irq_num]/smp_affinity. That is a mask of the cores for which the interrupt can be delivered to. Also, is this on an AMD system? If so, there is a restriction in the kernel to only operate in flat physical APIC mode which may be limiting the interrupt delivery to only one core.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.