> Stephen Hemminger wrote:
>> On Fri, 24 Aug 2007 17:47:15 +0200
>> Jan-Bernd Themann <ossthema@de.ibm.com> wrote:
>>
>>> Hi,
>>>
>>> On Friday 24 August 2007 17:37,
akepner@sgi.com wrote:
>>>> On Fri, Aug 24, 2007 at 03:59:16PM +0200, Jan-Bernd Themann wrote:
>>>>> .......
>>>>> 3) On modern systems the incoming packets are processed very fast.
>>>>> Especially
>>>>> on SMP systems when we use multiple queues we process only a
>>>>> few packets
>>>>> per napi poll cycle. So NAPI does not work very well here and
>>>>> the interrupt rate is still high. What we need would be some
>>>>> sort of timer polling mode which will schedule a device after a
>>>>> certain amount of time for high load situations. With high
>>>>> precision timers this could work well. Current
>>>>> usual timers are too slow. A finer granularity would be needed
>>>>> to keep the
>>>>> latency down (and queue length moderate).
>>>>>
>>>> We found the same on ia64-sn systems with tg3 a couple of years
>>>> ago. Using simple interrupt coalescing ("don't interrupt until
>>>> you've received N packets or M usecs have elapsed") worked
>>>> reasonably well in practice. If your h/w supports that (and I'd
>>>> guess it does, since it's such a simple thing), you might try it.
>>>>
>>> I don't see how this should work. Our latest machines are fast
>>> enough that they
>>> simply empty the queue during the first poll iteration (in most cases).
>>> Even if you wait until X packets have been received, it does not
>>> help for
>>> the next poll cycle. The average number of packets we process per
>>> poll queue
>>> is low. So a timer would be preferable that periodically polls the
>>> queue, without the need of generating a HW interrupt. This would
>>> allow us
>>> to wait until a reasonable amount of packets have been received in
>>> the meantime
>>> to keep the poll overhead low. This would also be useful in combination
>>> with LRO.
>>>
>>
>> You need hardware support for deferred interrupts. Most devices have
>> it (e1000, sky2, tg3)
>> and it interacts well with NAPI. It is not a generic thing you want
>> done by the stack,
>> you want the hardware to hold off interrupts until X packets or Y
>> usecs have expired.
>
> Does hardware interrupt mitigation really interact well with NAPI? In
> my experience, holding off interrupts for X packets or Y usecs does
> more harm than good; such hardware features are useful only when the
> OS has no NAPI-like mechanism.
>
> When tuning NAPI drivers for packets/sec performance (which is a good
> indicator of driver performance), I make sure that the driver stays in
> NAPI polled mode while it has any rx or tx work to do. If the CPU is
> fast enough that all work is always completed on each poll, I have the
> driver stay in polled mode until dev->poll() is called N times with no
> work being done. This keeps interrupts disabled for reasonable traffic
> levels, while minimizing packet processing latency. No need for
> hardware interrupt mitigation.