Hello Calin,
Thursday, April 30, 2009, 1:49:46 AM, you wrote:
quoted text > Wednesday, April 29, 2009, 7:21:11 PM, you wrote:
quoted text >> I finally managed to disable NAPI on e1000e - apparently it can only be
>> done on the "official" Intel driver (downloaded from their website), by
>> compiling with "make CFLAGS_EXTRA=-DE1000E_NO_NAPI". This doesn't seem
>> to be available in the (2.6.29) kernel driver.
quoted text >> With NAPI disabled, 4 (of 8) cores go to 100% (instead of only one), but
>> overall throughput *decreases* from ~110K pps (with NAPI) to ~80K pps.
>> This makes sense, since h/w interrupt is much more time consuming than
>> polling (that's the whole idea behind NAPI anyway).
quoted text >> Radu Rendec
quoted text > I tested with e1000 only, on a single quad-core CPU - the L2 cache was
> shared between the cores.
quoted text > For 8 cores I suppose you have 2 quad-core CPUs. If the cores actually
> used belong to different physical CPUs, L2 cache sharing does not occur -
> maybe this could explain the performance drop in your case.
> Or there may be other explanation...
quoted text > Anyway - coming back to David Miller's words:
quoted text > "HTB acts upon global state, so anything that goes into a particular
> device's HTB ruleset is going to be single threaded.
> There really isn't any way around this. "
quoted text > It could be the only way to get more power is to increase the number
> of devices where you are shaping. You could split the IP space into 4 groups
> and direct the trafic to 4 IMQ devices with 4 iptables rules -
quoted text > -d 0.0.0.0/2 -j IMQ --todev imq0,
> -d 64.0.0.0/2 -j IMQ --todev imq1, etc...
quoted text > Or you can customize the split depeding on the traffic distribution.
> ipset nethash match can also be used.
quoted text > The 4 devices can have the same htb ruleset, only the right parts
> of it will match.
> You should test with 4 flows that use all the devices simultaneously and
> see what is the aggregate throughput.
quoted text > The performance gained through parallelism might be a lot higher than the
> added overhead of iptables and/or ipset nethash match. Anyway - this is more of
> a "hack" than a clean solution :)
quoted text > p.s.: latest IMQ at
http://www.linuximq.net/ is for 2.6.26 so you will need to try with that
You will also need -i ethX (router), or -m physdev --physdev-in ethX
(bridge) to differentiate between upload and download in the iptables rules.
--
Best regards,
Calin
mailto:calin.velea@gemenii.ro
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to
majordomo@vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html