All depends on what the user provides with the test-specific -m option for how
much data they shove into the socket each time "send" is called, and I suppose if
they use a test-specific -D option to set TCP_NODELAY in the case of a TCP test
when they have small values of -m. Eg
netperf -t TCP_STREAM ... -- -m 64K
vs
netperf -t TCP_STREAM ... -- -m 1024
vs
netperf -t TCP_STREAM ... -- -m 1024 -D
vs
netperf -t UDP_STREAM ... -- -m 1024
etc etc.
If the netperf test is:
netperf -t TCP_RR ... -- -r 1 (single-byte request/response)
then TSO/GSO/USO won't matter at all, and probably still wont matter even if the
user has ./configure'd netperf with --enable-burst and does:
netperf -t TCP_RR ... -- -r 1 -b 64
or
netperf -t TCP_RR ... -- -r 1 -b 64 -D
which was basically what I was doing for the 32-core scaling stuff I posted about
a few weeks ago. That was running on multi-queue NICs, so looking at some of the
profiles of the "no iptables" data may help show how big/small the problem is,
keeping in mind that my runs (either the XFrame II runs, or the Chelsio T3C runs
before them) had one queue per core in the system...and as such may be a best
case scenario as far as lock contention on a per-queue basis goes.
ftp://ftp.netperf.org/
happy benchmarking,
rick jones
BTW, that setup went "poof" and had to go to other nefarious porpoises. I'm not
sure when I can recreate it, but I still have both the XFrame and T3C NICs when
the HW comes free again.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html