Thanks Jarek.
Pawel made some reports errors in fib thread, so I am not sure he really
tried 2.6.30 and had same oprofile results.
rt_worker_func() taking 13% of cpu0 is an alarm for me :)
And 21% of cpu0 and 34% of cpu6 taken by oprofiled seems odd too...
Pawel, could you give us :
grep . /proc/sys/net/ipv4/route/*
cat /proc/interrupts
on your various kernels (previous to 2.6.29, 2.6.29, 2.6.30, ...)
I suspect a change in hash table size, and/or change in interrupt affinities...
Change in hash table size comes from commit c9503e0fe052020e0294cd07d0ecd982eb7c9177
But as Pawel mentioned "net.ipv4.route.gc_thresh = 190536", I believe
his hash table is smaller than 512k entries!
Author: Anton Blanchard <anton@samba.org>
Date: Mon Apr 27 05:42:24 2009 -0700
ipv4: Limit size of route cache hash table
Right now we have no upper limit on the size of the route cache hash table.
On a 128GB POWER6 box it ends up as 32MB:
IP route cache hash table entries: 4194304 (order: 9, 33554432 bytes)
It would be nice to cap this for memory consumption reasons, but a massive
hashtable also causes a significant spike when measuring OS jitter.
With a 32MB hashtable and 4 million entries, rt_worker_func is taking
5 ms to complete. On another system with more memory it's taking 14 ms.
Even though rt_worker_func does call cond_sched() to limit its impact,
in an HPC environment we want to keep all sources of OS jitter to a minimum.
With the patch applied we limit the number of entries to 512k which
can still be overriden by using the rt_entries boot option:
IP route cache hash table entries: 524288 (order: 6, 4194304 bytes)
With this patch rt_worker_func now takes 0.460 ms on the same system.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html