OK, I have a last comment Stephen
(I tested your patches on my dev machine and got nice results)
In net/ipv4/netfilter/ip_tables.c,
get_counters() can be quite slow on large firewalls, and it is
called from alloc_counters() with BH disabled, but also from __do_replace()
without BH disabled.
So get_counters() first iterates on set_entry_to_counter() for current cpu
but might be interrupted and we get bad results...
Solution is to always use seqcount_t protection, and no BH disabling at all.
Then, I wonder if all this is valid, since alloc_counters() might be supposed
to perform an atomic snapshots of all counters. This was possible because
of a write_lock_bh(), but does it make any sense to block all packets for
such a long time ??? If this is a strong requirement, I am afraid we have
to make other changes...
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
net/ipv4/netfilter/ip_tables.c | 25 +++----------------------
1 files changed, 3 insertions(+), 22 deletions(-)
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 7c50e2e..d208b25 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -900,25 +900,8 @@ get_counters(const struct xt_table_info *t,
{
unsigned int cpu;
unsigned int i;
- unsigned int curcpu;
-
- /* Instead of clearing (by a previous call to memset())
- * the counters and using adds, we set the counters
- * with data used by 'current' CPU
- * We dont care about preemption here.
- */
- curcpu = raw_smp_processor_id();
-
- i = 0;
- IPT_ENTRY_ITERATE(t->entries[curcpu],
- t->size,
- set_entry_to_counter,
- counters,
- &i);
for_each_possible_cpu(cpu) {
- if (cpu == curcpu)
- continue;
i = 0;
IPT_ENTRY_ITERATE(t->entries[cpu],
t->size,
@@ -939,15 +922,12 @@ static struct xt_counters * alloc_counters(struct xt_table *table)
(other than comefrom, which userspace doesn't care
about). */
countersize = sizeof(struct xt_counters) * private->number;
- counters = vmalloc_node(countersize, numa_node_id());
+ counters = __vmalloc(countersize, GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL);
if (counters == NULL)
return ERR_PTR(-ENOMEM);
- /* First, sum counters... */
- local_bh_disable();
get_counters(private, counters);
- local_bh_enable();
return counters;
}
@@ -1209,7 +1189,8 @@ __do_replace(struct net *net, const char *name, unsigned int valid_hooks,
void *loc_cpu_old_entry;
ret = 0;
- counters = vmalloc(num_counters * sizeof(struct xt_counters));
+ counters = __vmalloc(num_counters * sizeof(struct xt_counters),
+ GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL);
if (!counters) {
ret = -ENOMEM;
goto out;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html