[RFC 0/4]x86: allocate up to 32 tlb invalidate vectors

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Shaohua Li
Date: Tuesday, November 2, 2010 - 11:44 pm

Hi,
In workload with heavy page reclaim, flush_tlb_page() is frequently
used. We currently have 8 vectors for tlb flush, which is fine for small
machines. But for big machines with a lot of CPUs, the 8 vectors are
shared by all CPUs and we need lock to protect them. This will cause a
lot of lock contentions. please see the patch 3 for detailed number of
the lock contention.
Andi Kleen suggests we can use 32 vectors for tlb flush, which should be
fine for even 8 socket machines. Test shows this reduces lock contention
dramatically (see patch 3 for number).
One might argue if this will waste too many vectors and leave less
vectors for devices. This could be a problem. But even we use 32
vectors, we still leave 78 vectors for devices. And we now have per-cpu
vector, vector isn't scarce any more, but I'm open if anybody has
objections.

Thanks,
Shaohua

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[RFC 0/4]x86: allocate up to 32 tlb invalidate vectors, Shaohua Li, (Tue Nov 2, 11:44 pm)
Re: [RFC 0/4]x86: allocate up to 32 tlb invalidate vectors, H. Peter Anvin, (Mon Nov 15, 10:53 am)