On Wed, Feb 20, 2008 at 11:39:42AM +0100, Andrea Arcangeli wrote:I ported the GRU driver to use the latest #v6 patch and ran a series of tests on it using our system simulator. The simulator is slow so true stress or swapping is not possible - at least within a finite amount of time. Functionally, the #v6 patch seems to work for the GRU. However, I did notice two significant differences that make the #v6 performance worse for the GRU than Christoph's patch. I think one difference is easily fixable but the other is more difficult: - the location of the mmu_notifier_release() callout is at a different place in the 2 patches. Christoph has the callout BEFORE the call to unmap_vmas() whereas you have it AFTER. The net result is that the GRU does a LOT of 1-page TLB flushes during process teardown. These flushes are not done with Christops's patch. - the range callouts in Christoph's patch benefit the GRU because multiple TLB entries can be flushed with a single GRU instruction (the GRU hardware supports a range flush using a vaddr & length). The #v6 patch does a TLB flush for each page in the range. Flushing on the GRU is slow so being able to flush multiple pages with a single request is a benefit. Seems like the latter difference could be significant for other users of mmu notifiers. --- jack --
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Jeff Garzik | Re: fallocate-implementation-on-i86-x86_64-and-powerpc.patch |
git: | |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Arjan van de Ven | Re: [GIT]: Networking |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Natalie Protasevich | [BUG] New Kernel Bugs |
