On Mon, 23 Jun 2008, Hugh Dickins wrote:The problem is that the old code said: - we can use FOLL_ANON, assuming that the vma has no vm_ops, or has no "fault" callback. That was funcamentally broken. Because you can have a "nopfn" callback. But it's hard to notice, since the whole FOLL_ANON code only _used_ to trigger if a whole page table was missing. The VM_LOCKED test was just crazy, but I doubt it was the cause of the bug. That's still crazy. make_pages_present() already does: write = (vma->vm_flags & VM_WRITE) != 0; and passes that in to "get_user_pages()". So for a writable mapping, we'll elide the FOLL_ANON case anyway, and for a read-only mapping we should have used ZERO_PAGE. Damn. Oh, well. We can certainly re-instate the insane behaviour for mlock(). Not that we historically used to - we used to just map in ZERO_PAGE. So here's a third patch to test. It removes the VM_SHARED thing just to get us closer to the original code (and because do_no_page() didn't do it historically, so let's not do it either), and it re-instates the insane VM_LOCKED test with a comment. Jeff, does this still work with vmware? Linus --- mm/memory.c | 20 ++++++++++++++++++-- 1 files changed, 18 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 9aefaae..a2ce28d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1045,6 +1045,23 @@ no_page_table: return page; } +/* Can we do the FOLL_ANON optimization? */ +static inline int use_zero_page(struct vm_area_struct *vma) +{ + /* + * We don't want to optimize FOLL_ANON for make_pages_present() + * when it tries to page in a VM_LOCKED region. + */ + if (vma->vm_flags & VM_LOCKED) + return 0; + /* + * And if we have a fault or a nopfn routine, it's not an + * anonymous region. + */ + return !vma->vm_ops || + (!vma->vm_ops->fault && !vma->vm_ops->nopfn); +} + int get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, int len, int write, int force, struct page **pages, struct vm_area_struct **vmas) @@ -1119,8 +1136,7 @@ int get_user_pages(struct task_struct *tsk, struct mm_struct *mm, foll_flags = FOLL_TOUCH; if (pages) foll_flags |= FOLL_GET; - if (!write && !(vma->vm_flags & VM_LOCKED) && - (!vma->vm_ops || !vma->vm_ops->fault)) + if (!write && use_zero_page(vma)) foll_flags |= FOLL_ANON; do { --
| Hiten Pandya | Re: up? (emacs docbook xml ide) |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Roland Dreier | Re: Integration of SCST in the mainstream Linux kernel |
| Florian Schmidt | blacklist kernel boot option |
git: | |
| Linus Torvalds | Re: iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49 |
| Arjan van de Ven | Re: [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
