On Wed, 14 May 2008, Christoph Lameter wrote:So you queue them. That's what we do with things like the dirty bit. We need to hold various spinlocks to look up pages, but then we can't actually call the filesystem with the spinlock held. Converting a spinlock to a waiting lock for things like that is simply not acceptable. You have to work with the system. Yeah, there's only a single bit worth of information on whether a page is dirty or not, so "queueing" that information is trivial (it's just the return value from "page_mkclean_file()". Some things are harder than others, and I suspect you need some kind of "gather" structure to queue up all the vma's that can be affected. But it sounds like for the case of rmap, the approach of: - the page lock is the higher-level "sleeping lock" (which makes sense, since this is very close to an IO event, and that is what the page lock is generally used for) But hey, it could be anything else - maybe you have some other even bigger lock to allow you to handle lots of pages in one go. - with that lock held, you do the whole rmap dance (which requires spinlocks) and gather up the vma's and the struct mm's involved. - outside the spinlocks you then do whatever it is you need to do. This doesn't sound all that different from TLB shoot-down in SMP, and the "mmu_gather" structure. Now, admittedly we can do the TLB shoot-down while holding the spinlocks, but if we couldn't that's how we'd still do it: it would get more involved (because we'd need to guarantee that the gather can hold *all* the pages - right now we can just flush in the middle if we need to), but it wouldn't be all that fundamentally different. And no, I really haven't even wanted to look at what XPMEM really needs to do, so maybe the above thing doesn't work for you, and you have other issues. I'm just pointing you in a general direction, not trying to say "this is exactly how to get there". Linus --
| Chuck Ebbert | Wanted: simple, safe x86 stack overflow detection |
| Alan Cox | Re: ndiswrapper and GPL-only symbols redux |
| Yinghai Lu | [PATCH 03/42] x86: remove irq_vectors_limits |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
git: | |
| しらいしななこ | Re: [ANNOUNCE] GIT 1.5.4 |
| Jan Wielemaker | git filter-branch --subdirectory-filter, still a mistery |
| Pierre Habouzit | [PATCH] guilt(1): Obvious bashisms fixed. |
| Christopher Faylor | Re: First cut at git port to Cygwin |
| Thilo Pfennig | OpenBSD project goals |
| Marco Peereboom | Re: Real men don't attack straw men |
| Daniel Hazelton | Re: Wasting our Freedom |
| Luke Bakken | Re: No Blob without Puffy |
| Julius Volz | [PATCHv3 19/24] IVPS: Disable sync daemon for IPv6 connections |
| Paul Moore | [RFC PATCH v4 04/14] selinux: Fix missing calls to netlbl_skbuff_err() |
| Dave Jones | odd RTL8139 quirk. |
| Patrick McHardy | [NET_SCHED 04/15]: act_api: use nlmsg_parse |
