On Tue, 11 Sep 2007 18:31:12 +0900 Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> wrote:Perfect changelog, thanks. It really helps everyone. Your email client performs space-stuffing. That means the recipient needs to do s/^ / /g to apply the patch. I really don't like the games we play with page refcounts here. By the time we consider a page, that page could have been freed and could be reused for any purpose whatever. We don't have any business playing around with the page's internal state when we're unsure of what it is being used for. Nick is the guy who skates on thin ice. If you join him, we all drown ;) The page_count() test here is a bit of a worry, too. Why do we need it? The caller must have pinned the page in some fashion else we couldn't use it safely in this function at all. I assume that you discovered that once we've cleared PageWriteback(), the page can get reclaimed elsewhere? If so, that could still happen immediately after the page_count() test. It's all a bit of a worry. Deferring the ClearPageWriteback() will fix any race concerns, but I do think that we need to take a ref on the page for the pagevec ownership. The tricky part will be working out whether you've found all the places which need to have move_tail_pages() added to them. pagevec_move_tail() needs a comment explaining that it must be called with preemption disabled. Otherwise it has nasty races. Actually, given that pagevec_move_tail() is called from both interrupt and non-interrupt context, I guess that it needs local_irq_save() protection as well. In which case the preempt_disable() and preempt_enable() in move_tail_pages() can be optimised away (use local_irq_save() and __get_cpu_var()). So I do think that for safety and sanity's sake, we should be taking a ref on the pages when they are in a pagevec. That's going to hurt your nice performance numbers :( Please consider doing a single call to __count_vm_events() in pagevec_move_tail(), instead of multiple calls to __count_vm_event(). It's a bit of a concern that we can have ((num_online_cpus - 1) * PAGEVEC_SIZE) pages outstanding in the other CPU's pagevecs. But I _guess_ that'll be OK - it's a relatively small number of pages. What happens during cpu hot-unplug? If we take a ref on the pageved'ed pages then we have pages stranded in the now-unplugged cpu's pagevec? Might need a hotplug notifier for that. Memory hot-unplug? What happens if we have pointers to hot-unplugged pages in some cpu's pagevec? Nothing, I guess. Historical note on pagevecs: like just about everything else in page reclaim, this is primarily an IO scheduling problem (if rotating disks ever become obsolete, we get to throw out a huge amount of stuff). When I first did the pagevec batching code I found that pageout was doing increased amounts of disk seeking. This was because a few pages were getting stranded in the per-cpu pagevecs on "other" CPUs. So pageout on "this" CPU was doing large writes with small "holes" in them (small problem). But later, when the "other" CPU's pagevec was spilled, we ended up with pages on the LRU which were dirty, but which were no longer close-on-disk to any other dirty pages. IOW: hiding dirty pages in the per-cpu pagevecs was deoptimising IO patterns coming out of page reclaim. So lru_add_drain() got added and calls to it were sprinkled around the place. That's just an fyi. I doubt if similar problems will occur with your change, because these pages are clean. -
| Linus Torvalds | Linux 2.6.27-rc8 |
| Greg KH | [patch 00/71] 2.6.26-stable review |
| Dmitry Torokhov | 2.6.27-rc8+ - first impressions |
| Rafael J. Wysocki | [Bug #11215] INFO: possible recursive locking detected ps2 command |
git: | |
| Christian MICHON | Re: MinGW port - initial work uploaded |
| Luiz Fernando N. Capitulino | Libification project (SoC) |
| Linus Torvalds | People unaware of the importance of "git gc"? |
| Jakub Narebski | [RFC] Git User's Survey 2008 |
| Richard Stallman | Real men don't attack straw men |
| Tony Abernethy | Re: What is our ultimate goal?? |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| James Hartley | scp batch mode? |
| Ingo Molnar | Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets |
| Timo Teräs | Re: xfrm_state locking regression... |
| Ingo Molnar | Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+ |
| Natalie Protasevich | [BUG] New Kernel Bugs |
