Hi Andrew, Unfortunately, I agreed with your suggestion too hastily. Not only would it be complex to implement, It does not work. It took me several days to put my finger on exactly why. Here it is in a nutshell: resources may be consumed _after_ the gatekeeper runs the "go, no go" throttling decision. To illustrate, throw 10,000 bios simultaneously at a block stack that is supposed to allow only about 1,000 in flight at a time. If the block stack allocates memory somewhat late in its servicing scheme (for example, when it sends a network message) then it is possible that no actual resource consumption will have taken place before all 10,000 bios are allowed past the gate keeper, and deadlock is sure to occur sooner or later. In general, we must throttle against the maximum requirement of inflight bios rather than against the measured consumption. This achieves the invariant I have touted, namely that memory consumption on the block writeout path must be bounded. We could therefore possibly use your suggestion or something resembling it to implement a debug check that the programmer did in fact do their bounds arithmetic correctly, but it is not useful for enforcing the bound itself. In case that coffin needs more nails in it, consider that we would not only need to account page allocations, but frees as well. So what tells us that a page has returned to the reserve pool? Oops, tough one. The page may have been returned to a slab and thus not actually freed, though it remains available for satisfying new bio transactions. Because of such caching, your algorithm would quickly lose track of available resources and grind to a halt. Never mind that keeping track of page frees is a nasty problem in itself. They can occur in interrupt context, so forget the current-> idea. Even keeping track of page allocations for bio transactions in normal context will be a mess, and that is the easy part. I can just imagine the code attempting to implement this approach acreting into a monster that gets confusingly close to working without ever actually getting there. We do have a simple, elegant solution posted at the head of this thread, which is known to work. Regards, Daniel --
| Alan | Re: [RFC] Heads up on sys_fallocate() |
| Rafael J. Wysocki | [Bug #11215] INFO: possible recursive locking detected ps2_command |
| Con Kolivas | Re: -mm merge plans for 2.6.23 |
| Mike Galbraith | Re: regression: CD burning (k3b) went broke |
git: | |
| Andy Parkins | svn:externals using git submodules |
| Jeff King | Re: Terminology question about remote branches. |
| Jon Smirl | ! [rejected] master -> master (non-fast forward) |
| Miles Bader | "git pull REMOTE" question |
| Richard Stallman | Real men don't attack straw men |
| Ari Constancio | Re: Squid/authpf with lookups on Active Directory |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Michael | Performance: OpenVPN vs IPsec |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Andrew Morton | Re: [Bugme-new] [Bug 11144] New: dhcp doesn't work with iwl4965 |
| David Miller | Re: xfrm_state locking regression... |
