Cc: Christoph Lameter <clameter@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <dkegel@...>, David Miller <davem@...>, Daniel Phillips <phillips@...>
On Tue, Aug 21, 2007 at 05:29:27PM +0200, Peter Zijlstra wrote:
Yes.
OK, I don't know exactly about MPI workloads. But I mean a few basic
things like the C and MPI libraries could already be quite big before
you even consider the application text (OK it won't be all paged in).
Maybe it won't be enough, but I think some form of recurive reclaim
will be better than our current scheme. Even assuming your patches are
in the kernel, don't you think it is a good idea to _not_ have potentially
complex writeout from reclaim just default to using up memory reserves?
But the code would end up better, wouldn't it? And it could be done
incrementally?
It wouldn't use reclaimable memory as such, but would have some small
amounts of reserve memory for allocating all those things required to
get a response from critical sockets. NBD for example would also then
be sure to reserve enough memory to at least clean one page etc. That's
the way the block layer has gone, which seems to be pretty good and I
think much better than having it in the buddy allocator.
I don't know if that is a really good advantage. The amount of memory
involved should just be pretty small. I mean it is an advantage, but
there are other disadvantages (imagine the mess if other subsystems used
their own global reserves in the allocator rather than mempools etc). I
don't see why networking is fundamentally more deserving of its own pools
in the allocator than anybody else.
-