Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE)

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Andrea Arcangeli
Date: Tuesday, July 10, 2007 - 3:11 am

On Mon, Jul 09, 2007 at 09:20:31AM +1000, David Chinner wrote:

I didn't misunderstand. But the reason you can't use a larger
blocksize than 4k is because the PAGE_SIZE is 4k, and
CONFIG_PAGE_SHIFT raises the PAGE_SIZE to 8k or more, so you can then
enlarge the filesystem blocksize too.


Of course, for I/O performance the CPU cost is mostly irrelevant,
especially with slow storage.


Yes I'm aware of this and my patch allows it too the same way, but the
fundamental difference is that it should help your I/O layout
optimizations with larger blocksize, while at the same time making the
_whole_ kernel faster. And it won't even waste more pagecache than a
variable order page size would (both CONFIG_PAGE_SHIFT and variable
order page size will waste some pagecache compared to a 4k page
size). So they better be used for workloads manipulating large files.


That should be possible the same way with both designs.


Sorry.


Totally agreed, your approach would be much better for dvd on the
desktop. If only I could trust it to be reliable (I guess I'd rather
stick to growisofs).

But for your _own_ usage, the big box with lots of ram and where a
blocksize of 4k is a blocker, my design should be much better because
it'll give you many more advantages on the CPU side too (the only
downside is the higher complexity in the pte manipulations).

Think, even if you would end up mounting xfs with 64k blocksize on a
kernel with a 16k PAGE_SIZE, that's still going to be a smaller
fragmentation risk than using a 64k blocksize on a kernel with a 4k
PAGE_SIZE, the risk in failing defrag because of alloc_page() = 4k is
much higher than if the undefragmentable alloc_page returns a 16k
page. The CPU cost of defrag itself will be diminished by a factor of
4 too.


The equivalent waste will happen on disk if you raise the blocksize to
64k. The same waste will happen as well if you mounted the filesystem
with the cache kernel tree using a variable order page size of 64k.

I guess for maximizing cache usage during kernel development the ideal
PAGE_SIZE would be smaller than 4k...


You guys need to explain me how you solved the defrag issue if you
can't defrag the return value of alloc_page(GFP_KERNEL) =
4k. Furthermore you never seem to account the CPU cost of defrag on
big systems that may need to memcpy a lot of ram. My design doesn't
need proofs, it never requires memcpy, and it'll just always run as
fast as right after boot. Boosting the PAGE_SIZE is more a black and
white and predictable think so I've no doubt I prefer it.

BTW, I asked Hugh to look into Bill's and Hugh's old patch to see if
there's some goodness we can copy to solve things like the underlying
overlapping anon page after writeprotect faults over
MAP_PRIVATE. Perhaps there's a better way than looking the nearby pte
for a pte pointing to PG_anon or a swap entry which is my current
idea. This is assuming their old patches were really using a similar
design to mine (btw, back then there was no PG_anon but I guess
checking page->mapping for null would have been enough to tell it was
an anon page).

Hugh also reminded me that at KS some year ago their old patch
boosting the PAGE_SIZE was dismissed because it looked unnecessary,
the major reason for wanting it back then was the mem_map_t array
size, and that's not an issue anymore on 64bit archs. But back then,
nobody proposed to boost the pagecache to order > 0 allocations, so
this is one reason why _now_ it's different. It's really your variable
order page size and the defrag efforts that don't math-proof guarantee
defrag, that triggered my interest in CONFIG_PAGE_SHIFT.
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Fri Jul 6, 3:26 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Dave Hansen, (Fri Jul 6, 4:33 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Fri Jul 6, 4:52 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Badari Pulavarty, (Fri Jul 6, 6:36 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Badari Pulavarty, (Fri Jul 6, 6:47 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Paul Mackerras, (Sat Jul 7, 12:01 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Sat Jul 7, 3:12 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Sat Jul 7, 3:25 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Jan Engelhardt, (Sat Jul 7, 11:53 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Rik van Riel, (Sat Jul 7, 1:34 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Sun Jul 8, 2:52 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), David Chinner, (Sun Jul 8, 4:20 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Tue Jul 10, 3:11 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), David Chinner, (Wed Jul 11, 5:12 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Thu Jul 12, 4:14 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), David Chinner, (Thu Jul 12, 7:44 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Thu Jul 12, 9:31 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Dave Hansen, (Thu Jul 12, 9:34 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Matt Mackall, (Thu Jul 12, 10:53 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Thu Jul 12, 6:06 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), David Chinner, (Fri Jul 13, 12:13 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Dave Kleikamp, (Fri Jul 13, 7:08 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Fri Jul 13, 7:31 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), David Chinner, (Sun Jul 15, 5:27 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), William Lee Irwin III, (Tue Jul 17, 10:47 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Tue Jul 17, 12:33 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), William Lee Irwin III, (Wed Jul 18, 6:32 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Rene Herman, (Wed Jul 18, 9:34 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Wed Jul 18, 4:50 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Rene Herman, (Wed Jul 18, 5:53 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Tue Jul 24, 12:44 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), William Lee Irwin III, (Tue Jul 24, 8:20 pm)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), Andrea Arcangeli, (Wed Jul 25, 7:39 am)
Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE), William Lee Irwin III, (Wed Jul 25, 10:56 am)