I suspect the cache line footprint is not the main problem here (talking about
only one other cache line), but the potential latency of fetching the other
half. One possible alternative instead of increasing struct page would be to
identify places that commonly touch a page first (e.g. using oprofile) and
then always add a prefetch() there to fetch the other half of the page
early.
prefetch on something that is already in cache should be cheap,
so for the structs that don't straddle cachelines it shouldn't be a big
overhead.
I don't think doing the ->virtual addition will buy very much,
because at least the 64bit architectures will probably move
towards vmemmap where pfn->virt is quite cheap.
Of course the real long term fix for struct page cache overhead
would be larger soft page size.
-Andi
-