Oh, yeah!
When porting part of Ingo's work, I realized that a similar thing can be do=
ne=20
for fork().
If the whole address space is unmapped in init_new_context_skas(), the firs=
t=20
fix_range_common() call won't need to call unmap at all. He did this with=20
remap_file_pages(), where init_new_context_skas() must "unmap" everything=20
anyway.
This is giving some speedup in lmbench (5% better in fork proc, 2% better i=
n=20
exec proc), but the results are still controversial, there is one benchmark=
=20
with a 2% slowdown (called 'mmap latency').
In a loop, it maps, touches a byte per page and unmaps a region with growin=
g=20
size (up to 32MB).
However, since results aren't yet stable for some other benchmark (context=
=20
switching benchmark is crazy), I'm still studying on this.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=2D-=20
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade