Its simpler code wise.
pcps are a particular problem on NUMA because the lists are replicated per
zone and per processor.
In the worst case we see that pcps cause a 5% performance drop (sequential
alloc w/o free followed by sequential free w/o allocs). See my page allocator
tests in my git tree.
pcps need improvement. The performance issues with the page allocator fastpath
are likely due to bloating of the fastpaths (antifrag did not do much good on
that level). Plus current crops of processors are sensitive to cache footprint
issues (seems that the tbench regression in the network stack also are due to
the same effect). Doubly linked lists are not good today because they touch
multiple cachelines.
--