On Mon, Aug 23, 2010 at 11:04:38AM -0500, Christoph Lameter wrote:
A delta on the NR_FREE_PAGES is the obvious problem. The page allocation
failure report I saw clearly stated that free was a value above min watermark
where as the buddy lists just as clearly showed that the number of pages on
the list were 0.
I am not aware of similar issues with another counter where drift causes
the system to make the wrong decision, are you?
Unfortunately, I do not have access to a machine large enough to investigate
around this area. All I have to go on is a few bug reports showing the delta
problem with NR_FREE_PAGES and test results in a patch functionally similar
to this patch showing that the livelock problem went away.
At best all we can do is keep an eye out for problems one large machines
that could be explained by counter drift. If such a bug is found with a
reporter with regular access to the machine for test kernels, we can
investigate if reducing the thresholds fix the problem without affecting
general performance.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--