then do we need a new option 'optimize for best overall performance' that
goes for size (and the corresponding wins there) most of the time, but is
ignored where it makes a huge difference?
I started useing Os several years ago, even when it was hidden in the
embedded menu becouse in many cases the smaller binary ended up being
faster.
in reality this was a flaw in gcc that on modern CPU's with the larger
difference between CPU speed and memory speed it still preferred to unroll
loops (eating more memory and blowing out the cpu cache) when it shouldn't
have.
if that has been fixed on later versions of gcc this would be a good
thing. if it hasn't (possibly in part due to gcc optimizations being
designed to be cross platform) then either the current 'go for size' or a
hybrid 'performance' option is needed.
David Lang
-