On Fri, Oct 12, 2007 at 10:25:34AM +0200, Jarek Poplawski wrote:
I'm not sure exactly what the situation is with the manufacturers,
but maybe they (at least Intel) wanted to keep their options open
WRT their barrier semantics, even if current implementations were
not taking full liberty of them.
I don't know quite what you're saying... the CPUs could probably get
performance by having weakly ordered loads, OTOH I think the Intel
ones might already do this speculatively so they appear in order but
essentially have the performance of weak order.
If you're just talking about this patch, then it probably isn't much
performance gain. I'm guessing you'd be lucky to measure it from
userspace.
I don't know if that would be worthwhile. It actually isn't always
trivial to trigger reordering. For example, on my dual-core core2,
in order to see reads pass writes, I have to do work on a set that
exceeds the cache size and does a huge amount of work to ensure it
is going to trigger that. If you can actually come up with a test
case that triggers load/load or store/store reordering, I'm sure
Intel / AMD would like to see it ;)
All existing processors as far as we know are in-order WRT loads vs
loads and stores vs stores. It was just a matter of getting the docs
clarified, which gives us more confidence that we're correct and a
reasonable guarnatee of forward compatibility.
So, I think the plan is just to merge these 3 patches during the
current window.
-