Re: [rfc][patch 3/3] x86: optimise barriers

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Jarek Poplawski <jarkao2@...>
Cc: Nick Piggin <npiggin@...>, Linux Kernel Mailing List <linux-kernel@...>, Andi Kleen <ak@...>
Date: Friday, October 12, 2007 - 11:13 am

On Fri, 12 Oct 2007, Jarek Poplawski wrote:

I think the chip manufacturers really wanted to keep their options open.

Having the option to re-order loads in architecturally visible ways was 
something that they probably felt they really wanted to have. On the other 
hand:

 - I bet they had noticed that things break, and some applications depend 
   on fairly strong ordering (not necessarily in Linux-land, but..)

   I suspect hw manufacturers go through life hoping that "software 
   improves". They probably thought that getting rid of the old 16-bit 
   windows would mean that less people depended on undefined behaviour. 

   And I suspect that they started noticing that no, with threads and 
   JVM's and things, *more* people started depending on fairly strong 
   memory ordering.

 - I suspect Intel in particular noticed that they can do a lot of very 
   aggressive re-ordering at a microarchitectural level, but can still 
   guarantee that *architecturally* they never show it (dynamic detection 
   of reordered loads being replayed on cache dirty events etc).

IOW, I suspect that both Intel and AMD noticed that while they had wanted 
to keep their options open, those options weren't really realistic, and 
not something that the market wanted (aggressive use of threading wants 
*stricter* memory ordering, not looser), and they could work well enough 
with a fairly strict memory model.


Quite frankly, even *within* Intel and AMD, there are damn few people who 
understand exactly what the memory ordering requirements and guarantees 
are and historically were for the different CPU's.

I would bet that had you asked a random (but still competent) Intel/AMD 
engineer that wasn't really intimately involved with the actual design of 
the cache protocols and memory pipelines, they would absolutely not have 
been able to tell you how the CPU actually worked.

So no, there's no way a software person could have afforded to say "it 
seems to work on my setup even without the barrier". On a dual-socket 
setup with s shared bus, that says absolutely *nothing* about the 
behaviour of the exact same CPU when used with a multi-bus chipset. Not to 
mention another revisions of the same CPU - much less a whole other 
microarchitecture.

So yes, I've personally been aware for about a year that the memory 
ordering was going to likely be documented, but no way was I going to 
depend on it until Intel and AMD were ready to state so *publicly*. 
Because before that happens, they may have noticed errata etc that made it 
not work out.

Also, please note that we didn't even just change the barriers immediately 
when the docs came out. I want to do it soon - still *early* in the 2.6.24 
development cycle - exactly because bugs happen, and if somebody notices 
something strange, we'll have more time to perhaps decide that "oops, 
there's something bad going on, let's undo this for the real 2.6.24 
release until we can figure out the exact pattern".

		Linus
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[rfc][patch 1/3] x86_64: fence nontemproal stores, Nick Piggin, (Thu Oct 4, 1:21 am)
[rfc][patch 3/3] x86: optimise barriers, Nick Piggin, (Thu Oct 4, 1:23 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Fri Oct 12, 4:25 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Linus Torvalds, (Fri Oct 12, 11:13 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Mon Oct 15, 3:44 am)
RE: [rfc][patch 3/3] x86: optimise barriers, David Schwartz, (Mon Oct 15, 10:38 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Nick Piggin, (Mon Oct 15, 4:09 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Mon Oct 15, 5:10 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Nick Piggin, (Mon Oct 15, 8:50 pm)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Tue Oct 16, 5:00 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Tue Oct 16, 8:49 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Mon Oct 15, 5:24 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Nick Piggin, (Fri Oct 12, 4:57 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Fri Oct 12, 5:55 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Nick Piggin, (Fri Oct 12, 6:42 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Fri Oct 12, 7:55 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Fri Oct 12, 8:10 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Helge Hafting, (Fri Oct 12, 4:42 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Fri Oct 12, 5:12 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Helge Hafting, (Fri Oct 12, 8:44 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Fri Oct 12, 9:29 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Helge Hafting, (Mon Oct 15, 6:17 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Mon Oct 15, 7:53 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Nick Piggin, (Fri Oct 12, 5:44 am)
Re: [rfc][patch 3/3] x86: optimise barriers, Jarek Poplawski, (Fri Oct 12, 6:04 am)
[rfc][patch 2/3] x86: fix IO write barriers, Nick Piggin, (Thu Oct 4, 1:22 am)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Dave Jones, (Thu Oct 4, 1:32 pm)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Andi Kleen, (Thu Oct 4, 1:53 pm)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Dave Jones, (Thu Oct 4, 2:10 pm)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Andi Kleen, (Thu Oct 4, 2:21 pm)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Dave Jones, (Thu Oct 4, 2:41 pm)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Andi Kleen, (Thu Oct 4, 2:58 pm)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Dave Jones, (Thu Oct 4, 3:08 pm)
Re: [rfc][patch 2/3] x86: fix IO write barriers, Alan Cox, (Thu Oct 4, 4:52 pm)