Please make a proposal for how gcc should be modified instead of just
shooting on the gcc people -- the root cause here is the way the C/C++
memory model is defined. (Note: I'm not in any way involved in gcc
development.)You can find my proposal to improve gcc here:
http://gcc.gnu.org/ml/gcc/2007-10/msg00465.htmlBart Van Assche.
-
Btw, I think this is fine per se, but putting "__attribute__((acquire))"
on the functions that acquire a lock does seem to be problematic, in that
quite often you might well want to inline those things. How would you
handle that?The fact is, the whole optimization is broken. You should never do
extraneous writes to anything but registers (or your own spill pool on the
stack - anything that hasn't had its address taken and cannot be visible
to outsiders). A C compiler should basically do what the user asks it to
do, and that means that it simply should be _very_ nervous about doing
optimizations that can have visible secondary effects.So the first question that should be asked is: "is that optimization even
worth it in the first place outside of registers and the spill pool?"If an optimization introduces visible behaviour differences to the
"obvious" stupid interpretation of the code, it should automatically be
something that should be given a lot of thought, and perhaps not enabled
at all by default (where "default" is certainly normal optimization
levels).And different languages have different usages. In C, it's quite common
(and _possible_) for the programmer to specify how to do things at a
fairly low level. That's not true in all other languages, and it affects
how a compiler should optimize things. In C, a compiler should give more
weight to "this is how the user specified the solution" because in C,
programmers really *do* that.So I don't think your proposal is wrong, but I think before even going to
something like that, you should ask yourself: "was that a valid
optimization to start with?"(That said, there may well be *other* reasons for wanting gcc to know
about lock/unlock behaviour and teaching gcc about barriers. If gcc starts
getting more threading knowledge, gcc may well need to have that kind of
information in other places).Linus
-
Thinking some more about this, you really have two cases:
- full functions taking/releasing locks (possibly conditionally, ie
with something lik etrylock and/or based on argument values).You simply *cannot* require these to be marked, because the locking may
have been done indirectly. Yes, you can mark things like
"pthread_mutex_trylock()" as being an acquire-function, but the fact
is, users will then wrap these things in *other* functions, and return
their return values.Ergo: a compiler *must* assume that a function call that it
didn't inline involves locking. There's no point in adding some
gcc-specific attributes to system header files, because it's not going
to fix anything in any portable program.- inline assembly (together with, potentially, compiler primitives).
That's the only other way to reliably do locking from C.This one gcc could certainly extend on. But would there really be any
upside? It would be easier/better to say that inline assembly (at least
if it clobbers memory or is volatile) has the same serialization issues
as a function call.So the fact is, any compiler that turns
if (conditional)
x++;into an unconditional write to x (where 'x' is potentially visible to the
outside - global visibility or has had its address taken) is just broken.No ifs, buts or maybes about it. You simply cannot do that optimization,
because there is no way for the compiler to know whether the conditional
implies that you hold a lock or not.Linus
-
A problem is that the serialization properties defined for functions
in the C standard only apply to volatile variables, not to
non-volatile variables. But for asm statements this can be solved by
adding memory to the list of clobbered registers -- this will prevent
any reordering of manipulations of non-volatile variables and asm
statements.Andrew, do you know whether gcc currently contains any optimization
that interchanges the order of accesses to non-volatile variables andI agree with the above, but I see this as a different issue -- it
wasn't my intention to solve this with my proposal for acquire and
release attributes.Bart Van Assche.
-
IFF the processor doesn't reorder them in hardware, which on some
processors is visibly out of order when viewed from an I/O device or
another CPU.You can stop the compiler but not the CPU - and some processors will
certainly speculatively load across conditionals, reorder writes etc-
The difference is that the CPU knows how to cancel most[1] side effects
of these speculative accesses (e.g. by not issuing exceptions[2] etc.).The compiler doesn't normally (except on some architectures with special support like IA64;
but I'm not sure gcc supports it there)[1] In some it can't and we've had problems with that in the past. e.g. in
a few cases speculative reads can be a problem. But we generally fix or
workaround those cases in the code.
[2] Modulo hardware bugs -- see the hall of shame in x86_64 fault.c-Andi
-
Well, when we're talking inline asms used for locking, the whole point of
using inline asm is exactly that you cannot do it with regular accesses,
and have to add architecture-specific barriers. If the user gets that
wrong, then it's a user problem, not a compiler issue.So that's not the problem. The problem is if the compiler then does other
things wrong *despite* the inline asm being correct.Linus
-
Bart Van Assche writes:
> Andrew, do you know whether gcc currently contains any optimization
> that interchanges the order of accesses to non-volatile variables
> and function calls ?It sure does.
Andrew.
--
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK
Registered in England and Wales No. 3798903
-
Note that doing so is perfectly fine.
But only for local variables that haven't had their addresses taken.
The fact is, those kinds of variables really *are* special. They are
provably not accessible from any other context, and re-ordering them (or
doing anything AT ALL to them - the most basic and very important
optimization is caching them in registers, of course) is always purely an
internal compiler issue.But if gcc re-orders functions calls with *other* memory accesses, gcc is
totally broken. I doubt it does that. It would break on all but the most
trivial programs, and it would be a clear violation of even standard C.HOWEVER: the bug that started this thread isn't even "reordering
accesses", it's *adding* accesses that weren't there (and please don't mix
this up with "volatile", since volatile is a totally unrelated issue and
has nothing what-so-ever to do with anything).Linus
-
Note that I'm very ambivalent about gcc. I think it's a *great* compiler.
I have my doubts about some of the things it does, but hey, it is an old
project, it has accumulated cruft over time, and cleaning things up is
often almost impossible.And bugs happen. I'm not complaining about that at all either, even if
sometimes a compiler bug is damn frustrating.And the fact is, most of the gcc developers are great.
So my basic worry about gcc is in fact none of the individual technical
problems themselves - those can be fixed. No, the problem I've seen in gcc
is that _some_ of the developers seem to be total d*ckheads when it comes
to "language definition", and seem to think that it's more important to
read the language spec like a lawyer than it is to solve actual user
problems.And that has come up before. It has nothing to do with this particular
"gcc doesn't create thread-safe code" issue. We had the exact same issue
with gcc creating totally unusable code due to having a fundamentally
braindead memory aliasing model. The language-lawyering people basically
could push their *idiotic* model onto gcc, just by quoting the standard,
and not caring about actual users at all.And that's a problem. In the kernel, we've historically always cared a lot
about POSIX and SuS, but while conforming to standards have been primary
goals since pretty much day one (ie I asked around about POSIX before the
very first release, and it's how I met some suckers^Wupstanding citizens
to try those early kernels), it has *always* taken a back seat to things
like compatibility with existing apps.The gcc lists seem to often get to the point where people quote the
standard, and that's that. In that environment, the paper standard (that
hass *nothing* to do with reality) trumps any other argument. "What we do
is _allowed_ by the standard" seems to be a good argument, even if it
breaks real code and there is no sane way to avoid doing it.And I really don't think it's eve...
So we have the great opportunity to change the standard, then
gcc will change ;-)C is in pre-review phase, and it is taking proposal and correction for the C1x
(so in few kernel release ;-) )ciao
cate
-
I see the smiley, but sadly, new standards take ten years or more to
mature. Which means that even if the upcoming one is "perfect", things
will be wrong with it, if only because people will have new usage
scenarios where the standard simply isn't relevant or that it otherwise
just doesn't address, and that then gets us back to the same issues
somewhere else.So it would be much better if developers just didn't think the standard
trumped "real and existing code and problems", and shot down the language
lawyers (and don't get me wrong - it's not just in gcc, btw. We _have_ had
some of the same behavior in the kernel, although I will argue that our
"backwards compatibility trumps pretty much everything else" rules at
least solves _some_ of the problems).Standards are just papers. Yes, they're important, but they are definitely
not more important than anything else, and they are a lot _less_ important
than some people seem to think. Gcc has done more for programming by being
a de-facto standard and widely available, than the _paper_ standards often
ever do!It's also sad that a lot of these things seem to be done in the name of
optimizing code, and then in many cases it drives people *away* from using
that optimizer for anything but benchmarking.In the kernel, we historically used to try for extreme optimizations,
these days we spend more time tuning the optimizations _down_ because they
aren't optimizations at all (ie using -Os instead of -O2), or they were
buggy enough that we have to explicitly disable them (aliasing,
"unit-at-a-time" etc).Linus
-
Linus Torvalds writes:
> So my basic worry about gcc is in fact none of the individual
> technical problems themselves - those can be fixed. No, the problem
> I've seen in gcc is that _some_ of the developers seem to be total
> d*ckheads when it comes to "language definition", and seem to think
> that it's more important to read the language spec like a lawyer
> than it is to solve actual user problems.Well, yeah. I know what you mean. However, at this moment, some gcc
developers are trying really hard not to be total d*ckheads about this
issue, but get gcc fixed. Give us a chance.Andrew.
--
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK
Registered in England and Wales No. 3798903
-
Can we get some kind of consensus that 'optimizations' that add writes to
any object that the programmer might have taken the address of are invalid
on any platform that supports memory protection? That seems like obvious
common sense to me.And it has the advantage that it can't be language-lawyered. There is no
document that states the rational requirements of a compiler that's going to
support a memory protection model. So they can be anything rational people
think they should be.DS
-
David Schwartz writes:
>
> > Well, yeah. I know what you mean. However, at this moment, some
> > gcc developers are trying really hard not to be total d*ckheads
> > about this issue, but get gcc fixed. Give us a chance.
>
> Can we get some kind of consensus that 'optimizations' that add
> writes to any object that the programmer might have taken the
> address of are invalid on any platform that supports memory
> protection?That's what the proposed standard language says, kinda-sorta. There's
an informal description at
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html.Anyway, we have fixed this bug and are committing it to all open gcc
branches. Credit to Ian Taylor for writing the patch.Andrew.
--
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK
Registered in England and Wales No. 3798903
-
There is other important information in the cited text. A.o. it is
explained that register promotion of potentially shared variables can
introduce data races. Or: register promotion can introduce bugs in
multithreaded software when compiled with optimization enabled. Are
there any register promotion transformations implemented in gcc that
can introduce data races in multithreaded software ? This is very
important information both for kernel developers and for developers of
multithreaded userspace applications.Another conclusion from the cited text is that in contrast with what
was stated before on the gcc mailing list, it is not required to
declare thread-shared variables volatile if that thread-shared data is
consistently protected by calls to locking functions.Bart Van Assche.
-
It all depends upon what threading standard you are using. If GCC is going
to support POSIX threading, it cannot require that thread-shared data be
marked 'volatile' since POSIX does not require this.It can offer semantic guarantees for volatile-qualified data if it wants to.
But POSIX provides a set of guarantees that do not require marking data as
'volatile' and if GCC is going to support POSIX threading, it has to support
providing those guarantees.As far as I know, no threading standard either requires 'volatile' or states
that it is sufficient for any particular purpose. So there seems to be no
reason to declare thread-shared variables as
volatile except as some kind of platform-specific optimization.POSIX mutexes are sufficient. They are necessary if there is no other way to
get the guarantees you need. Nothing prevents GCC from providing any
guarantees it wants for 'volatile' qualified data. But POSIX mutexes must
work as POSIX specifies or GCC cannot support POSIX threading.This is the nightmare scenario (thanks to Hans-J. Boehm):
int x;
bool need_to_lock;
pthread_mutex_t mutex;for(int i=0; i<50; i++)
{
if(unlikely(need_to_lock)) pthread_mutex_lock(&mutex);
x++;
if(unlikely(need_to_lock)) pthread_mutex_unlock(&mutex);
}Now suppose the compiler optimizes this as follows:
register=x;
for(int i=0; i<50; i++)
{
if(need_to_lock)
{
x=register; pthread_mutex_lock(&mutex) register=x;
}
register++;
if(need_to_lock)
{
x=register; pthread_mutex_unlock(&mutex); register=x;
}
}
x=register;This is a perfectly legal optimization for single-threaded code. It may in
fact be an actual optimization. Clearly, it totally destroys threaded code.This shows that, unfortunately, the normal assumption that not knowing
anything about the pthread functions ensures that optimizations won't break
them is incorrect.DS
-
Bart Van Assche writes:
> On 10/30/07, Andrew Haley <aph@redhat.com> wrote:
> > David Schwartz writes:
> > >
> > > Can we get some kind of consensus that 'optimizations' that add
> > > writes to any object that the programmer might have taken the
> > > address of are invalid on any platform that supports memory
> > > protection?
> >
> > That's what the proposed standard language says, kinda-sorta. There's
> > an informal description at
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html.
>
> There is other important information in the cited text. A.o. it is
> explained that register promotion of potentially shared variables
> can introduce data races. Or: register promotion can introduce bugs
> in multithreaded software when compiled with optimization
> enabled. Are there any register promotion transformations
> implemented in gcc that can introduce data races in multithreaded
> software ?I expect so. We're going to have to audit this whole enormous code
base to find them all and take them out.Note that some of these optimizations have been around since gcc 3.4.
> This is very important information both for kernel developers and
> for developers of multithreaded userspace applications.> Another conclusion from the cited text is that in contrast with
> what was stated before on the gcc mailing list, it is not required
> to declare thread-shared variables volatile if that thread-shared
> data is consistently protected by calls to locking functions.Well, let's be clear: ISO 9899:1999 doesn't say so, but the proposed
standard language does.Andrew.
--
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK
Registered in England and Wales No. 3798903
-
Has it already been decided who will do this audit, and when this
audit will happen ? Has a target date been set when this audit should
be complete, or is the completion of this audit a requirement for the
release of a specific gcc version ?And if there would exist register promotion transformations in gcc
that can introduce data races, which would be the optimization levels
that enable these transformations ?Bart Van Assche.
-
I am told that the gcc people realized that was indeed a bug (people were
able to show problems even in non-threaded environments with mprotect()),
and have now fixed it in the current gcc sources. That still leaves the
old versions with potential problems, but I think it makes it much less
interesting to audit for these things.Linus
-
What I understood from the gcc mailing list is that a patch has been
applied to the gcc sources that solves the issue with speculative
stores that was already discussed here on the LKML
(http://gcc.gnu.org/ml/gcc/2007-10/msg00554.html).But the issue I am referring to is a different issue: namely that a
compiler optimization called register promotion can introduce data
races. Hans J. Boehm has a clear explanation of this -- see also
paragraph 4.3 in
http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf or
http://portal.acm.org/citation.cfm?id=1064978.1065042 .Bart Van Assche.
-
Linus Torvalds writes:
>
>
> On Sun, 4 Nov 2007, Bart Van Assche wrote:
> >
> > Has it already been decided who will do this audit, and when this
> > audit will happen ? Has a target date been set when this audit
> > should be complete, or is the completion of this audit a
> > requirement for the release of a specific gcc version ?
>
> I am told that the gcc people realized that was indeed a bug
> (people were able to show problems even in non-threaded
> environments with mprotect()), and have now fixed it in the current
> gcc sources. That still leaves the old versions with potential
> problems, but I think it makes it much less interesting to audit
> for these things.We're back-porting the patch to all open branches. However, this
patch only affects one paticular case where gcc introduces a data
race; we're sure there are others not fixed.Andrew.
--
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK
Registered in England and Wales No. 3798903
-
| Davide Libenzi | Re: [patch 7/8] fdmap v2 - implement sys_socket2 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Mariusz Kozlowski | [KJ PATCHES] mostly kmalloc + memset conversion to k[cz]alloc |
git: | |
| KOSAKI Motohiro | [bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin" |
| Stefan Richter | Re: [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
