Hi all, Since version 4.3, gcc changed its behaviour concerning the x86/x86-64 ABI and the direction flag, that is it now assumes that the direction flag is cleared at the entry of a function and it doesn't clear once more if needed. This causes some problems with the Linux kernel which does not clear the direction flag when entering a signal handler. The small code below (for x86-64) demonstrates that. If the signal handler is using code that need the direction flag cleared (for example bzero() or memset()), the code is incorrectly executed. I guess this has to be fixed on the kernel side, but also gcc-4.3 could revert back to the old behaviour, that is clearing the direction flag when entering a routine that touches it until most people are running a fixed kernel. Kind regards, Aurelien [1] http://gcc.gnu.org/gcc-4.3/changes.html #include <stdint.h> #include <stdlib.h> #include <stdio.h> #include <signal.h> void handler(int signal) { uint64_t rflags; asm volatile("pushfq ; popq %0" : "=g" (rflags)); if (rflags & (1 << 10)) printf("DF = 1\n"); else printf("DF = 0\n"); } int main() { signal(SIGUSR1, handler); while(1) { asm volatile("std\r\n"); } return 0; } -- .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73 : :' : Debian developer | Electrical Engineer `. `' aurel32@debian.org | aurelien@aurel32.net `- people.debian.org/~aurel32 | www.aurel32.net --
The Linux kernel currently does not clear the direction flag before calling a signal handler, whereas the x86/x86-64 ABI requires that. This become a real problem with gcc version 4.3, which assumes that the direction flag is correctly cleared at the entry of a function. This patches changes the setup_frame() functions to clear the direction before entering the signal handler. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> --- arch/x86/ia32/ia32_signal.c | 4 ++-- arch/x86/kernel/signal_32.c | 4 ++-- arch/x86/kernel/signal_64.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c index 1c0503b..5e7771a 100644 --- a/arch/x86/ia32/ia32_signal.c +++ b/arch/x86/ia32/ia32_signal.c @@ -500,7 +500,7 @@ int ia32_setup_frame(int sig, struct k_sigaction *ka, regs->ss = __USER32_DS; set_fs(USER_DS); - regs->flags &= ~X86_EFLAGS_TF; + regs->flags &= ~(X86_EFLAGS_TF | X86_EFLAGS_DF); if (test_thread_flag(TIF_SINGLESTEP)) ptrace_notify(SIGTRAP); @@ -600,7 +600,7 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, regs->ss = __USER32_DS; set_fs(USER_DS); - regs->flags &= ~X86_EFLAGS_TF; + regs->flags &= ~(X86_EFLAGS_TF | X86_EFLAGS_DF); if (test_thread_flag(TIF_SINGLESTEP)) ptrace_notify(SIGTRAP); diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c index caee1f0..0157a6f 100644 --- a/arch/x86/kernel/signal_32.c +++ b/arch/x86/kernel/signal_32.c @@ -407,7 +407,7 @@ static int setup_frame(int sig, struct k_sigaction *ka, * The tracer may want to single-step inside the * handler too. */ - regs->flags &= ~TF_MASK; + regs->flags &= ~(TF_MASK | X86_EFLAGS_DF); if (test_thread_flag(TIF_SINGLESTEP)) ptrace_notify(SIGTRAP); @@ -500,7 +500,7 @@ static int setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, * The tracer may want to single-step inside the * handler too. ...
Acked-by: H. Peter Anvin <hpa@zytor.com> --
Linux should definitely follow the ABI. This is a bug, and a pretty serious such. -hpa --
Unfortunately, there are a lot of kernels out there already with this problem, and the symptoms are likely to be subtle. So even if it is true that it is the kernel that is "in the wrong", I think we still are going to need to give users a workaround from the gcc side as well. So I think gcc at least needs an *option* to revert to the old behavior, and there's a good argument to make it the default for now, at least for x86/x86-64 on Linux. --
And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0, they have the same behaviour as Linux, that is they don't clear DF before calling the signal handler. I also tested Hurd, and it causes a kernel crash. -- .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73 : :' : Debian developer | Electrical Engineer `. `' aurel32@debian.org | aurelien@aurel32.net `- people.debian.org/~aurel32 | www.aurel32.net --
Hi, Sigh. We could perhaps insert a cld for all functions which can be recognized as possible signal handlers and call other unknown or string functions. But it's probably even faster to emit cld in front of the inline copies of mem functions again :-( Ciao, Michael. --
Yes, if there are four kernels that get it "wrong", that effectively means that the ABI document doesn't describe reality and gcc has to adjust. --
Kernels almost never follow ABI used by applications to last detail. Linux kernel is disabling red zone and use kernel code model, yet the ABI is not going to be adjusted for that. This is resonably easy to fix on kernel side in signal handling, or by removing std usage completely (I believe it is not performance win, but some benchmarking would be needed to double check) Honza --
That's not the issue. The issue is that the kernel leaks the DF from the code that took a signal to the signal handler. -hpa --
Hi, But we aren't talking about the ABI the kernel uses internally, but about what is exposed to user-space via signal handlers. _That_ part needs to be followed, and if it isn't it's a serious problem we somehow have to That is true. But it requires updating the kernel to a fixed one if you want to run your programs compiled by 4.3 :-/ Not something we'd like to demand. Ciao, Michael. --
I changed the title just for emphasis. I think that we can't ship 4.3.0 if signal handlers on x86/x86_64 platforms for both Linux and BSD systems will mysteriously (to the users) fail, and it doesn't matter whose fault it is. --
We didn't yet run into this issue and build openSUSE with 4.3 since more than three month. Richard. --
Well, how often do you take a trap inside an overlapping memmove()? -hpa --
Also, would it be possible to produce an exploit? If you can get string instructions to work "the wrong way", you might be able to overwrite data. "We haven't seen a problem" isn't the right answer. Can someone deliberately *create* a problem? And if we aren't sure, we should err on the side of safety. --
Oh, you mean releasing a kernel security update? ;) What does ICC or other compilers do? Richard. --
Right. So this problem is over-exaggerated. It's not like "any binary you create on that system will be broken on any other existing system." Richard. --
From: "Richard Guenther" <richard.guenther@gmail.com> I will be sure to hunt you down to help debug when someone reports that once every few weeks their multi-day simulation gives incorrect results :-) This is one of those cases where the bug is going to be a huge issue to people who actually hit it, and since we know about the problem, knowingly shipping something in that state is unforgivable. --
In this case, it appears that the 4.3.0 tarballs already hit the servers before the issue was discovered. It's not the end of the world; quite often .0 releases have some issue, and we can patch the 4_3 branch and recommend that the patch be used. --
Hi, Give them a LD_PRELOADable DSO that intercepts signal() and sig_action(), installing a trampoline that first does "cld" and then jumps to the real handler. That ought to be a quick test if it's this or a different issue. Many bugs are a big issue to people who actually hit them, and we had (and probably still have) far nastier corner case miscompilations here and there and nevertheless released. It never was the end of the world :) Let it be a data point that nobody noticed the problem which existed in trunk since more than a year (since 2006-12-06 to be precise). Ciao, Michael. --
This is the sort of stuff that security holes are made from. -hpa --
Hi, For security problems I prefer fixes over work-arounds. The fix lies in the kernel, the work-around in gcc. Ciao, Michael. --
From: Michael Matz <matz@suse.de> This depends upon how you interpret this ABI situation. There is at least some agreement that how things have actually been implemented by these kernels for more than 15 years trumps whatever a paper standard states. --
We had a similar argument about the undefinedness of signed int overflow. That's what the standard says, yet code that assumes otherwise is pervasive, including in gcc itself. If a standard is widely violated in a very consistent way, the violation in effect becomes standard. --
Incorrect. The bugs are in the ABI documentation and in gcc, and the fixes should be done there. Doing it in the kernel is the workaround. OG. --
That was the state with older gcc, but with newer gcc it does not necessarily
reset the flag before the next function call.
so e.g. if you have
memmove(...)
for (... very long loop .... ) {
/* no function calls */
/* signals happen */
}
the signal could see the direction flag
-Andi
--
memmove is supposed to (and does) do a cld insn after it finishes the backward copying. Jakub --
You can still take a signal inside memmove() itself, of course. -hpa --
The problem can be easily reproduced by using a glibc built with gcc 4.3, with SBCL (the gcc version doesn't matter). The signal handler in SBCL calls sigemptyset() which uses memset(). -- .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73 : :' : Debian developer | Electrical Engineer `. `' aurel32@debian.org | aurelien@aurel32.net `- people.debian.org/~aurel32 | www.aurel32.net --
There are already gcc 4.3.0 packages on the FTP site. Sent from my iPhone --
Hi, FWIW I don't think it's a release blocker for 4.3.0. The error is arcane and happens seldomly if at all. And only on unfixed kernels. A program needs to do std explicitely, which most don't do _and_ get hit by a signal while begin in a std region. This happens so seldom that it didn't occur in building the next openSuSE 11.0, and it continually builds packages with 4.3 since months. It should be worked around in 4.3.1 if at all. Ciao, Michael. --
OK, I suppose that I over-reacted, and it seems that the ship has sailed in any case. I agree that it's obscure, and I think that the only reason to worry is if it introduces a means of attack, which seems unlikely. --
From: Michael Matz <matz@suse.de> Which translates right now into "all kernels." --
How would you know whether it has happened? OG. --
The same way you do with other bugs: You would observe unexpected behavior. In this case probably either corrupted memory or a SIGSEGV. David Daney --
So that probably means the programs you use for compiling packages probably aren't hit. Doesn't mean the packages you've compiled with it aren't hit. Compiling packages doesn't test what's in them at all. It's extremely rare, no doubt about it. It's just that it *yells* security issue in the making. It's not a source bug, i.e. not easily reviewable. It's related to signal handlers which are the mark of a server and/or more failure-conscious program than usual. It's obscure (breaking a stringop, probably memset, or a not-paranoid-enough inline asm in a signal handler through a running memmove in the main program, oh my) but reasonably predictable for someone looking for an exploitable flaw. It's gcc's job to adapt to the realities of its running environment, not the other way around. OG. --
I smell a false dilemma here. The problem doesn't have to be fixed/worked-around in either the kernel or GCC. Per your argument, one might claim it's the userland library's, or even the application's job to adapt to the realities of its running environment. GCC doesn't know what functions are signal handlers to insert cld in them. How could it fix the problem, then? How could it possibly fix custom assembly? How could it possibly fix object code containing signal handlers, compiled by other compilers? A userland system library, in theory, knows what functions are signal handlers. It could wrap function pointers passed as arguments to signal() such that they get cld. But then, applications that couldn't care less about this would take a hit. Applications, on the other hand, know when they might need cld. So, per your argument, they should adapt to the realities of their running environment, and add asm("cld"); to signal handlers that might need it. At times, it may be hard for them to know whether they need it, because too many factors may affect this need. E.g.: - if the kernel does cld for them, then they don't need it. But that's a run-time property, so it can't be tested at build time: the code may run on a different kernel that doesn't do it. - if none of the libraries they use mess with this flag, or none of the libraries they use from signal handlers depend on this flag, then they don't need it. But then, again, libraries may vary over time, and you can't assume the (dynamic) library that's available at build time will behave the same way at run time. So an application would have to do it conservatively, adding cld to their signal handlers just in case. But then, it would be more convenient if the library did it. And then, by the same argument, it would be more convenient if the kernel did it. (Compiler can't do it, since it doesn't know what's a signal handler in the general case.) And that's an argument to support the ABI specs as ...
Well, there is a (slight) difference: you know that a called function will not clobber your DF state; it's only the entry condition which is imprecise. The best would be if this could be controlled by a flag, which we can flip once kernel fixes has been around for long enough. -hpa --
I have to agree there. Whatever the decision that gcc will take, distributions will reenable the old behaviour for some time for to allow upgrades from a previous version. Providing a flag to switch the behaviour (whatever the default behaviour) will help a lot. -- .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73 : :' : Debian developer | Electrical Engineer `. `' aurel32@debian.org | aurelien@aurel32.net `- people.debian.org/~aurel32 | www.aurel32.net --
From: Aurelien Jarno <aurelien@aurel32.net> I don't think this approach is tenable. If a distribution should ship with a "fixed" kernel and compiler enabling the new direction flag behavior, any binary you create on that system will be broken on any other existing system. I think we really are stuck with this forever, overwhelming practice over the past 15 years has dictated to us what the real ABI is. --
I think you've got the timescales wrong. Anything that we do now in gcc will take a while to percolate to the Linux distributions. It is far quicker for those distributions to fix their kernels as fast as possible. By the time any gcc fix is in the world all of this will be over. I suppose one could apply the precautionary principle, but those systems that don't update kernels won't update gcc either, so the solution won't work. Andrew. --
You seem to assume that running a gcc 4.3 compiled binary requires a gcc update. That is not necessarily true. -Andi --
It (sometimes) requires a libgcc and libstdc++ update. Richard. --
"Sometimes" is correct; many users commonly run newer compilers on older distros, with LD_LIBRARY_PATH set to pick up the correct C++ support library. This is particularly common on servers, where you don't want to mess with a working system but you might need to run newer code. So, we've been arguing for a while, so the question is what to do. Using a principle based on the old IETF concept of being liberal in what you accept, and conservative in what you send, I think that both the Linux kernel and gcc should fix the problem. The kernel should fix the information leak, and gcc should remove the assumption that the direction flag is set in a given direction on function entry. The gcc patch will be too late for 4.3.0, but it would be on the 4.3 branch, and we would recommend that distros pick it up for any compilers they ship. --
A patched GCC IMHO makes only sense if it is always-on, yet another option won't help in corner cases. And corner cases is exactly what people seem to care about. For this reason that we have this single release, 4.3.0, that behaves "bad" is already a problem. Richard. --
The option will help embedded vendors who can guarantee that it's not a problem. -hpa --
For very very low values of "help". To be realistic it is very unlikely anybody will measure a difference from a few more or a few less clds in a program. It's not that they're expensive instructions and they normally don't happen in inner loops either. "If you enable this option you will get an optimization that you cannot measure" @) -Andi --
They aren't? According to http://www.agner.org/optimize/instruction_tables.pdf , they have a latency of 52 cycles on at least one popular x86 chip. -Chris --
Hi, Only an assumption, and in fact wrong. See upthread for a benchmark. IIRC Uros also made measurements to justify the removal of cld (on P4 I think), where it helps tremendously on small memcpy loops. Ciao, Michael. --
Aurelien Jarno writes: > On Wed, Mar 05, 2008 at 11:58:34AM -0800, Joe Buck wrote: > > > > Aurelien Jarno wrote: > > > >Since version 4.3, gcc changed its behaviour concerning the x86/x86-64 > > > >ABI and the direction flag, that is it now assumes that the direction > > > >flag is cleared at the entry of a function and it doesn't clear once > > > >more if needed. > > > >... > > > >I guess this has to be fixed on the kernel side, but also gcc-4.3 could > > > >revert back to the old behaviour, that is clearing the direction flag > > > >when entering a routine that touches it until most people are running a > > > >fixed kernel. > > > > On Wed, Mar 05, 2008 at 08:00:42AM -0800, H. Peter Anvin wrote: > > > Linux should definitely follow the ABI. This is a bug, and a pretty > > > serious such. > > > > Unfortunately, there are a lot of kernels out there already with this > > problem, and the symptoms are likely to be subtle. So even if it is true > > that it is the kernel that is "in the wrong", I think we still are going > > to need to give users a workaround from the gcc side as well. > > > > So I think gcc at least needs an *option* to revert to the old behavior, > > and there's a good argument to make it the default for now, at least for > > x86/x86-64 on Linux. > > And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0, > they have the same behaviour as Linux, that is they don't clear DF > before calling the signal handler. FWIW, Solaris 10 (both 32- and 64-bit) gets it right. --
Hi, According to i386 psABI, --- The direction flag must be set "forward" direction before entry and upon exit from a function. --- So, asm statement should make sure that the direction flag is cleared before function returns and kernel should make sure that the direction flag is cleared when calling a signal handler. H.J. --
