login
Header Space

 
 

FreeBSD: Fix for Hyper-Threading Vulnerability Considered Non-Trivial

May 30, 2005 - 10:18am
Submitted by njc on May 30, 2005 - 10:18am.
FreeBSD news

Colin Percival continues the discussion regarding the shared-cache vulnerability inherent in multi-core processors [story], offering potential mitigation techniques in the form of fixes to the FreeBSD schedulers. Based on Percival's original discovery, information leakage between threads which share a processor core and the subsequent opportunity to monitor memory access patterns can be prevented by eliminating the co-scheduling of threads that have differing privileges. Additionally, Percival advises that a currently scheduled in-kernel thread should be capable of telling its siblings (who would likely run with the same privileges) to sleep in cases when it is handling sensitive data in a "non-oblivious" manner - IPSec being a good example of this. This would further secure sensitive data from monitoring. For these two solutions, he suggests the use of p_candebug(9) for first and an as yet unimplemented IPI (Interprocessor Interrupt) mechanism for the second:

"1. The scheduler must be taught to not run threads on the same
processor core unless they p_candebug() each other. For reasons
of performance and locking, this is probably best accomplished by
only allowing threads to share a processor core if they belong
to the same process."

"2. When a thread is in the kernel, there must be a mechanism for it to IPI its siblings and put them to sleep, and then wake them
up later. This would be used any time when a thread in the kernel
is about to handle sensitive data in a non-oblivious manner; IPsec
is a good example of where this would be necessary."

Stephen Uphoff, referring to the relative difficulties of accomplishing this work among other scheduler development tasks, comments:

"Currently we don't even know if a thread running on another virtual CPU is in the kernel or not. Throwing preemption, interrupt and kernel threads, pinned threads, priority inheritance and the IPIs in the mix make correct and efficient hyperthreading safe scheduling difficult. This also looks like a lot of work and I am wondering if it would not be better spend on other scheduler improvements."

Other developers comment on the proposed solution and offer alternatives. While there will likely to be more deliberation, the availability of a fix other than disabling HTT may take some time to develop. Interested users should stay tuned. Read on for the partial thread and a link to the full discussion.



From: Colin Percival [email blocked]
To: freebsd-arch
Subject: Scheduler fixes for hyperthreading
Date: 2005-05-21 23:11:07

As you are probably all aware by now, HyperThreading has been
disabled on the stable and security branches due to a problem
with information leakage between threads which are scheduled
simultaneously on the two processor cores.  Clearly, some people
(and at least one large company) are unhappy about us having
hyperthreading disbaled, so the security team would like to see
hyperthreading re-enabled by default as soon as we believe that
this can be done safely.

  The following must be done before hyperthreading is re-enabled:

1. The scheduler must be taught to not run threads on the same
processor core unless they p_candebug() each other.  For reasons
of performance and locking, this is probably best accomplished by
only allowing threads to share a processor core if they belong
to the same process.
2. When a thread is in the kernel, there must be a mechanism for
it to IPI its siblings and put them to sleep, and then wake them
up later.  This would be used any time when a thread in the kernel
is about to handle sensitive data in a non-oblivious manner; IPsec
is a good example of where this would be necessary.

  Does anyone want to step forward to work on this?

Colin Percival

From: Marcel Moolenaar [email blocked] Subject: Re: Scheduler fixes for hyperthreading Date: 2005-05-22 0:34:37 On May 21, 2005, at 4:11 PM, Colin Percival wrote: *snip* > The following must be done before hyperthreading is re-enabled: > > 1. The scheduler must be taught to not run threads on the same > processor core unless they p_candebug() each other. For reasons > of performance and locking, this is probably best accomplished by > only allowing threads to share a processor core if they belong > to the same process. > 2. When a thread is in the kernel, there must be a mechanism for > it to IPI its siblings and put them to sleep, and then wake them > up later. This would be used any time when a thread in the kernel > is about to handle sensitive data in a non-oblivious manner; IPsec > is a good example of where this would be necessary. > > Does anyone want to step forward to work on this? Maybe it's a better idea to describe the problem in much more detail, rather than dictate what you want someone else to do? A pointer to where the problem is described/discussed would do. Just a thought, -- Marcel Moolenaar From: Colin Percival [email blocked] Subject: Re: Scheduler fixes for hyperthreading Date: 2005-05-22 0:49:20 Marcel Moolenaar wrote: > On May 21, 2005, at 4:11 PM, Colin Percival wrote: >> The following must be done before hyperthreading is re-enabled: >> [snip] > > Maybe it's a better idea to describe the problem in much more > detail, rather than dictate what you want someone else to do? > A pointer to where the problem is described/discussed would > do. The problem is described in my paper "Cache missing for fun and profit": http://www.daemonology.net/papers/htt.pdf Put simply, threads which share a processor core can monitor each others' memory access patterns, so we need to ensure that such co-scheduling never happens between threads which have different privileges. The reason I cut through to explaining what needed to be done is that I discussed this at length with several people from the FreeBSD security team before and during BSDCan; but these discussions were obviously not public, so I can't give a reference to them. Colin Percival
From: Marcel Moolenaar [email blocked] Subject: Re: Scheduler fixes for hyperthreading Date: 2005-05-22 1:04:23 On May 21, 2005, at 5:49 PM, Colin Percival wrote: > Marcel Moolenaar wrote: >> On May 21, 2005, at 4:11 PM, Colin Percival wrote: >>> The following must be done before hyperthreading is re-enabled: >>> [snip] >> >> Maybe it's a better idea to describe the problem in much more >> detail, rather than dictate what you want someone else to do? >> A pointer to where the problem is described/discussed would >> do. > > The problem is described in my paper "Cache missing for fun and > profit": > http://www.daemonology.net/papers/htt.pdf Thanks. > Put simply, threads which share a processor core can monitor each > others' > memory access patterns, so we need to ensure that such co-scheduling > never > happens between threads which have different privileges. I'll be studying your paper to see if it can be shown that the HT implementation in Itanium is affected as well. If not, any solution must be sufficiently machine dependent. > The reason I cut through to explaining what needed to be done is that > I discussed this at length with several people from the FreeBSD > security > team before and during BSDCan; but these discussions were obviously not > public, so I can't give a reference to them. I can only assume that the discussion was i386 centric (as this is typically the case). Hence my request for a problem description. -- Marcel Moolenaar From: Colin Percival [email blocked] Subject: Re: Scheduler fixes for hyperthreading Date: 2005-05-22 1:59:36 Marcel Moolenaar wrote: > On May 21, 2005, at 5:49 PM, Colin Percival wrote: >> Put simply, threads which share a processor core can monitor each others' >> memory access patterns, so we need to ensure that such co-scheduling >> never >> happens between threads which have different privileges. > > I'll be studying your paper to see if it can be shown that the HT > implementation in Itanium is affected as well. My understanding is that there are no currently released ia64 processors with hyperthreading support, but that some future ia64 processor(s) are likely to be affected. > I can only assume that the discussion was i386 centric (as this is > typically the case). Hence my request for a problem description. In addition to i386 and amd64, which are certainly affected, and ia64, which will probably be affected, there is a good chance that some powerpc processors are affected... the problem is a general one with shared caches and probably affects all currently existing simultaneous multithreading processors. I think the "right solution" is to make the basic functionality machine independent, but have the machine dependent initialization code determine which sets of threads share caches. Colin Percival
From: Stephen Uphoff [email blocked] Subject: Re: Scheduler fixes for hyperthreading Date: 2005-05-22 2:44:25 On Sat, 2005-05-21 at 19:11, Colin Percival wrote: > As you are probably all aware by now, HyperThreading has been > disabled on the stable and security branches due to a problem > with information leakage between threads which are scheduled > simultaneously on the two processor cores. Clearly, some people > (and at least one large company) are unhappy about us having > hyperthreading disbaled, so the security team would like to see > hyperthreading re-enabled by default as soon as we believe that > this can be done safely. > > The following must be done before hyperthreading is re-enabled: > > 1. The scheduler must be taught to not run threads on the same > processor core unless they p_candebug() each other. For reasons > of performance and locking, this is probably best accomplished by > only allowing threads to share a processor core if they belong > to the same process. > 2. When a thread is in the kernel, there must be a mechanism for > it to IPI its siblings and put them to sleep, and then wake them > up later. This would be used any time when a thread in the kernel > is about to handle sensitive data in a non-oblivious manner; IPsec > is a good example of where this would be necessary. > > Does anyone want to step forward to work on this? > > Colin Percival While I have been to your talk I have not read your paper yet and the following may be totally uninformed (Please be gentle :-) : Would it be enough to disable access to RDTSC for user processes? I believe the attack needs a very exact time source. Beside benchmarking - is there any other real use for RDTSC ? Is there any use of RDTSC that system requiring the security cannot live without? (We could even try to emulate the instruction if we really need to) I have to think more about possible scheduler changes. Currently we don't even know if a thread running on another virtual CPU is in the kernel or not. Throwing preemption, interrupt and kernel threads, pinned threads,priority inheritance and the IPIs in the mix make correct and efficient hyperthreading safe scheduling difficult. This also looks like a lot of work and I am wondering if it would not be better spend on other scheduler improvements. Stephan From: Bruce M Simpson [email blocked] Subject: Re: Scheduler fixes for hyperthreading Date: 2005-05-22 2:59:15 On Sat, May 21, 2005 at 10:44:25PM -0400, Stephan Uphoff wrote: > Beside benchmarking - is there any other real use for RDTSC ? > Is there any use of RDTSC that system requiring the security cannot live > without? (We could even try to emulate the instruction if we really need > to) A number of ports use RDTSC for high-resolution timing. The most obvious examples being machine emulators mostly used for gaming (UAE and MAME spring to mind, possibly also dosbox and others). I daresay VMware probably uses RDTSC too. BMS



Related Links:

Confused

May 31, 2005 - 9:06am

So this hardware bug inherent to a specific set of processors, which requires a significant, non-trivial fix to make safe in FreeBSD, doesn't exist in Linux on the same hardware? How's that work?

re: Confused

May 31, 2005 - 10:18am
animus (not verified)

I'm confused as well. There's so much misinformation blowing around on this one that it's hard to know what to think.

I was under the impression that linux was equally affected, but since it's a hard bug to exploit, Linus doesn't really care too much.

OSnews didn't make it any better by posting a completely bogus story about Linux not being affected -- if you read that, forcefully rm -rf it from your memory.

However, I'd certainly like to see a proof of concept showing that the 2.6.x kernel is affected -- just to either prove linux is immune, or to shutup the fan boys that are deliberately spreading around lies.

Linus cares. He just thinks t

May 31, 2005 - 2:10pm
Anonymous23 (not verified)

Linus cares. He just thinks that this is a userspace issue.

PS: Funny - you sound like a fanboy too.

userspace issue?!? scheduling

May 31, 2005 - 2:29pm
Anonymous

userspace issue?!? scheduling in a monolithic kernel is done in the kernel.... ...... some microkernels too (micro/mono hybrids?)

Why don't you just read the m

May 31, 2005 - 3:05pm
Anonymous23 (not verified)

Why don't you just read the mails?
There is absolutely no requirement for schedulers to protect against timing attacks. This is orthogonal to whether scheduling decisions are done in a monolithic kernel or some scheduling server.

I think you missed the "under

May 31, 2005 - 3:59pm
animus (not verified)

I think you missed the "under the impression" part.

This was the impression I was given based on the things I've read. It was not meant to be read out of context as "Linus doesn't care", because obviously not knowing him personally I can't honestly say that -- nor would I. The point of my post was to admit my confusion on the matter and hope someone had something of value to say about it.

Fanboy for what? Linux? BSD? Windows? Mac OS X? QNX? AmigaOS? GEOS?

Look man, you need to troll somewhere else; kerneltrap is for grownups.

Look man, you need to troll s

May 31, 2005 - 5:35pm
Anony (not verified)

Look man, you need to troll somewhere else; kerneltrap is for grownups.

Must resist feeding the trolls! Arrrghh.. dammit....
So what are you doing here?
;-)

Also a "problem" under Linux

May 31, 2005 - 1:04pm
Inhibit (not verified)

It's an issue on Linux as well and possibly (from my limited understanding) any (monolithic?) OS running on the hyperthreading model hardware. Check out the kerneltrap entry on that subject.

Linus expressed doubt over it being an exploitable flaw (or at least that it has been present and unexploited for some time in other forms) and other people weighed in.

Re: Also a "problem" under Linux

May 31, 2005 - 8:53pm
Brendan (not verified)

It would apply to all OSs that support hyper-threading (including microkernels). This includes Linux kernel version 2.6 (despite at least one article on the 'net claiming otherwise. No OS has a fix for it yet (despite at least one article on the 'net claiming FreeBSD was the first to fix it - they only disabled hyper-threading, which isn't a fix).

Judging from the full FreeBSD thread archive (linked to in this article) it might take a while for them to realize that a fix for the problem should come from the memory manager rather than the scheduler - the attack relies on cache timing, so I expect it's possible to disable caching for pages that contain data used for key generation and forget about timing and the scheduler.

This would require changes to libraries, but wouldn't significantly effect performance, and they can add support for VIA's new CPU (with CPU accelerated encryption, etc) while they're at it...

IMHO most schedulers that support multi-CPU and hyper-threading are complicated enough already (while most memory managers probably already support uncached page types)...

It's potentially a problem on

June 1, 2005 - 9:11am

It's potentially a problem on any OS that supports HT, regardless of OS structure used.

As you say, similar attacks have been around for a while but a) not with such high bandwidth and b) with it being (potentially) much harder to extract information from a non-co-operating process.

Linux: Bug is not really exploitable

June 1, 2005 - 5:25am

As has been reported and discussed on kerneltrap and other sites at length, the Linux kernel community currently does not see this bug as dangerous. The consensus it that while the possibility to exploit this theoretically exists, the timing constraints are too strict, and scheduling decisions by the kernel are too hard to influence effectively for this bug to actually be used in practice. One requirement would be, for example, that the only threads in runnnable state are the attacker and the attacked. If any other thread gets the processor even once during the attack, you have to start over. There are more prerequisites, this is just one.

The recommendation here is that most people shouldn't worry and those few that do should disable SMT. Especially desktop systems gain from SMT (not throughput-wise, but latency-wise), and that a general deactivation of SMT is Not Worth It (TM).

Not exploitable?

June 3, 2005 - 3:55am
Wol (not verified)

The problem with that attitude is that there are at least TWO exploits in existence, and Colin has stated that at least one of them works pretty effectively.

It's all very well saying "I don't believe it can be done", but when your opponent says "fine, I'll show you, it's easy", you only look a fool if you repeat yourself.

Btw, I do agree with Linus it's a user-space problem - a simple change to cryptography algorithm would stop the leak and it's in the algorithm that the fix belongs - but to simply say "I don't believe it's a problem" in the face of considerable evidence that the hack does work is simply stupidity.

Cheers,
Wol

It's not a hardware bug. Rather it's unintended consequences.

June 1, 2005 - 1:37pm

The issue here is that HT processors reuse the L1 and L2 caches between both simultaneously-executing threads. Encryption and security libraries often use lookup tables for s-boxes and key expansion. The access pattern within these lookup tables can be both key and data dependent.

A carefully written program can use the high-precision time-stamp counter to determine which of its memory accesses hit or miss the L1 data cache. By doing so carefully, it can determine what the access pattern of other simultaneously-executing tasks are. Based on that information and knowledge of the encryption implementation, an attacker can gain enough insight to dramatically reduce the search space when trying to hack someone's key.

In a single CPU enviromnent, enough code runs during a context switch that L1 is sufficiently polluted prior to running the "observer" task. In other words, what little information was available from observing L1 hit/miss patterns is obscured in quite a bit of noise.

In an HT environment, two tasks run simultaneously on the same CPU without an intervening context switch. Thus, there is effectively no intervening pollution between the encryption task and the observer task. Thus, in an HT environment, the attack is much likelier to succeed, since it can get very high quality data on what other access patterns occur in parallel w/ the observer.

This same mechanism can be used by two cooperating processes to provide a "covert channel"--a mechanism for two tasks to communicate without having an obvious link (e.g. a socket, file, shared-memory pool, or signals) between them. Suitably obscured, covert channels can be hard to find in a code audit, and nearly impossible to detect in a live binary-only setting. (Well, at least the sending side--the receiver should be obvious w/ all the references to the time stamp counter.)

Some people complain that "oh, this is CPU dependent--it won't work unless you know exactly what you're running on." True, but the specs for Intel processors' caches, including their line sizes, associativity, and so on are quite well known. And CPUID will tell you all sorts of useful information to let you know exactly what you're running on.

It appears Colin's approach is to avoid coexecution of the encryption task with anything that might be an observer task. (Or a "sending" task in parallel with a potential "receiving" task.) That's possibly the "right approach" if you're looking to solve the problem generically.

Linus' position has been (so far as I've seen) that the crypto problem could be solved in userspace by changing the crypto libraries. If you modify the libraries to "noise up" their cache footprint, you eliminate the data the observer process looks for. Alternately, if you modify how you use lookup tables so that the access pattern isn't so key or data dependent, you're safe. Such an approach requires proactive work on the part of the crypto writer. It also doesn't stop tasks from explicitly using this mechanism for communication, as in the case of a covert channel.

"This would further secure se

June 1, 2005 - 5:27am
rUdE_tUrNiP (not verified)

"This would further secure sensitive data from monitoring. For these two solutions, he suggests the use of p_candebug(9) for first and an as yet unimplemented IPI (Interprocessor Interrupt) mechanism for the second:"

OT: Doesn't FreeBSD 5.x have enough IPI mechanisms as it is? A half-dozen at last count... last I checked, you can do wonders with only one...

Re: OT - IPI mechanisms

June 1, 2005 - 2:45pm

Well, I guess the short answer to that (rhetorical) question is - if the requirement for a new feature necessitates an IPI mechanism due to the design choices of the OS at question, then no - in this case, they don't have enough IPI mechanisms. :) The architecture of the MP locking/mutex model in FreeBSD makes those IPIs "necessary" - the design choices made in 5.x require them.

Strictly speaking, I'm not sure that the present *quantity* of IPIs has anything to do with whether or not its "good" to add more. I say that without knowing whether other OSs use many IPI mechanisms or just a single IPI subsystem. Is there a theoretical or anecdotal point where you really do have too many? I don't know. How is this done in Linux, Solaris, Windows, Netware...? I'd be interested in a clear explanation if anyone happens to know. In FreeBSDs case, the crux of the issue - when too many IPIs is a "bad thing" - is that the half-dozen IPIs you're referring to are complex, suffer from occasional deadlocking issues, inefficiencies, and the time to debug them has slowed progress of development in other areas. That's what the standing argument is/was anyway.

A single IPI mechanism/messaging subsystem may do "wonders", but I assume you're concluding that from Matt Dillon's position on the matter. I'm not inclined to agree or disagree, but there has been no formal testing to prove whether this is true and there hase been no serious attempt to port the work from DFly over to fbsd's -current tree. It's a lot more complex than it sounds, and I'm not sure it would succeed without the willing participation of many major FreeBSD developers. It seems like that would be very far off if it were to ever happen at all. I'd definitely like to see it pursued at some point because the selling points are hard to resist, on the other hand, I'd also like to see how the current model shakes out as well.

I haven't had much time to play with DragonFly lately, maybe the proof is there?

Linux fix

June 4, 2005 - 4:58am

Fixing it in linux is possible using hyperthreading code we already have:
http://marc.theaimsgroup.com/?l=linux-kernel&m=111762732127504&w=2
A combination of that patch I submitted with a prctl and then using a wrapper on secure apps (gpg, ssh etc) would be enough, and the userspace apps wouldn't need to be modified at all; just run via the wrapper.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
speck-geostationary