I know Jeff Garzik says he's not interested in an anti-entropy pogrom for existing net drivers, but here is the patch if anyone else is interested..? :)
Only 12 net drivers are affected, the last of the theoretically-exploitable network entropy.
drivers/net/3c523.c
drivers/net/3c527.c
drivers/net/atlx/atl1.c
drivers/net/cris/eth_v10.c
drivers/net/ibmlana.c
drivers/net/macb.c
drivers/net/mv643xx_eth.c
drivers/net/netxen/netxen_nic_main.c
drivers/net/niu.c
drivers/net/qla3xxx.c
drivers/net/tg3.c
drivers/net/xen-netfront.c
chris
Signed-off-by: Chris Peterson <cpeterso@cpeterso.com>
---
diff -uprN linux-2.6.26-rc2-git3.orig/drivers/net/3c523.c linux-2.6.26-rc2-git3/drivers/net/3c523.c
--- linux-2.6.26-rc2-git3.orig/drivers/net/3c523.c 2008-05-13 22:37:06.000000000 -0700
+++ linux-2.6.26-rc2-git3/drivers/net/3c523.c 2008-05-13 23:11:58.000000000 -0700
@@ -289,8 +289,7 @@ static int elmc_open(struct net_device *
elmc_id_attn586(); /* disable interrupts */
- ret = request_irq(dev->irq, &elmc_interrupt, IRQF_SHARED | IRQF_SAMPLE_RANDOM,
- dev->name, dev);
+ ret = request_irq(dev->irq, &elmc_interrupt, IRQF_SHARED, dev->name, dev);
if (ret) {
printk(KERN_ERR "%s: couldn't get irq %d\n", dev->name, dev->irq);
elmc_id_reset586();
diff -uprN linux-2.6.26-rc2-git3.orig/drivers/net/3c527.c linux-2.6.26-rc2-git3/drivers/net/3c527.c
--- linux-2.6.26-rc2-git3.orig/drivers/net/3c527.c 2008-05-13 22:37:06.000000000 -0700
+++ linux-2.6.26-rc2-git3/drivers/net/3c527.c 2008-05-13 23:12:15.000000000 -0700
@@ -434,7 +434,7 @@ static int __init mc32_probe1(struct net
* Grab the IRQ
*/
- err = request_irq(dev->irq, &mc32_interrupt, IRQF_SHARED | IRQF_SAMPLE_RANDOM, DRV_NAME, dev);
+ err = request_irq(dev->irq, &mc32_interrupt, IRQF_SHARED, DRV_NAME, dev);
if (err) {
release_region(dev->base_addr, MC32_IO_EXTENT);
printk(KERN_ERR "%s: unable to get IRQ %d.\n", DRV_NAME, dev->irq);
diff -uprN linux-2.6.26-rc2-git3.orig/drivers/net/atlx/atl1.c ...On Thu, 15 May 2008 00:11:10 -0700 (PDT) Looks fine to me. If Jeff doesn't want to touch them then send them direct to Andrew/Linus. A more interesting alternative might be to mark things like network drivers with a new flag say IRQF_SAMPLE_DUBIOUS so that users can be given a switch to enable/disable their use depending upon the environment. Alan --
we've been hearing rumblings of big customers wanting (maybe requiring) wired network drivers from Intel to advertise this flag. Jeff have you heard of such? I think the argument is that a headless system (no keyboard/mouse, no soundcard, probably no video) with a libata based driver and a network driver without IRQF_SAMPLE_RANDOM has *no* sources of entropy. In this case the argument is very strong for at least *some* source of entropy from interrupts so that randomness can get some external input. Just try rebuilding a kernel RPM over an ssh session and you'll see what I mean. In short, I agree with Alan's IRQF_SAMPLE_DUBIOUS, and know of Linux customers who also want the same. --
They should be made to read the Debian ssh security report - three times and understand the same would apply to them if something did cause their network packet arrivals to be observed or non-random Far better would be to get your CPU guys to put an RNG back into the systems or on the CPU die ala VIA. Given I've even seen people using VIA boxes as a random number feeder (streaming random numbers over SSL) there is clearly a demand 8) Alan --
The Treacherous Platform Module includes an RNG. Someone (hi Jesse?) should implement support for TPM_GetRandom. All the specs are public, and the hardware is already in users' hands. Jeff --
Sounds like something he should neither use in the e1000 driver nor implement :) This would be an interesting thing to the generic rng support in linux though Auke --
That's what I meant. Support should be implemented in the appropriate place in order to solve the problem Jesse's complaining about. That appropriate place being drivers/char/hw_random/ Jeff --
I will not pretend to understand everything mentioned, but having web searched on "TPM RNG" I came across this: http://www.mail-archive.com/cryptography@metzdowd.com/msg06299.html which may either encourage or discourage depending on one's point of view. rick jones --
Yep, in theory you can do it in userspace right now, with zero kernel modifications. But just my gut feeling about the Treacherous Platform Module makes me think we should have a kernel driver, for both ease of use, and ease of replacement should that usage turn out to be unwise. Jeff --
Here's an example patch (compile-tested only) to get people started. This function calls the TPM command, and returns TPM header + RNG data in the supplied buffer. A hw_random driver for TPM still needs to (a) parse the TPM header for return code, (b) extract RNG bytes out at offset 14, and (c) figure out some way to get a tpm_chip pointer. Spec at https://www.trustedcomputinggroup.org/specs/TPM/TCPA_Main_TCG_Architecture_v1_1b.pdf describes TPM_GetRandom on page 215. Jeff
(d) auto feed the information into random.c. Otherwise it'll be useless for most people. -Andi --
On Fri, 16 May 2008 02:27:36 +0200 No - you don't want to do FIPS randomness verification in kernel space. Plus all the other random generator inputs are done via the user space daemon as they should be. --
Just think a little bit: system has no randomness source except the hardware RNG. you do your strange randomness verification. if it fails what do you do? You don't feed anything into your entropy pool and all your random output is predictable (just boot time) If you add anything predictable from another source it's still predictable, no difference. Also in general what happens in the hypothetical case that your random generator e.g. generates all zeros (which is very unlikely but let's assume it): your entropy doesn't get significantly worse than it was before. Previously it was just seeded with the boot time (or other sources) and now you're adding some zeroes. The output is still as random as the previous state. While that changes the state of the entropy pool it doesn't make it any easier to predict. The only problem you got from possible bogus input is that the entropy counts will be wrong, but in my experience nearly all programs use /dev/urandom anyways because /dev/random is just a DoS waiting to happen and user space programmers know that. Basically with this insisting on FIPS you're violating the strong variant of Steinbach's rule: not only "never test for an error condition you don't know how to handle", but "never test for an error condition you can't handle" Also why do you not trust your random generator but trust your CPU to correctly execute the cryptographic algorithm? Yes and which makes them about useless because distros don't run that daemon by default so users don't get the feature. Besides it's all not needed anyways because the FIPS verification is pointless. -Andi --
You can continue to feed data into the pool even if it fails the test. You just keep the entropy value same as before. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --
You could do that, but what advantage would it have? I don't think it's worth running the FIPS test, or rather requiring the user land daemon and leaving behind most of the userbase just for this. -Andi --
The obvious advantage is that you don't unblock /dev/random readers until there is real entropy available. Remember that a hardware RNG failure is a catastrophic event, so a heavy-handed response such as blocking /dev/random is reasonable. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --
As far as I can figure out with some research (stracing, strings) pretty much every interesting cryptographic software except gpg keygen uses /dev/urandom anyways. They have to because too many systems don't have enough entropy and /dev/random simply blocks far too often and does not really work. When you check the now famous openssl source you see it uses /dev/urandom first simply because of this problem. They only have /dev/random for systems where /dev/urandom is not available. That is because the real world cryptographers care as much about DoS as about other issues. It's also quite understandable: "Sorry our company couldn't receive email because nobody banged on the keyboard of the mail server" Clearly that would be absurd, but if e.g. openssl used /dev/random you would easily get into that situation. Part of the problem here is of course this strange insistence to not auto-feed from all available random sources. If you set this entropy standards too high soon none are left and the entropy pool is often only very poorly fed. So by setting too high Is it? The pool is just as random as it was before because the hash output will depend on all previous input. So even if you e.g. add a known string of zeroes for example it should be still as much or as little unpredictible as it was before. The big difference is that it is just cryptographic security instead of true entropy, but without enough entropy (or you not trusting your entropy) there's no choice. Also my assumption is that if the hardware RNG fails the rest of the system (CPU, memory) will likely fail with it. Well I admit I'm not 100% sure about that, but the stories going around about RNG failures are so vaguely folksy that my assumptions are likely I only know of GPG initial key gen who really relies on it and I'm sure I wasn't the only one to feel silly when banging random keys on the keyboard while generating a key :) Obviously that doesn't work for all the interesting cases like session keys ...
Andi, can you please clarify what you mean by "auto-feeding /dev/urandom only" and "only get blocking /dev/random with the user daemon"? Are you suggesting that the kernel provides /dev/urandom and a userspace daemon (e.g. EGD) provides /dev/random? Also, if crypto apps like ssh and openssl use on "insecure" /dev/urandom, then who actually relies on /dev/random? For comparison, FreeBSD does not even (AFAIK) have /dev/urandom. FreeBSD's /dev/random is nonblocking (like Linux's /dev/urandom) and includes network entropy. chris --
On Sat, 17 May 2008 12:54:02 -0700 I think the big kicker is the difference between a session key (short lived) and a "real" key such as a gpg key that lives for a long time and is used for multile sessions and with different users (in crypto speak, Alice uses the same random key for Bob, Charlotte and David and potentially for a long time). For a session key, urandom is very likely an acceptable compromise; there's only so much data it's used for. For long term keys I can totally see why /dev/random is used instead. So both have value, just in different circumstances. --
We don't use it for most long term keys, e.g. ssh host keys. That is because even on high entropy systems /dev/random usually doesn't work during distribution installation because the system has not run long enough to collect significant entropy yet. See also the distinction between "user controlled visible cryptography" and "background cryptography" I introduced in a earlier mail on that topic. gpg can only get away with it because they rely on a high level of user education (so requiring banging on keys is ok), but that's not really an option for your normal "everyday background crypto", including longer term keys. So yes it's a nice theory, but without using the available randomness sources always it doesn't work. Instead I think just both urandom and random should try to rely on TPMs and other hardware rngs and always get high quality bit noise. -Andi --
One thing which I'm not sure most people understand is that there isn't that much difference between /dev/random and /dev/urandom. They are fed by the same sources, at the high level. There is a single large entropy pool which gets fed by whatever entropy sources the kernel can get its hands on, which periodically catastrophically seeds separate smaller pools used by /dev/random and /dev/urandom. The only difference is that /dev/random does entropy tracking; /dev/urandom doesn't. Hence, if you don't think the system hasn't run long enough to collect significant entropy, you need to distinguish between "has run long enough to collect entropy which is causes the entropy credits using a somewhat estimation system where we try to be conservative such that /dev/random will let you extract the number of bits you need", and "has run long enough to collect entropy which is unpredictable by an outside attacker such that host keys generated by /dev/urandom really are secure". See why the qualifying statements is so important? If you really believe that there isn't enough entropy after installing a distribution, THEN YOU SHOULDN'T BE GENERATING SSH HOST KEYS. The problem is that the server scenario with no keyboard case is a really No, this distinction is a specious one, I think. It's really more a level of paranoia. People tend to be much more paranoid about their own personal keys, at leasts if they are well trained. Most people don't bother verifying ssh host keys the first time they contact a host, making them subject to a man-in-the-middle attack. But most people don't mind, by which we can deduce that most folks aren't as careful about their ssh key. If distributions really cared, they could very well introduce keyboard banging as part of the install process; but no, being able to do an unmanned, "turnkey" install is considered more important. That says something about how much they care about security right there. (By the way, if you are at least forcing the ...
If the World really cared about security, every cpu chip would supply a true source of random bits based on the sampling some easily accessable quantum on-chip state, such as the tiny fluctuations in current flow across a resistance. I suspect supplying this would be about as expensive as supplying a true TSC driven directly by the external clock -- that is, so close to zero as to not matter. Joe --
It's made worse by not feeding in the network interrupts. I never quite understood the rationale behind that one either: if you worry about someone else controlling the timing of these events, why do you not worry about timings on the local system. e.g. it's not that hard to predict with similar accuracy when a hard disk interrupt happened when a local process read something from disk. Or when the keyboard/USB interrupt happened when you process keyboard input. On the other hand if only the low bits of the time stamp counter are used it should be still random enough in all cases because If people don't realize cryptography is used they can't really have any paranoia. And you can only get away with paranoia requiring In this case I would say it's more the program authors because the users It's more a question on how practical security is. If security gets too complicated nobody will use it. [classical example of that is to force users to use unrememberable "strong" passwords -- result is that they are just written down on Yes definitely, but the trouble is /dev/random does not use it by default. So even if you have a working hardware RNG there's no improvement on what comes out of /dev/u?random Yes, but we don't use it either by default. So even if you have Jeff's patch looked like a good start. I'll try to come up with a complete patch series that auto feeds. The only system I have with a TPM is a T61, let's see if trousers works on that. There's also a couple of other problems here I believe, in particular some of the kernel subsystems who get random numbers for their purpose get it too early before there's any chance of seeing. Also for a lot of kernel purposes (like the networking hash tables) it's really wasteful to use the precious entropy pool entropy. It should rather just seed some cryptographic PRNG once and then use output from that for kernel purposes (or alternatively at least not remove entropy credit from the random pool) -Andi --
Accepting packets requires you to trust another party -- the network provider and users of your host's networked services -- which you do not have to trust otherwise. --
What I meant was "only getting working blocking /dev/random with the user mode daemon". / The kernel would still provide /dev/random. But on systems without much entropy (which is pretty common) it will block often and be unusable unless you run some obscure user space daemons which regularly refeed /dev/random from hw_random and stops doing that if the FIPS test fails and makes /dev/random It's sad to say, but their implementation makes more sense than Linux's (including the feeding in of network data) I suspect that's the main reason I actually found that many /dev/random users as I found during my research. -Andi --
Security through obfuscation? Someone trying to predict the RNG can do so in theory, but if they have to keep track of network timings, disk activity, and 5 other things, then chances are that they fail ofen enough even if the attack is possible "in theory". Helge Hafting --
If programs just need some random data without relying on the fact that it's cryptographically strong /dev/urandom is the right choice. But some programs need entropy for doing crypto stuff, and a local DoS is harmless compared to the consequences of bad /dev/random data. Consider as a worst case the just discovered OpenSSL bug in Debian where all accounts with public key authentification and keys created on a Debian/Ubuntu system during the last 20 months [1] can be taken over by an attacker within less than 20 minutes with a simple brute force cu Adrian [1] 13 months for Debian stable users [2] http://www.derkeiler.com/Mailing-Lists/Full-Disclosure/2008-05/msg00416.html -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed --
No in this case /dev/urandom is the wrong choice. You should seed then some standard RND with the time,pid as is the classical way and not use any precious entropy. Yes some programs don't do that, Even the cryptographic programs normally use /dev/urandom to get session keys etc. That is because they are definitely concerned about local DoS. Just strace your ssh daemon or your SSL web server to see Yes, but if you read the context of that patch it commented out the code that accessed /dev/urandom! Please reread my analysis of the issue. If you have already entropy in the pool the additional feed doesn't change anything. And if you don't it still stays the same. -Andi --
From: Alan Cox <alan@lxorguk.ukuu.org.uk> This does remind me of a deficiency in the current hwrng driver layer that I noticed while working on the Niagara2 RNG driver. If the device allows tweaking of settings (selecting different voltage oscillators per entropy source in my case) we really need some way to test the randomness generated by different setting so that we can make an optimal choice. One common scheme is to compute the Renyi entropy for blocks of random data using the various settings, something else we don't want in the kernel. So we would need some kind of interface so that userland could handle something like that. Something like: 1) Export number of possible configurations to userspace 2) Allow userspace to simply iterate over the configurations, something as simple as just specifying an integer index in the range 0 to num_configs Then userspace can do randomness analysis of the different configurations however it and the user's choosen policy dictate. --
My curiousity got the better of me, so I enabled the TPM (Infineon) moe:~# dmesg | grep -i tpm tpm_inf_pnp 00:05: Found TPM with ID IFX0101 tpm_inf_pnp 00:05: TPM found: config base 0xff5b804e, data base 0xff5b8000, chip version 0x0006, vendor id 0x15d1 (Infineon), product id 0x0006 (SLD 9630 TT 1.1) on a system I had, and using whatever Debian Lenny (w 2.6.24-1 kernel) offers for trousers and tcsd etc, a bunch of help from some other HPers, and a hacked example program from an HP-UX document I pressed-on without much understanding and arrived at: moe:~# time ./raj_example -c 1000 making 1000 Tspi_TPM_GetRandom calls of 16 bytes each real 0m28.033s user 0m0.000s sys 0m0.000s 28.0033s / 1000 calls = 2.8 ms per call. While this was happening top was reporting 1% CPU time in the tcsd. I've no idea how much feeding /dev/random would want and how often, but there is some crude data on overhead for pulling random numbers out of at least one TPM. Here is some varying of the number of bytes requested each time: moe:~# for i in 1 16 32 64 > do > time ./raj_example -c 1000 -r $i > done making 1000 Tspi_TPM_GetRandom calls of 1 bytes each real 0m24.146s user 0m0.008s sys 0m0.000s making 1000 Tspi_TPM_GetRandom calls of 16 bytes each real 0m28.032s user 0m0.000s sys 0m0.000s making 1000 Tspi_TPM_GetRandom calls of 32 bytes each real 0m28.032s user 0m0.004s sys 0m0.000s making 1000 Tspi_TPM_GetRandom calls of 64 bytes each real 0m36.032s user 0m0.004s sys 0m0.000s rick jones no, i don't plan on adding this to netperf :) --
I do indeed hear requests all the time, from people who want to make There are entropy sources on a headless box, even one without audio and video, that are more secure than adding IRQF_SAMPLE_RANDOM to network drivers. EGD demonstrates this, for example: http://egd.sourceforge.net/ It looks at snmp, w, last, uptime, iostats, vmstats, etc. And there are plenty of untapped entropy sources even so, such as reading temperature sensors, fan speed sensors on variable-speed fans, etc. Heck, "smartctl -d ata -a /dev/FOO" produces output that could be hashed and added as entropy. I'm interested to hear peoples' opinion of Chris P's patch, but definitely do not want to go in the other direction and start adding IRQF_SAMPLE_RANDOM, thus moving randomness in the direction of being externally exploitable. Jeff --
Is there nothing associated with the networking stack - NIC, driver, protocols, system calls which can be used as a source of entropy? rick jones --
The issue is with being externally observable and controllable, or, with some irq mitigation schemes, be made /too regular/. Interrupts (or timed mitigation events) may be triggered by the outside world, which makes it a very short path from remote attacker to local kernel entropy pool. Finally, with severe load, there are little or no interrupts thanks to heavy mitigation, which means your entropy pool may be externally DoS'd. Or at the very least, when your entropy needs to be INCREASED (due to heavy workload due to heavy traffic), your incoming entropy DECREASES due to decreased interrupts. [I just realized that last one. Heck, I'm even convincing myself even more its a bad idea] Jeff --
so you have established that with any type of interrupt moderation (either NAPI or some form of irq throttling in the NIC hardware) that IRQF_SA_RANDOM will become more predictable. How about the non-NAPI and non-throttled case? I would argue that without any irq mitigation we can still use SA_RANDOM. Many (e.g. embedded) devices will want some extra form of entropy, and providing them it in this form will be very beneficial as these devices more commonly have no other form of entropy anymore. Auke --
When was the last time we added a new driver for new hardware, and it didn't support NAPI and/or hw mitigation? They are few and far between, "no other form of entropy"? See examples in this thread. Jeff --
Please correct me if I'm wrong, but this thread's conclusions seem to be: * network interrupts are an inappropriate source of entropy (see my patch) * headless servers need entropy, but should seek a better solution, such as EGD, hardware RNG, or other kernel entropy sources (but that is a separate task) * TPM RNG is a separate task and, if implemented, should be in drivers/char/hw_random/ chris --
That's my own opinion, yes. But not necessarily a consensus opinion :) Jeff --
I agree it's by far the _best_ solution. I think that some embedded devices that do not have any RNG hardware should be able to turn off NAPI/irq mitigation and possibly fall back on IRQF_SA_RANDOM. It's not as good as the above solution at all, but may be sufficient for headless embedded devices that are dying for some entropy. of course, with most of the network drivers being NAPI enabled by default this pretty much is not realistic (as Jeff G. pointed out). Unless someone writes an (e.g.) ethtool parameter to turn NAPI on/off :) --
The TPM RNG task is probably best done in a user space process, which might also do other things like pulling environmental noise from the sound card's microphone, etc. The problem is finding someone with the skills, time, and energy to create the appropriate user space daemon. All of the kernel interfaces already exist; it's just matter of implementing the user space support. - Ted --
rngd is already deployed in most distros, via rng-tools. Motivated people are welcome to steal maintainership from me... Jeff --
I was, once. But I got into the trap of getting too out of sync from upsteam rngd, and suddenly, packaging all those changes into incremental patches become too much non-fun work for me to find the energy to do so. It has been a few years since I really looked at that code, except for a few minor Debian packaging fixes. It is old code, and if my C is not world-class now, it was even worse at that time. But at least valgrind tells me I fixed the worst bugs, and FIPS 140-2 (the old version of it that still had criteria for non-deterministic RNGs) and DIEHARD couldn't find any issues with its output after several runs on gigabyte-sized files produced both from an Intel FWH RNG, and from a VIA PadLock RNG. If anyone wants to poke at it, get the Debian rng-tools source package. It directly supports the VIA PadLock in userspace in a suitably paranoid mode (checks that the RNG was not reprogrammed at every read), and does multithreading so that FIPS and output processing does not block (nor gets blocked) by /dev/hw_random reading, etc. Lots of cowebs on it, though. And you probably need to get the version in Debian unstable AND the version in Debian experimental, to get the full code. If someone really wants to work on it, I can send the VC repo for them to play with. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh --
Neat. I always did prefer VIA padlock in userspace. I just sorta assumed a buffering, interrupt-driver TPM RNG driver would be better than doing it from userspace, but maybe that was a bad assumption to make on my part. It should be quite doable to support TPM RNG entirely via userspace, at any rate. Jeff --
I will tell you what. If someone manages to get trousers to actually *work* for data binding and sealing to the TPM in a ThinkPad T43 with an NSC/Winbond TPM (their "sup3r s3kr1t TPM-inside-the-SuperIO 8394T" crap one needs a NDA to get the documentation for), and I manage to duplicate it (i.e. make it work here too), I will write the rng-tools trousers interface code (at least for the Debian version) :-) The kernel TPM driver works, the BIOS works, and I have the PCRs updated properly during boot, but trousers get the tpm pubek key wrong for some reason (the kernel driver can read it just fine). The chip is good, IBM's stuff worked just fine with it. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh --
There were some web pages on this subject that seemed imply that IBM used a non-standard string-to-key algorithm, and that caused the incompatibility with Trousers. So if you initialized the TPM using the IBM Windows drivers, you have to mess around with TSS to get it to work correctly with Thinkpads. I tried for a bit to try to get it to work a while ago, but the few things I tried didn't work, and I eventually lost interest. - Ted --
None of that, I cleared the TPM. In fact, it took a while to find a non-black-magic way to get the IBM BIOS to unhide the "Clear the TPM" prompt... Anyway, the TPM is clean, and I have tried with and without passphrases (owner and operator), etc. It is either a userspace to kernel communications bug, or a trousers bug. The PUBEK is there, the kernel exports it nicely through sysfs, but trousers gets crap instead of the PUBEK if I ask it to get the PUBEK (and therefore, nothing useful in trousers works). -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh --
If I recall correctly, you need access to a magic TPM key just to *talk* to the TPM. Normally that key is stored in a file, and of course we can have a userspace helper pull that key into the kernel, but given the extensive Trousers infrastructure that can do this already, it seemed to make more sense to do it all in userspace, and not require any more kernel code. - Ted --
The TPM has some sort of idea of restricted operations. It will depend whether one can get random numbers as an anonymous party (and frankly, I don't care for looking at the TCG docs right now to find out). I certaily can ask the TPM "are you there?" even when it is disabled(!), so I would not be too surprised to find out that, as long as it is enabled, it will return random numbers to anyone. But access to the TPM requires a control layer which must have excusive access to the chip. That layer would have to move into the kernel... IMHO, it is just not worth even bothering with the idea, and just do it all in userspace. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh --
So where does one get entropy if not the ethernet adapter on many embedded systems? If you have no mouse, no keyboard, no hardware number generator, just ethernet ports and a serial console that usually receives no input. While ethernet might not be preferable if you have something else, sometimes you really don't have anything else. -- Len Sorensen --
Already answered in this thread... EGD illustrates how many sources of entropy remain, even in the example you just gave. Further, you do not want to rely on entropy from a source that declines just as network traffic increases. Jeff --
I don't know egd that well, but from a cursory look it gets data from such things as w or last (wtmp) which is static on most embedded boxes. It also uses netstat and snmp - surely this is at least as easy to manipulate as interrupt timings? I'm not a cryptographer by any means but it looks as if it works by magic. Last changed 2002, written in perl. No, I don't think I'll be shipping this on any systems any time soon. --
I will certainly keep applying a patch to the kernel to enable the ethernet driver as a source of entropy. I won't expect the upstream kernel to want it, but it certainly is useful to have some source of entropy. Generating an ssl key or the like can take an awful long time if you have no sources at all. The last thing I need is another perl script eating up resources for no good reason. -- Len Sorensen --
Inevitably some of the local-machine entropy sources will be static or externally influenced. That's the whole point of using several. If using one source was sufficient... we would just use that one and be done with it. :) The questions to ask are * is this collective snapshot of local machine state sufficiently unique? * is this local-machine state externally controllable within realistic netstat reflects local machine state of all sockets, including local ones, and including local details like tcp in-q and out-q. snmp can query MIBs such as ethernet wire stats, gaining entropy from pause/collision/etc. frame statistics. A set of mitigated network interrupt events is far, far more predictable and controllable than the collective state of a machine's network sockets, or the electrical state of the ethernet LAN link. For network-interrupt randomness to be subverted in some cases, one might need only to increase overall network traffic to a certain level. Jeff --
Ethernet is observable so ethernet isn't entropy. There is no "anything else" here -> there is no *anything* --
So what is one to do if a few applications want to read from /dev/random but you have no excellent source of entropy on the system? Wait forever? I think ethernet interrupts are better than nothing, and I really doubt that someone would be able to predict what the entropy generated would be in practice. If you have that good access to the physical network, then probably have physical access to the system itself, in which case it is all irrelevant. -- Len Sorensen --
Yes. If they don't need that level of security they can use /dev/urandom. Piping network randomness into /dev/urandom is probably quite sensible but not into /dev/random. Alan --
Well it isn't that things liks ssh and ssl and such don't need that level of security, but if there is no way to get it you have to go for the best you can get. Fortunately it seems hardware RNG is becoming more common on embedded CPUs. -- Len Sorensen --
Of course in some cases, someone might be typing stuff at the serial console and you get some actual random input, but you can't rely on it being there, its nice to get it if its there though. -- Len Sorensen --
So open /dev/urandom in that case. It's a user space problem - this is policy. Alan --
How does user space know? Maybe someone is using the serial console, maybe not. -- Len Sorensen --
On Fri, 16 May 2008 16:39:33 -0400 How does the kernel know - it has even less useful policy information ? --
I remember Jesse telling that he had this very same experience while installing a RH box on a headless system with a serial console - a box prompted the user to rattle a keyboard in order for the ssh key generation to continue :) you absolutely don't want to use urandom for that I assume, but if the system just sits dead waiting for randomness, and you can't see the popup asking for some entropy, you're pretty much screwed :) Auke --
If there was an IRQF_SAMPLE_DUBIOUS or IRQF_SAMPLE_URANDOM for network device IRQs to feed /dev/urandom, are there any other IRQs that should use it instead of IRQF_SAMPLE_RANDOM? From a cursory search, it seems like: * network drivers could use IRQF_SAMPLE_URANDOM * all (?) other device drivers could use IRQF_SAMPLE_RANDOM * and timer IRQs should not use either? chris --
Is it permissable for /dev/urandom to degrade to be externally influenced by a hostile party? For example, /dev/random has run out. So the output of /dev/urandom is now determined by previous values of /dev/random. I then send in a stack of network packets at regular intervals. So the output of /dev/urandom is now greatly determined by those packets. My search space for the resulting key is small since /dev/urandom appears to be random, but in fact is periodic. I'll also note that there is a huge number of periodic packets seen by hosts on quiet networks -- such as a preparation VLAN where a system administrator might choose to run up a new machine. --
That's not how it works. Basically, as long as there is *some* entropy in the system, even from the /var/lib/random-seed, or from keyboard interrupts, or from mouse interrupts, which is unknown to the attacker, in the worse case /dev/urandom will be no worse than a cryptographic random number generator. Even if you feed it huge amounts of known data, it won't allow you to "influence" the cryptographic random number generator --- unless of course SHA-1 is totally and thoroughly broken as a cryptographic hash algorithm (invalidating all public key certificates and digital signatures made using SHA-1 algorithm). There is a reason why /dev/random is world-writeable; it's perfectly safe to write arbitary amounts of data into /dev/random. If the attacker doesn't know what has been fixed into the entropy pool, his life just got a lot harder. If it is the attacker mixing known data into the pool, it's no worse. The problems with /dev/urandom only appear if there *all* of the data is known by the attacker --- so all of the keyboard interrupts, all of the network interrupts, all of the mouse interrupts, the initial random seed file --- everything. In practice the time when this has come up is very early in the initial install process, where there is no random seed file, and before any interrupt entropy has had a chance to be mixed into the pool, particularly if it is a headless (i.e., no keyboard, no mouse, no monitor) install process. And here there is no magic bullet. If you are doing a headless install, and there is no entropy, and you don't have a way of accessing a real hardware random number generator, THIS IS NOT THE If the attacker has the power to monitor your preparation/installation network exactly when the machine is being installed, you probably have worse problems on your hands --- for example, most distribution installs off of CD include the RPM, and then get on the network to grab the security updates. If you have an attacker on your preparation/install ...
El Sun, 25 May 2008 19:27:12 -0400 [ ... ] Just a shot in the dark... would hw sensors (raw data) chips be a good source --
For systems with high resolution timers, even if an attacker has total knowledge/control of the network, it doesn't seem realistically possible for them to determine the low order bits of the nanosecond timer of disk and network I/O system calls, if those were used as a source of entropy. I think this is a case of the (unrealistic) best being an enemy of the common (and realistic) good. Another idea that occured to me: How about using the low order bits of the instruction memory address being executed that was interrupted by the HZ timer interrupt. This also doesn't seem to be something that an external attacker could realistically determine. And a combination of these approaches would be that much stronger, combined of course with any other available entropy sources. -Bill --
Think of constant-instructions-length processors :-) -- Krzysztof Halasa --
I'm not sure what you're driving at, but if it's that you shouldn't use the very last 2 or 3 bits, then sure those should be excluded. But that still leaves 9 or 10 bits at least in the page offset, and that's being conservative in the number of address bits to sample. -Bill --
We would still need to exclude any tight loops which an attacker could predict or influence a process to enter - the idle loop obviously, plus udelay(), memcpy() and probably many other functions. Some such loops may be in userland and therefore unknown to us. So there might not be nearly as many bits of entropy in the program counter as could be naively expected. What's more, once userland is blocked on /dev/random, there is no more entropy available from the program counter! Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. --
Around the same time I was working on getting the performance figures for the RNG in the Infineon TPM in the system I had, I tried, however briefly, to concoct a test using netperf and pulling the ITC on an Itanium processor to generate some randomness. I'm not at all sure I was doing things correctly - I was pulling the bottom one to 4 bits of the ITC after each call to recv() of a TCP_RR test - but when I tried to feed the resulting trickle of data through diehard (which I may have been running poorly) it was giving nothing but a p value of 0.000000 which while I don't grok the p-value itself, I understand that consistent value of 0.000000 is bad :( So, I may have had a bad test case. If someone has some suggestions for a better test of the low-order-bits-of-the-interval-timer hypothesis I'd love to hear about them. rick jones --
Well, I'd fear that hlt instruction in idle loop would be the one interrupted most. But low bits of tsc at timer interrupt would be fine entropy source. ..actually, bogomips varies a bit between boots, maybe we should hash it into the pool? Maybe we could even redo bogomips calculation at runtime and use low bits as random numbers? :-). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
You don't know what packet-shaping us upstream ISPs are using.
If we're shaping then we're moving packets in time so that they
arrive upon the ticking of a output queue playout clock. That
is, packet arrival becomes periodic not random.
Linux has a class-based queuing implementation and this would
have a similar effect on outbound packets.
Nearby microwave ovens will add periodicy to the arrival
of WLAN data. It wouldn't shock me if multicast traffic
over WLANs (even if not addressed to the host in question)
had the same effect on unicast data.
TCP's behaviour hardly leads to random packet arrival times.
Take the probability of TCP data inter-packet arrival times.
It is at least a binomial distribution (and thus not a random
distribution, and thus not suitable for /dev/random):
- Case A: first packet in a TCP window transmission
- Case B: subsequent packets in a TCP window transmission
(probability rises to near 1 that another packet
will shortly follow this one).
TCP packet transmission times are also binomial and strongly
self-correlated.
Worst of all, packet arrivals and departures are remotely observable,
both to a classic remote attacker with access to the comms channel and
to another user on a multiuser host. So even if packet arrivals and
departures were totally random they would not be of use, since the
"random" numbers which contribute to the key would be known to the
attacker.
Regards, Glen
--
We have two random number interfaces:
- /dev/random
- /dev/urandom
If a customer wants to get data from /dev/random although there's not
enough entropy that's not a problem we can solve (we can only try to
gather more real entropy if possible).
If he can live with dubious data he can simply use /dev/urandom .
If a customer wants to use /dev/random and demands to get dubious data
there if nothing better is available fulfilling his wish only moves
the security bug from his crappy application to the Linux kernel.
But what we could perhaps do with some kind of IRQF_SAMPLE_DUBIOUS would
be to improve the quality of the data in /dev/urandom if there's not
enough entropy available?
I have seen embedded systems with zero entropy, and dubious entropy
might there be better than no entropy at all.
Or am I wrong on the latter?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--
It's also relevant to the discussion to note that input data to kernel devrandom is mixed, and we can control the amount of "credit" applied to incoming entropy. Jeff --
Sure, and one possible thing to do is to simply always input the interrupt information to the random number generator, but give it a "entropy credit" of 0. That has the net result of potentially improving the entropy found in /dev/random and /dev/urandom, but not necessarily compromising /dev/random, since /dev/random's output is throttled by the entropy estimate. The only cost of doing this would be the overhead in sending the information into the entropy pool. - Ted --
Note: I'm far from being any kind of expert on this topic, but I just had a crazy idea. What if we use the time between syscalls being made as a source of (very little) entropy? My point is that the rate (and timing between) syscalls is depending on very many factors; the kernel version (and configuration), the software installed, the software currently executing, the state of the software currently executing, the number of apps executing, the amount of network traffic, the accuracy of the hardware clock, the speed of (various) IO sources (network, disk, USB, etc), the speed (and type) of the CPU, the speed of memory. And various other things. I'd guess that predicting the syscall rate and interval between syscalls would be too hard to accurately predict to predict the actual entropy generated by that sampling in any real world scenario. Wouldn't that make it a reasonable entropy source for machines that have no other sources (and a fair contributor of entropy even for machines that do have other sources) ?? -- Jesper Juhl <jesper.juhl@gmail.com> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html --
It Depends. For certain workloads, a lot of these issues might just boil out, or not result in as much entropy as you think. Think about a certificate server which doesn't get much traffic, but when it is contacted, it is expected to create new high security RSA keys and the public key certificates to go with it. If the attacker knows the machine type, distribution OS loaded, etc., it might not be that hard to brute force guess many of the factors you have listed above. Basically the question has always been one of the overhead to collect and boil down any input data (which after all, any user space process is estimating how much "entropy" should be ascribed to data which is sent into the entropy pool, and this is where you have to be very careful. If you screw the entropy credit information then security of /dev/random will be impacted. /dev/urandom won't be impacted since it doesn't care about the entropy estimation. That's why only root is allowed to use the ioctl which atomically sends in some "known to be random" data and the entropy credit ascribed to that data. - Ted --
Hmm, I would like to know how you'd do that. Even if you a) know the exact distro installed (and the configuration, b) know the exact hardware in the machine, c) know the exact time it was booted and know that there have been no ssh logins or similar that might have generated syscalls, d) know exactely how many requests (and of what type) have been made to the server and the exact times they were made. How would you go about brute force guessing the contents of the entropy pool? If the server does, for example, this; every second it samples the number of syscalls made during that second and uses that number as the base of adding one or two bits of entropy. Every time a syscall is made it uses the "time since last syscall in 'us'" to add one bit of entropy to the pool. I'd say that even if that server sees very little (and even predictable) traffic, we may have; details of the filesystem layout on disk, a timer interrupt happening a few microseconds early due to a flaky chip, a background process initiating some action a millisecond early/late for scheduling reasons, the switch the machine is connected to causing a network packet to arrive a tiny bit later than normal and various other factors like that, to cause the generated entropy to be off by a bit or two compared to your guess - and by the time you realize you are off, another spurrious event has probably happened, so you'll never end up in-sync with the entropy pool... Or is there some "obvious entropy pool guessing method" that Yes, I'm aware of that, and I'm not suggesting to use syscall rates as a generator of high amounts of high quality entropy. Im merely suggesting that sampling syscall rates and time-between and using those numbers as the source of very low amounts of low quality entropy might be worth-while. It wouldn't hurt on machines that have other, higher quality, entropy sources. On machines that have no other entropy sources it would ensure that we always have a steady (although I'm only talking about ...
There are two issues that people need to separate here: - sampling noise - estimating entropy in that noise It certainly makes sense to sample network timing noise. It often does not make sense to assume that there's any entropy in those timing samples. For instance: - our clock resolution may be low enough that an attacker can guess our samples (ie it's simply HZ, very common in embedded land) - the bus involved (ISA, peripheral bus, even slow PCI) may have the same issue - it may be heavily correlated with some other measurement (ie network vs disk samples on file servers) We currently assume that IRQF_SAMPLE_RANDOM means 'this is a completely trusted unobservable entropy source' which is obviously wrong for network devices but is right for some other classes of device. I'd personally prefer to add a new interface, eg add_network_randomness(), that internalized the wisdom of what to do with network samples. Similarly, the various 'input'-like devices that use SAMPLE_RANDOM should be switched to go through the 'input' interface. -- Mathematics is the supreme nostalgia of our time. --
So I guess a good message for customers might be: Don't depend on an entropy source whose volume decreases as workload and network traffic increase. Jeff --
