Re: [PATCH resend][CRYPTO]: RSA algorithm patch

Previous thread: sched_yield proposals/rationale by Buytaert_Steven on Thursday, April 12, 2007 - 1:31 am. (15 messages)

Next thread: [AppArmor 03/41] Remove redundant check from proc_sys_setattr() by jjohansen on Thursday, April 12, 2007 - 2:08 am. (2 messages)
From: Tasos Parisinos
Date: Thursday, April 12, 2007 - 1:34 am

Well with old credit cards (magnetic) it was possible (as a matter of fact it was trivial)
With new chip cards its a lot harder but still possible, only the risk is smaller in means that

I didn't have time to read the pdf but i will. As for erasing ram it usually means to also 
scramble it and there are chips that advertise that can do this in 1 cycle. Well reaching the bus or ram 
to probe it is another thing. Most die in such chip are also hidden under protective meshes so it 
takes some time. What you said is true, systems are created and broken, the question is what 
you have to hide and whether the person that wants to break it has the money the will the knowledge and the equipment 
to do it. Because if he is to spend a million euro only to create fraud of half a million euro then he wont do it.


I understand and i agree, thing is that i dont decide about which parts are given GPL.
While the RSA module is given standalone, it can be still used by others to develop
their own signing and authenticating mechanisms. For example some will decide to do 
hashing of code with md5 while others with sha1 and some will do padding compliant
with pkcs1 other will implement other schemas e.t.c. I want to say although this is not
really ready to be used for binary signing, it has the advantage to provide basic functionality

people think that by hiding implementations (which can be reverse engineered of course)
will make it more difficult to break. Well i agree with you but... its not in my hands.
So i will come back with these parts replaced with some of my own

And a question
Can someone provide me with a full list of kernel functions where code is loaded?

Best regards 
Tasos Parisinos

-

From: Satyam Sharma
Date: Thursday, April 12, 2007 - 2:35 am

Firstly, I'd like to steer clear of all the RSA-in-kernel-or-userspace
/ MPI-in-kernel-or-userspace arguments. True, MPI / RSA / (PKI? that
even requires some knowledge of ASN.1 etc to parse certificates) do
add a lot of bloat to the kernel. That said, I don't see any other
reason to _not_ do asymmetric crypto in the kernel either. In some
situations, that in fact simplifies the overall system by avoiding the
use of additional userspace components / daemons to do the key
management stuff.

I do have some comments on the submitted patch, though:

1. First, sorry, I don't think an RSA implementation not conforming to
PKCS #1 qualifies to be called RSA at all. That is definitely a *must*
-- why break strong crypto algorithms such as RSA by implementing them
in insecure ways? It shouldn't be too difficult for you to add, and
you _can_ still allow the user to choose the appropriate padding
scheme by taking it as an argument to the encrypt / decrypt API
functions (some possibilities I can think of are PKCS#1 v1.5 padding,
v2.0 i.e. OAEP padding, or null padding).

2. Of course, you can forget about what digest was used to compute the
plaintext input to RSA. We don't really care, as RSA should never be
used to operate on multiple blocks anyway and no real-world hybrid
cryptosystem uses it in such a way either. (RSA could be thought of as
a block cipher of block size == modulus size)

3. Please, try to re-use the MPI library from GnuPG that has already
been ported to the kernel -- not the mainline tree, but most
distributions such as Red Hat, Fedora, Debian, Suse(?) have maintained
an MPI library in crypto/ for the modsign stuff for years now -- that
would make your entire RSA module merely a set of wrappers
(implementing PKCS#1) to the core modular exponentiation functions in
that MPI library. It also contains implementations of optimized
algorithms for a lot of the mathematics involved, so you have only to
benefit from utilizing a library that has been used, maintained ...
From: Indan Zupancic
Date: Thursday, April 12, 2007 - 5:22 am

Because it can all be done in userspace? Can trap all binary loading from
user-space too, though that must be done carefully, without changing the

It's still RSA, that it's not enough to get a complete and secure crypto
system doesn't mean it isn't RSA anymore. Maybe you're right and having
RSA without the rest makes no sense. But things like padding schemes change,
the core RSA algorithm does't, so I think when adding anything that's unused
by the rest of the kernel it has more chance if it's a low maintenance bare

Right, and that MPI implementation isn't merged. Perhaps a smaller, simpler,
not complexified by optimisations version has more chance. RSA is damn slow
anyway, so better to concentrate on other things than speed, IMHO.

That said, perhaps the different groups could concentrate on one implementation
they all agree on? That's the main advantage of having it mainline, that there's
a common place to cencentrate all effort into.

But keep in mind that it's a precaurious balance, at least that's how I see it.
Right now it's about a rather static 800 lines of code which implements some
common algorithms. Not much bloat, low maintenance, over all not much reason
to not merge it, except that it isn't used by the rest of the kernel.

If you're going to add the full kitchen sink, it might not look that appealing
anymore, and merging it will become much harder, especially as no one provided
good arguments why it can't be done in user space.

Perhaps, with that in mind, it's better to not merge the bare RSA version either,
as it might give false hope for more, because it looks like most implementations
should be done in user space (either because that's better anyway, or because of
legal reasons).

Greetings,

Indan


-

From: Andi Kleen
Date: Thursday, April 12, 2007 - 5:40 am

Before merging anything there would need to be clear applications
(with patches, discussions etc.) 

In general code with no users is not merged because it will just bitrot.

-Andi

-

From: Satyam Sharma
Date: Thursday, April 12, 2007 - 7:20 am

Aarghh ... just the argument I wanted to steer clear of, isn't this?
Continuing, however, we _can_ do a lot of things in userspace, but
still might choose to do them in the kernel, for various reasons in
each case (performance, convenience, application transparency etc
etc). So many things come to mind -- filesystems, readahead, the
cryptoapi itself -- some less controversial, and some more so.

Of course, once an infrastructure is indeed merged into the kernel, it
better already have (or quickly develop) some users for itself. If it
doesn't, it does end up as rot. If it does, that pretty much solves
the maintenance problem there itself.

Getting specific to *this* particular case, I _can_ think of other
users in the kernel that could gladly use general asymmetric crypto
capabilities added to the cryptoapi -- encrypting file systems, module
signing (as of now they've just implemented DSA directly for

Well, then *all* the users of the RSA cryptoapi code would end up
duplicating the PKCS padding for themselves! (because if we don't care
about padding in a "low-maintenance-bare-RSA-implementation", then we
might as well not implement any RSA crypto at all :-) Also, padding
schemes are careful cryptographic constructions -- much like ciphers
themselves. Much thought has gone into PKCS#1 over the years (and it's
later revisions). We wouldn't want users to re-invent some kind of
insecure padding for themselves.

The way I think about this is: (1) RSA without PKCS#1 makes no sense,
and (2) PKCS#1 is *defined* only for RSA. Combine (1) and (2), and it
makes *all* sense to encapsulate the two in just one module.

Coming to padding standards, RSA has been around for ~20 years now,
and only 3 PKCS#1 padding schemes were ever defined (at least AFAICR).
That's not something that changes so frequently as to not implement it
in the core RSA code itself, especially when the penalties of not

If RSA is slow, then optimizing it to make it fast (and thus have more
impact on relative ...
From: Indan Zupancic
Date: Thursday, April 12, 2007 - 8:01 am

Yes, but currently those users are more or less hidden and working

The potential for users is the bare minimum expected, next step is that


I prefer clear, simple code that is easily audited to be correct or at least
not cause problems, which is small enough to not add much bloat. I don't care
about "very slow" or merely "slow" code. RSA/DSA isn't used as a symmetric block
judged by its quality, not by its heritage.

I know you didn't want to talk about the user versus kernel space question,
but I think it's a very important and interesting one. As you said, GPG is
there around for years and tested, why making a new implementation in the kernel?

And whether they can agree on a common implementation. If they all just want to do
their own thing, and not combine resources, then it doesn't make any sense to merge
anything at all, as it would be ignored anyway.

As for the userspace issue, I get the impression that they didn't consider it very
well, ignoring things like initramfs versus main filesystem and so on. It's also a
gradual thing, some things can be done best in the kernel, and others in user space.
Another alternative is to push it up into the boot loader. ;-)

Greetings,

Indan


-

From: Satyam Sharma
Date: Thursday, April 12, 2007 - 11:38 am

Hi Indan,


Agreed, the GPG MPI library *is* significantly larger -- I'm looking
at the Fedora 4 kernel source here and their crypto/mpi/ comes to
about 5000 lines (not including comments). But there are reasons for
this -- they export far too many MPI operations. I suspect they just
ported the *entire* GNU MP lib.

For example, there's crypto/mpi/longlong.h, worth 1500 lines of
ugliness in itself. I see arch-specific assembly there for processors
that Linux doesn't even support. Wow.

There are some other better reasons for the bloat in the GNU MP lib,
though. Tasos' code uses the rsa_cipher() as a dual-purpose primitive.
Feed it the plaintext (p), public exponent (e) and public modulus (n),
and you get the ciphertext (c = p^e mod n). Feed it the ciphertext,
private exponent (d) and public modulus, and you get the plaintext (p
= c^d mod n) back. All modern RSA implementations, however, prefer
preserving the prime numbers (p, q, and their other derivatives such
as d mod (p-1), d mod (q-1) and inverse of q modulo p) generated at
the time of key generation along with the private exponent as the
complete "private key" (this is what is recommended by PKCS#1 too)
which enables us to use the chinese remainder theorem to decrypt
faster than simply do an (expensive) modulo exponentiation again.

Also, DSA signature generation and verification (which is what the
modsign guys use) doesn't exactly utilize the same MPI operations as
RSA does. If we really don't care about speed and DSA, we could strip
that library down to the basic and RSA-only operations Tasos' code
provides, and I'm quite sure we'd end up somewhere close to 1000 lines
there too.

Of course, that would then *force* other users such as modsign to
re-implement their own library for their needs again, thus defeating
the exercise of merging this bare-bones MPI library into the kernel in

Yes, that "common implementation" bit is definitely crucial, which is
why merging a solid, feature-rich and fast MPI library could ...
From: Indan Zupancic
Date: Thursday, April 12, 2007 - 12:05 pm

Hello,


Which, according to the Wikipedia page on RSA, is susceptible to timing
attacks, which requires measures to counter that (that might be needed in
Tasos' implementation too though).

Doing MPI well is already tricky. Doing it in such way that most side-channel
attacks don't work is challenging, and brings already code bloat and complexity
with it.  Doing all the previous with compact and simple code is hard. Speed is
the last thing I'm worried about.

I'd rather start with good and simple code, and speed it up without adding too
much complexity gradually, case by case.

That's nice. But then you lose more or less all advantages of using an existing
implementation, don't you? Anyway, how much different would that code be from

Not if they go the other way round and strip everything except DSA functionality.

You want feature-rich and fast, I prefer dense, clean and spartan, what would
others prefer? Perhaps deciding on a common implementation is harder than it looks.

Greetings,

Indan


-

From: Satyam Sharma
Date: Thursday, April 12, 2007 - 12:57 pm

Of course, I'd rather code to the PKCS#1 RSA Cryptography Standard
than an entry-level Wikipedia page :-) Timing attacks are particularly
problematic on smart cards (too slow, and with predictable operation
times, if not using constant-time crypto implementations) and not
really worthwhile in practice on any other platform where there's
enough noise around to make accurate timing difficult (that
hyper-threading "vulnerability" discovered some time back comes to
mind). Even so, constant-time crypto implementations do take care of

I do agree that only those parts of an MPI lib that are really needed
by any users must be included. But then we don't want to end up in a
situation where we merge such a small MPI library that code and/or
functionality are being sadly duplicated across users who want
different asymmetric cryptosystems (note the 2 DLMs in mainline, for
example). When we want to support both RSA and DSA, which require a
diverse set of MPI operations and primitives, I don't see how we can
still continue to retain the simplistic and "spartan" RSA-only MPI lib
that this code provides.

Cheers,
S
-

From: Indan Zupancic
Date: Thursday, April 12, 2007 - 1:44 pm

Noise can be filtered out, and combined attacks can give enough information.
(E.g. throw in power usage or other measurements.) There are ways to add
noise to the timing, but still using those optimisations. The point was
that they add extra code and complexity.

(Personally I think RSA-like asymmetric encryption has so many, often
subtle problems, with complex, hard to get secure implementations, that
it's something I'd avoid using as much as possible. Unfortunately there's

That's because for side-channel attacks you need physical access to the
hardware, something for most machines means security is breached anyway.
But when this code is going to be used to sign things by embedded devices
(with a local, secret key), it can be important.

For checking signatures the key is known and all this doesn't matter, but

We won't know until those users show themselves. Until then, we can only
speculate. Right now there is only user, with another one lurking in the
background.

Greetings,

Indan


-

From: Satyam Sharma
Date: Thursday, April 12, 2007 - 2:13 pm

But timing attacks are not exclusive to RSA / asymmetric
cryptosystems. Such (side channel / timing / power measurement / bus
access) attacks are possible against AES, etc too.

Of course, now we're really moving into a different realm -- I guess
in security there is always a threshold, and you really needn't care
beyond a particular threat perception level. I don't see how even the
existing cryptoapi (or *any* security measure in the kernel for that

I think the original idea was to generate signatures at a centralized
place (not on an embedded system) and only *verify* them using
*public* keys on the embedded systems? For most common
implementations, as I suggested, you only need bother yourself upto a
certain security threshold.
-

From: Indan Zupancic
Date: Thursday, April 12, 2007 - 3:51 pm

True, but those are often easier to protect, or are less vulnerable in
the first place. (E.g. it isn't very hard to make a constant time AES

True, and very specialized hardware is needed in such cases anyway, so
arguing that it's not the kernel's task to protect against such attacks
is valid. But it are interesting attacks, and people should be aware of
them, instead of blindly trusting any security measure (not implying

Yes, but it depends on how the code is used. It is supposed to be generic code,
so whether someone wants to use it for signing or not is an open question. So
far it seems only signature checking is needed, and that simplifies a lot, but
if that isn't the case more questions pop up, like where the security threshold
should be.

The user with the tightest requirements more or less dictates the implementation.

All in all, to get anything merged at all in the kernel it seems at least the
following needs to happen:

- Future users speaking up and uniting.

- Figuring out their needs (so overlapping needs can make it into common
  code, and other decisions can be made, as where the kernel and user
  space border should be.)

- Deciding on a commonly agreed security threshold, and making that explicit.

- Coding it all up and keeping it in sync with mainline.

I don't see this happening soon. But a good start would be if someone who
cares about this sets up a mailing list or website to collect all users
and information.

Good night,

Indan


-

From: Indan Zupancic
Date: Thursday, April 12, 2007 - 6:09 am

Those can be peeled off. But the paper I linked to concentrates on other kinds of attacks.

Quote:

"One of the main contributions of this thesis is fault injection attacks done in a semi-invasive
manner which can be used to modify the contents of SRAM and change the state of any
individual transistor inside the chip. That gives almost unlimited capabilities to the attacker in
getting control over the chip operation and abusing the protection mechanism.
Compared to non-invasive attacks, semi-invasive attacks are harder to implement as they
require decapsulation of the chip. However, very much less expensive equipment is needed than
for invasive attacks. These attacks can be performed in a reasonably short period of time. Also
they are scalable to a certain extent, and the skills and knowledge required to perform them can
be easily and quickly acquired. Some of these attacks, such as an exhaustive search for a"

It gives a good overview of all kind of other attacks and defences too. Personally I find the

Very true, and the right way to look at it. (Vendors should say "it costs about $X to

No, the kernel licence and others do. But you can't edit the kernel and distribute it without

What else is there except the signing and binary loading hooks? Perhaps some caching, but
not much more, is there? I'm a bit baffled about what's kept hidden here, as there doesn't
seem much to hide, except perhaps other, independent "protections", but we weren't talking
about those.

I don't know anything about your company, but I bet the marketing people are more in
favour to keeping it hidden than the security people. I'd ask the legal people for the

Counter questions:

- Why should we give that list if you're not going to release the code you're going to write?
Or in other words, why should we help you?
(But if you're going to open it up, try to find people interested in the same functionality so
you can work together on it. Not me, I'm just a passer-by.)

- That you ask for that list ...
Previous thread: sched_yield proposals/rationale by Buytaert_Steven on Thursday, April 12, 2007 - 1:31 am. (15 messages)

Next thread: [AppArmor 03/41] Remove redundant check from proc_sys_setattr() by jjohansen on Thursday, April 12, 2007 - 2:08 am. (2 messages)