Imperfect Forward Secrecy

Credit: Nick Carter
Credit: Nick Carter

Perfect Forward Secrecy is a desirable characteristic of secure communications protocols (such as TLS, the protocol that puts the “s” in “https://…”) which means that even if I manage to steal your private key, I still can’t read the terabytes of encrypted traffic that I’ve managed to capture. The reason for this is that the private key used to assure your identity was not also used to secure the session keys used to encrypt the data sent back and forth. Instead, some method of key exchange was used which uses clever mathematics to allow you and your correspondent to independently calculate the same session key while I, despite eavesdropping on all your conversations, can’t. The method that I’m going to discuss is called Ephemeral Diffie-Hellman. If you’re not interested in the mathematics, you can skip this bit. Otherwise, the actors in the drama are an Activist (A), Wikileaks (W) and the CIA (C). A is submitting a bunch of top secret documents to W and C would quite like to be able to listen in.

W has a set of fixed Diffie-Hellman parameters (fixed because generating them from scratch takes ages) consisting of a generator, g (which is typically 2 or 5) and a modulus, p, which is a large prime number such that for any integer a, there is another integer, k, with the property that:
g^k=a\bmod p
If you don’t understand the notation, “mod” stands for “modulo”, which you may have learnt at school as “clock arithmetic”: if the time is 22:00 hours and we wind the clock forward 8:00 hours, the clock reads 06:00 hours, not 30:00 hours. This is addition modulo 24.

W now picks a secret random number, x and sends A the following values: p, g and a calculated value w where:
w=g^x\bmod p
In the TLS protocol, W signs this particular message using its private key so that A can verify that the parameters are sound. A now picks a random number, y and sends W a calculated value a where:
a=g^y\bmod p

Now here’s the clever part. W derives a secret key, s:
s=a^x\bmod p
Meanwhile, A derives the same secret value:
s=w^y\bmod p
This works because, modulo p:
w^y\bmod p = g^{xy}\bmod p = g^{yx}\bmod p = a^x\bmod p

C, who has been privy to these exchanges, knows p, g, w and a but he can’t do a lot with them because, for example, to obtain W’s secret number x he would have to work backwards from w to get it. This problem is known as the discrete logarithm problem (you’re looking for an exponent and if this weren’t modular arithmetic, you’d simply take the logarithm of w to get the answer) and is fantastically hard to solve when p is a large number.

So much for the beautiful theory, the devil is in the implementation details! If you’re programming support for ephemeral Diffie-Hellman, you’re likely to be using openssl to do so and a combination of poor documentation and boneheaded behaviour makes it really easy to get this wrong.

The first thing you need to do is generate parameters. Fortunately, openssl makes this easy to do from the command line. For example, to generate a 2048-bit prime (the recommended minimum at the time of writing), you’d use:

$ openssl dhparam -out dh2048.pem 2048

If you do it like this, you need to load the parameters from the file at runtime using the PEM_read_DHparams function. I prefer not to do it this way, because using an external file gives me a bunch of possible error conditions that I need to handle and test. Instead, you can get openssl to write you a nice C-function (actually, it’s a bit ugly) that you can simply paste in your code:

$ openssl dhparam -noout -C 2048

If you go down this route, remember to refresh your parameters every so often. I use a Perl script that generates a header file with fresh parameters whenever I do a clean build.

However you get your parameters you have to tell openssl to use them. One way is to use a fixed-strength set of parameters for every connection:

#include <openssl.h>

SSL_CTX ctx = SSL_CTX_new();
/* Load Diffie-Hellman parameters */
DH *dh2048 = ...
...
if (SSL_CTX_set_tmp_dh(ctx, dh2048) != 1) {
   /* Handle the error */
   ...
}

This is OK, but it’s not very agile. If you look at this table (the relevant numbers are the “L”-values in the FF column), you’ll see that a 2048-bit prime is a match for protecting a 3DES key, since both give you 112-bits of effective security. 112-bits means that you’d expect to perform around 2112 (or 5×1033) computations to break the cryptography. Since Sweet32, 3DES is right out. This means that, realistically, you’re using AES (or maybe Camellia if you really don’t want to sell software to the US government) with a minimum strength of 128-bits. The corresponding prime is now 3072-bits. It is quite possible when using TLS to negotiate a cipher suite that uses AES-256 as the data encryption algorithm. The matching prime is a whopping 15360-bits (good luck generating that!). Since using a larger prime than required means doing more work in your overburdened server code, you’d ideally want a way of matching the parameters to the ciphersuite. openssl appears to give you the means to do so via a callback function: SSL_CTX_set_tmp_dh_callback. However, if you use this, there are a couple of serious pitfalls, one that may be of your making and one that most definitely is not.

Since this will be code that you probably don’t revisit terribly often, your code may look something like this:

#include <openssl/ssl.h>
#include <openssl/dh.h>

DH *dh512 ...;
DH *dh1024...;
DH *dh2048...;

DH *DhCallback(SSL *ssl, int isExport, int keyLength)
{
    DH *ret;
    switch (keyLength) {
        case 512:
            /* Oops, we're vulnerable to Logjam */
            ret = dh512;
            break;

        case 1024:
            /* Not good enough since 2010 */
            ret = dh1024;
            break;

        case 2048:
        default:
            /* Just right but never used */
            ret = dh2048;
            break;
    }
    return ret;
}

SSL_CTX ctx = SSL_CTX_new();
SSL_CTX_set_tmp_dh_callback(ctx, DhCallback);

If your code does look like this, then that’s the problem of your making. You have two code paths that don’t do the right thing (and, indeed, one that does a very wrong thing indeed). However, there’s another problem. You might reasonably expect the keyLength parameter to be a “sensible” value for the ciphersuite, but it isn’t. It’s always 1024 (unless isExport is non-zero, but then you have other problems to worry about!). At least the most recent documentation tells you to ignore the parameter, but somehwat older documentation is quite misleading.

If you want the agility, you can base it on the strength of the private key since, referring back to the earlier table, the “FF” and “IF” (meaning RSA keys) columns line up. Your code might look something like this:

/* Use decently strong parameters */
DH *dh2048...;
DH *dh3072...;

DH *DhCallback(SSL *ssl, int isExport, int keyLength)
{
    DH *ret;

    EVP_PKEY *pkey = SSL_get_privatekey(ssl);
    int pBits = pKey ? EVP_PKEY_bits(pkey) : 0;

    if (pBits <= 2048) {
        ret = dh2048;
    }
    else {
        ret = dh3072;
    }

    return ret;
}

SSL_CTX ctx = SSL_CTX_new();
SSL_CTX_set_tmp_dh_callback(ctx, DhCallback);

Note the test for a NULL private key. If it’s NULL, you’re using anonymous Diffie-Hellman which means you’re in something of a state of sin!

If you used the former method (SSL_CTX_set_tmp_dh), openssl used to immediately generate a session key and use for it every single client connection for the lifetime of your server process unless you explicitly told it not to be silly. Fortunately, it doesn’t do that anymore.

So that’s how to use ephemeral Diffie-Hellman safely in your server-side code. However, there’s a clearly superior alternative, elliptic curve Diffie-Hellman (you can find a gentle introduction to elliptic curve cryptography or ECC here).

Uptake of ECC was hindered until quite recently by Red Hat’s stubborn insistence on stripping ECC functionality out of openssl. I’m not going to question Red Hat’s legal concerns (especially since they got really shirty about it) but their stance badly affected Linux developers while Microsoft, Google, Mozilla, Canonical et al. didn’t seem at all bothered.

There’s another issue too: FUD about the numerology behind it all. To all intents and purposes, the world uses two curves called P-256 and P-384 which provide 128 and 192 bits of effective strength respectively (look at the “EC” column in that table I referred to earlier). The reason that everybody uses these two curves is twofold: their use is required if you want to sell software to the US government and coming up with your own curves is beyond the grasp of regular mortals. The likes of Daniel J. Bernstein can but you and I almost certainly can’t! Those two curves were published by the perfectly benevolent National Institute of Standards and Technology (NIST). Unfortunately, they had assistance from the distinctly less benevolent National Security Agency (NSA) and anything the NSA touches is tainted by association. My take on the FUD is that P-256 and P-384 are part of “suite-B” (algorithms approved for securing up to Top Secret information). If the NSA had a magic back-door for P-384 (the Top Secret one), they would have to assume that the Chinese and Russians knew about it as well. Since presumably the NSA don’t want the Chinese and Russians reading Top Secret US communications, it’s a reasonable assumption that the NIST curves are currently secure from snooping.

Therefore, my parting advice is to ditch all that code supporting DHE ciphersuites and just use elliptic curve cryptography instead. In a novel move, openssl not only makes this easy but also provides a function that does – apparently – the right thing:

#include <openssl.h>

SSL_CTX ctx = SSL_CTX_new();
SSL_CTX_set_ecdh_auto(ctx, 1);

Leave a Reply

Your email address will not be published. Required fields are marked *