On Web-Security and -Insecurity: Old attacks on new TLS implementations - or how a tiny side channel can break your crypto

This week at USENIX Security my colleague Chris Meyer will present our latest research paper on TLS attacks: "Revisiting SSL/TLS Implementations: New Bleichenbacher Side Channels and Attacks" (written by him, Eugen Weiss, Jörg Schwenk, Sebastian Schinzel, Erik Tews and me) [paper].
This blog post is intended for people who do not like boring long research papers and would like to get a comprehensive summary what's going on. As the title suggests we developed some attacks on TLS implementations. In the following, I will try to give you an intuition behind these attacks and explain why they worked.

Some prerequisites

For understanding the attacks, you need to have some background on adaptive chosen-ciphertext attacks and RSA PKCS#1 usage in TLS.

Adaptive chosen-ciphertext attacks

In general we apply a specific adaptive chosen-ciphertext attack.

In an adaptive chosen-ciphertext attack scenario, the attacker's goal is to decrypt a ciphertext without knowing the decryption key (the source of the ciphertext is not relevant, let's say the attacker eavesdropped it). Using the original ciphertext, the attacker iteratively generates new ciphertexts, which are somehow related to the original one. He sends the ciphertexts to a receiver - oracle - and observes its responses which leak specific information about the validity of the decrypted message. With each response the attacker learns some plaintext information. He repeats these steps until he achieves his goal and decrypts the whole message. See the following figure for the attack illustration.

Many of you are probably familiar with padding oracle attacks, which also belong to the family of adaptive chosen-ciphertext attacks. In padding oracle attacks the attacker exploits an information about the PKCS#5 message validity while using Cipher Block Chaining (CBC) mode of operation. In fact, we are also going to adapt a specific padding oracle attack called Bleichenbacher's attack...but this attack is based on the RSA PKCS#1 padding.

RSA PKCS#1 v1.5

RSA PKCS#1 v1.5 (in the following referenced as PKCS#1) padding is intended for messages going to be encrypted with RSA. In order to encrypt a message k, the message is first prepended with 0x00, some random bytes and 0x00 0x02. The number of random bytes is chosen in a way so that the padded message achieves specific block length (1024, 2048 or 4096 bits). See the next figure depicting a 16 byte long message padded to achieve 256 bytes (2048 bits).

We write m=00 02 ...padding... 00 k.

The message decryptor processes an encrypted message as follows. It decrypts the message, checks the message structure, searches for the first 0x00 byte after 0x02, and unpads the secret message.

PKCS#1 in TLS Handshake

PKCS#1 is used in many applications and standards. In case of SSL/TLS, it is applied in the handshake protocol to transfer a PremasterSecret:

In order to get a basic idea behind the attacks, it is not necessary to understand the whole handshake message flow. You just need to know that a TLS handshake protocol is used to establish a TLS session key. The session key is established using several public values (like client/server random) and PremasterSecret. PremasterSecret is a 48 byte long value generated by the client and sent to the server in a ClientKeyExchange message, encoded using PKCS#1:

PremasterSecret consists of two protocol version bytes (in this case 0x03 0x01) and 46 random bytes. If an attacker gets in possession of these 46 random bytes, he can decrypt a TLS session key and thus the complete TLS session (since all other values are public).

Bleichenbacher's "million message" attack

It was a very long time ago (in 1998) as Daniel Bleichenbacher presented his "million message" attack on RSA PKCS#1 [million-message-attack]. The attack works independently of the protocol using PKCS#1. Daniel Bleichenbacher showed how to apply the attack to several protocols like SSL, WTLS or IPSec. Bardou et al. improved the attack performance and showed how to apply it to crypto hardware [crypto-hardware-attacks] (Graham, thanks for the code ;)) In all cases, there is only a need for a validity oracle responding with 1 or 0 based on the PKCS#1 validity of a decrypted message.
In case of SSL/TLS, it was possible to construct an oracle based on alert messages returned by the server: In a case the client sent an invalid PKCS#1 message, the server directly responded with an alert message. Otherwise, it proceeded with the handshake flow.

Attack Intuition

I am not going into technical details about the whole attack, I just try to give you some intuition behind it. The whole attack works thanks to an RSA property called malleability. Given an RSA ciphertext, this property allows anyone to multiply the underlying plaintext - without knowing the private key.

For the description purposes, let us define a variable B=2^8(N-2), where N is the byte length of the RSA key (this just means that, for example, 2B = 0x00 02 00 ... 00). A valid PKCS#1 message always belongs to the interval <2B, 3B).

Now, imagine there is an encrypted PKCS#1 message m lying near the 3B value. Using the malleability property, the attacker can multiply m with various s values (s=2, s=3, s=4 ...).

He does this until he finds a new value m'=m*sx, which also belongs to the interval <2B,3B). In that case, the oracle responds with 1 and the attacker knows he created a valid plaintext.

The same scenario can be observed also when the original message m lies near the 2B value:

However, in that case, the number of steps (multiplications) is larger.

In general, it holds that:

Large s value indicates m is in the near of 2B
Small s value indicates m is in the near of 3B

The attacker can proceed with searching for further s values producing further valid PKCS#1 messages. This helps him to reduce the interval, where the original message m lies in.

Attack performance

Even though the attack is called "million message" attack, the number of oracle requests varies and can be much more smaller than one million. The number of requests depends on the oracle strength. In general, the oracle strength can be measured using a probability that the oracle responds with 1 when a given decrypted message starts with 0x00 02.
If a message starts with 0x00 02 and the oracle responds with 0, this is a false negative. False negatives lead to a worse attack performance.

Countermeasure

RFC-5246 (and other standards using PKCS#1) proposes the following countermeasure:

In any case, a TLS server MUST NOT generate an alert if processing an
   RSA-encrypted premaster secret message fails, or the version number
   is not as expected.  Instead, it MUST continue the handshake with a
   randomly generated premaster secret.

This should prevent the information leakage so that the attacker learns nothing about the PKCS#1 message validity.

Our challenges

The only prerequisite for executing the attack is an oracle, which allows to distinguish valid from invalid PKCS#1 messages. If such an oracle is given, it is possible to execute Bleichenbacher's attack. Thus our challenge was to analyze current TLS implementations and the possibility for turning a server into a PKCS#1 validity oracle. In addition to the direct messages, we focused on timing side channels. Once a side channel was found, we needed to analyze the oracle strength and execute a complete attack.

Attacks and side channels

Ok, so now we are ready to describe the interesting part with our practical attacks.

Side channel 1: Error messages in JSSE

We were extremely surprised as we analyzed the JSSE library. JSSE (Java Secure Socket Extension) is a default library for providing SSL functionality in Java. After sending differently formatted PKCS#1 messages, we found out JSSE responded with different errors: INTERNAL_ERROR and HANDSHAKE_FAILURE. The INTERNAL_ERROR message was caused by an internal ArrayIndexOutOfBoundsException, which was not handled by the JSSE library. Depending on the key length, this exception was thrown in specific cases, see the next figure:

As can be seen, the INTERNAL_ERROR was produced in case the message started with 0x00 02 and a 0x00 byte appeared at a specific position. Thus, when an INTERNAL_ERROR was returned, we knew the message definitely started with 0x00 02, and we were able to apply the direct Bleichenbacher attack...surprising since it should have been fixed 15 years ago.
We applied a real attack using the different error messages. We needed about 177,000 queries to a JSSE server applying 2048 bit keys and about 74,000 queries to a JSSE server applying 4096 bit keys. We needed lesser queries to a server using 4096 bit keys because the constructed oracle was less restrictive: the probability of getting a valid server response for messages starting with 0x00 02 was higher. The attack on a server using 1024 bit keys showed up not to be practical, since the probability of getting an INTERNAL_ERROR message was very low.

Side channel 2: Additional random number generation

After finding the first side channel, we decided to analyze the source code of typical TLS implementations. We found out that GnuTLS, OpenSSL and other implementations do not apply the attack countermeasure properly and produce some timing leakage with an additional random number generation:

decrypt the ciphertext: m := dec(c)
if ( (m ≠ 00||02||PS||00||k) OR
     (|k| ≠ 48) ) then
   generate a random PMSR
   proceed with PMS := PMSR
else
   proceed with PMS := k

As can be seen, a random PremasterSecret is generated only in a case the message has a valid structure. However, further analysis showed that it is not possible to execute a practical attack. First, the timing difference between valid and invalid messages was very small. Second, the probability of getting a valid message was very very low, thus executing the attack would last too long.

Side channel 3: Additional internal exception in JSSE

This attack was again applied on JSSE and it is actually my favourite one. The side channel comes from an additional exception, which was generated inside of the PKCS#1 Java method. In short, the PKCS#1 unpadding method worked as follows:

private byte [] unpadV15 (byte[] padded) throws BadPaddingException {
   if (not PKCS compliant) {
       throw new BadPaddingException();
   } else {
       return unpadded text;
   }
}

As can be seen, an additional BadPaddingException was generated in case the PKCS#1 structure was invalid. This exception was caught on different layers so the client could not observe any different message. However, exception processing in Java (and in other object oriented languages as well) is pretty expensive (think of generating the whole stack trace etc.). We evaluated that an additional exception generation lasts about 20 microseconds. And we could really exploit this behaviour.
Based on the timing observations, we could differentiate valid from invalid messages. Longer response times mean an exception was thrown (the message is invalid). Shorter response times indicated no exception was thrown and the message had a valid PKCS#1 structure:

We were able to construct an oracle using this behaviour and execute a whole Bleichenbacher attack in a LAN environment. We needed about 20 hours and 20,000 oracle requests to decrypt a PremasterSecret. To determine a message validity, for each oracle request we needed to send about 750 server requests. Thus, a few millions of messages in total were sent to the server.

Side channel 4: Unexpected behaviour by some hardware appliances

In our lab we had a possibility to test the TLS implementation of IBM Datapower (a specific XML Security hardware appliance) which had a very specific timing leakage resulting from an improper PKCS#1 structure verification. The TLS framework checked only the second 0x02 byte in the PKCS#1 message. In case the second byte was equal to 0x02, the processing lasted a few microseconds longer than in by processing different ClientKeyExchange messages:

We knew from our previous timing attacks on JSSE that such a timing leakage can be practically exploited. However, the constructed oracle was not the same as the oracle used by Bleichenbacher. The longer response times were produced not only by messages starting with 0x00 0x02, but also with messages starting with 0x01 0x02, 0x02 0x02, etc. At the end, this showed up to be a real benefit since we could adapt the original attack and make it much more performant with such an oracle. Our new attack needed only about 4700 oracle queries to decrypt a PremasterSecret. The whole attack took about 40 hours and we issued about 4 million queries in total.

During our research, we found out the problem was not caused by the IBM Datapower appliance, but by the underlying Cavium accelerator chip. Thus, the same attack was very likely applicable to different appliances from F5 (confirmed by F5), Cisco, Citrix or Juniper Networks as well.

What did we learn?

The results showed how important it is to implement crypto libraries with constant timing functions. We showed that even a tiny side channel from an additional exception could lead to catastrophic results and could be used to execute the whole Bleichenbacher attack over LAN. Constant timing becomes even more relevant since crypto is moved to the browser (see Web Crypto API), where the attacker could execute "localhost-like" attacks.
In addition, we motivate for implementation of pentesting tools for TLS and other crypto libraries.

FAQ:

Heartbleed, CCS, Tripple Handshake...why do you not have a cool name for this attack?

These attacks are not new attacks on the TLS protocol. We just show how Bleichenbacher attacks (from 1998) can still be applied to several PKCS#1 implementations using new practical side channels.

We hope that our research will be accepted by the community also without having a new attack name :)

Are the described applications fixed?

Most of the attacks have been fixed during our research:

The direct attack 1 on Java was fixed in October 2012 (with JDK 6u37).
The timing attack on Java was fixed in January 2014 (with JDK 7u45, 6u65).
Cavium Accelerators were fixed in the recent months, please take a look at your vendor's site about the current status. For example, IBM and F5 track these problems in their security bulletins and under these CVE numbers: CVE-2014-4024, CVE-2014-0852.
We also proposed a fix for the Bouncy Castle library. ~~However, our messages were ignored. The library (to the best of our knowledge) remains vulnerable to the timing attacks, C# as well as Java version.~~ Java library was fixed a few weeks ago: http://www.bouncycastle.org/jira/browse/BJA-487. C#???
We did not notify developers of other libraries since the attacks showed up not to be practical.

My application was vulnerable to this attack. Do I need to change the server's private key and certificate now?

No, this attack was not as dangerous as for example heartbleed. It does not enable an attacker to recover server's RSA private key.

However, consider that the attack could have enabled an attacker to decrypt specific TLS connections transporting secret data.

Should I use RSA PKCS#1 v1.5 in my implementation?

Even though RSA PKCS#1 v1.5 is a very old and in general insecure standard, it is still used in many many applications and standards. For example, in 2012 we showed that even if all countermeasures are correctly implemented, there still exist ways to execute Bleichenbacher attacks in XML Encryption [xml-enc]. For this reason, some of the XML frameworks restricted usage to RSA PKCS#1 v2.0 (also known as RSA-OAEP). We assume, there are more applications beyond TLS/XML Encryption which remain vulnerable.

If your implementation uses RSA PKCS#1 v1.5, we recommend you switch to RSA-OAEP.

My talk at Deepsec: https://vimeo.com/116972878

On Web-Security and -Insecurity

Wednesday, August 20, 2014

Old attacks on new TLS implementations - or how a tiny side channel can break your crypto