The trouble with new TLS version numbers

September 28, 2016

This article was contributed by Hanno Böck

The TLS working group in the IETF is currently working on the next version of the encryption protocol: TLS 1.3. The new protocol will bring performance improvements by avoiding round trips and will deprecate a lot of dangerous cryptographic constructions. But, apart from technical improvements, it will also bring something that may seem trivial, but that could cause a lot of trouble: a new version number. That will probably lead to a redesign of the TLS version-negotiation mechanism.

When a new version of a protocol gets introduced, there must be some mechanism to keep compatibility with existing implementations. Not everyone will move to TLS 1.3; many legacy implementations will keep using TLS 1.2 or older versions for years to come.

TLS uses a version mechanism that may seem relatively simple, but it has been the source of a surprising number of problems. When a client connects to a server, it sends the highest version number it supports in the ClientHello message. The server can reply with any version equal or lower than that. Therefore, if a client connects to a server with a maximum version number of 1.2 and the server only supports TLS 1.0, it will answer with that version. As long as the client still has compatibility for TLS 1.0, a successful connection can be established.

This ideal case often doesn't occur, however, due to faulty server implementations. Many servers simply fail once one tries to connect with a higher TLS version than they support. The failure can happen in a variety of ways. Some servers terminate the connection on a TCP level or send a TLS error alert, others simply wait until a timeout happens. Some also successfully send a TLS ServerHello and almost complete a handshake, but fail later during verification of the FinishedMessage, which is the last part of the handshake. All these behaviors are bugs in the server software.

Version intolerance

This problem is known as "version intolerance" and it has cropped up every time browsers and TLS implementations have introduced new protocol versions. An old web page documents the problem; it was written by Netscape in 2003 and can be found in the Mozilla wiki. Most of the affected devices were enterprise TLS appliances, although occasionally free implementations like OpenSSL were also affected.

Browser vendors have reacted to these problems with a questionable strategy: after a connection failure, the browser tries to reconnect with a lower TLS or SSL version. Back then, the only versions in widespread use were SSL 3 and TLS 1.0. While this avoided problems with broken servers, it introduced another problem: these downgrades occasionally happened because of dropped packets due to bad network connections. Therefore protocol features that were only supported in TLS 1.0 stopped working on an irregular basis.

One extension that TLS 1.0 introduced is called Server Name Indication (SNI) and it removed a limitation of the old SSL protocol, by allowing multiple domains with different certificates to be hosted on the same IP address. SNI allows shared hosting services that often host hundreds of websites on the same IP to deploy HTTPS. The deployment of SNI was severely hampered by the browser's version fallbacks in TLS, because randomly website visitors would see the wrong certificate due to a connection downgrade to SSL 3.

The version fallbacks also introduced security issues. If browsers try to reconnect with a lower TLS or SSL version, then a man-in-the-middle attacker can force these version downgrades by blocking ClientHello messages with higher version numbers. At the Black Hat USA conference in 2014, Antoine Delignat-Lavaud presented an attack called "virtual host confusion" (YouTube video, paper [PDF]). The attack exploited the fact that an attacker can disable SNI by a forced version downgrade.

Later that year, Bodo Möller, Thai Duong, and Krzysztof Kotowicz discovered the POODLE attack — a padding oracle attack that exploits the fact that in SSL 3 the padding of the encryption was undefined and could have any value. But that alone wouldn't have been very interesting, because at that time SSL 3 was rarely used. In combination with version fallbacks, however, POODLE became a severe issue because almost all servers and clients still supported SSL 3. With version downgrades it was easy to force a connection to use the old protocol. The POODLE paper introduced the term "protocol downgrade dance" for the downgrade behavior of browsers.

In response to these kinds of problems, a mechanism called "Signaling Cipher Suite Value" (SCSV) was introduced. By including a special cipher suite value, servers could signal to clients that they weren't defective, thus if a connection used a version downgrade it shouldn't be established. SCSV got standardized as RFC 7507, but it quickly became almost obsolete, because browser vendors decided that they could get rid of the questionable version fallbacks entirely.

SCSV is notable, though, because it is a feature for the TLS standard that exists solely to work around buggy implementations. But it's not the only such feature. Some devices from the company F5 fail to allow connections if a handshake has a size between 256 and 512 bytes. Therefore a padding extension was introduced that simply expands the handshake to avoid those sizes. However, it later turned out that this solution would cause other implementations to fail, because they don't accept handshakes larger than 512 bytes.

The return of fallbacks

Despite all the drama version fallbacks have caused, they may make a comeback. In a recent blog post, Google developer Adam Langley commented:

It's taken about 15 years to get to the point where web browsers don't have to work around broken version negotiation in TLS and that's mostly because we only have three active versions of TLS. When we try to add a fourth (TLS 1.3) in the next year, we'll have to add back the workaround, no doubt.

Langley was certain that there is no way to avoid TLS version fallbacks when TLS 1.3 gets introduced. The reason is that currently about three percent of the major web pages have problems with TLS 1.3 handshakes. In theory, browser vendors could skip the fallbacks and simply break non-compliant sites, however that's unlikely to happen. A browser that breaks a large number of sites and devices will likely face a backlash from users and may push those users to choose another browser. Chrome has often faced heavy criticism from users when it deprecated insecure mechanisms in the past. When Google deprecated insecure Diffie-Hellman parameters, it broke connections to a Cisco RV042G router. While it is obvious that Cisco was at fault here, the user reactions that can be seen in Chrome's public forum blamed Google for its effort to make the Internet more secure.

TLS 1.3 contains a mechanism similar to SCSV that could avoid the worst consequences of version intolerance. By sending a specific value in the random number field of the handshake, a server can indicate that it doesn't want downgraded connections. Still this is far from ideal, as it adds another layer of complexity. Ideally vendors should just fix their TLS implementations.

Vendor responses

The vendors responsible for broken version negotiations mostly don't seem to care a lot. I have tried to identify affected vendors. Many of the buggy web pages use Citrix Netscaler devices. Citrix has informed me that it is aware of this problem, although it doesn't consider it to be a security issue. Citrix was unable to give any timeline on when this bug will be fixed.

Several products from IBM, among them IBM HTTP Server and Lotus Domino, are also affected. At first IBM security simply denied that there is a problem and claimed that the issue was already fixed in the current HTTP Server release. After informing them that I actually tested with the latest release and that it is still affected, the company looked into it. IBM informed me that it doesn't treat the issue as a security vulnerability. IBM was unable to give a concrete timeline when a fix will be available, but informed me that it will likely happen with the next version of its TLS implementation, GSKit, which will be released by the end of the year. A while later, IBM went back into denial mode and informed me that the issue was closed, because the company was unable to reproduce it — after it already confirmed that it was working on a fix.

So two major vendors didn't consider this issue a security vulnerability and didn't see any urgency to tackle it. While it is true that this issue itself doesn't cause a security problem for its device owner,past experience has shown that down the line these bugs can cause security issues, because they force client implementations to implement dangerous behavior.

The third vendor that could be identified was Cisco and version intolerance affects their ACE load-balancer devices. These devices are out of support and no longer receive updates. It was made clear to me that Cisco won't consider any exceptions to its end-of-life policy. So people who still use these devices will have to live with this bug, with no way of fixing it. Cisco did promise to verify whether devices that are still supported are also affected by this bug. As the software of these devices is proprietary, there is no way for users to fix these bugs themselves.

I also tried to contact operators of major affected web pages, but with limited success. The most notable web pages that fail with a TLS 1.3 handshake are apple.com, ebay.com, and various localized versions of PayPal. In many cases, only connections without a leading www are affected. The reason for that is probably that the www version of a site is often transferred to a content delivery network, while the domain without www is delivered by another device that simply forwards connections.

Apple and eBay didn't answer questions about their version intolerant web services; both sites are still affected. PayPal simply said that TLS issues aren't covered by their bug bounty program, but refused to discuss the issue any further.

Server operators can test their server for TLS version intolerance with the SSL Labs test or with the testssl.sh tool. Both tests have limitations and don't catch all instances of version intolerance. The most reliable way to test right now is to use the Beta or Dev channel release of Chrome and manually enable TLS 1.3 (via chrome://flags option "Maximum TLS version enabled") or use Firefox Nightly (set "security.tls.version.max" and "security.tls.version.fallback-limit" to "4" in about:config). Trying to access version intolerant sites that usually support HTTPS will result in a connection failure.

Rethinking version negotiation

Given the situation, Google developer David Benjamin proposed a different route with a redesign of the whole version negotiation mechanism. He suggested that the version could be negotiated with an extension that sends a list of supported newer versions. Obviously the same problem with version intolerance could happen again with such a solution in the future: servers may simply not work if they see any version in the extension that they don't know.

To avoid this, Benjamin proposed that browsers could randomly send bogus version numbers that get reserved with a guarantee that they will never be used for any real TLS version. Any correct implementation should just ignore all unsupported version values. Bugs in servers that fail when they see a version number they don't support would likely be discovered much earlier, so they probably will never make it into production releases. It is still possible that vendors could implement this in the wrong way by just ignoring the reserved bogus version numbers. However, it is hardly imaginable that one does so without outright trying to create non-compliant software.

Benjamin also proposed a generalized variant of this mechanism under the name Generate Random Extensions And Sustain Extensibility (GREASE). The same way that bogus version numbers are sent could be used for extensions and cipher suites to avoid bugs in those areas.

The proposal for a TLS version negotiation via an extension was received with skepticism during the last IETF conference in Berlin. It would further complicate an already complicated handshake. The existing ClientHello already contains two version numbers, the TLS record layer version and the real ClientHello version. The TLS record layer version never had any real meaning, so most implementations simply set it to the version value of TLS 1.0 and ignore it. TLS 1.3 will make this official and says that it must be ignored. What further adds to confusion is that the version numbers sent over the wire don't match the version numbers of the protocol. For historic reasons — all versions of TLS came after SSL version 3 — TLS 1.0 is indicated with the value pair {3, 1}, TLS 1.3 will be {3, 4}.

The TLS community was therefore uneasy with the idea of adding another layer of complexity. But Benjamin's latest proposal got more support on the mailing list than during the IETF conference. It has now the status of a rough consensus and will most likely be part of TLS 1.3.

The GREASE strategy is an interesting new paradigm for designing protocols in an ecosystem where many vendors ship low-quality products that implement specifications incorrectly. There is a need to stay compatible with an existing infrastructure of defective devices. Similar strategies have been used in other cases. HTTP/2, for example, is not negotiated over a normal HTTP request, instead an extension mechanism for TLS called Application-Layer Protocol Negotiation (ALPN) is used to negotiate the higher version.

David Benjamin's GREASE concept goes one step further and tries anticipate potential failures. He has tried to design a protocol where bugs will show up before products are shipped. It'll be interesting to see whether this leads to a less fragile TLS ecosystem.

Index entries for this article
Security	Transport Layer Security (TLS)
GuestArticles	Böck, Hanno

The trouble with new TLS version numbers

Posted Sep 29, 2016 7:47 UTC (Thu) by sourcejedi (guest, #45153) [Link]

Google have just published the draft spec for a protocol called Roughtime, which allows clients to determine the time to within the nearest 10 seconds or so without the need for an authoritative trusted timeserver. One part of their ecosystem document caught my eye – it’s like a small “chaos monkey” for protocols, where their server intentionally sends out a small subset of responses with various forms of protocol error

The trouble with new TLS version numbers

Posted Sep 30, 2016 8:17 UTC (Fri) by HybridAU (guest, #85157) [Link]

Hanno, that was a brilliant write up! I really enjoyed it, I thought I knew a lot about TLS but much of that article was news to me.

Thanks :-)

The trouble with new TLS version numbers

Posted Sep 30, 2016 17:30 UTC (Fri) by hkario (subscriber, #94864) [Link]

You can test if a server implements the version negotiation correctly if the client sends a protocol version higher that the highest supported by server using tlsfuzzer: https://github.com/tomato42/tlsfuzzer/blob/master/scripts...

Other ideas to mark (i.e. punish) non-conforming servers

Posted Oct 5, 2016 18:11 UTC (Wed) by robbe (guest, #16131) [Link] (1 responses)

1. decorate the addressbar in some way
2. give severe penalties (or outright de-list) in Google search results
3. make the users click through some warning before opening the site
4. mark the site as technically shoddy in search results

Other ideas to mark (i.e. punish) non-conforming servers

Posted Oct 6, 2016 21:35 UTC (Thu) by magfr (subscriber, #16052) [Link]

All of these still fail to handle enterprise firmware management systems and their users are usually well aware of their general shoddiness.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

The trouble with new TLS version numbers

Version intolerance

The return of fallbacks

Vendor responses

Rethinking version negotiation

The trouble with new TLS version numbers

The trouble with new TLS version numbers

The trouble with new TLS version numbers

Other ideas to mark (i.e. punish) non-conforming servers

Other ideas to mark (i.e. punish) non-conforming servers

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.