SSL Handshake Denial of Service


#1

Notice
This bulletin was posted on March 4th, 2019 and has been updated as of March 7th, 2019

Background
On Sunday February 24th the Sanic Community was contacted about a recently
discovered issue when implementing SSL directly through Sanic. The core development
team reviewed options and impact and has identified several remediation paths.

Description
Sanic running under versions of Python prior to 3.7 are impacted by a possible
denial of service whereby the SSL handshake is left in an incomplete state,
causing connection exhaustion or memory exhaustion, preventing further
connections. This is due to Python not implementing a connection timeout for
SSL prior to version 3.7. Ref: https://bugs.python.org/issue29970

Who is affected?
Anyone using Sanic to terminate SSL connections and using Python versions
3.6.x and prior when not using uvloop (installed with SANIC_NO_UVLOOP, as
for example in alpine-based containers)

Remediation options, in order of preference

  • Update to Python 3.7 - the fix is included in Python itself
  • Terminate SSL at a load balancer or a proxy (such as AWS’ ALB or nginx)
  • Implement the patch created by Richard Kuesters - see: https://github.com/vltr/sslproto37
    This includes simple examples including how to see the DOS issue. Earlier
    revisions indicated an increase on overhead, however we are backporting
    the change to 3.7 for the handshake, and so no additional impact is expected.

#2

While I’m working on finishing the patch, which is completely different than the first prototype I created. In fact, it’s just a new event loop policy that creates under the hood a backport of Python’s 3.7 SSLProtocol, which already has this implemented.

It’s worth to mention that uvloop implements a SSL handshake timeout on its own and, if available, is used by Sanic by default on Python 3.5 and 3.6, leaving this as a lesser problem for us to mitigate.

I’m just working on creating unit tests for this new event policy (with no overheads). As soon as it’s available, I’ll drop a word here and start working on two PRs: for 18.12 LTS and master branches.


#3

I’ll update the bulletin once you get the patches completed, @vltr!


#4

Thanks, @sjsadowski

I just uploaded the (main) code to its repository. There are some info there that you can use to increment the bulletin as well as simple examples (in the examples folder) where you can actually see the possible DOS problem by yourself.

I already have drafted the PR for Sanic, but it needs some cleanup and proper testing - as well as taking under consideration some aspects of the backport itself to fit into Sanic. I’ll also send this over the mail list that @ahopkins started for Tom and Phil to use the code. I hope to finish it soon.

(My brain feels like a pudding today, I ate a lot of letters in the commit message … Oh my).

Richard.


#5

All - I had to update the repository without a print that I forgot inside sslproto37.py and some better unit tests to get a better result handling (and not just “guess”).


#6

@sjsadowski, some considerations to update your bulletin:

You can add that if the user / developer is using uvloop with Sanic, then he (or she) is not vulnerable to this bug, since uvloop does provide SSL handshake timeout instructions on its own and works fine under Python 3.5 and 3.6.

This new “solution” is simply a backport of Python 3.7 SSLProtocol. It does not add any overhead except for the SSL handshake timeout task itself, which is implemented in Python 3.7. If you do not use SSL, there will be no overhead (and you would not need this backport anyway).

:wink:

EDIT: I plan to make the Sanic PRs this weekend, if that’s OK with you :wink:


#7

I think that statement was made based on the “ugly POC” - I’ll revise!


#9

#10

Yep, that “ugly POC” was indeed ugly as hell and prone to error I late discovered :grimacing: