New streaming API and other major changes

I have reimplemented Sanic HTTP handling in pure Python and actually gained 20 % faster performance (53000 req/s against 44000 req/s without these changes), with several hundred lines less code than the old implementation. This is a large patch with implications to all Sanic users, so I would invite public discussion on how the new API is going to look like, and on how much havoc are the changes creating in existing applications.

The pull request is here:

Please do test it, particularly if you are developing extensions or larger applications based on Sanic. Let me know what problems there are, and we’ll see how they are best fixed.

The main motivation for this was to make streaming a first class feature: now all requests and responses are streamed and there is no separate implementation for non-streaming mode. Simple responses returned from handlers still work as before because Sanic automatically handles the streaming outside of the handler call.

The new streaming API does not require callbacks and also fixes a long-standing problem with request maximum size because streaming handlers now have the opportunity to adjust maximum size prior to receiving request body:"/echo", stream=True)
async def echo(request):
    # Lift the size limit = float("inf")
    # Spawn a streaming response
    response = request.respond(content_type=request.content_type)
    # Stream request body back to sender
    async for data in
        await response.send(data)

Callback-based StreamingHTTPResponse is still supported, but deprecated in favour of the new API.

Reading and writing data is implemented with async send/receive functions which avoid some buffer queues and allow for automatic backpressure control. Another interesting effect of this is that only the server code depends on asyncio facilities in any significant manner. Because of this it becomes much easier to implement alternative backends such as a Trio server.

The pull request has a more comprehensive listing of things removed, possibly broken or otherwise affected, so please have a look if you believe this might concern you. Most importantly, again, please do test this with your stuff.

If all goes well, I expect this to land in Sanic 20.6, released in June.

1 Like

Hi @Tronic
This looks great. I’m excited to look through the code to evaluate the changes.

A bit of context, I tried eliminate the need for callbacks in streaming back when I refactored streaming responses in 2018 in

However back then we were supporting Python v3.5, so couldn’t use async for or async with, so async callbacks were the only way we could do it. Now we have Python v3.6 minimum, I think its sensible to move to this new way, though it will be a big change for users with existing streaming response code, when upgrading.

I’ve already rewritten StreamingHTTPResponse to internally use the new API. It is going to get marked deprecated but should remain functional as long as we want to keep it.

Streaming requests by stream=True and then await should also function just as before from app developer viewpoint.

The breaking changes are a bit more subtle, e.g. with max request size and timeout handling. Anyone using custom protocols will be entirely broken (quite possibly that feature should be dropped entirely), and currently I am planning to drop the request body_init, body_push, body_finish methods in favour of a simpler async method receive_body that may be overridden to accomplish the same, but that can also be called by streaming handlers themselves. Response middleware are also in a churn, in case they are interested on response body and not only the headers. One possibility is to make middleware only able to access the body in a streaming manner but that is obviously a major breaking change.

The big and urgent question is what kind of mitigation and support APIs can be provided in 20.3 to ease the transition, and which functions should already be marked deprecated in this release.

I’ve made the default request maximum size for streaming handlers infinite, so that line in the example code is no longer needed. Optionally one may set on per request basis, if limits are desired. app.config.REQUEST_MAX_SIZE still affects the maximum header size and non-streaming routes’ bodies, as well as the maximum input data buffered by Sanic (e.g. if the handler is busy doing something other than reading).

A related breaking issue is that the default value of 100 MB is too large and allows crashing Sanic servers by sending in very large requests so that the server runs out of RAM. Until this patch lands, it is hard to reduce that without affecting the sizes of files that can be uploaded, but eventually (maybe at 20.9, and once form handling also supports streaming?) the default value should be reduced to something in the order of 1 MB.

@Tronic, Just want to let you know I have this on my radar. I apologize I have not been able to look closer and provide feedback. I appreciate the effort you put into this. Just been a couple of ridiculous weeks for me. :dizzy_face:

Likely I will get back with some thoughts by tomorrow evening.

Reviewing and will respond with some specifics in the PR.

In general, I am hugely in favor of this. This looks like a potential great step forward. As far as deprecation goes, I think we need to get some warnings in there ASAP. Which at this point might be tight to get into 20.3. And, without a good policy yet on deprecations, I would be hesitant to green light this for 20.6 just yet.

I am glad you took out the need to place float("inf"). It is not a hugely used part of Python and might trip up some users.

I ran this against sanic-jwt, not bad for a first pass.

FAILED tests/ - AssertionError: assert b'Error: Method GET not allow...
FAILED tests/ - AssertionError: assert b'Error: Requested URL /au...
FAILED tests/ - AssertionError: assert b'Error: Meth...
FAILED tests/ - AssertionError: assert (b'<!DOCTYPE html><meta chars...
=============================== 4 failed, 197 passed, 2 xfailed, 197 warnings in 20.45s ================================

There is a separate PR #1800 for 20.3 deprecations. Breaking existing things should be avoided as far as possible but not all internals can be protected and especially extensions are likely to use things that might be considered internal (not part of tests and not preserved in the streaming branch). Fortunately some relevant deprecations like the output functions on response objects have been in place for quite some time already.

Are the sanic-jwt tests passing against current master? Looks like the failures are about unexpected error message strings.

Of course. I am not advocating for a static internal API, or even developer facing.

As for sanic-jwt, I had not tested it since the last release. The tests seem to have nothing to do with this PR, but rather #1768.

As to the issues at hand:

Request size

but requests are no longer rejected because of body size until the body is received.

This seems like one of the more impactful changes that developers may come across. And, does raise some concerns. I know you are advocating lowering the limit (which would bring it in line with the nginx default, but it still has the potential for harm since the check will not occur until after the body is received. So, if I do want to increase to 100Mb, my server will need to receive all 100 before it fails on 101.

This brings me back to something that I really would like to implement: route level config changes.

@app.route(..., request_max_size=100)
async def upload(request):


The middleware remains a big problem.

Response middleware issues more or less resolved

What is the state of middleware support?


Since the core of this change is HTTP parsing, and while I am not calling into doubt the quality of work, I am wondering how best to introduce this change. There are sort of two competing thoughts I see on this:

  1. We aim for 20.6 which still provides 6 months to deal with stabilizing it before an LTS; or
  2. We implement it as an “opt-in” or “beta” feature

I realize that option 2 would require a lot of work, and a high level of complexity. But I am also worried that there is such a huge amount of new tests we need to cover that, as we know, we are bound to miss something.

API changes

I have found myself overriding these, from time to time. I think we can still keep these in the API if for nothing else as hooks. If not, then perhaps we add calls to them with a warning that the methods have been removed, and then completely remove them after a further release.


Final thoughts

@Tronic, what do you see as the biggest remaining gaps/obstacles? Test coverage?

Request size limit is enforced when starting to receive the body, or when the known size at any phase becomes larger than the current limit. For instance, someone may send content-length header with a very large number, causing the request to be rejected as soon as either the handler or the internal logic (for non-streaming routes) attempts to read request body. Similarly in chunked encoding at some point a chunk with very large size may appear, causing instant rejection without actually receiving until the limit is hit.

However, the asyncio protocol level receive buffer needs to be made smaller. The current code just uses REQUEST_MAX_SIZE for that but it should rather be around 64 KiB instead. Otherwise the sender may send too much before the app even tries to receive any of it. Whether this needs to be configurable, like on websocket, and whether the send buffer also needs to be, is an open question.

Middleware issues: I’ve found no adequate way of handling streaming bodies but AFAIK all standard functionality is in place.

HTTP parsing does not check for invalid header names (like httptools does) and might be more strict about newlines (CRLF vs. LF) and charsets (line-by-line autodetection of those would be slow).

The response API is still in flux and there are several ways to go. The current approach cannot support await file_stream("filename") to actually stream the response because it has no access to the request, stream or any other objects to do so. It is possible to fix that by async-local global variables, as is done by Quart because it has to emulate Flask’s thread-local request and response globals. This would also avoid the need for request.respond() as I’ve currently circumvented this issue, but there are also pretty big drawbacks in that design.

The refactoring is related. I think I’ll need a new “glue” class to hold various things now scattered across, and that really are shared between all of them.