Request.stream.read() with max file size and parsing

cx42net · November 10, 2022, 4:49pm

Hi!

I’m trying to allow uploading a file bigger than the max file size limit defined in Sanic (I know it can be changed but I don’t want to, except to that particular endpoint).

So I’ve followed the suggestions in the doc but faced a few issues:

How can I ensure I read the stream as they come and block when the size limit is reached
How can I ensure that the sender won’t clog the script by starting to send, then waiting forever
How can I process what I just received.

Here’s what I’ve implemented, and I have a few questions about it:

from sanic.request import parse_multipart_form, parse_content_header, DEFAULT_HTTP_CONTENT_TYPE

@app.post('/<id:int>/upload', stream=True)
async def upload(request, id):
    MAX_FILE_SIZE = 14  # In Mb

    if (int(request.headers.getone('content-length', '0')) or 0) > MAX_FILE_SIZE * 1000 * 1000:
        abort('You can upload up to {}Mb.'.format(MAX_FILE_SIZE), 400)

    body = b''
    for i in range(0, MAX_FILE_SIZE * 1000):  # We consider the batches will be at least of 1kb (I don't have any ideas)
        chunk = await request.stream.read()
        if not chunk:
            break
        body += chunk

        if len(body) > MAX_FILE_SIZE * 1000 * 1000:
            abort('You can upload up to {}Mb.'.format(MAX_FILE_SIZE), 400)

    if not body:
        abort("Please send a file.", 400)

    # I re-use the code done by Sanic here
    content_type = request.headers.getone("content-type", DEFAULT_HTTP_CONTENT_TYPE)
    content_type, parameters = parse_content_header(content_type)
    form, files = parse_multipart_form(body, parameters['boundary'].encode('utf-8'))

    key = None
    try:
        keys = list(files.keys())
        assert len(keys) == 1
        key = keys[0]
    except Exception as e:
        abort('Please upload one file at a time.', 400)

    filename = files.get(key).name
    content_type = files.get(key).type
    file_content = files.get(key).body
    # ... proceed with the data

My questions:

Is it possible for a sender to start sending data then stop. This would put the server in a waiting position and if done a million time could lead to a DoS attack. Is there a mechanism in place to avoid that or do I have to implement it?
Do you see any issues with this implementation? If so, which ones? What can be improved?

Thanks!