Is it possible to run a long process without waiting?

cx42net · November 30, 2021, 8:54am

Hi,

I was wondering if it were possible to run a long process without waiting on it, and returning a basic response such as “202 - Accepted” to the client.

I thought about create_task from Python’s Asyncio, but from my understanding, the running task could be killed by the garbage collector since the function has finished and we are not waiting for the task to complete.

Here’s what would be a long computed task:

async def long_process(request):
    await asyncio.sleep(600)

@app.post('/images/convert')
async def convert_images(request):
    task = asyncio.create_task(long_process(request))
    return text('Accepted', status=202)

Another ideal solution would be to send the response, THEN process the data, something like :

@app.post('/images/convert')
async def convert_images(request):
    await sanic.response.send('Accepted', status=201)
    await long_process(request)

But I’m not sure if something like this is possible with Sanic ? (the sanic.response.send is invented here)

Thanks for your help

ZinkLu · November 30, 2021, 10:33am

It looks like signal’s job:

import asyncio

from sanic.response import text
from sanic import Sanic

app = Sanic("longprocess")


@app.signal("app.trigger.long_process")
async def long_process():
    print("start long process")
    await asyncio.sleep(5)
    print("start long process done")


@app.get('/images/convert')
async def convert_images(request):
    await app.dispatch("app.trigger.long_process")
    return text('Accepted', status=202)


app.run()

BUT, convert image looks like a CPU bound operation? This will bolck the request.

ahopkins · November 30, 2021, 3:39pm

That is the use case for: Background tasks | Sanic Framework

request.app.add_task(long_process(request))

You are correct that it is not entirely immune from cancellation. If your server crashed, or someone tripped on the power cord, there is no recovery. For that level of confidence, you would need to send it off to a thirdparty application.

ahopkins · November 30, 2021, 10:37am

One thing that you might do (using @ZinkLu suggestion of using signals) is to create a signal whose job it is to initiate a process somewhere else. If it is a CPU bound item, then yes, this will not really help you.

Take a look at the example here: https://github.com/ahopkins/pyconil2021-liberate-your-api/tree/main/saje_project/breakpoints/bp3/saje_project/src

That might be helpful for you to push off work that is CPU bound.

cx42net · November 30, 2021, 12:44pm

Thank you @ZinkLu & @ahopkins for your feedbacks!

I read about signal, but I (might) have a wrong understanding of how it works (maybe the doc should be more clear about this) => Having a await app.dispatch("app.trigger.long_process") made me believe the await would wait for the signal to be run and finished before going on the next step (returning the data).

If the await here is just to wait that the signal is properly sent, without waiting on the destination to complete, that is exactly what I need.

… which leads me to wonder what is the difference between Signals and Background tasks?

I agree with @ahopkins about the immunity from cancellation, but my needs are bound to the Sanic server: it’s an asynchrone computation (image processing was just an example here) that still depends on Sanic to run because the user will be notified once the computation is finished, so if the server crash, the request not completing is not an issue.

Just to TLDR:

Could you confirm that await app.dispatch("app.trigger.long_process") does not wait for the signal handler to complete before continuing
The difference between background tasks and signals ?

Thanks

ahopkins · November 30, 2021, 12:58pm

Correct. There is an option to make the actual running block, but that is not default. I am not at a keyboard now, but I believe it is inline=True. In your case you would not want that.
They are essentially the same. They both attempt to push off work to a new asyncio task (unless you use inline as stated above). Signals are meant to be a little more reusable and might be easier to run from multiple places in your app. If you have a single handler maybe this is not helpful.

But, it is worth repeating that just by pushing to a task, if that task does not yield back to the loop and instead has blocking calls, it will still block the whole worker instance.

cx42net · November 30, 2021, 2:11pm

Great ! It makes sense then, thank you for the clarification !

The task in real is just a loop over a database so it uses async calls and does not have blocking calls.

ahopkins · November 30, 2021, 3:25pm

Perfect! That’s exactly the type of operation that works well for this use case.

cx42net · November 30, 2021, 3:40pm

One last thing though, what’s the difference between Background tasks and asyncio.create_task?

Why should I use one rather than the other?

ahopkins · November 30, 2021, 4:04pm

Nothing. The Sanic method is a wrapper around the other. It does have some safety catches so that you can still call it outside of the loop. Meaning, before you call app.run() it’s possible to add a task that will begin as soon as the server starts. Otherwise, it’s nothing more.

In the next release, there will be some utilities to help track those tasks. But it’s just candy.

cx42net · November 30, 2021, 4:18pm

Ok great, thank you!