Highly different CPU usage when proxying m2tp udp unicast stream to a streamed Sanic.response

I have developed a proxy which lets me consume catchup video provided by my ISP as udp unicast m2tp streams through http. For that I am using Sanic.

It is working beautifully yet there is something I cannot explain. Some streams use less than 8% of one cpu core (i7-3770), while other specific ones consume as much as 30%. All the streams are in principle the same and I do not do any processing on them, just read from a udp socket and write to a streamed sanic.Response, basically this:

import asyncio_dgram
import socket

from contextlib import closing
from Sanic import Sanic, response

app = Sanic()

MIME = 'video/MP2T'

@app.get('/some/url')
async def handle_url(request):
    host = socket.gethostbyname(socket.gethostname())
    client_port = 43545
    async def my_streaming_fn(response):
        with closing(await asyncio_dgram.bind((host, client_port))) as stream:
            while True:
                data, remote_addr = await stream.recv()
                await response.write(data)

    return response.stream(my_streaming_fn, content_type=MIME)

All the streams arrive on a separate vlan, all the ts packets are 1344 bytes (1316 without headers). I see the difference with streams of the same bitrate, even from the same original tv channel.

What I am missing which could explain the huge difference in cpu usage between some and other streams? Any idea?

Consistently the same streams are higher? Or are you seeing variance in each run?

Consistently the same streams and it has nothing to do with their bitrate. I’ve seen streams of ~3mbps consuming 30% while usually the most common 7-8mbps streams take ~8%.

I don’t think there’s anything special or peculiar happening with sanic here. It’s reading bytes and streaming them. What version are you using?

You might also try current master. There are major changes to how data is streamed to the response.

I’m using it within docker alpine, so Sanic 20.12.1

I might as well try with master and see how it performs.

I’m running master now and I see the same 30% consumption.

Of course I am using the deprecated API, I might benefit from the new one but I am not clear how to use it to do the same thing.

The new api is implemented under the hood. The streaming callback is just a compat layer now.

New style is documented here: sanic/response.py at a46ea4fc59c764058cdcc83fc136032ccbf140dc · sanic-org/sanic · GitHub

But, it may be subject to change. The final api for that probably will not be until 21.6 or 21.9.

Wow, master does indeed improve performance.

Turns out I was not actually using master, pip install git+https://github.com/Tronic/sanic correctly clones the repo and installs sanic, but version 19.2, not master, even if I specify the branch at the end of the url, still 19.2 gets installed.

So now I clone, build a wheel and install pip install the wheel.

I am also using the new API, I was initially confused but it is cleaner and works very well. End result, now maximum cpu consumed is 13%, on the videos where it hit 35% before, so much much better.

There is one other small gripe I’ve had with this streaming system, I cannot reliably close the connection when I get no more data.

I am unable to detect there is no more data arriving oh the udp socket, even if I set a timeout on it, it makes no difference.

At the same time, the only way to detect the client has closed the connection, so I can do some external cleanup, is in a finally block, so with the new API, something like this:

        try:
            resp = await request.respond()
            with closing(await asyncio_dgram.bind((host, int(client_port)))) as stream:
                while True:
                    data, remote_addr = await stream.recv()
                    if not data:
                        log.info(f'Stream loop ended [{request.ip}]')
                        await resp.send('', True)
                        break
                    await resp.send(data, False)
        except Exception as ex:
            return response.json({}, 500)
        finally:
            do_some_cleanup_here()

        return resp

The if not data is never reached, even after 30s of no data being received from the udp datagram.

The except is never reached either,only the finally block when it is the client who closes the connection.

Can you help me understand this? I do not.

That fork was abandoned and the the PR that implemented it was on a branch in the main repo.


I am not sure I understand the second question. It seems that your method of receiving data is not providing empty data to signal a close. Therefore, I think try/except/finally is appropriate.

First, I was using the wrong url for installing sanic master, so my bad.

Second, the too high cpu usage only happens with debug=True, that’s when I see more than a 30% of a core used to proxy a video stream.

Third, I was missing an await resp.send(end_stream=True) in the finally block. With that the client understands the stream is over and does not request it again, which resulted in a looped video played over and over. Now it requests the next second of video so it works smoothly.

Finally, to detect I am not receiving any more data from the udp datagram within a time limit, I only found two ways: signal.alarm() and asyncio.wait_for(). The later causes a bigger spike in cpu usage, the former just kills Sanic, I did not manage to catch it without Sanic being killed.

Seems like an issue with asyncio_dgram. Typically, when something is done streaming it should send empty bytes or some other sort of close signal.