Using Sanic to serve a Tensorflow model with multiple workers

I am having some issues running a Tensorflow model with Sanic even though it works fine with Flask. I think it has something to do with the async nature of Sanic? The specific error I get is:

2019-10-22 18:05:39.062833: E tensorflow/core/grappler/clusters/] Failed to get device properties, error code: 3
2019-10-22 18:05:39.731232: F tensorflow/stream_executor/cuda/] Check failed: err == cudaSuccess || err == cudaErrorInvalidValue Unexpected CUDA error: initialization error

and I feel it’s at least partially related to

One other thing is that the Sanic app works if I set the number of workers to 1, but this isn’t ideal for me and I’d like to be able to use more than one worker. I am using Sanic’s inbuilt webserver.

I know it’s a pretty general description of the issue, but I guess I just need some ideas to try. I think maybe isolating the code that serves as the entry point for Tensorflow and sharing that resource (i.e., Tensorflow model) between workers/processes?

I’m guessing you’re initializing some things post-fork for the workers, but I don’t have a really great answer for you otherwise.

Do you have a code snippet you could post that might help us troubleshoot with you?

I came up with a minimal example that produces a similar error. The example works if workers is set to 1 in the sanic_config dict, but not when set to more than 1.

from sanic import Sanic
from sanic.response import json
import tensorflow as tf
import random


app = Sanic(__name__)
my_test = None

class Test:
    def __init__(self):
        config = tf.compat.v1.ConfigProto()
        config.gpu_options.allow_growth = True
        self.a = tf.compat.v1.constant(5.0)
        self.b = tf.compat.v1.constant(6.0)
        self.c = self.a * self.b

        self.sess = tf.compat.v1.Session(graph=self.c.graph, config=config)

    def test_it(self, rand_num=1.0):
        with self.sess.as_default():
            return float( *

async def index(request):
    return json({'tf result': my_test.test_it(random.randint(1, 100))})

if __name__ == "__main__":
    my_test = Test()

    sanic_config = {
        "host": "",
        "port": 5000,
        "workers": 3,
        "debug": False,
        "access_log": False

In my own code I’ve also tried instantiating the class, which in the minimal example is Test, on the first request, but that doesn’t work either. I’ve tried using listeners “before_server_start” and “after_server_start”, but these try to instantiate the model multiple times and the model I am working is rather large so multiple copies cause the GPU to run out of memory. I read somewhere that setting the method to start a process to “spawn” or “forkserver” might help, but I’ve tried setting the start method right before with multiprocessing.set_start_method("forkserver") and I just get AttributeError: 'NoneType' object has no attribute 'test_it'.