A fAST new router

@ahopkins: when I pointed you towards AST, I already knew (from my benchmarks) that this could be an awesome feature for Sanic, not only because it would be pure Python (requiring no extra dependencies) but also would be in our reach to maintain.

@5onic I also like the trie model, I just don’t know if having a pure Python (or regex mixed version) would be as fast as having code written on the fly. Moreover, perhaps we could try and mix trie with the AST router and see if we can benefit from it - but this would make the router code pretty exclusive to maintain …

As of xrtr, I’m playing a little bit (when possible) with pybind11, and I really believe an interesting router could be made using plain C++ with direct bindings to Python (instead of using Cython), but, again, that would be a binary dependency for Sanic … Which I believe is a downside.

Oh, just saw that (regarding trie ideas with AST) … anding seems like a good alternative, but the speed up is almost unnoticeable :man_shrugging:

So … A trie regex perhaps? Argh, that would be hell on earth to debug (and even harder to find a good way to write it down inside the router).

That was this test: A fAST new router

There is an ever so sight benefit of you can evaluate a few levels of the hierarchy at once. Given the scope of complexity it adds for the almost irrelevant difference… :balance_scale:

That’s basically what hyprtr is.

At the end of the day, the AST has a lot of benefits. As much as I looked into it, @vltr, you were right.

I am working on another base version so that it can be used for both routing and also the signaling RFC.

1 Like

Ah, I did not dig myself completely inside the hyprtr code to check on that … Sorry.

Well, I’m glad all this effort we put into xrtr, hyprtr, the Sanic router and the benchmark suite came to some good usage inside Sanic after all :muscle:

I know I’m a bit over my head with things to do but I would love to work on the new router.

3 Likes

So, I redid the methodology for parsing the tree. But I realized an obstacle that I am not sure how to overcome.

First of all, one thing I am trying very hard to achieve is to make this a drop in replacement for the Sanic router. Sanic will allow you to do something like this:

srouter = sanic.router.Router()
srouter.add("/foo/<id:int>", ["get"], 1)
srouter.add("/foo/<id:string>", ["get"], 2)

You can run this and see it works as expected.

class Request:
    def __init__(self, path, method):
        self.path = path
        self.method = method

print(srouter.get(Request("/foo/1", "get")))
print(srouter.get(Request("/foo/abc", "get")))

Two routes have an identical path, but they have different regex. Therefore the Sanic router allows this.

The equivalent in Falcon does not work. This is not acceptable

frouter = falcon.routing.CompiledRouter()
frouter.add_route("/foo/{id:int}", method_map=["GET"], resource=1)
frouter.add_route("/foo/{id:uuid}", method_map=["GET"], resource=1)

The way the AST style routers work is that all dynamic variables are sort of the same. There is nothing to check or compare against. Later, after the route is found it can be cast to type.

Which seems like a roadblock for full compatibility.


Or is it?

Perhaps we can merge them with two optional casts, and it can try both. Of course the idea popped into my head after I started writing this…

I am going to leave the post anyway. Will report back.

Okay, ignore that last post. It is feasible. Not thrilled about adding a regex.match, but it works. This is definitely something to optimize at a further point.

def find_route(path, router, basket):
    parts = path.split(router.delimiter)
    num = len(parts)
    if num > 0:
        if parts[0] == "":
            if num > 1 and REGEX_TYPES["int"][1].match(parts[1]):
                basket[1] = parts[1]
                return router.dynamic_routes["/<foo:int>"], basket
            elif num > 1 and REGEX_TYPES["string"][1].match(parts[1]):
                basket[1] = parts[1]
                return router.dynamic_routes["/<foo>"], basket

Here is most recent version: https://gist.github.com/ahopkins/9b91b950c94c1bc3c24df741b7fb02ef#file-v2-py

2 Likes

So, quick explanation.

The new version is generating a tree that looks like this:

 Node(level=0)
     Node(part=a, level=1)
         Node(part=much, level=2)
             Node(part=longer, level=3)
                 Node(part=item, level=4)
                     Node(part=<id>, level=5, route=<Route: /a/much/longer/item/<id>>, dynamic=True)
             Node(part=<longer>, level=3, dynamic=True)
                 Node(part=item, level=4)
                     Node(part=<id>, level=5, route=<Route: /a/much/<longer>/item/<id>>, dynamic=True)
         Node(part=thing, level=2)
             Node(part=<id>, level=3, route=<Route: /a/thing/<id>>, dynamic=True)
                 Node(part=and, level=4)
                     Node(part=doit, level=5, route=<Route: /a/thing/<id>/and/doit>)
         Node(part=<banana>, level=2, route=<Route: /a/<banana>>, dynamic=True)
     Node(part=<foo>, level=1, route=<Route: /<foo>>, dynamic=True)
         Node(part=bar, level=2, route=<Route: /<foo>/bar>)
         Node(part=buzz, level=2, route=<Route: /<foo>/buzz>)
         Node(part=fuzz, level=2, route=<Route: /<foo>/fuzz>)
     Node(part=<foo:int>, level=1, route=<Route: /<foo:int>>, dynamic=True)

It is strictly ordered, so it does not matter at all the order of inserting new routes.

As you can see from this example, there does exist the potential to collapse some layers together where you have multiple consecutive static path parts.

2 Likes

Wow, nice stuff @ahopkins could you please attach examples how to use the v2 router? I’ll also play with it.

For now… this API will get cleaned up.

from functools import lru_cache
import random

# This would be the router used by Sanic still need to add some cleanup here
class Router(BaseRouter):
    @lru_cache
    def get(self, path, *args):
        return self.resolve(path, *args)


router = Router()

routes = [
    "/hello/world",
    "/<foo>",
    "/a/<banana>",
    "/a/thing/<id>/and/doit",
    "/a/thing/<id>",
    "/<foo>/fuzz",
    "/<foo>/buzz",
    "/<foo>/bar",
    "/a/much/longer/item/<id>",
    "/a/much/<longer>/item/<id>",
    "/<foo:int>",
]

# To prove that input order does not matter
random.shuffle(routes)
for i, r in enumerate(routes):
    router.add(r, i)

# These will probably be abstracted away
router._generate_tree()
router._render()

# This is a debug feature to show the tree
router.tree.display()

print("===")

# Output the generated source code
print(router.find_route_src)

print("===")

# See the routes that were attached
print(router.static_routes)
print(router.dynamic_routes)

# Do some sample fetches
print(router.get("/hello/world"))
print(router.get("/hello"))
print(router.get("/hello/bar"))
print(router.get("/a/bar"))
print(router.get("/a/thing/bar"))
print(router.get("/a/thing/bar/and/doit"))
3 Likes

I tried the new version of ast router and add two route like this:

routes = ["/foo/<some:int>/bar", "/foo/<some:ymd>/bar"]

the find_route_src shows that the def find_route looks like this

def find_route(path, router, basket):
    parts = path.split(router.delimiter)
    num = len(parts)
    if num > 0:
        if parts[0] == "":
            if num > 1:
                if parts[1] == "foo":
                    if num > 2:
                        basket[2] = parts[2]
                        if num > 3:
                            if parts[3] == "bar":
                                return router.dynamic_routes["/foo/<some:int>/bar"], basket
                        basket[2] = parts[2]
                        if num > 3:
                            if parts[3] == "bar":
                                return router.dynamic_routes["/foo/<some:ymd>/bar"], basket
    raise NotFound

So if the request url is /foo/2020-08-29/bar, the method won’t return the right handler and basket.

The method can match the dynamic part earlier but it looks very inefficient :joy:

def find_route(path, router, basket):
    parts = path.split(router.delimiter)
    num = len(parts)
    if num > 0:
        if parts[0] == "":
            if num > 1:
                if parts[1] == "foo":
                    if num > 2:
                        basket[2] = parts[2]
                        if num > 3:
                            if parts[3] == "bar":
                                if num > 4:
                                    if router.dynamic_routes["/foo/<some:int>/bar"].params[4].pattern.search(parts[4]):
                                        return router.dynamic_routes["/foo/<some:int>/bar"], basket
                        basket[2] = parts[2]
                        if num > 3:
                            if parts[3] == "bar":
                                if num > 4:
                                    if router.dynamic_routes["/foo/<some:ymd>/bar"].params[4].pattern.search(parts[4]):
                                        return router.dynamic_routes["/foo/<some:ymd>/bar"], basket
    raise NotFound

Falcon didn’t support the routes like the these , But some framwork did.
xrtr seems did’t support type match for dynamic part.

So will the ast route drop support for this scenario?

A few things:

  1. I have another version that does handle this scenario, however…
  2. Doing so has its own set of disadvantages, namely that it adds some regex into find_route
  3. I am preparing and RFC right now. This is one of the big questions that needs to be answered because as I see it there are two directions to go. One favoring performance, the other compatability. Since this is a valid route scheme in Sanic today, dropping this support will break some applications for sure.

Great. this should be seriously considered :thinking:

Referencing PR #1972 for changes that will need to be made in naming endpoints.

I have pushed the repo that I have been working on.

The goal is to get this integrated into master branch in the next week or two to try and make it for v21.3. Let me know your thoughts.

100% test coverage

image

Lots of cleanup to still do, but we are getting closer. Also, this was a nice side-by-side comparison of the router change.

In Version 20.12

-----------------------------------------------------------------------------------------------
Name (time in ns)                              Min                 Max                Mean             StdDev              Median               IQR            Outliers  OPS (Mops/s)            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_resolve_route_no_arg_string_path     183.5220 (1.0)      260.4150 (1.0)      193.0517 (1.0)       6.9385 (1.0)      191.8610 (1.0)      3.9795 (1.0)        134;97        5.1800 (1.0)        1000        1000
test_resolve_route_with_typed_args        184.8240 (1.01)     590.3360 (2.27)     201.4882 (1.04)     30.8262 (4.44)     194.9180 (1.02)     4.0070 (1.01)       40;171        4.9631 (0.96)       1000        1000
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In new router branch

----------------------------------------------------------------------------------------------- benchmark: 2 tests -----------------------------------------------------------------------------------------------
Name (time in ns)                              Min                 Max                Mean            StdDev              Median               IQR            Outliers  OPS (Mops/s)            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_resolve_route_no_arg_string_path     104.7200 (1.0)      192.7400 (1.0)      109.5443 (1.0)      7.1217 (2.22)     108.8200 (1.0)      0.3000 (1.0)        20;202        9.1287 (1.0)        1000        1000
test_resolve_route_with_typed_args        117.8700 (1.13)     193.0490 (1.00)     126.3849 (1.15)     3.2035 (1.0)      127.1450 (1.17)     3.0255 (10.08)      156;16        7.9123 (0.87)       1000        1000
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Like

Here are some overall benchmarks.

Version 20.12

Running 30s test @ http://127.0.0.1:3333
  12 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.48ms    5.95ms 131.35ms   91.06%
    Req/Sec    10.21k     1.94k   56.63k    76.23%
  3659132 requests in 30.09s, 411.78MB read
Requests/sec: 121614.26
Transfer/sec:     13.69MB

Current master

Running 30s test @ http://127.0.0.1:3333
  12 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.81ms    4.55ms 107.27ms   90.82%
    Req/Sec    11.22k     2.29k   36.57k    74.27%
  4014399 requests in 30.09s, 394.33MB read
Requests/sec: 133394.06
Transfer/sec:     13.10MB

New router branch

Running 30s test @ http://127.0.0.1:3333
  12 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.74ms    4.60ms 102.41ms   90.73%
    Req/Sec    11.60k     2.61k   46.51k    75.23%
  4148337 requests in 30.08s, 407.48MB read
Requests/sec: 137923.77
Transfer/sec:     13.55MB

For the most part, we will see little change in the new routing branch because of route caching. The most noticeable place that we will see the increase is when matching on dynamic routes that are not cacheable.

Here is a super simple implementation of the new router as the beginning of the signals API:

from sanic_routing import BaseRouter, Route


class Signal(Route):
    def get_handler(self, raw_path, method, _):
        method = method or self.router.DEFAULT_METHOD
        raw_path = raw_path.lstrip(self.router.delimiter)
        try:
            return self.handlers[raw_path][method]
        except (IndexError, KeyError):
            raise self.router.method_handler_exception(
                f"Method '{method}' not found on {self}",
                method=method,
                allowed_methods=set(self.methods[raw_path]),
            )


class SignalRouter(BaseRouter):
    def __init__(self) -> None:
        super().__init__(
            delimiter=".",
            route_class=Signal,
            stacking=True,
        )

    def get(self, event: str):
        return self.resolve(f".{event}")

    def send(self, event: str) -> None:
        signal, handlers, params = self.get(event)
        for handler in handlers:
            handler(**params)


def callback1():
    print("called1")


def callback2():
    print("called2")


signals = SignalRouter()
signals.add("server.init.before", callback1)
signals.add("server.init.before", callback2)

signals.finalize()

signals.send("server.init.before")

See RFC : Signals Support for Sanic Proposal · Issue #1630 · sanic-org/sanic · GitHub

2 Likes