Route paths - How do they work?

ahopkins · April 7, 2021, 7:42am

There was a recent question on the Discord server about paths and trailing slashes. This will hopefully be a bit of an explanation about how this works. The new router mirrors the functionality from the old one, and corrects a few edge cases that had incorrect results.

There are really two cases to consider:

A root level trailing slash: https://sanicframework.org/
A path level trailing slash: https://sanicframework.org/en/

According to the HTTP spec, the first URL is is same with or without a slash. The second is different without the slash, although most of the Internet now treats it the same. However, since technically a URL is meant to point at a file path: /en means a file called en and /en/ means a directory called en.

But, again, much of the Internet and APIs generally treat them as equivalent. Therefore in Sanic, the default is to keep them the same.

This brings up the question of strict_slashes, how they work, and how different calls will be handled in Sanic.

Since you can define a route either with or without the slash, with or without strict_slashes, and either as a root or path level: there are 8 possibilities we need to consider:

Route	Strict Slashes
`""`	`True`
`""`	`False`
`"/"`	`True`
`"/"`	`False`
`"/foo"`	`True`
`"/foo"`	`False`
`"/foo/"`	`True`
`"/foo/"`	`False`

However, not all of these can coexist since having eight handlers would lead to ambiguity.

Let’s setup these routes and see what happens:

from sanic import Sanic, text

app = Sanic("pathtest")

paths = (
    ("", True),
    # ("", False),  << Same as previous, so we cannot use both
    # ("/", True),  << Same as previous, so we cannot use both
    # ("/", False),  << Same as previous, so we cannot use both
    ("/foo", True),
    # ("/foo", False),  << Same as previous, so we cannot use both
    ("/foo/", True),
    # ("/foo/", False),  << Same as previous, so we cannot use both
)

for idx, (path, strict) in enumerate(paths):

    @app.get(path, strict_slashes=strict, name=f"path{idx}")
    def handler(request):
        return text(
            f"{request.name}: {request.path=} {request.route.path=} {request.route.parts=}"
        )


app.run(port=9999, debug=True)

As you can see, there really are only three true options. Thinking from the client perspective:

localhost:9999 or localhost:9999/ (they are the same)
localhost:9999/foo
localhost:9999/foo/

Therefore, when we try to hit these endpoints, it should look like this:

$ curl localhost:9999 localhost:9999/ localhost:9999/foo localhost:9999/foo/
pathtest.path0: request.path='/' request.route.path='' request.route.parts=('',)
pathtest.path0: request.path='/' request.route.path='' request.route.parts=('',)
pathtest.path1: request.path='/foo' request.route.path='foo' request.route.parts=('foo',)
pathtest.path2: request.path='/foo/' request.route.path='foo/' request.route.parts=('foo', '')

Sanic retains the path information from the request because it is important and potentially deterministic. Notice how the first two are identical?

The way that Sanic achieves that in the router is basically by ignoring the beginning slash. That is way request.route.path (what the router uses) is not the same as request.path what came in from the HTTP request.

For the sake of completion… Here is another acceptable groupings with strict_slashes.

paths = (
    # ("", True),
    ("", False),
    # ("/", True),
    # ("/", False),
    # ("/foo", True),
    ("/foo", False),
    ("/foo/", True),
    # ("/foo/", False),
)

$ curl localhost:9999 localhost:9999/ localhost:9999/foo localhost:9999/foo/
pathtest.path0: request.path='/' request.route.path='' request.route.parts=('',)
pathtest.path0: request.path='/' request.route.path='' request.route.parts=('',)
pathtest.path1: request.path='/foo' request.route.path='foo' request.route.parts=('foo',)
pathtest.path2: request.path='/foo/' request.route.path='foo/' request.route.parts=('foo', '

Therefore:

strict_slashes on a non-root path means:

True: accept only as defined
- /foo matches ONLY /foo
- /foo/ matches ONLY /foo/
False: match either case
- /foo matches /foo or /foo/
- /foo/ matches /foo or /foo/

One final note:

The reason the router “ignores” the first slash is because it is not important and not deterministic. The more “public” API of request.path displays it since it is part of the HTTP request. But, since it just gets in the way for the router, we ignore it there. In 99% of use cases when you need the path: use request.path. Furthermore, request.route.path would display the route as defined including any parameters: <foo>.

ar4hc · April 7, 2021, 2:39pm

Good explanation,
but i have some edge case i can’t figure out:
What happens if the same url path has a POST and GET route that could match?
e.g. GET “/foo/” and POST “/foo/” (or /foo ?)
How is this expected to behave on these queries?

curl -X GET /foo
curl -X GET /foo/
curl -X GET /foo/123
curl -X POST /foo
curl -X POST /foo/

Does the order of definition matter here?
What is the difference here if not root, e.g. Blueprint with prefix (here “/foo”), and root, no prefix or just “/”?

i’m confused…

ahopkins · April 7, 2021, 2:55pm

A “route” in Sanic is defined as a possible path.

Therefore, these are the same route:

@app.get("/foo")
...
@app.post("/foo")

A single route has one or more handlers that are defined by the HTTP method.

The answer is: it depends. How are the routes defined? With a trailing slash or not? With strict_slashes=True or not>

Potentially, each request in your example could be directed to a different handler.

# curl -X GET /foo
@app.get("/foo", strict_slashes=True)

# curl -X POST /foo
@app.post("/foo", strict_slashes=True)

# curl -X GET /foo/
@app.get("/foo/", strict_slashes=True)

# curl -X POST /foo/
@app.post("/foo/", strict_slashes=True)

# curl -X GET /foo/123
@app.get("/foo/123", strict_slashes=True)

Nope. Sanic will rearrange them as needed to make sure everything gets handled. If there is a potential conflict where there could be ambiguity, you will get an error at startup.

I am not 100% sure I understand the question. Blueprints do not really impact routing. They are a convenience mechanism for helping you to build and organize your code. But once the server starts up, it is as if they don’t exist. The url_prefix is applied to the routes when they are created, but that is about the extent of their job. They do not impact routing after that.