Here is the POC: https://gist.github.com/ahopkins/9b91b950c94c1bc3c24df741b7fb02ef
Hey , I run Your code and It was a great experience to learn and understand experienced python developer’s thought of coding
but I run into some problems if add another route like:
router.add(“id:uuid/num:int”, handler4, “get”)
also I couldn’t understand why would generate find_route method …?!
I would appreciate your response ,thnx
Hi hamzabouissi !
I’m try to understand @ahopkins 's ast Router code, It’s quite complex on code generating part, the find_route
method. But I may answer the question.
I tried add router.add(“id:uuid/num:int”, handler4, “get”)
and print(self.find_src)
before compiled_find = compile(self.find_src, "", "exec",)
and get the genrated code:
def find_route(path, router, extra, basket):
try:
return router.static_routes[path], None
except KeyError:
pass
parts = path.split(router.delimiter)
num_parts = len(parts)
if num_parts > 0:
if parts[0] == '':
if num_parts > 1:
basket[1] = parts[1]
if num_parts > 2:
if num_parts > 2:
basket[2] = parts[2]
if num_parts > 3:
if num_parts > 3:
basket[3] = parts[3]
if num_parts == 4:
return router.dynamic_routes['/<foo:string>/<id:uuid>/<num:int>'], basket
raise NotFound
raise NotFound
raise NotFound
basket[1] = parts[1] # something goes wrong here
elif num_parts > 2:
if num_parts > 2:
basket[2] = parts[2]
if num_parts == 3:
return router.dynamic_routes['/<id:uuid>/<num:int>'], basket
raise NotFound
raise NotFound
raise NotFound
raise NotFound
Python won’t allow any code betwenn if
and elif
in the same indention.
since the method either return the result or raise the exception within a if
block, could we change elif
into if
?
I can’t expalin why would generate find_route
method very well.
The method split a dynamic url into many parts with /
, and then match each part in if
statement, It will raise earlier when the static part won’t match compare to the origin implementation Sainc Router. More route is added more if statement is generated.
-
What is the purpose of
requirement
param inRoute
class ? -
The
REGEX_TYPES
seems to be fixed. Could I write my own regex in a dynamic url? such as email? -
I can’t figure out the meaning of
if
statement below, I thought__depth__
always eaquls tolevel + 1
, Did I miss something?
if is_last:
retval += [(indent + 2) * TAB + f"raise {self.exception.__name__}"]
if branch["__depth__"] == level + 1: # TODO: What does this if mean?
retval += [(indent + 1) * TAB + f"raise {self.exception.__name__}"]
If you could learn anything from that mess of code… not exactly a paradigm of clarity, more like a scratch pad.
Yes, it needs some work. This is no where near ready to be implemented. It is meant mainly as a proof of concept. The idea behind it works and is performant. The next thing to do is really hammer out the best way to build the AST.
Not 100% sure I understand the question. Let me try, and if I am do not answer, please let me know.
The idea behind this is that Python’s regular flow control with if/else
is incredibly fast. Even if it is extremely basic, traversing through a complex web of finding a route is basically just like following a flow chart. If A, follow this course, if B, follow that instead.
So, the idea is to perform some analysis at run time and dynamically generate that code. I did it with simple strings, but Python also has an AST library that could be used to run this at server startup.
Yes. This being a first crack at it, the implementation is far from complete or being without mistakes. I had identified that mistake after posting it, and am working on other ways to correct it. There probably needs to be a lot more context per URL part. I tried to keep it super simplistic, but might need to beef it up a little to handle more complex edge cases. This should not impact the overall performance, at request time. The extra work would be up front at startup.
This is to add additional controls to routing. In particular: domain based, and header based.
For example, if you have multiple domains going to the same Sanic instance, you could seperate some of the routes based upon the domain.
router.add("/foo/3", handler3, requirements={"domain": "foobar.com"})
This is also because I want a single router that is capable of also being used in the Signals RFC. I have some thoughts on how we can implement this, and it would also need a router. Having the requirements
adds another level of flexibility for implementing with various applications.
Whatever implementation is decided upon, it needs to be backwards compatible on the surface of it. Meaning, if I am a long time Sanic user, I would not see a difference except hopefully in performance. The internals may change, but how I define routes, should not.
To that end, there does exist this ability now. For my limited example, there was no need to build that it.
Well, yes, but in that context level
is constantly changing. One thing that does have an impact on performance is trying to bail out as quickly as possible. Therefore, at every possibility where there is nothing left to evaluate, it should bail rather then waiting for higher level scopes to be completed. This is definitely something that needs some more attention as my rather simplistic approach is just that: too simplistic.
Thanks for the attention to detail on this.
Thanks for replying, I learnt a lot from this new router(specially the Ast staff)!
I can fully understand the concept there.
First of all nice stuff @ahopkins but I don’t have much time to fully check your POC, but in the first glance it looks nice, also as a contributor of Falcon
I can confirm that Falcon
uses AST
stuff for routing.
Am I wondering whether we did the research for that case? for example to find some Data Structure which will satisfy our needs. I have been thinking about Prefix-Tree but right now, I’m not sure that suitable for us, need to do some research. Becase in our case, we also need to handle url parameters
, segments
etc.
Thanks for the heads up. I agree that the data structure here needs some consideration. One of the things that both htprtr
and xrtr
do for optimization is exactly that. I agree that it should be worked in here. Also, I am using a simple structure, as opposed to what it looks like Falcon does with some custom classes. Something more robust is probably needed since there is a lot of meta-data per node level that needs to be considered, and the current implementation does not take some of this into consideration to optimize the structure (which is the errors being experienced above).
and the current implementation does not take some of this into consideration to optimize the structure (which is the errors being experienced above).
No problems at all here, just want to say that for the future implementation we should keep in mind the performance/easy scaling of the router, etc.
Also, what htprtr
and xrtr
means?
@vltr will need to explain xrtr. I imagine “extreme router”?
hyprtr is “hyperscan router”
I have some more tests to check performance on different ways to potentially handle tries to see if it is indeed faster than nested if/else statements.
A python data struct would be much more clear for me.The AST is also great but with less readable. Although the idea of AST is blowing my mind. Great work again.
I tried a few types of simple tests that could be used with a trie.
- control - just continue if/elif with no trie
-
anding -
if parts[2] == "bar" and parts[3] == "baz"
-
concat -
if parts[2] + "/" + parts[3] == "bar/baz"
-
concat_s -
if "%s/%s" % (parts[2], parts[3]) == "bar/baz"
-
concat_f -
if f"{parts[2]}/{parts[3]}" == "bar/baz"
Results:
Timy executed (control) for 1000000 times in 0.995399 seconds
Timy best time was 0.000001 seconds
Timy executed (anding) for 1000000 times in 0.977669 seconds
Timy best time was 0.000001 seconds
Timy executed (concat) for 1000000 times in 1.208060 seconds
Timy best time was 0.000001 seconds
Timy executed (concat_s) for 1000000 times in 1.320606 seconds
Timy best time was 0.000001 seconds
Timy executed (concat_f) for 1000000 times in 1.247579 seconds
Timy best time was 0.000001 seconds
The take away is that using a trie may ever so incredibly slightly impact performance. I am debating whether that is even worth the extra effort. 18/1000 of a second over a million runs.
Granted, this was not a super nested structure. As the complexity increases, so should the theoretical difference.
Hm, good tests, also after yesterday discussing here, I think probably AST
will be slightly faster than Trie
, but the readability of Trie
will be fairly better.
One more important point - AST
will be harder to debug.
In general - hard choice
The question on the last one was whether to build out a trie inside the AST.
I do not think the AST-based router will be too hard to debug. Clearly we need a heavy test suite against it to cover as many edge cases as possible. But, in the end it is just Python code that can be output, viewed, and run like anything else.
xrtr
would be simply “X router”, since I just made it as a proof of concept. No final name yet as no final verdict I came across on the best solution to the matter. I’ll add more thoughts replying another comment.
@ahopkins: when I pointed you towards AST, I already knew (from my benchmarks) that this could be an awesome feature for Sanic, not only because it would be pure Python (requiring no extra dependencies) but also would be in our reach to maintain.
@5onic I also like the trie
model, I just don’t know if having a pure Python (or regex mixed version) would be as fast as having code written on the fly. Moreover, perhaps we could try and mix trie
with the AST router and see if we can benefit from it - but this would make the router code pretty exclusive to maintain …
As of xrtr
, I’m playing a little bit (when possible) with pybind11, and I really believe an interesting router could be made using plain C++ with direct bindings to Python (instead of using Cython
), but, again, that would be a binary dependency for Sanic … Which I believe is a downside.
Oh, just saw that (regarding trie ideas with AST) … anding
seems like a good alternative, but the speed up is almost unnoticeable
So … A trie regex perhaps? Argh, that would be hell on earth to debug (and even harder to find a good way to write it down inside the router).
That was this test: A fAST new router
There is an ever so sight benefit of you can evaluate a few levels of the hierarchy at once. Given the scope of complexity it adds for the almost irrelevant difference…