Cython + openMP code running faster as Sanic backend than standalone terminal app?

Hello everyone! This is a niche question so I’m not expecting a lot of people to be able to help but I want to understand something that is happening.

I wrote an app using mostly Cython that ranks a lot of sequences in parallel. All of the memory for the app is allocated beforehand, as numpy arrays. The sequences themselves are stored as a n x m array of integers, so that there are n sequences of length m. I have a function that evaluates each sequence, computing a metric for each sequence and storing the result in a float array of length n. Each evaluation happens in parallel via openMP. So the code is running as compiled C code on a static chunk of memory. After evaluating all of the sequences in parallel, the code (not in parallel) ranks the sequences by metric and keeps a subset of them, the best ones, and randomizes the remaining sequences, repeating the process. So overall what happens is a cycle of parallel evaluation, then a sort, and back to evaluation.

I was under the impression that when I run this code, it uses multiple cores during the parallel part and a single core for the sorting metrics part, causing a short bottleneck. And since everything is reading from and writing to the same chunk of memory, the only parallelization possible is that happening within one instance of running code.

I have two ways to run this app: as a standalone program that runs in a terminal, and as the backend for a Sanic web-app. I created the standalone terminal app thinking it would run faster because it does not have the overhead of client side javascript code (when server and client are on the same machine). Surprisingly I found that it runs much faster when running as a the backend of a web-app. When I use htop to investigate, the Sanic app appears to be using Python multiprocessing, and much more CPU is being utilized. Almost 100%. The terminal app is in fact using all of the CPU cores but only at about 35%, and it runs much slower! I am baffled as to why this is happening. The terminal app should theoretically be a lot faster.

Can anyone help me understand why this is happening? My apologies in advance if this is not an appropriate topic for this forum. I was not sure where I should be asking. Thanks!