Hi!
I’ve experienced this a few times already, and can’t figure out what is happening.
When running Sanic (in Dev or Production), at some point, the server stops accepting requests and just hangs. Loading a page does nothing.
What’s worse is that I have nothing in the logs or anything that would give a hint on what is happening.
One thing I found out, not sure if it helps, is that it seems that Sanic never closes the connections. Calling netstat -a -n -o | grep "0 127.0.0.1:5000"
returns a lot of “CLOSE_WAIT” lines:
root@www:~# netstat -a -n -o | grep "0 127.0.0.1:5000"
tcp 101 0 127.0.0.1:5000 0.0.0.0:* LISTEN off (0.00/0/0)
tcp 24231 0 127.0.0.1:5000 127.0.0.1:45330 CLOSE_WAIT off (0.00/0/0)
tcp 3656 0 127.0.0.1:5000 127.0.0.1:43922 CLOSE_WAIT off (0.00/0/0)
tcp 3653 0 127.0.0.1:5000 127.0.0.1:49652 CLOSE_WAIT off (0.00/0/0)
tcp 7752 0 127.0.0.1:5000 127.0.0.1:55710 CLOSE_WAIT off (0.00/0/0)
tcp 16146 0 127.0.0.1:5000 127.0.0.1:59648 CLOSE_WAIT off (0.00/0/0)
tcp 24668 0 127.0.0.1:5000 127.0.0.1:38280 CLOSE_WAIT off (0.00/0/0)
tcp 8986 0 127.0.0.1:5000 127.0.0.1:38904 CLOSE_WAIT off (0.00/0/0)
tcp 3647 0 127.0.0.1:5000 127.0.0.1:37120 CLOSE_WAIT off (0.00/0/0)
tcp 3698 0 127.0.0.1:5000 127.0.0.1:38028 CLOSE_WAIT off (0.00/0/0)
tcp 5470 0 127.0.0.1:5000 127.0.0.1:34678 CLOSE_WAIT off (0.00/0/0)
tcp 25098 0 127.0.0.1:5000 127.0.0.1:43758 CLOSE_WAIT off (0.00/0/0)
tcp 4632 0 127.0.0.1:5000 127.0.0.1:55694 CLOSE_WAIT off (0.00/0/0)
tcp 15441 0 127.0.0.1:5000 127.0.0.1:38274 CLOSE_WAIT off (0.00/0/0)
tcp 751 0 127.0.0.1:5000 127.0.0.1:46346 CLOSE_WAIT off (0.00/0/0)
tcp 4648 0 127.0.0.1:5000 127.0.0.1:60718 CLOSE_WAIT off (0.00/0/0)
tcp 4646 0 127.0.0.1:5000 127.0.0.1:48550 CLOSE_WAIT off (0.00/0/0)
tcp 3648 0 127.0.0.1:5000 127.0.0.1:47568 CLOSE_WAIT off (0.00/0/0)
tcp 3649 0 127.0.0.1:5000 127.0.0.1:43920 CLOSE_WAIT off (0.00/0/0)
tcp 7752 0 127.0.0.1:5000 127.0.0.1:35700 CLOSE_WAIT off (0.00/0/0)
tcp 7815 0 127.0.0.1:5000 127.0.0.1:40918 CLOSE_WAIT off (0.00/0/0)
tcp 3649 0 127.0.0.1:5000 127.0.0.1:54996 CLOSE_WAIT off (0.00/0/0)
tcp 3694 0 127.0.0.1:5000 127.0.0.1:57308 CLOSE_WAIT off (0.00/0/0)
tcp 3656 0 127.0.0.1:5000 127.0.0.1:51712 CLOSE_WAIT off (0.00/0/0)
tcp 5284 0 127.0.0.1:5000 127.0.0.1:57264 CLOSE_WAIT off (0.00/0/0)
tcp 3717 0 127.0.0.1:5000 127.0.0.1:53096 CLOSE_WAIT off (0.00/0/0)
tcp 4631 0 127.0.0.1:5000 127.0.0.1:38258 CLOSE_WAIT off (0.00/0/0)
tcp 3780 0 127.0.0.1:5000 127.0.0.1:37636 CLOSE_WAIT off (0.00/0/0)
tcp 24231 0 127.0.0.1:5000 127.0.0.1:44518 CLOSE_WAIT off (0.00/0/0)
tcp 3639 0 127.0.0.1:5000 127.0.0.1:57568 CLOSE_WAIT off (0.00/0/0)
tcp 24668 0 127.0.0.1:5000 127.0.0.1:40386 CLOSE_WAIT off (0.00/0/0)
tcp 5405 0 127.0.0.1:5000 127.0.0.1:45074 CLOSE_WAIT off (0.00/0/0)
tcp 3707 0 127.0.0.1:5000 127.0.0.1:45072 CLOSE_WAIT off (0.00/0/0)
tcp 3647 0 127.0.0.1:5000 127.0.0.1:37580 CLOSE_WAIT off (0.00/0/0)
tcp 3688 0 127.0.0.1:5000 127.0.0.1:57290 CLOSE_WAIT off (0.00/0/0)
tcp 24668 0 127.0.0.1:5000 127.0.0.1:43356 CLOSE_WAIT off (0.00/0/0)
tcp 25098 0 127.0.0.1:5000 127.0.0.1:34694 CLOSE_WAIT off (0.00/0/0)
tcp 3771 0 127.0.0.1:5000 127.0.0.1:48228 CLOSE_WAIT off (0.00/0/0)
tcp 3649 0 127.0.0.1:5000 127.0.0.1:52856 CLOSE_WAIT off (0.00/0/0)
tcp 3815 0 127.0.0.1:5000 127.0.0.1:48520 CLOSE_WAIT off (0.00/0/0)
tcp 4629 0 127.0.0.1:5000 127.0.0.1:38238 CLOSE_WAIT off (0.00/0/0)
tcp 3698 0 127.0.0.1:5000 127.0.0.1:55754 CLOSE_WAIT off (0.00/0/0)
tcp 3825 0 127.0.0.1:5000 127.0.0.1:43736 CLOSE_WAIT off (0.00/0/0)
tcp 5455 0 127.0.0.1:5000 127.0.0.1:35828 CLOSE_WAIT off (0.00/0/0)
tcp 25098 0 127.0.0.1:5000 127.0.0.1:55766 CLOSE_WAIT off (0.00/0/0)
tcp 3660 0 127.0.0.1:5000 127.0.0.1:46816 CLOSE_WAIT off (0.00/0/0)
tcp 3649 0 127.0.0.1:5000 127.0.0.1:55824 CLOSE_WAIT off (0.00/0/0)
tcp 25098 0 127.0.0.1:5000 127.0.0.1:55720 CLOSE_WAIT off (0.00/0/0)
tcp 3654 0 127.0.0.1:5000 127.0.0.1:58910 CLOSE_WAIT off (0.00/0/0)
tcp 16146 0 127.0.0.1:5000 127.0.0.1:43108 CLOSE_WAIT off (0.00/0/0)
tcp 16146 0 127.0.0.1:5000 127.0.0.1:38836 CLOSE_WAIT off (0.00/0/0)
tcp 24668 0 127.0.0.1:5000 127.0.0.1:41942 CLOSE_WAIT off (0.00/0/0)
tcp 3715 0 127.0.0.1:5000 127.0.0.1:48546 CLOSE_WAIT off (0.00/0/0)
tcp 7751 0 127.0.0.1:5000 127.0.0.1:38270 CLOSE_WAIT off (0.00/0/0)
tcp 3656 0 127.0.0.1:5000 127.0.0.1:40338 CLOSE_WAIT off (0.00/0/0)
tcp 24668 0 127.0.0.1:5000 127.0.0.1:49062 CLOSE_WAIT off (0.00/0/0)
tcp 751 0 127.0.0.1:5000 127.0.0.1:52056 CLOSE_WAIT off (0.00/0/0)
tcp 24231 0 127.0.0.1:5000 127.0.0.1:45064 CLOSE_WAIT off (0.00/0/0)
tcp 3697 0 127.0.0.1:5000 127.0.0.1:49406 CLOSE_WAIT off (0.00/0/0)
tcp 3690 0 127.0.0.1:5000 127.0.0.1:49160 CLOSE_WAIT off (0.00/0/0)
tcp 3711 0 127.0.0.1:5000 127.0.0.1:44446 CLOSE_WAIT off (0.00/0/0)
tcp 3786 0 127.0.0.1:5000 127.0.0.1:57258 CLOSE_WAIT off (0.00/0/0)
tcp 3740 0 127.0.0.1:5000 127.0.0.1:36070 CLOSE_WAIT off (0.00/0/0)
tcp 3770 0 127.0.0.1:5000 127.0.0.1:38014 CLOSE_WAIT off (0.00/0/0)
tcp 3786 0 127.0.0.1:5000 127.0.0.1:40910 CLOSE_WAIT off (0.00/0/0)
tcp 7815 0 127.0.0.1:5000 127.0.0.1:60462 CLOSE_WAIT off (0.00/0/0)
tcp 3800 0 127.0.0.1:5000 127.0.0.1:38900 CLOSE_WAIT off (0.00/0/0)
tcp 16146 0 127.0.0.1:5000 127.0.0.1:38030 CLOSE_WAIT off (0.00/0/0)
tcp 3718 0 127.0.0.1:5000 127.0.0.1:60706 CLOSE_WAIT off (0.00/0/0)
tcp 3700 0 127.0.0.1:5000 127.0.0.1:38250 CLOSE_WAIT off (0.00/0/0)
tcp 15441 0 127.0.0.1:5000 127.0.0.1:40552 CLOSE_WAIT off (0.00/0/0)
tcp 24231 0 127.0.0.1:5000 127.0.0.1:45076 CLOSE_WAIT off (0.00/0/0)
tcp 3699 0 127.0.0.1:5000 127.0.0.1:35816 CLOSE_WAIT off (0.00/0/0)
So maybe the sockets are full ?
The only solution I have is to restart the Sanic server (not reload, full restart), which is annoying.
I have an endpoint at /ping that just returns a basic info to tell that the server is running. On the server, here’s the cURL call:
root@www:~# curl --trace - http://127.0.0.1:5000/ping
== Info: Trying 127.0.0.1:5000...
== Info: connect to 127.0.0.1 port 5000 failed: Connection timed out
== Info: Failed to connect to 127.0.0.1 port 5000: Connection timed out
== Info: Closing connection 0
curl: (28) Failed to connect to 127.0.0.1 port 5000: Connection timed out
That’s it. Sanic returns absolutely nothing.
Also, I have two Sanic instances running on the server. One is the actual service, the second is a listener to deploy new changes coming from Github (as a webhook request). The second one is pretty simple, with just one endpoint, no connections to other services (no db), and still, it also fails at some point.
If you can help me here, I know there isn’t much to work on, but maybe you’ve had this before and know what the issue is about ?