Getting original request URL behind reverse proxy?


#1

My Sanic site runs behinds Apache as reverse and listens on https://beta.print-css.rocks

Inside some template I need to construct URLs with the original public URL.

That’s what I get from the request and headers:

(Pdb)  pp(dict(request.headers))
{'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
 'accept-encoding': 'gzip, deflate, br',
 'accept-language': 'en-US,en;q=0.9,de;q=0.8,fr;q=0.7,pt;q=0.6',
 'cache-control': 'max-age=0',
 'connection': 'close',
 'dnt': '1',
 'host': '127.0.0.1:8000',
 'referer': 'https://beta.print-css.rocks/lessons',
 'upgrade-insecure-requests': '1',
 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) '
               'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 '
               'Safari/537.36',
 'x-forwarded-for': '185.236.201.126',
 'x-forwarded-host': 'beta.print-css.rocks',
 'x-forwarded-server': 'beta.print-css.rocks'}

(Pdb) pp request.url
'http://127.0.0.1:8000/lesson/lesson-chapter-numbering'

Of course I have the host information inside the request but I am missing the request scheme and eventually the port number of the public URL.

Is there some build in functionality for getting of hold the original URL - in this case https://beta.print-css.rocks/lesson/lesson-chapter-numbering ?


#2

As a workaround I do inject a custom request_url that checks if the request is coming from a reverse proxy (under the assumption that HTTPS is being used on the public URL):

@app.route('/lesson/<lesson>')
@jinja.template('lesson.html')
async def lesson(request, lesson):
    lesson_dir = os.path.join(LESSON_ROOT, lesson)
    if not os.path.exists(lesson_dir):
        raise NotFound('Lession {} does not exist'.format(lesson))
    request_url = get_request_url(request)
    return dict(lesson=get_lesson_data(lesson), request_url=request_url)


def get_request_url(request):
    """ Deal with virtual hosting """
    forwarded_host = request.headers.get('x-forwarded-host')
    if not forwarded_host:
        return request.url
    f = furl.furl(request.url)
    f.scheme = 'https'  # we assume public SSL/TLS
    f.host = forwarded_host
    f.port = None  # SSL
    return f.tostr()


#3

Hello, @zopyx! Welcome to the Sanic Community Forums! :smile:

Indeed, your “workaround” is more like the approach to take, since you can only grab the information you need from the headers injected by Apache reverse proxy.

But, if you want to play a little bit with Sanic, you can use the url_for function (see reference here ), where you can pass some kwargs (see source code for possible combinations), as well as grab the current request information the instance itself (request.path, request.host, request.query_string …).

Or, you can simply override the Request.host (and/or Request.scheme) properties with your own data:

class MyRequest(Request):
    @property
    def host(self):
        return self.headers.get("x-forwarded-host", "")

    # *if* needed
    @property
    def scheme(self):
        return "https"  # it'll always be behind SSL, right?

app = Sanic(request_class=MyRequest)

# the whole drill ...

And access the “correct” URL simply by calling the usual request.url property in your endpoint :wink: