Get the host scheme and domain using Flask - python

I have a certain use case where a Flask app is not configured with SERVER_NAME and it resides in a subpath (say, http://example.com/subpath/subpath2), and, from within a request context, I need to extract just the http://example.com.
I noticed that, without SERVER_NAME being set in the Flask config, url_for with _external=True is still able to retrieve the full URL including the domain on which the Flask app is being served. So I figured there must be a way that Flask (or Werkzeug) retrieves the domain. I dug around inside the url_for function and what resulted was this function which I made to get the domain from within a request context.
from flask.globals import _request_ctx_stack
def get_domain():
fragments = [
_request_ctx_stack.top.url_adapter.url_scheme,
_request_ctx_stack.top.url_adapter.get_host(''),
]
return '://'.join(fragments)
I realise I'm accessing private members, but there is no other obvious way to do it that I could find.
Is this the correct way to do it?

Related

Is it possible to get flask app url without request context?

It's a URL shortener app. The app structure is like following:
App structure
In forms.py, I have custom validators: validate_url() and validate_short_url()
that use APP_URL; APP_URL = "localhost:5000/"
I'm fine with that running locally, but there is a lot of cases app domain can change:
Running through docker image;
Hosting (e.g. on Heroku);
Changing the port value;
So every time I run this flask app differently I have to change the value of APP_URL, which isn't the best practice
All in all, I want to use something like flask.Request.url_root to avoid manual writing again and again
When I just try to use flask.request I get the following traceback:
RuntimeError: Working outside of request context.
This typically means that you attempted to use functionality that needed
an active HTTP request. Consult the documentation on testing for
information about how to avoid this problem.
forms.py is posted here
The app is already hosted on Heroku, here is the link: https://qysqa.herokuapp.com/
The solution was to use flask.request inside custom validators (validate_url() and validate_short_url()) where app context gets passed

Weird `/` behavior when Serving Flask with Apache2 + Gunicorn

I'm trying to build multiple endpoints and subendpoints within my application, part of it as a learning exercise, and part of it is that I have 2 domains.
For simplicity I'm going to refer to them as domain1 and domain2.
My Flask listening endpoints are on /api1 and /api2 for domains 1 & 2 respectively. Gunicorn is bound to listen on a unix socket at sock/domain1.sock and sock/domain2.sock. So far everything is working this way.
My Apache2 proxies the endpoints into the proper socket as follows:
for domain1 I have:
<Location /api>
ProxyPass unix:/var/www/socks/domain1.sock|http://127.0.0.1/api1
ProxyPassReverse unix:/var/www/socks/domain1.sock|http://127.0.0.1/api1
</Location>
for domain2 I have:
<Location /api>
ProxyPass unix:/var/www/socks/domain2.sock|http://127.0.0.1/api2
ProxyPassReverse unix:/var/www/socks/domain2.sock|http://127.0.0.1/api2
</Location>
I know that I don't need to have 2 sockets, but I'm doing so just for testing.
Now when I open domain1.com/api things are working perfectly. And so are for domain2.com/api
But when I open domain1.com/api/ (with a slash at the end) or domain2.com/api/ then it gives me a Site Not Found error. This is understandable since in my Flask I'm listening to the endpoint without a trailing slash. The fix for that is to implement / into my flask endpoint. So when I do that, the weird behavior occurs.
New Flask listening Endpoints are /api1/ and /api2/ (with trailing slash).
Now when I open domain.com/api/ it is working as intended. But when I'm on domain.com/api (without the slash) it's referring me to either domain.comapi or 127.0.01/api, where both are wrong scenarios. I tried to add a trailing slash in my Apache config, and tried multiple Flask approaches but they're all doing the same weird behavior and I can't understand why it's doing that. Now personally I don't mind using the endpoint without the slash, I just want to understand why this is happening. I also tried googling a lot but nothing came up related to my query.
Reproduceable Behavior:
I'm unable to link the 2nd domain as it is a protected IP for my company, so I created multiple endpoints so that you can click on to simulate the behavior.
https://thethiny.xyz/api1 -> sock|http://127.0.0.1/api1 -> internal /api1
https://thethiny.xyz/api2 -> sock|http://127.0.0.1/api2 -> internal /api2/
https://thethiny.xyz/api3 -> sock|http://127.0.0.1/api1/ -> internal /api1
https://thethiny.xyz/api4 -> sock|http://127.0.0.1/api2/ -> internal /api2
Working:
https://thethiny.xyz/api1
https://thethiny.xyz/api2/
https://thethiny.xyz/api4
Not Found:
https://thethiny.xyz/api1/
https://thethiny.xyz/api3
https://thethiny.xyz/api3/
Weird Redirect:
https://thethiny.xyz/api2
https://thethiny.xyz/api4/
Edit: I understand the problem and have came up with some solutions in the answer below. I'm not satisfied with the solutions but I'm taking this as a limitation of mapping endpoints to different underlying endpoints. For more information, read about Reverse Proxy Pass and Redirects and Rewriting Location Header in HTTPd
I now understand the problem. So in my Apache Proxy it is giving the request to Flask on the endpoint specified 127.0.0.1/api2, so when there's a redirect request from within Flask, it tries to redirect to 127.0.0.1/api2/, since Flask doesn't have any information about the original url source. Using ProxyPreserveHost solves this only when the endpoint resources match, as in mapping /api2/ to /api2 but not /api4/ to /api2/, since on the redirect, Flask receives a request for /api2 -> /api2/ and returns that having no information about /api4. Unfortunately I don't there's an actual solution to this from Apache2/Flask configurations other than manually handling the routes specifically to how you want them to be, as in do not allow Flask to redirect automatically since it will not know how, and instead either manually redirect (external redirect) to the correct endpoint, or handle each route separately (/api and /api/stuff but not /api/).
Example:
app.add_url_rule("/api2", view_func=StubFunction(), redirect_to="/api2/")
app.add_url_rule("/api2/", view_func=ActualFunction())
And add ProxyPreserveHost On to your Apache2 config or use the built in Proxy Fixer if you don't want to modify your Virtual Hosts:
from werkzeug.middleware.proxy_fix import ProxyFix
app.wsgi_app = ProxyFix(app.wsgi_app, x_proto=1, x_host=1)
What happens now is that 127.0.0.1 gets translated to yourdomain.tld when delivered to your Flask app. So when you're redirecting back using redirect_to, you're redirecting to your domain externally, no longer relatively. So in the case above, /api2 is redirecting to myDomain.tld/api2/ then /api2/ is called, which is functional.
You can also skip the preserve host and manually put in your domain name in the redirect as so:
app.add_url_rule("/api2", view_func=StubFunction(), redirect_to="https://yourDomain.tld/api2/")
But I don't like this approach in case you change your domain for some reason.
tl;dr, don't put a trailing slash in your ProxyPass Applications.

How to detect which of the two virtual hosts is being used in python and flask

I have a website developed in flask running on an apache2 server that responds on port 80 to two URLs
Url-1 http://www.example.com
Url-2 http://oer.example.com
I want to detect which of the two urls the user is coming in from and adjust what the server does and store the variable in a config variable
app.config['SITE'] = 'OER'
or
app.config['SITE'] = 'WWW'
Looking around on the internet I can find lots of examples using urllib2 the issue is that you need to pass it the url you want to slice and I cant find a way to pull that out as it may change between the two with each request.
I could fork the code and put up two different versions but that's as ugly as a box of frogs.
Thoughts welcome.
Use the Flask request object (from flask import request) and one of the following in your request handler:
hostname = request.environ.get('HTTP_HOST', '')
url = urlparse(request.url)
hostname = url.netloc
This will get e.g. oer.example.com or www.example.com. If there is a port number that will be included too. Keep in mind that this ultimately comes from the client request so "bad" requests might have it set wrong, although hopefully apache wouldn't route those to your app.

Pyramid: How to specify the base URL for the application

Let's say my app is served at the domain www.example.com.
How (where?) should I specify this in the Pyramid configuration file so that functions like request.route_url would automatically pick it and generate the correct URL.
(I think [server:main] is not the place for this)
The url generation functions route_url, static_url, resource_url all depend on the WSGI environ dictionary from where they take all the essential parameters required to generate a full URL.
Hence one way to do it is to modify the WSGI environment dictionary at the request creation time, and modify the required parameters. Events are great for this kind of thing:
from pyramid.events import NewRequest
from pyramid.events import subscriber
#subscriber(NewRequest)
def mysubscriber(event):
event.request.environ['HTTP_HOST'] = 'example.com'
After this, route_url will take example.com as the base URL.
Yes, a proper reverse proxy will forward along the appropriate headers to your wsgi server. See the pyramid cookbook for an nginx recipe.

Using webapp2's DomainRoute on google app engine

I'm trying to use webapp2's DomainRoute to route requess to specific users. The definition of the routes looks like this:
app = webapp2.WSGIApplication([
DomainRoute("<subdomain>." + os.environ["HTTP_HOST"], [
webapp2.Route('/',ClientHandler)]),
('/', MainHandler)],
debug=True)
the handlers all exist, and currently, my ClientHandler should just spit out the current subdomain but when I currently go to nosub.localhost:8090 it doesn't even reach the server. Do I need to edit my hosts file? And if so, is it valid to add a wildcart like *.localhost so any subdomain will work?
Yes, you need to edit your hosts file - whatever.localhost does not automatically resolve to 127.0.0.1. Alternately, save yourself some time and use xip.io.
Your code has a significant problem, though: you're using os.environ["HTTP_HOST"] in a context that only gets run on the first request. This means that you extract the hostname from the first request to your app and use that as a base name for it and all subsequent requests - which is pretty definitely not what you want. For instance, if the first user to your app instance comes from subdomain.myapp.com, you'll set up a route for subdomain.subdomain.myapp.com.

Categories