Proxying to another web service with Flask - python

I want to proxy requests made to my Flask app to another web service running locally on the machine. I'd rather use Flask for this than our higher-level nginx instance so that we can reuse our existing authentication system built into our app. The more we can keep this "single sign on" the better.
Is there an existing module or other code to do this? Trying to bridge the Flask app through to something like httplib or urllib is proving to be a pain.

I spent a good deal of time working on this same thing and eventually found a solution using the requests library that seems to work well. It even handles setting multiple cookies in one response, which took a bit of investigation to figure out. Here's the flask view function:
from dotenv import load_dotenv # pip package python-dotenv
import os
#
from flask import request, Response
import requests # pip package requests
load_dotenv()
API_HOST = os.environ.get('API_HOST'); assert API_HOST, 'Envvar API_HOST is required'
#api.route('/', defaults={'path': ''}) # ref. https://medium.com/#zwork101/making-a-flask-proxy-server-online-in-10-lines-of-code-44b8721bca6
#api.route('/<path>')
def redirect_to_API_HOST(path): #NOTE var :path will be unused as all path we need will be read from :request ie from flask import request
res = requests.request( # ref. https://stackoverflow.com/a/36601467/248616
method = request.method,
url = request.url.replace(request.host_url, f'{API_HOST}/'),
headers = {k:v for k,v in request.headers if k.lower() == 'host'},
data = request.get_data(),
cookies = request.cookies,
allow_redirects = False,
)
#region exlcude some keys in :res response
excluded_headers = ['content-encoding', 'content-length', 'transfer-encoding', 'connection'] #NOTE we here exclude all "hop-by-hop headers" defined by RFC 2616 section 13.5.1 ref. https://www.rfc-editor.org/rfc/rfc2616#section-13.5.1
headers = [
(k,v) for k,v in res.raw.headers.items()
if k.lower() not in excluded_headers
]
#endregion exlcude some keys in :res response
response = Response(res.content, res.status_code, headers)
return response
Update April 2021: excluded_headers should probably include all "hop-by-hop headers" defined by RFC 2616 section 13.5.1.

I have an implementation of a proxy using httplib in a Werkzeug-based app (as in your case, I needed to use the webapp's authentication and authorization).
Although the Flask docs don't state how to access the HTTP headers, you can use request.headers (see Werkzeug documentation). If you don't need to modify the response, and the headers used by the proxied app are predictable, proxying is staightforward.
Note that if you don't need to modify the response, you should use the werkzeug.wsgi.wrap_file to wrap httplib's response stream. That allows passing of the open OS-level file descriptor to the HTTP server for optimal performance.

My original plan was for the public-facing URL to be something like http://www.example.com/admin/myapp proxying to http://myapp.internal.example.com/. Down that path leads madness.
Most webapps, particularly self-hosted ones, assume that they're going to be running at the root of a HTTP server and do things like reference other files by absolute path. To work around this, you have to rewrite URLs all over the place: Location headers and HTML, JavaScript, and CSS files.
I did write a Flask proxy blueprint which did this, and while it worked well enough for the one webapp I really wanted to proxy, it was not sustainable. It was a big mess of regular expressions.
In the end, I set up a new virtual host in nginx and used its own proxying. Since both were at the root of the host, URL rewriting was mostly unnecessary. (And what little was necessary, nginx's proxy module handled.) The webapp being proxied to does its own authentication which is good enough for now.

Related

Python session SAMESITE=None not being set

I am having issues with chrome and SameSite. I am serving a webpage in a shopify iframe and when setting the session using flask-login, chrome tells me this:
A cookie associated with a cross-site resource at
URL was set without the
SameSite attribute. It been blocked, as Chrome now only delivers
cookies with cross-site requests if they are set with SameSite=None
and Secure.
Secure is set, but I tried to set SameSite in all the possible way, but without effect.
I tried setting
app.config['SESSION_COOKIE_SAMESITE'] = "None"
I tried, changing the behavior of the library, I tired setting the attribute in set_cookie() but nothing seemed to work. The response I see doesn't have the SameSite attribute.
(I have the last versions of flask, flask-login, flask-security and werkzeug)
Can you help me?
Thank you
Just to expand on this, using flask application config just as you've mentioned, you can set everything except when setting SESSION_COOKIE_SAMESITE=None Google Chrome doesn't seem to place the value as "None", which then defaults to "Lax".
How i worked around this problem was to add the cookie back into the response header. First I had to get the cookie value because using request.cookies.get("my_cookie") doesn't seem to extract the cookie value from the response and always appears as None.
secondly, using the response.set_cookie() still doesn't set the samesite=None value. I have no idea why because i'm using the latest version of flask and Werkzeug which apparently should fix the problem but it doesn't. After lots of testing, I found out using the response.headers.add() works to add a Set-Cookie: header but I needed a way to extract the cookie value to ensure I can get the same session. After looking through flask docs and other online forums. I found out that I can actually call SecureCookieSessionInterface class and get the signed session from there.
from flask import session
from flask.sessions import SecureCookieSessionInterface
# where `app` is your Flask Application name.
session_cookie = SecureCookieSessionInterface().get_signing_serializer(app)
Lastly, i had to ensure that the same session is added to the response after the request has been established rather than calling it on every route which doesn't seem feasible within a full fledged application. This is done by using the after_request decorator which runs automatically after a request.
#app.after_request
def cookies(response):
same_cookie = session_cookie.dumps(dict(session))
response.headers.add("Set-Cookie", f"my_cookie={same_cookie}; Secure; HttpOnly; SameSite=None; Path=/;")
return response
What I noticed in Chrome is that, it basically sets a duplicate cookie with the same signed value. Since both are identical with one having samesite=None in the response header and the other blocked by Chrome seems to be ignored. Thus, the session is validated with the flask app and access is allowed.
A mistake easily made (as I did) is to confuse None with "None". Be sure to use the string instead of the python literal like so:
response.set_cookie("key", value, ..., samesite="None")
samesite=None would indeed be ignored and defaults to "Lax".
From: https://github.com/GoogleChromeLabs/samesite-examples/blob/master/python-flask.md
Assuming you're on the latest version of werkzeug that includes the fix to this issue, you should be able to use
set_cookie()
like this:
from flask import Flask, make_response
app = Flask(__name__)
#app.route('/')
def hello_world():
resp = make_response('Hello, World!');
resp.set_cookie('same-site-cookie', 'foo', samesite='Lax');
resp.set_cookie('cross-site-cookie', 'bar', samesite='None', secure=True);
return resp
Otherwise, you can still
set the header
explicitly:
from flask import Flask, make_response
app = Flask(__name__)
#app.route('/')
def hello_world():
resp = make_response('Hello, World!');
resp.set_cookie('same-site-cookie', 'foo', samesite='Lax');
# Ensure you use "add" to not overwrite existing cookie headers
resp.headers.add('Set-Cookie','cross-site-cookie=bar; SameSite=None; Secure')
return resp
I had a similar issue where it kept setting SESSION_COOKIE_SAMESITE to Lax. In my case it was caused by flask-talisman, who was overwriting my config.
Specifying
session_cookie_samesite=app.config["SESSION_COOKIE_SAMESITE"]
in the Talisman constructor worked.

Falcon CORS middleware does not work properly

I'm using Falcon CORS to allow access to my web service only from several domains. But it does not work properly.
Let me explain, if we take a look at my implementation:
ALLOWED_ORIGINS = ['*']
crossdomain_origin = CORS(allow_origins_list=[ALLOWED_ORIGINS], log_level='DEBUG')
app = falcon.API(middleware=[RequireJSON(), JSONTranslator(), cors.middleware])
When I make any post request to my API service, I get this warning:
Aborting response due to origin not allowed
But, then I get the correct response from my API.
Here is an official docs about this module: https://github.com/lwcolton/falcon-cors
Your code does not match the falcon-cors documentation's example:
import falcon
from falcon_cors import CORS
cors = CORS(allow_origins_list=['http://test.com:8080'])
api = falcon.API(middleware=[cors.middleware])
# ^^^^^^^^^^^^^^^
Note the cors.middleware variable is being passed into the api call. In your code you are creating crossdomain_origin but not passing it into the API setup.
If this does not solve it, please provide a working code example, including the Falcon resource classes, that is easy to test and reproduce, and I'm happy to try to assist.
edit:
From comments below, it sounds like falcon-cors is working properly, rather the problem may be origin header was being omitted from the request.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS
The Origin header indicates the origin of the cross-site access request or preflight request.
I tried as guided by lwcolton on github here
And also set allow_all_headers=True, allow_all_methods=True
i.e. same as #Ryan comment
from falcon_cors import CORS
cors = CORS(
allow_all_origins=True,
allow_all_headers=True,
allow_all_methods=True,
)
api = falcon.API(middleware=[cors.middleware])
Side note:
ORIGIN '*' does not work on some browsers.. notably IE. In the past I've had to dynamically set the ORIGIN header to the 'host' name requested in the HTTP headers in order to support a wildcard domain host for a site I setup.
There's is another way to implement this without using falcon-cors
You might want to look at this on the official documentation - how-do-i-implement-cors-with-falcon
class CORSComponent:
def process_response(self, req, resp, resource, req_succeeded):
resp.set_header('Access-Control-Allow-Origin', '*')
if (req_succeeded
and req.method == 'OPTIONS'
and req.get_header('Access-Control-Request-Method')
):
# NOTE: This is a CORS preflight request. Patch the
# response accordingly.
allow = resp.get_header('Allow')
resp.delete_header('Allow')
allow_headers = req.get_header(
'Access-Control-Request-Headers',
default='*'
)
resp.set_headers((
('Access-Control-Allow-Methods', allow),
('Access-Control-Allow-Headers', allow_headers),
('Access-Control-Max-Age', '86400'), # 24 hours
))
When using the above approach, OPTIONS requests must also be special-cased in any other middleware or hooks you use for auth, content-negotiation, etc. For example, you will typically skip auth for preflight requests because it is simply unnecessary; note that such request do not include the Authorization header in any case.
You can now put this in middleware
api = falcon.API(middleware=[
CORSComponent()
])

Parse raw HTTP request with Werkzeug

I'm writing a fuzzer for a Flask application. I have example requests stored as text files, like this get.txt:
GET /docs/index.html HTTP/1.1
Host: www.w3.org
Ideally, I'd parse this to a werkzeug.wrappers.Request object, something like this (psuedo-code):
from werkzeug.wrappers import Request
req = Request()
with open('get.txt') as f:
req.parse_raw(f.read())
However, it looks like the raw HTTP parsing isn't happening in Werkzeug. Instead, Werkzeug takes a WSGI environ from BaseHTTPServer.BaseHTTPRequestHandler, and that requires a BaseHTTPServer.HTTPServer instance to parse the request. That seems like overkill for something this simple.
I also came across http-parser, which is closer to what I want, but it duplicates most of Werkzeug's data structures with incompatible types. I'd have to transform the data from one to another.
Is there a simpler way to go from raw HTTP request to WSGI environ in Werkzeug (or using BaseHTTPRequestHandler without an HTTP server)?
Didn't find an easy way to do this, so I wrote a library called Werkzeug-Raw to parse raw HTTP requests to WSGI environments (or even to open requests on test clients).
It works like this:
from flask import Flask, request
import werkzeug_raw
app = Flask(__name__)
environ = werkzeug_raw.environ('GET /foo/bar?tequila=42 HTTP/1.1')
with app.request_context(environ):
print request.args # ImmutableMultiDict([('tequila', u'42')])
To open a raw HTTP request on a test client:
client = app.test_client()
rv = werkzeug_raw.open(client, 'GET /foo/bar?tequila=42 HTTP/1.1')

Serving HTTP client with versioned resource only if it has changed - using Flask

I am running a webserver based on Flask, which serves a resource being versioned (e.g. installation file of some versioned program). I want to serve my HTTP client with new resource only in case, it already does not have the current version available. If there is new version, I want the client to download the resource and install it.
my Flask server looks like this
import json
import redis
import math
import requests
from flask import Flask,render_template,request
app=Flask(__name__)
#app.route('/version', methods=['GET','POST'])
def getversion():
r_server=redis.Redis("127.0.0.1")
if request.method == 'POST':
jsonobj_recieve=request.data
data=json.loads(jsonobj)
currentversion=r_server.hget('version')
if data == currentversion:
#code to return a 'ok'
else:
#code to return 'not ok' also should send the updated file to the client
else:
return r_server.hget('version')
if __name__ == '__main__':
app.run(
debug=True,
host="127.0.0.1",
port=80
)
my client is very basic:
import sys
import json
import requests
url="http://127.0.0.1/version"
jsonobj=json.dumps(str(sys.argv[1]))
print jsonobj
r=requests.post(url,data=jsonobj)
I will likely have to recode the entire client, this is not a problem but I really have no idea where to start....
Requirements Review
have web app, serving a versioned resource. It can be e.g. file with an applications.
have client, which allows fetching the resource only in case, the version of resource on the server and what client has locally already available differ
the client is aware of version string of the resource
allow client to learn new version string if new version is available
HTTP like design of your solution
If you want to allow downloading an application only in case, the client does not have it already, following design could be used:
use etag header. This usually contains some string describing unique status of resource you want to get from that url. In your case it could be current version number of your application.
in your request, use header "if-none-match", providing version number of your application present at client. This will result in HTTP Status code 306 - Not Modified in case, your client and server share the same version of resource. In case it differs, you would simply provide the content of the resource and use it. Your resource shall also denote in etag current version of the resource and your client shall take note of it, or find new version name from other sources (like from the downloaded file).
This design follows HTTP principles.
Flask serving resource with declaring version in etag
This is focusing on showing the principle, you shall elaborate on providing real content of the resource.
from flask import Flask, Response, request
import werkzeug.exceptions
app = Flask(__name__)
class NotModified(werkzeug.exceptions.HTTPException):
code = 304
def get_response(self, environment):
return Response(status=304)
#app.route('/download/app')
def downloadapp():
currver = "1.0"
if request.if_none_match and currver in request.if_none_match:
raise NotModified
def generate():
yield "app_file_part 1"
yield "app_file_part 2"
yield "app_file_part 3"
return Response(generate(), headers={"etag": currver})
if __name__ == '__main__':
app.run(debug=True)
Client getting resource only, if it is new
import requests
ver = "1.0"
url = "http://localhost:5000/download/app"
req = requests.get(url, headers={"If-None-Match": ver})
if req.status_code == 200:
print "new content of resource", req.content
new_ver = req.headers["etag"]
else:
print "resource did not change since last time"
Alternative solution of web part using web server (e.g. NGINX)
Assuming the resource is static file, which updates only sometime, you shall be able configuring your web server, e.g. NGINX, to serve that resource and declaring in your configuration explicit value for etag header to the version string.
Note, that as it was not requested, this alternative solution is not elaborated here (and was not tested).
Client implementation would not be modified by that (here it pays back the design is following HTTP concepts).
There are multiple ways of achieving this but as this is a Flask app, here's one using HTTP.
If the version is OK, just return a relevant status code, like a 200 OK. You can add a JSON response in the body if that's necessary. If you return a string with flask, the status code will be 200 OK and you can inspect that in your client.
If the version differs, return the URL where the file is located. The client will have to
download the file. That's pretty simple using requests. Here's a typical example for downloading file by streaming requests:
def get(url, chunk_size=1024):
""" Download a file in chunks of n bytes """
fn = url.split("/")[-1] # if you're url is complicated, use urlparse.
stream = requests.get(url, stream=True)
with open(fn, "wb") as local:
for chunk in stream.iter_content(chunk_size=chunk_size):
if chunk:
f.write(chunk)
return fn
This is very simplified. If your file is not static and cannot live on the server (like software update patches probably shouldn't) then you'll have to figure out a way to get the file from a database or generate it on the fly.

Dumping HTTP requests with Flask

I am developing a Flask application based web application ( https://github.com/opensourcehacker/sevabot ) which has HTTP based API services.
Many developers are using and extending the API and I'd like to add a feature which prints Flask's HTTP request to Python logging output, so you can see raw HTTP payloads, source IP and headers you get.
What hooks Flask offers where this kind of HTTP request dumping would be the easiest to implement
Are there any existing solutions and best practices to learn from?
Flask makes a standard logger available at at current_app.logger, there's an example configuration in this gist, though you can centralise the logging calls in a before_request handler if you want to log every request:
from flask import request, current_app
#app.before_request
def log_request():
if current_app.config.get('LOG_REQUESTS'):
current_app.logger.debug('whatever')
# Or if you dont want to use a logger, implement
# whatever system you prefer here
# print request.headers
# open(current_app.config['REQUEST_LOG_FILE'], 'w').write('...')

Categories