Passing session from template view to python requests api call - python

I want to make multiple internal REST API call from my Django TemplateView, using requests library. Now I want to pass the session too from template view to api call. What is the recommended way to do that, keeping performance in mind.
Right now, I'm extracting cookie from the current request object in template view, and passing that to requests.get() or requests.post() call. But problem with that is, I would have to pass request object to my API Client, which I don't want.
This the current wrapper I'm using to route my requests:
def wrap_internal_api_call(request, requests_api, uri, data=None, params=None, cookies=None, is_json=False, files=None):
headers = {'referer': request.META.get('HTTP_REFERER')}
logger.debug('Request API: %s calling URL: %s', requests_api, uri)
logger.debug('Referer header sent with requests: %s', headers['referer'])
if cookies:
csrf_token = cookies.get('csrftoken', None)
else:
csrf_token = request.COOKIES.get('csrftoken', None)
if csrf_token:
headers['X-CSRFToken'] = csrf_token
if data:
if is_json:
return requests_api(uri, json=data, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
elif not files:
return requests_api(uri, data=data, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
else:
return requests_api(uri, data=data, files=files, params=params, cookies=cookies if cookies else request.COOKIES,
headers=headers)
else:
return requests_api(uri, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
Basically I want to get rid of that request parameter (1st param), because then to call it I've to keep passing request object from TemplateViews to internal services. Also, how can I keep persistent connection across multiple calls?

REST vs Invoking the view directly
While it's possible for a web app to make a REST API call to itself. That's not what REST is designed for. Consider the following from: https://docs.djangoproject.com/ja/1.9/topics/http/middleware/
As you can see a django request/response cycle has quite a bit of overhead. Add to this the overhead of webserver and wsgi container. At the client side you have the overhead associated with the requests library, but hang on a sec, the client also happens to be the same web app so it become s part of the web app's overhead too. And there is the problem of peristence (which I will come to shortly).
Last but not least, if you have a DNS round robin setup your request may actually go out on the wire before coming back to the same server. There is a better way, to invoke the view directly.
To invoke another view without the rest API call is really easy
other_app.other_view(request, **kwargs)
This has been discussed a few times here at links such as Django Call Class based view from another class based view and Can I call a view from within another view? so I will not elaborate.
Persistent requests
Persistent http requests (talking about python requests rather than django.http.request.HttpRequest) are managed through session objects (again not to be confused with django sessions). Avoiding confusion is really difficult:
The Session object allows you to persist certain parameters across
requests. It also persists cookies across all requests made from the
Session instance, and will use urllib3's connection pooling. So if
you're making several requests to the same host, the underlying TCP
connection will be reused, which can result in a significant
performance increase
Different hits to your django view will probably be from different users so you don't want to same cookie reused for the internal REST call. The other problem is that the python session object cannot be persisted between two different hit to the django view. Sockets cannot generally be serialized, a requirement for chucking them into memcached or redis.
If you still want to persist with internal REST
I think #julian 's answer shows how to avoid passing the django request instance as a parameter.

If you want to avoid passing the request to wrap_internal_api_call, all you need to do is do a bit more work on the end of the TemplateView where you call the api wrapper. Note that your original wrapper is doing a lot of cookies if cookies else request.COOKIES. You can factor that out to the calling site. Rewrite your api wrapper as follows:
def wrap_internal_api_call(referer, requests_api, uri, data=None, params=None, cookies, is_json=False, files=None):
headers = {'referer': referer}
logger.debug('Request API: %s calling URL: %s', requests_api, uri)
logger.debug('Referer header sent with requests: %s', referer)
csrf_token = cookies.get('csrftoken', None)
if csrf_token:
headers['X-CSRFToken'] = csrf_token
if data:
if is_json:
return requests_api(uri, json=data, params=params, cookies=cookies, headers=headers)
elif not files:
return requests_api(uri, data=data, params=params, cookies=cookies, headers=headers)
else:
return requests_api(uri, data=data, files=files, params=params, cookies=cookies, headers=headers)
else:
return requests_api(uri, params=params, cookies=cookies, headers=headers)
Now, at the place of invocation, instead of
wrap_internal_api_call(request, requests_api, uri, data, params, cookies, is_json, files)
do:
cookies_param = cookies or request.COOKIES
referer_param = request.META.get['HTTP_REFERER']
wrap_internal_api_call(referer_param, requests_api, uri, data, params, cookies_param, is_json, files)
Now you are not passing the request object to the wrapper anymore. This saves a little bit of time because you don't test cookies over and over, but otherwise it doesn't make a difference for performance. In fact, you could achieve the same slight performance gain just by doing the cookies or request.COOKIES once inside the api wrapper.
Networking is always the tightest bottleneck in any application. So if these internal APIs are on the same machine as your TemplateView, your best bet for performance is to avoid doing an API call.

Basically I want to get rid of that request parameter (1st param), because then to call it I've to keep passing request object from TemplateViews to internal services.
To pass function args without explicitly passing them into function calls you can use decorators to wrap your functions and automatically inject your arguments. Using this with a global variable and some django middleware for registering the request before it gets to your view will solve your problem. See below for an abstracted and simplified version of what I mean.
request_decorators.py
REQUEST = None
def request_extractor(func):
def extractor(cls, request, *args, **kwargs):
global REQUEST
REQUEST = request # this part registers request arg to global
return func(cls, request, *args, **kwargs)
return extractor
def request_injector(func):
def injector(*args, **kwargs):
global REQUEST
request = REQUEST
if len(args) > 0 and callable(args[0]): # to make it work with class methods
return func(args[0], request, args[1:], **kwargs) # class method
return func(request, *args, **kwargs) # function
return injector
extract_request_middleware.py
See the django docs for info on setting up middleware
from request_decorators import request_extractor
class ExtractRequest:
#request_extractor
def process_request(self, request):
return None
internal_function.py
from request_decorators import request_injector
#request_injector
def internal_function(request):
return request
your_view.py
from internal_function import internal_function
def view_with_request(request):
return internal_function() # here we don't need to pass in the request arg.
def run_test():
request = "a request!"
ExtractRequest().process_request(request)
response = view_with_request(request)
return response
if __name__ == '__main__':
assert run_test() == "a request!"

Related

Python requests - cannot understand how the argument is passed

I am using this code to get data from Twitter API.
The code works, but I cannot understand how.
Specifically, I cannot understand how the auth=bearer_oauth argument works, since I am passing a function. And how the function works, since I am calling it without its argument.
Sorry if this is too basic, but I could not find an answer.
import requests
bearer_token = "AAA"
api_url = "https://api.twitter.com/2/tweets/search/recent"
def bearer_oauth(r):
r.headers["Authorization"] = f"Bearer {bearer_token}"
return r
def connect_to_endpoint(url, params):
response = requests.get(url, auth=bearer_oauth)
return response
query_params = {'query': 'test'}
json_response = connect_to_endpoint(api_url, query_params)
The bearer_oauth function is just setting the request's authorization header to the bearer token before the request is sent.
The code you provided essentially has the same functionality as this:
headers = {"Authorization": f"Bearer {bearer_token}"
requests.get(url, headers=headers)
After you send the request, Twitter's server parses the authorization header and checks that the bearer token you supplied is valid and has access to the requested resources.
As for why your specific code works, bearer_oauth is an authentication handler that gets attached to the request. The handler gets called when the request is constructed. You don't need to pass the request object because the handler is part of it already.
If you're curious about the implementation, I'd read the internal code here. It looks like the request object is passed to the handler, which then modifies the request (in this case, by setting the authorization header), and then returns the modified request object back to the internal function preparing the request. Then, all of the modified request object's attributes are copied:
# Allow auth to make its changes.
r = auth(self)
# Update self to reflect the auth changes.
self.__dict__.update(r.__dict__)
Since __dict__ is an internal dictionary that holds all the attributes of a single object, everything that was changed about the request object in the handler function will be copied and included in the request before it is sent.

How to mock a url path returning response in Django / Python?

I have a function like this:
def get_some_data(api_url, **kwargs)
# some logic on generating headers
# some more logic
response = requests.get(api_url, headers, params)
return response
I need to create a fake/mock "api_url", which, when made request to, would generate a valid response.
I understand how to mock the response:
def mock_response(data):
response = requests.Response()
response.status_code = 200
response._content = json.dumps(data)
return response
But i need to make the test call like this:
def test_get_some_data(api_url: some_magic_url_path_that_will_return_mock_response):
Any ideas on how to create an url path returning a response within the scope of the test (only standard Django, Python, pytest, unittest) would be very much appreciated
The documentation is very well written and more than clear on how to mock whatever you want. But, let say you have a service that makes the 3rd party API call:
def foo(url, params):
# some logic on generating headers
# some more logic
response = requests.get(url, headers, params)
return response
In your test you want to mock the return value of this service.
#patch("path_to_service.foo")
def test_api_call_response(self, mock_response):
mock_response.return_value = # Whatever the return value you want it to be
# Here you call the service as usual
response = foo(..., ...)
# Assert your response

Flask Middleware with both Request and Response

I want to create a middleware function in Flask that logs details from the request and the response. The middleware should run after the Response is created, but before it is sent back. I want to log:
The request's HTTP method (GET, POST, or PUT)
The request endpoint
The response HTTP status code, including 500 responses. So, if an exception is raised in the view function, I want to record the resulting 500 Response before the Flask internals send it off.
Some options I've found (that don't quite work for me):
The before_request and after_request decorators. If I could access the request data in after_request, my problems still won't be solved, because according to the documentation
If a function raises an exception, any remaining after_request functions will not be called.
Deferred Request Callbacks - there is an after_this_request decorator described on this page, which decorates an arbitrary function (defined inside the current view function) and registers it to run after the current request. Since the arbitrary function can have info from both the request and response in it, it partially solves my problem. The catch is that I would have to add such a decorated function to every view function; a situation I would very much like to avoid.
#app.route('/')
def index():
#after_this_request
def add_header(response):
response.headers['X-Foo'] = 'Parachute'
return response
return 'Hello World!'
Any suggestions?
My first answer is very hacky. There's actually a much better way to achieve the same result by making use of the g object in Flask. It is useful for storing information globally during a single request. From the documentation:
The g name stands for “global”, but that is referring to the data being global within a context. The data on g is lost after the context ends, and it is not an appropriate place to store data between requests. Use the session or a database to store data across requests.
This is how you would use it:
#app.before_request
def gather_request_data():
g.method = request.method
g.url = request.url
#app.after_request
def log_details(response: Response):
g.status = response.status
logger.info(f'method: {g.method}\n url: {g.url}\n status: {g.status}')
return response
Gather whatever request information you want in the function decorated with #app.before_request and store it in the g object.
Access whatever you want from the response in the function decorated with #app.after_request. You can still refer to the information you stored in the g object from step 1. Note that you'll have to return the response at the end of this function.
you can use flask-http-middleware for it link
from flask import Flask
from flask_http_middleware import MiddlewareManager, BaseHTTPMiddleware
app = Flask(__name__)
class MetricsMiddleware(BaseHTTPMiddleware):
def __init__(self):
super().__init__()
def dispatch(self, request, call_next):
url = request.url
response = call_next(request)
response.headers.add("x-url", url)
return response
app.wsgi_app = MiddlewareManager(app)
app.wsgi_app.add_middleware(MetricsMiddleware)
#app.get("/health")
def health():
return {"message":"I'm healthy"}
if __name__ == "__main__":
app.run()
Every time you make request, it will pass the middleware
Okay, so the answer was staring me in the face the whole time, on the page on Deferred Request Callbacks.
The trick is to register a function to run after the current request using after_this_request from inside the before_request callback. This is the code snippet of what worked for me:
#app.before_request
def log_details():
method = request.method
url = request.url
#after_this_request
def log_details_callback(response: Response):
logger.info(f'method: {method}\n url: {url}\n status: {response.status}')
These are the steps:
Get the required details from the response in the before_request callback and store them in some variables.
Then access what you want of the response in the function you decorate with after_this_request, along with the variables you stored the request details in earlier.

Change request headers between subsequent retries

Consider an http request using an OAuth token. The access token needs to be included in the header as bearer. However, if the token is expired, another request needs to be made to refresh the token and then try again. So the custom Retry object will look like:
s = requests.Session()
### token is added to the header here
s.headers.update(token_header)
retry = OAuthRetry(
total=2,
read=2,
connect=2,
backoff_factor=1,
status_forcelist=[401],
method_whitelist=frozenset(['GET', 'POST']),
session=s
)
adapter = HTTPAdapter(max_retries=retry)
s.mount('http://', adapter)
s.mount('https://', adapter)
r = s.post(url, data=data)
The Retry class:
class OAuthRetry(Retry):
def increment(self, method, url, *args, **kwargs):
# refresh the token here. This could be by getting a reference to the session or any other way.
return super(OAuthRetry, self).increment(method, url, *args, **kwargs)
The problem is that after the token is refreshed, HTTPConnectionPool is still using the same headers to make the request after calling increment. See: https://github.com/urllib3/urllib3/blob/master/src/urllib3/connectionpool.py#L787.
Although the instance of the pool is passed in increment, changing the headers there will not affect the call since it is using a local copy of the headers.
This seems like a use case that should come up frequently for the request parameters to change in between retries.
Is there a way to change the request headers in between two subsequent retries?
No, in current version of Requests(2.18.4) and urllib3(1.22).
Retrys is finally handled by openurl in urllib3. And by trace the code of the whole function, there is not a interface to change headers between retrys.
And dynamically changing headers should not be considered as a solution. From the doc:
headers – Dictionary of custom headers to send, such as User-Agent, If-None-Match, etc. If None, pool headers are used. If provided, these headers completely replace any pool-specific headers.
headers is a param passed to the function. And there is no guarantee that it will not be copy after passed. Although in current version of urllib3, openurl does not copy headers, any solution based on changing headers is considered hacky, since it's based on the implementation but not the documentation.
One work around
Interrupt a function and edit some verible it's using is very dangerous.
Instead of injecting something into urllib3, one simple solution is that check the response status and try again if needed.
r = s.post(url, data=data)
if r.status_code == 401:
# refresh the token here.
r = s.post(url, data=data)
Why does the original approach not work?
Requests copy the header in prepare_headers before sending it to urllib3. So urllib3 use the copy created before editing when retrying.

Return a requests.Response object from Flask

I'm trying to build a simple proxy using Flask and requests. The code is as follows:
#app.route('/es/<string:index>/<string:type>/<string:id>',
methods=['GET', 'POST', 'PUT']):
def es(index, type, id):
elasticsearch = find_out_where_elasticsearch_lives()
# also handle some authentication
url = '%s%s%s%s' % (elasticsearch, index, type, id)
esreq = requests.Request(method=request.method, url=url,
headers=request.headers, data=request.data)
resp = requests.Session().send(esreq.prepare())
return resp.text
This works, except that it loses the status code from Elasticsearch. I tried returning resp (a requests.models.Response) directly, but this fails with
TypeError: 'Response' object is not callable
Is there another, simple, way to return a requests.models.Response from Flask?
Ok, found it:
If a tuple is returned the items in the tuple can provide extra information. Such tuples have to be in the form (response, status, headers). The status value will override the status code and headers can be a list or dictionary of additional header values.
(Flask docs.)
So
return (resp.text, resp.status_code, resp.headers.items())
seems to do the trick.
Using text or content property of the Response object will not work if the server returns encoded data (such as content-encoding: gzip) and you return the headers unchanged. This happens because text and content have been decoded, so there will be a mismatch between the header-reported encoding and the actual encoding.
According to the documentation:
In the rare case that you’d like to get the raw socket response from the server, you can access r.raw. If you want to do this, make sure you set stream=True in your initial request.
and
Response.raw is a raw stream of bytes – it does not transform the response content.
So, the following works for gzipped data too:
esreq = requests.Request(method=request.method, url=url,
headers=request.headers, data=request.data)
resp = requests.Session().send(esreq.prepare(), stream=True)
return resp.raw.read(), resp.status_code, resp.headers.items()
If you use a shortcut method such as get, it's just:
resp = requests.get(url, stream=True)
return resp.raw.read(), resp.status_code, resp.headers.items()
Flask can return an object of type flask.wrappers.Response.
You can create one of these from your requests.models.Response object r like this:
from flask import Response
return Response(
response=r.reason,
status=r.status_code,
headers=dict(r.headers)
)
I ran into the same scenario, except that in my case my requests.models.Response contained an attachment. This is how I got it to work:
return send_file(BytesIO(result.content), mimetype=result.headers['Content-Type'], as_attachment=True)
My use case is to call another API in my own Flask API. I'm just propagating unsuccessful requests.get calls through my Flask response. Here's my successful approach:
headers = {
'Authorization': 'Bearer Muh Token'
}
try:
response = requests.get(
'{domain}/users/{id}'\
.format(domain=USERS_API_URL, id=hit['id']),
headers=headers)
response.raise_for_status()
except HTTPError as err:
logging.error(err)
flask.abort(flask.Response(response=response.content, status=response.status_code, headers=response.headers.items()))

Categories