aiohttp / Getting response object out of context manager - python

I'm currently doing my first "baby-steps" with aiohttp (coming from the requests module).
I tried to simplify the requests a bit so I wont have to use a context manager for each request in my main module.
Therefore I tried this:
async def get(session, url, headers, proxies=None):
async with session.get(url, headers=headers, proxy=proxies) as response:
response_object = response
return response_object
But it resulted in:
<class 'aiohttp.client_exceptions.ClientConnectionError'> - Connection closed
The request is available in the context manager. When I try to access it within the context manager in the mentioned function, all works.
But shouldn't it also be able to be saved in the variable <response_object> and then be returned afterwards so I can access it outside of the context manager?
Is there any workaround to this?

If you don't care for the data being loaded during the get method, perhaps you could try loading it inside it:
async def get(session, url, headers, proxies=None):
async with session.get(url, headers=headers, proxy=proxies) as response:
await response.read()
return response
And the using the body that was read like:
resp = get(session, 'http://python.org', {})
print(await resp.text())
Under the hood, the read method caches the body in a member named _body and when trying to call json, aiohttp first checks whether the body was already read or not.

Related

AIOHTTP having request body/content/text when calling raise_for_status

I'm using FastAPI with aiohttp, I built a singleton for a persistent session and I'm using it for opening the session at startup and closing it at shutdown.
Demand: The response body is precious in case of a failure I must log it with the other details.
Because how raise_for_status behave I had to write those ugly functions which handle each HTTP method, this is one of them:
async def post(self, url: str, json: dict, headers: dict) -> ClientResponse:
response = await self.session.post(url=url, json=json, headers=headers)
response_body = await response.text()
try:
response.raise_for_status()
except Exception:
logger.exception('Request failed',
extra={'url': url, 'json': json, 'headers': headers, 'body': response_body})
raise
return response
If I could count on raise_for_status to return also the body (response.text()),
I just could initiate the session ClientSession(raise_for_status=True) and write a clean code:
response = await self.session.post(url=url, json=json, headers=headers)
Is there a way to force somehow raise_for_status to return also the payload/body, maybe in the initialization of the ClientSession?
Thanks for the help.
It is not possible for aiohttp and raise_for_status. As #Andrew Svetlov answered here:
Consider response as closed after raising an exception.
Technically it can contain a partial body but there is no any guarantee.
There is no reason to read it, the body could be very huge, 1GiB is not a limit.
If you need a response content for non-200 -- read it explicitly.
Alternatively, consider using the httpx library in this way. (It is widely used in conjunction with FastAPI):
def raise_on_4xx_5xx(response):
response.raise_for_status()
async with httpx.AsyncClient(event_hooks={'response': [raise_on_4xx_5xx]}) as client:
try:
r = await client.get('http://httpbin.org/status/418')
except httpx.HTTPStatusError as e:
print(e.response.text)

How to post Multipart Form Data through python aiohttp ClientSession

I am trying to asynchronously send some Multipart-Encoded Form Data as a post request, mainly a file and two other fields.
Before trying to use asyncio I was doing the process synchronously with requests-toolbelt MultipartEncoder (https://github.com/requests/toolbelt) which worked great for normal requests, but did not work when using aiohttp for async. aiohttp provides 2 multipart classes, a FormData() class and a MultipartWriter() class, neither of which have given me much success.
After some testing, it seems like the difference is that when I use the toolbelt MultipartEncoder() the request sends the data in the form section of the post request as it should. However, when using aiohttp the request is put into the body section of the request. Not sure why they are acting differently
def multipartencode() -> ClientResponse():
# Using MultipartEncoder
m = MultipartEncoder(
fields={'type': type_str,
'metadata': json.dumps(metadata),
'file': (filename, file, 'application/json')}
)
# Using FormData
data = FormData()
data.add_field('file', file, filename=filename,
content_type='multipart/form-data')
data.add_field('type', type_str, content_type='multipart/form-data')
data.add_field('metadata', json.dumps(metadata),
content_type='multipart/form-data')
# Using MultipartWriter
with MultipartWriter('multipart/form-data') as mpwriter:
part = mpwriter.append(
file, {'CONTENT-TYPE': 'multipart/form-data'})
part.set_content_disposition('form-data')
part = mpwriter.append_form([('type', type_str)])
part.set_content_disposition('form-data')
part = mpwriter.append_form([('metadata', json.dumps(metadata))])
part.set_content_disposition('form-data')
# send request with ClientSession()
resp = await session.post(url=url, data=data, headers=headers)
return resp
How can I properly format/build the multipart-encoded request to get it to send using aiohttp?

Decorating requests module calls with pre and post method calls

I have a class called Client which uses the requests module to interact with a service. It has methods like:
def get_resource(self, url, headers):
response = requests.get(url, headers, auth=self.auth)
return response
Now I want to call some methods before and after each call to the requests module. Something like:
def get_resource(self, url, headers):
self.add_request_header(headers)
response = requests.get(url, headers, auth=self.auth)
self.process_response_headers()
return response
I'm having trouble finding a way to do this without having to rewrite all Client methods. The most straightforward way is to change the calls to the request module to calls to self and add the calls to the methods there.
def get_resource(self, url, headers):
response = self.__get(url, headers, auth=self.auth)
return response
def __get(self, headers, auth):
self.add_request_header(headers)
response = requests.get(url, headers, auth=self.auth)
self.process_response_headers()
return response
But this requires me to change all the call sites and duplicate functionality of the request module. I've tried to use decorators to add these method calls to the request module functions, but got stuck with how to pass in self to the decorator.
I'm sure there's an elegant way to do this in Python.
You can use monkey patching.
read this : Python: Monkeypatching a method of an object
import requests
def get(self, url, params=None, **kwargs):
self.add_request_header(self.headers)
response = requests.get(url, self.headers, auth=self.auth)
self.process_response_headers()
setattr(requests.Session, 'get', requests.Session.get)
s = requests.Session()
I think in this case decorators will not be as good as it sounds and OOP is a better approach to your problem. You could use a base class Client:
class Client(object):
def __init__(self, auth):
self.auth = auth
def add_request_header(self, headers):
pass
def process_response_headers(self):
pass
def get_resource(self, url, headers):
self.add_request_header(headers)
response = requests.get(url, headers, auth=self.auth)
self.process_response_headers()
return response
And create another subclasses with other implementations of add_request_header and/or process_response_headers so later you just need to instantiate the class that better suites your case

How to make client request to external server avoiding cache using aiohttp

We are using aiohttp to make multiple requests to various website vendors to grab their latest data.
Some of the content providers serve the data from a cache. Is it possible to request the data from the server directly? We have tried to pass in the headers parameter with no luck.
async def fetch(url):
global response
headers = {'Cache-Control': 'no-cache'}
async with ClientSession() as session:
async with session.get(url, headers=headers, proxy="OUR-PROXY") as response:
return await response.read()
The goal is to get the last-modified date header, which is not provided from the cache request.
Try to add some additional variable with dynamic value to URL (e.g. timestamp).
This will prevent caching on the server side even if it ignores Cache-Control.
Example:
from: https://example.com/test
to: https://example.com/test?timestamp=20180724181234

Passing session from template view to python requests api call

I want to make multiple internal REST API call from my Django TemplateView, using requests library. Now I want to pass the session too from template view to api call. What is the recommended way to do that, keeping performance in mind.
Right now, I'm extracting cookie from the current request object in template view, and passing that to requests.get() or requests.post() call. But problem with that is, I would have to pass request object to my API Client, which I don't want.
This the current wrapper I'm using to route my requests:
def wrap_internal_api_call(request, requests_api, uri, data=None, params=None, cookies=None, is_json=False, files=None):
headers = {'referer': request.META.get('HTTP_REFERER')}
logger.debug('Request API: %s calling URL: %s', requests_api, uri)
logger.debug('Referer header sent with requests: %s', headers['referer'])
if cookies:
csrf_token = cookies.get('csrftoken', None)
else:
csrf_token = request.COOKIES.get('csrftoken', None)
if csrf_token:
headers['X-CSRFToken'] = csrf_token
if data:
if is_json:
return requests_api(uri, json=data, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
elif not files:
return requests_api(uri, data=data, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
else:
return requests_api(uri, data=data, files=files, params=params, cookies=cookies if cookies else request.COOKIES,
headers=headers)
else:
return requests_api(uri, params=params, cookies=cookies if cookies else request.COOKIES, headers=headers)
Basically I want to get rid of that request parameter (1st param), because then to call it I've to keep passing request object from TemplateViews to internal services. Also, how can I keep persistent connection across multiple calls?
REST vs Invoking the view directly
While it's possible for a web app to make a REST API call to itself. That's not what REST is designed for. Consider the following from: https://docs.djangoproject.com/ja/1.9/topics/http/middleware/
As you can see a django request/response cycle has quite a bit of overhead. Add to this the overhead of webserver and wsgi container. At the client side you have the overhead associated with the requests library, but hang on a sec, the client also happens to be the same web app so it become s part of the web app's overhead too. And there is the problem of peristence (which I will come to shortly).
Last but not least, if you have a DNS round robin setup your request may actually go out on the wire before coming back to the same server. There is a better way, to invoke the view directly.
To invoke another view without the rest API call is really easy
other_app.other_view(request, **kwargs)
This has been discussed a few times here at links such as Django Call Class based view from another class based view and Can I call a view from within another view? so I will not elaborate.
Persistent requests
Persistent http requests (talking about python requests rather than django.http.request.HttpRequest) are managed through session objects (again not to be confused with django sessions). Avoiding confusion is really difficult:
The Session object allows you to persist certain parameters across
requests. It also persists cookies across all requests made from the
Session instance, and will use urllib3's connection pooling. So if
you're making several requests to the same host, the underlying TCP
connection will be reused, which can result in a significant
performance increase
Different hits to your django view will probably be from different users so you don't want to same cookie reused for the internal REST call. The other problem is that the python session object cannot be persisted between two different hit to the django view. Sockets cannot generally be serialized, a requirement for chucking them into memcached or redis.
If you still want to persist with internal REST
I think #julian 's answer shows how to avoid passing the django request instance as a parameter.
If you want to avoid passing the request to wrap_internal_api_call, all you need to do is do a bit more work on the end of the TemplateView where you call the api wrapper. Note that your original wrapper is doing a lot of cookies if cookies else request.COOKIES. You can factor that out to the calling site. Rewrite your api wrapper as follows:
def wrap_internal_api_call(referer, requests_api, uri, data=None, params=None, cookies, is_json=False, files=None):
headers = {'referer': referer}
logger.debug('Request API: %s calling URL: %s', requests_api, uri)
logger.debug('Referer header sent with requests: %s', referer)
csrf_token = cookies.get('csrftoken', None)
if csrf_token:
headers['X-CSRFToken'] = csrf_token
if data:
if is_json:
return requests_api(uri, json=data, params=params, cookies=cookies, headers=headers)
elif not files:
return requests_api(uri, data=data, params=params, cookies=cookies, headers=headers)
else:
return requests_api(uri, data=data, files=files, params=params, cookies=cookies, headers=headers)
else:
return requests_api(uri, params=params, cookies=cookies, headers=headers)
Now, at the place of invocation, instead of
wrap_internal_api_call(request, requests_api, uri, data, params, cookies, is_json, files)
do:
cookies_param = cookies or request.COOKIES
referer_param = request.META.get['HTTP_REFERER']
wrap_internal_api_call(referer_param, requests_api, uri, data, params, cookies_param, is_json, files)
Now you are not passing the request object to the wrapper anymore. This saves a little bit of time because you don't test cookies over and over, but otherwise it doesn't make a difference for performance. In fact, you could achieve the same slight performance gain just by doing the cookies or request.COOKIES once inside the api wrapper.
Networking is always the tightest bottleneck in any application. So if these internal APIs are on the same machine as your TemplateView, your best bet for performance is to avoid doing an API call.
Basically I want to get rid of that request parameter (1st param), because then to call it I've to keep passing request object from TemplateViews to internal services.
To pass function args without explicitly passing them into function calls you can use decorators to wrap your functions and automatically inject your arguments. Using this with a global variable and some django middleware for registering the request before it gets to your view will solve your problem. See below for an abstracted and simplified version of what I mean.
request_decorators.py
REQUEST = None
def request_extractor(func):
def extractor(cls, request, *args, **kwargs):
global REQUEST
REQUEST = request # this part registers request arg to global
return func(cls, request, *args, **kwargs)
return extractor
def request_injector(func):
def injector(*args, **kwargs):
global REQUEST
request = REQUEST
if len(args) > 0 and callable(args[0]): # to make it work with class methods
return func(args[0], request, args[1:], **kwargs) # class method
return func(request, *args, **kwargs) # function
return injector
extract_request_middleware.py
See the django docs for info on setting up middleware
from request_decorators import request_extractor
class ExtractRequest:
#request_extractor
def process_request(self, request):
return None
internal_function.py
from request_decorators import request_injector
#request_injector
def internal_function(request):
return request
your_view.py
from internal_function import internal_function
def view_with_request(request):
return internal_function() # here we don't need to pass in the request arg.
def run_test():
request = "a request!"
ExtractRequest().process_request(request)
response = view_with_request(request)
return response
if __name__ == '__main__':
assert run_test() == "a request!"

Categories