This is a general, best-practice question. Which of the following try-except examples is better (the function itself is a simple wrapper for requests.get()):
def get(self, url, params=params):
try:
response = {}
response = requests.get(url, params=params)
except requests.ConnectionError,e:
log.exception(e)
finally:
return response
or
def get(self, url, params=params):
try:
return requests.get(url, params=params)
except requests.ConnectionError,e:
log.exception(e)
return {}
Or perhaps both are suboptimal? I seem to write these kind of wrapper functions fairly often for error logging and would like to know the most Pythonic way of doing this. Any advice on this would be appreciated.
It is better to return nothing on exception, and I'm agree with Mark - there is no need to return anything on exception.
def get(self, url, params=params):
try:
return requests.get(url, params=params)
except requests.ConnectionError,e:
log.exception(e)
res = get(...)
if res is not None:
#Proccess with data
#or
if res is None:
#aborting
The second version looks ok to me, but the first one is slightly broken. For example, if the code inside try-except raises anything but ConnectionError, you'll still return {} since returning from finally suppresses any exceptions. And this latter feature is quite confusing (I had to try it myself before answering).
You can also use else clause with try:
def get(self, url, params=params):
try:
# Do dangerous some stuff here
except requests.ConnectionError,e:
# handle the exception
else: # If nothing happened
# Do some safe stuff here
return some_result
finally:
# Do some mandatory stuff
This allows defining the exception scope more precisely.
The second seems clearer to me.
The first version is a little confusing. At first I though it was an error that you were assigning to the same variable twice. It was only after some thought that I understood why this works.
I'd probably look at writing a context manager.
from contextlib import contextmanager
#contextmanager
def get(url, params=params):
try:
yield requests.get(url, params=params)
except requests.ConnectionError as e:
log.exception(e)
yield {}
except:
raise # anything else stays an exception
Then:
with get(...) as res:
print res # will be actual response or empty dict
Related
It is my first question here on Stack Overflow so I apologize if I did something stupid or missed something.
I am trying to make asynchronous aiohttp GET requests to many api endpoints at a time to check the status of these pages: the result should be a triple of the form
(url, True, "200") in case of a working link and (url, False, response_status) in case of a "problematic link". This is the atomic function for each call:
async def ping_url(url, session, headers, endpoint):
try:
async with session.get((url + endpoint), timeout=5, headers=headers) as response:
return url, (response.status == 200), str(response.status)
except Exception as e:
test_logger.info(url + ": " + e.__class__.__name__)
return url, False, repr(e)
These are wrapped into a function using asyncio.gather() which also creates the aiohttp Session:
async def ping_urls(urllist, endpoint):
headers = ... # not relevant
async with ClientSession() as session:
try:
results = await asyncio.gather(*[ping_url(url, session, headers, endpoint) \
for url in urllist],return_exceptions=True)
except Exception as e:
print(repr(e))
return results
The whole called from a main that looks like this:
urls = ... # not relevant
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(ping_urls(urls, endpoint))
except Exception as e:
pass
finally:
loop.close()
This works most of the time, but if the list is pretty long, I noticed that as soon as I get one
TimeoutError
the execution loop stops and I get TimeoutError for all other urls after the first one that timed out. If I omit the timeout in the innermost function I get somehow better results, but then it is not that fast anymore. Is there a way to control the Timeouts for the single api calls instead of a big general timeout for the whole list of urls?
Any kind of help would be extremely appreciated, I got stuck with my bachelor thesis because of this issue.
You may want to try setting a session timeout for your client session. This can be done like:
async def ping_urls(urllist, endpoint):
headers = ... # not relevant
timeout = ClientTimeout(total=TIMEOUT_SECONDS)
async with ClientSession(timeout=timeout) as session:
try:
results = await asyncio.gather(
*[
ping_url(url, session, headers, endpoint)
for url in urllist
],
return_exceptions=True
)
except Exception as e:
print(repr(e))
return results
This should set the ClientSession instance to have TIMEOUT_SECONDS as the timeout. Obviously you will need to set that value to something appropriate!
I struggled with the exceptions as well. I then found the hint, that I can also show the type of the Exception. And with that create appropriate Exception handling.
try: ...
except Exception as e:
print(f'Error: {e} of Type: {type(e)}')
So, with this you can find out, what kind of errors occur and you can catch and handle them individually.
e.g.
try: ...
except aiohttp.ClientConnectionError as e:
# deal with this type of exception
except aiohttp.ClientResponseError as e:
# handle individually
except asyncio.exceptions.TimeoutError as e:
# these kind of errors happened to me as well
I'm Trying to create an all in one function that handles all my API requests and cuts down on lots of repeated code especially with all of the error handling for different error codes.
I am using a few files different files to achieve this a.py that connects to "api a" and b.py that connect to "api b" and api.py that contains the function
a.py and b.py both start with
from api import *
and use
login_response = post_api_call(api_url_base + login_url, None , login_data).json()
or similar
api.py contains the below, but will be fleshed out with more error handling with retries etc which is what I don't want to be repeating.
import requests
import logging
def post_api_call (url, headers, data):
try:
response = requests.post(url, headers=headers, data=data)
response.raise_for_status()
except requests.exceptions.HTTPError as errh:
print ("Http Error:",errh)
logging.warning("Http Error:" + errh)
except requests.exceptions.ConnectionError as errc:
print ("Error Connecting:",errc)
logging.warning ("Error Connecting:" + errc)
except requests.exceptions.Timeout as errt:
print ("Timeout Error:",errt)
logging.warning ("Timeout Error:" + errt)
# only use above if want to retry certain errors, below should catch all of above if needed.
except requests.exceptions.RequestException as err:
print ("OOps: Something Else",err)
logging.warning ("OOps: Something Else" + err)
# retry certain errors...
return response
The above works and isn't an issue.
The issue I'm having is I'm trying to not have different functions for post/get/push etc. how can I pass this through as a variable?
The other issue I am having is some APIs need the data passed as "data=data" others only work when I specify "JSON=data". Others need headers while some don't, but if I pass headers = None as a variable i get 405 Errors. The only other way round it that I can think of is long nested if statements which is nearly as bad as the repeating code.
Am I trying to over simplify this? Is there a better way?
The scripts have a number of API calls (minimum of 5) to a number of different APIs (currently 3 but expecting this to grow) it will then combine all the received data, compare it to the database and the run any updates against the necessary APIs.
Imports:
from requests import Request, Session
Method:
def api_request(*args, **kwargs):
if "session" in kwargs and isinstance(kwargs["session"], Session):
local_session = kwargs["session"]
del kwargs["session"]
else:
local_session = Session()
req = Request(*args, **kwargs)
prepared_req = local_session.prepare_request(req)
try:
response = local_session.send(prepared_req)
except:
# error handling
pass
return response
Usage:
sess = Session()
headers = {
"Accept-Language": "en-US",
"User-Agent": "test-app"
}
result = api_request("GET", "http://www.google.com", session=sess, headers=headers)
print(result.text)
How should I structure the code so I won't get this error:
UnboundLocalError: local variable 'r' referenced before assignment.
If I want to ensure I get a 200 response before returning r.json(), where should I place this code — inside or outside the try block?
if r.status_code == requests.code['ok']
My function:
def get_req():
url = 'https://www.example.com/search'
data = {'p': 'something'}
try:
r = requests.get(url, params=data)
r.raise_for_status()
except requests.exceptions.HTTPError as err:
print(err)
return r.json()
I don't see any way you can get that error. I see no references to r that can occur unless your call to requests.get() succeeded and set r properly. Are you sure you're seeing that happen with the code you're showing us?
Why do you want to check for status code 200? You're already calling raise_for_status(), which basically does that. raise_for_status() checks for some other codes that mean success, but you probably want your code to treat those as success for your purpose as well. So I don't think you need to check for 200 explicitly.
So by the time you call r.json() and return, that should be what you want to do.
UPDATE: Now that you've removed the sys.exit(), you have to do something specific in the error case. In the comments I've given you the four possibilities I see. The simplest one would be to declare your method as returning None if the request fails. This returns the least information to the caller, but you are printing an error already, so that might be fine. For that case, your code would look like this:
def get_req():
url = 'https://www.example.com/search'
data = {'p': 'something'}
try:
r = requests.get(url, params=data)
r.raise_for_status()
except requests.exceptions.HTTPError as err:
print(err)
return None
return r.json()
This code is "correct" if you define returning None on failure of the request, but throwing an exception in some other cases, as being the expected behavior. You might want to catch Exception instead or as a separate case, if you never want this method to throw an exception. I don't know if it's possible for requests.get() to throw some other exception than HTTPError. Maybe it isn't, in which case this code will never throw an exception as is. IMO, it's better to assume it can, and deal with that case explicitly. The code is much more readable that way and does not require future readers to know what exceptions requests.get() is able to throw to understand the behavior of this code in all cases.
your function should look like this:
def get_req():
url = 'https://www.example.com/search'
data = {'p': 'something'}
try:
r = requests.get(url, params=data)
r.raise_for_status()
if r.status_code == requests.code['ok']:
return r.json()
except requests.exceptions.HTTPError as err:
print(err)
sys.exit(1)
hope this helps
I have the below flask code :
from flask import Flask,request,jsonify
import requests
from werkzeug.exceptions import InternalServerError, NotFound
import sys
import json
app = Flask(__name__)
app.config['SECRET_KEY'] = "Secret!"
class InvalidUsage(Exception):
status_code = 400
def __init__(self, message, status_code=None, payload=None):
Exception.__init__(self)
self.message = message
if status_code is not None:
self.status_code = status_code
self.payload = payload
def to_dict(self):
rv = dict(self.payload or ())
rv['message'] = self.message
rv['status_code'] = self.status_code
return rv
#app.errorhandler(InvalidUsage)
def handle_invalid_usage(error):
response = jsonify(error.to_dict())
response.status_code = error.status_code
return response
#app.route('/test',methods=["GET","POST"])
def test():
url = "https://httpbin.org/status/404"
try:
response = requests.get(url)
if response.status_code != 200:
try:
response.raise_for_status()
except requests.exceptions.HTTPError:
status = response.status_code
print status
raise InvalidUsage("An HTTP exception has been raised",status_code=status)
except requests.exceptions.RequestException as e:
print e
if __name__ == "__main__":
app.run(debug=True)
My question is how do i get the exception string(message) and other relevant params from the requests.exceptions.RequestException object e ?
Also what is the best way to log such exceptions . In case of an HTTPError exceptions i have the status code to refer to.
But requests.exceptions.RequestException catches all request exceptions . So how do i differentiate between them and also what is the best way to log them apart from using print statements.
Thanks a lot in advance for any answers.
RequestException is a base class for HTTPError, ConnectionError, Timeout, URLRequired, TooManyRedirects and others (the whole list is available at the GitHub page of requests module). Seems that the best way of dealing with each error and printing the corresponding information is by handling them starting from more specific and finishing with the most general one (the base class). This has been elaborated widely in the comments in this StackOverflow topic. For your test() method this could be:
#app.route('/test',methods=["GET","POST"])
def test():
url = "https://httpbin.org/status/404"
try:
# some code...
except requests.exceptions.ConnectionError as ece:
print("Connection Error:", ece)
except requests.exceptions.Timeout as et:
print("Timeout Error:", et)
except requests.exceptions.RequestException as e:
print("Some Ambiguous Exception:", e)
This way you can firstly catch the errors that inherit from the RequestException class and which are more specific.
And considering an alternative for printing statements - I'm not sure if that's exactly what you meant, but you can log into console or to a file with standard Python logging in Flask or with the logging module itself (here for Python 3).
This is actually not a question about using the requests library as much as it is a general Python question about how to extract the error string from an exception instance. The answer is relatively straightforward: you convert it to a string by calling str() on the exception instance. Any properly written exception handler (in requests or otherwise) would have implemented an __str__() method to allow an str() call on an instance. Example below:
import requests
rsp = requests.get('https://httpbin.org/status/404')
try:
if rsp.status_code >= 400:
rsp.raise_for_status()
except requests.exceptions.RequestException as e:
error_str = str(e)
# log 'error_str' to disk, a database, etc.
print('The error was:', error_str)
Yes, in this example, we print it, but once you have the string you have additional options. Anyway, saving this to test.py results in the following output given your test URL:
$ python3 test.py
The error was: 404 Client Error: NOT FOUND for url: https://httpbin.org/status/404
I am exposing HTTP endpoints—outputting JSON exclusively—using Bottle.
Errors currently throw: {'error': %s, 'error_message': %s, 'status_code': #}.
So in all my endpoint decoratored functions I have:
try:
someObj = <stuff>
except <MyCustomErrors> as e:
response.status = e.response.pop('status_code', 500)
return e.response
response.status = someObj.response.pop('status_code', 200)
return someObj.response
But I could just as easily avoid using exceptions alltogether, resulting in more concise + DRYer endpoint code with reduced overhead.
There are disadvantages however; other devs will need to read—or run—through the code at least once to understand the output format.
Documentation will work here; however is this whole setup bad practice?
In the meantime I wrote this poor substitute:
# Not a decorator because I can't work out how to give `#route(apply=)` func args
def error_else_response(init_with):
try:
result = init_with(**request.query)
except <CustomError> as e:
response.status = e.msg.pop('status_code')
return e.msg
response.status = result.response.pop('status_code')
return result.response
#route('/augment')
def augment():
return error_else_response(<CustomClass>)