How should I structure the code so I won't get this error:
UnboundLocalError: local variable 'r' referenced before assignment.
If I want to ensure I get a 200 response before returning r.json(), where should I place this code — inside or outside the try block?
if r.status_code == requests.code['ok']
My function:
def get_req():
url = 'https://www.example.com/search'
data = {'p': 'something'}
try:
r = requests.get(url, params=data)
r.raise_for_status()
except requests.exceptions.HTTPError as err:
print(err)
return r.json()
I don't see any way you can get that error. I see no references to r that can occur unless your call to requests.get() succeeded and set r properly. Are you sure you're seeing that happen with the code you're showing us?
Why do you want to check for status code 200? You're already calling raise_for_status(), which basically does that. raise_for_status() checks for some other codes that mean success, but you probably want your code to treat those as success for your purpose as well. So I don't think you need to check for 200 explicitly.
So by the time you call r.json() and return, that should be what you want to do.
UPDATE: Now that you've removed the sys.exit(), you have to do something specific in the error case. In the comments I've given you the four possibilities I see. The simplest one would be to declare your method as returning None if the request fails. This returns the least information to the caller, but you are printing an error already, so that might be fine. For that case, your code would look like this:
def get_req():
url = 'https://www.example.com/search'
data = {'p': 'something'}
try:
r = requests.get(url, params=data)
r.raise_for_status()
except requests.exceptions.HTTPError as err:
print(err)
return None
return r.json()
This code is "correct" if you define returning None on failure of the request, but throwing an exception in some other cases, as being the expected behavior. You might want to catch Exception instead or as a separate case, if you never want this method to throw an exception. I don't know if it's possible for requests.get() to throw some other exception than HTTPError. Maybe it isn't, in which case this code will never throw an exception as is. IMO, it's better to assume it can, and deal with that case explicitly. The code is much more readable that way and does not require future readers to know what exceptions requests.get() is able to throw to understand the behavior of this code in all cases.
your function should look like this:
def get_req():
url = 'https://www.example.com/search'
data = {'p': 'something'}
try:
r = requests.get(url, params=data)
r.raise_for_status()
if r.status_code == requests.code['ok']:
return r.json()
except requests.exceptions.HTTPError as err:
print(err)
sys.exit(1)
hope this helps
Related
Need to capture the response body for a HTTP error in python. Currently using the python request module's raise_for_status(). This method only returns the Status Code and description. Need a way to capture the response body for a detailed error log.
Please suggest alternatives to python requests module if similar required feature is present in some different module. If not then please suggest what changes can be done to existing code to capture the said response body.
Current implementation contains just the following:
resp.raise_for_status()
I guess I'll write this up quickly. This is working fine for me:
try:
r = requests.get('https://www.google.com/404')
r.raise_for_status()
except requests.exceptions.HTTPError as err:
print(err.request.url)
print(err)
print(err.response.text)
you can do something like below, which returns the content of the response, in unicode.
response.text
or
try:
r = requests.get('http://www.google.com/nothere')
r.raise_for_status()
except requests.exceptions.HTTPError as err:
print(err)
sys.exit(1)
# 404 Client Error: Not Found for url: http://www.google.com/nothere
here you'll get the full explanation on how to handle the exception. please check out Correct way to try/except using Python requests module?
You can log resp.text if resp.status_code >= 400.
There are some tools you may pick up such as Fiddler, Charles, wireshark.
However, those tools can just display the body of the response without including the reason or error stack why the error raises.
My question is closely related to this one.
I'm using the Requests library to hit an HTTP endpoint.
I want to check if the response is a success.
I am currently doing this:
r = requests.get(url)
if 200 <= response.status_code <= 299:
# Do something here!
Instead of doing that ugly check for values between 200 and 299, is there a shorthand I can use?
The response has an ok property. Use that:
if response.ok:
...
The implementation is just a try/except around Response.raise_for_status, which is itself checks the status code.
#property
def ok(self):
"""Returns True if :attr:`status_code` is less than 400, False if not.
This attribute checks if the status code of the response is between
400 and 600 to see if there was a client error or a server error. If
the status code is between 200 and 400, this will return True. This
is **not** a check to see if the response code is ``200 OK``.
"""
try:
self.raise_for_status()
except HTTPError:
return False
return True
I am a Python newbie but I think the easiest way is:
if response.ok:
# whatever
The pythonic way to check for requests success would be to optionally raise an exception with
try:
resp = requests.get(url)
resp.raise_for_status()
except requests.exceptions.HTTPError as err:
print(err)
EAFP: It’s Easier to Ask for Forgiveness than Permission: You should just do what you expect to work and if an exception might be thrown from the operation then catch it and deal with that fact.
I am using the following code to resolve redirects to return a links final url
def resolve_redirects(url):
return urllib2.urlopen(url).geturl()
Unfortunately I sometimes get HTTPError: HTTP Error 429: Too Many Requests. What is a good way to combat this? Is the following good or is there a better way.
def resolve_redirects(url):
try:
return urllib2.urlopen(url).geturl()
except HTTPError:
time.sleep(5)
return urllib2.urlopen(url).geturl()
Also, what would happen if there is an exception in the except block?
It would be better to make sure the HTTP code is actually 429 before re-trying.
That can be done like this:
def resolve_redirects(url):
try:
return urllib2.urlopen(url).geturl()
except HTTPError, e:
if e.code == 429:
time.sleep(5);
return resolve_redirects(url)
raise
This will also allow arbitrary numbers of retries (which may or may not be desired).
https://docs.python.org/2/howto/urllib2.html#httperror
This is a fine way to handle the exception, though you should check to make sure you are always sleeping for the appropriate amount of time between requests for the given website (for example twitter limits the amount of requests per minute and has this amount clearly shown in their api documentation). So just make sure you're always sleeping long enough.
To recover from an exception within an exception, you can simply embed another try/catch block:
def resolve_redirects(url):
try:
return urllib2.urlopen(url).geturl()
except HTTPError:
time.sleep(5)
try:
return urllib2.urlopen(url).geturl()
except HTTPError:
return "Failed twice :S"
Edit: as #jesse-w-at-z points out, you should be returning an URL in the second error case, the code I posted is just a reference example of how to write a nested try/catch.
Adding User-Agent to request header solved my issue:
from urllib import request
from urllib.request import urlopen
url = 'https://www.example.com/abc.json'
req = request.Request(url)
req.add_header('User-Agent', 'abc-bot')
response = request.urlopen(req)
Suppose I have a code snippet as following
r = requests.post(url, data=values, files=files)
Since this is making a network request, a bunch of exceptions can be thrown from this line. For completeness of the argument, I could also have file reads, sending emails, etc. To encounter for such errors I do
try:
r = requests.post(url, data=values, files=files)
if r.status_code != 200:
raise Exception("Could not post to "+ url)
except Exception as e:
logger.error("Error posting to " + url)
There are two problems which I see with this approach.
I have just handled a generic exception and don't know what exact exception would be raised by this line, what is the best way to find it in python.
This makes the code look ugly, which is non pythonic but fine, as long as its robust and handles all the cases.
I am wondering what would be the best way to handle exceptions in python.
The best way to write try-except -- in Python or anywhere else -- is as narrow as possible. It's a common problem to catch more exceptions than you meant to handle!
In particular, at a minimum, I'd re-write your example code as something like:
try:
r = requests.post(url, data=values, files=files)
except Exception as e:
logger.error("Error posting to %r: %s" % (url, e))
raise
else:
if r.status_code != 200:
logger.error("Could not to %r: HTTP code %s" % (url, r.status_code))
raise RuntimeError("HTTP code %s trying to post to %r" % (r.status_code, url))
This embodies several best-practices, such as: detailed error messages, always re-raise exceptions you don't know how to specifically handle (after logging error messages with more details as well as the exception), never raise something as generic as Exception, &c -- and, crucially, catch exceptions only on the narrowest part of code you possibly can, that's what the else: clause in try/except is for!-)
If and when you do expect -- and know how to handle -- specific exceptions, so much the better -- you put other except ThisSpecificProblem as e: clauses before the generic except Exception clause which logs and re-raises. But (from the Zen of Python -- import this at a Python interpreter prompt!) -- "Errors should never pass silently. // Unless explicitly silenced."... and you should only "explicitly silence" errors you fully expect, and fully know how to handle!
I have just handled a generic exception and don't know what exact
exception would be raised by this line, what is the best way to find
it in python.
As always, the answer is to look at the documentation:
In the event of a network problem (e.g. DNS failure, refused
connection, etc), Requests will raise a ConnectionError exception.
In the rare event of an invalid HTTP response, Requests will raise an
HTTPError exception.
If a request times out, a Timeout exception is raised.
If a request exceeds the configured number of maximum redirections, a
TooManyRedirects exception is raised.
All exceptions that Requests explicitly raises inherit from
requests.exceptions.RequestException.
Code that raises exceptions (especially if there are custom exceptions) is documented. You can also have a look at the source if the documentation is not explicit.
Your code is fine, except you should avoid generic except clauses as these can hide other problems with your code. You should except those exceptions that you can predict, and then let the others "rise up" until caught/logged.
Well, answering your first question, what exact exception would be raised by this line, you are one step away.
You already call except Exception as e, but you don't use e anywhere. e contains the information about your exception, so just add a little print statement
print e
And it works:
>>> try:
... x = int(raw_input('Input: '))
... except Exception as e:
... print e
...
Input: 5t
invalid literal for int() with base 10: '5t'
>>>
I don't exactly see what you're asking in the 2nd, you say it is ugly/non-pythonic, but then you say it is fine. Yes, it is fine, and it is also quite pythonic, in my opinion.
You should try avoiding using except Exception as e: as much as possible.
For clarity you can create a custom exception class which takes care of your error code = 200 scenario.
class PostingError(Exception):
pass
And then raise PostingError only. Try catching this error only. By catching all kinds of error, you might be catching wrong information. For example even a memory error might be caught and displayed as a "Error posting to URL".
So this is how it would look like finally
try:
r = requests.post(url, data=values, files=files)
if r.status_code != 200:
raise PostingError("Could not post to "+ url)
except PostingError as e:
logger.error(e)
This is a general, best-practice question. Which of the following try-except examples is better (the function itself is a simple wrapper for requests.get()):
def get(self, url, params=params):
try:
response = {}
response = requests.get(url, params=params)
except requests.ConnectionError,e:
log.exception(e)
finally:
return response
or
def get(self, url, params=params):
try:
return requests.get(url, params=params)
except requests.ConnectionError,e:
log.exception(e)
return {}
Or perhaps both are suboptimal? I seem to write these kind of wrapper functions fairly often for error logging and would like to know the most Pythonic way of doing this. Any advice on this would be appreciated.
It is better to return nothing on exception, and I'm agree with Mark - there is no need to return anything on exception.
def get(self, url, params=params):
try:
return requests.get(url, params=params)
except requests.ConnectionError,e:
log.exception(e)
res = get(...)
if res is not None:
#Proccess with data
#or
if res is None:
#aborting
The second version looks ok to me, but the first one is slightly broken. For example, if the code inside try-except raises anything but ConnectionError, you'll still return {} since returning from finally suppresses any exceptions. And this latter feature is quite confusing (I had to try it myself before answering).
You can also use else clause with try:
def get(self, url, params=params):
try:
# Do dangerous some stuff here
except requests.ConnectionError,e:
# handle the exception
else: # If nothing happened
# Do some safe stuff here
return some_result
finally:
# Do some mandatory stuff
This allows defining the exception scope more precisely.
The second seems clearer to me.
The first version is a little confusing. At first I though it was an error that you were assigning to the same variable twice. It was only after some thought that I understood why this works.
I'd probably look at writing a context manager.
from contextlib import contextmanager
#contextmanager
def get(url, params=params):
try:
yield requests.get(url, params=params)
except requests.ConnectionError as e:
log.exception(e)
yield {}
except:
raise # anything else stays an exception
Then:
with get(...) as res:
print res # will be actual response or empty dict