requests.exceptions.ReadTImeout not caught - python

I've been having some issues with this piece of code, and I cannot figure out why.
When I run:
try:
r = requests.get('https://httpbin.org/get',
timeout=10,
headers={'Cache-Control': 'nocache', 'Pragma': 'nocache'})
r.raise_for_status()
return r.json()
except (requests.exceptions.RequestException, ValueError):
return False
NOTE: Host changed for privacy, the actual service is way
more erratic/buggy.
I will occasionally get this error:
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='https://httpbin.org/get', port=80): Read timed out. (read timeout=10)
I can't understand what has gone wrong;
I seem to be properly catching requests.exceptions.RequestException which is a superset/parent of requests.exceptions.ReadTimeout..
EDIT: It seems updating requests has fixed it.

It would be nice to have your error replicated, so far I tried code below,
but it still does catch a timeout.
import requests
try:
r = requests.get('http://httpstat.us/200?sleep=1000',
timeout=0.01,
headers={'Cache-Control': 'nocache', 'Pragma': 'nocache'})
r.raise_for_status()
print(r.json())
except (requests.exceptions.RequestException, ValueError) as e:
print('Error caught!')
print(e)
prints:
Error caught!
HTTPConnectionPool(host='192.168.1.120', port=8080): Read timed out. (read timeout=0.01)
Even in a minimal form your are still catching requests.exceptions.ReadTimeout:
try:
raise requests.exceptions.ReadTimeout
except requests.exceptions.RequestException:
print('Caught ReadTimeout')
My best guess is that your exception arises in some other part of code, but not in this example.

Related

Handling both a ConnectionError and a status code error

I am using requests to get some data from a server, which is done in a while loop. However, every now and then, one of two errors occur. The first is that the status_code of the return from the request is not equal to 200, and this prints out an error message. The second is that a ConnectionError exception is raised.
If I receive either error, I want to keep attempting to get the data. However, I'm not sure how to do this for both types of error.
I know how to handle the ConnectionError exception, for example:
def get_data(self, path):
# Keep trying until the connection attempt is successful
while True:
# Attempt a request
try:
request_return = requests.get(path, timeout=30)
break
# Handle a connection error
except ConnectionError as e:
pass
# Return the data
return request_return.json()
But how can I also handle the status_code in a similar manner? Is it something to do with the raise_for_status() method?
Seems like you could just adjust your try/except to look like this:
try:
request_return = requests.get(path, timeout=30)
if request_return.status_code == 200:
break
except ConnectionError as e:
pass
If you prefer, you can use request_return.status_code == requests.codes.ok as well.
If you're set on handling the request as an exception (for whatever reason), raise_for_status() returns an HTTPError, so you can amend your try/except like this:
try:
request_return = requests.get(path, timeout=30)
request_return.raise_for_status()
break
except ConnectionError as e:
pass
except HTTPError as e:
pass
You can test the status code and leave the loop only on a 200 like:
Code:
if request_return.status_code == 200:
break
Probably should limit the number of retries:
import requests
def get_data(path):
# Keep trying until the connection attempt is successful
retries = 5
while retries > 0:
# Attempt a request
try:
request_return = requests.get(path, timeout=3)
if request_return.status_code == 200:
break
# Handle a connection error
except ConnectionError as e:
pass
retries -= 1
if retries == 0:
""" raise an error here """
# Return the data
return request_return.json()
get_data('https://stackoverflow.com/rep')

How can I make this work? Should I use requests or urllib.error for exceptions?

I am trying to handle the exceptions from the http responses.
The PROBLEM with my code is that I am forced to use and IF condition to catch http error codes
if page.status_code != requests.codes.ok:
page.raise_for_status()
I do not believe this is the right way to do it, I am trying the FOLLOWING
import requests
url = 'http://someurl.com/404-page.html'
myHeaders = {'User-agent': 'myUserAgent'}
s = requests.Session()
try:
page = s.get(url, headers=myHeaders)
#if page.status_code != requests.codes.ok:
# page.raise_for_status()
except requests.ConnectionError:
print ("DNS problem or refused to connect")
# Or Do something with it
except requests.HTTPError:
print ("Some HTTP response error")
#Or Do something with it
except requests.Timeout:
print ("Error loading...too long")
#Or Do something with it, perhaps retry
except requests.TooManyRedirects:
print ("Too many redirect")
#Or Do something with it
except requests.RequestException as e:
print (e.message)
#Or Do something with it
else:
print ("nothing happen")
#Do something if no exception
s.close()
This ALWAYS prints "nothing happen", How I would be able to catch all possible exceptions related to GET URL?
You could catch a RequestException if you want to catch all the exceptions:
import requests
try:
r = requests.get(........)
except requests.RequestException as e:
print(e.message)

Python HTTP Error 429 with urllib2

I am using the following code to resolve redirects to return a links final url
def resolve_redirects(url):
return urllib2.urlopen(url).geturl()
Unfortunately I sometimes get HTTPError: HTTP Error 429: Too Many Requests. What is a good way to combat this? Is the following good or is there a better way.
def resolve_redirects(url):
try:
return urllib2.urlopen(url).geturl()
except HTTPError:
time.sleep(5)
return urllib2.urlopen(url).geturl()
Also, what would happen if there is an exception in the except block?
It would be better to make sure the HTTP code is actually 429 before re-trying.
That can be done like this:
def resolve_redirects(url):
try:
return urllib2.urlopen(url).geturl()
except HTTPError, e:
if e.code == 429:
time.sleep(5);
return resolve_redirects(url)
raise
This will also allow arbitrary numbers of retries (which may or may not be desired).
https://docs.python.org/2/howto/urllib2.html#httperror
This is a fine way to handle the exception, though you should check to make sure you are always sleeping for the appropriate amount of time between requests for the given website (for example twitter limits the amount of requests per minute and has this amount clearly shown in their api documentation). So just make sure you're always sleeping long enough.
To recover from an exception within an exception, you can simply embed another try/catch block:
def resolve_redirects(url):
try:
return urllib2.urlopen(url).geturl()
except HTTPError:
time.sleep(5)
try:
return urllib2.urlopen(url).geturl()
except HTTPError:
return "Failed twice :S"
Edit: as #jesse-w-at-z points out, you should be returning an URL in the second error case, the code I posted is just a reference example of how to write a nested try/catch.
Adding User-Agent to request header solved my issue:
from urllib import request
from urllib.request import urlopen
url = 'https://www.example.com/abc.json'
req = request.Request(url)
req.add_header('User-Agent', 'abc-bot')
response = request.urlopen(req)

Why do I receive a timeout error from Pythons requests module?

I use requests.post(url, headers, timeout=10) and sometimes I received a ReadTimeout exception HTTPSConnectionPool(host='domain.com', port=443): Read timed out. (read timeout=10)
Since I already set timeout as 10 seconds, why am I still receiving a ReadTimeout exception?
Per https://requests.readthedocs.io/en/latest/user/quickstart/#timeouts, that is the expected behavior. As royhowie mentioned, wrap it in a try/except block
(e.g.:
try:
requests.post(url, headers, timeout=10)
except requests.exceptions.Timeout:
print "Timeout occurred"
)
try:
#defined request goes here
except requests.exceptions.ReadTimeout:
# Set up for a retry, or continue in a retry loop
You can wrap it like an exception block like this. Since you asked for this only ReadTimeout. Otherwise catch all of them;
try:
#defined request goes here
except:
# Set up for a retry, or continue in a retry loop
Another thing you can try is at the end of your code block, include the following:
time.sleep(2)
This worked for me. The delay is longer (in seconds) but might help overcome the issue you're having.

Python requests exception handling

How to handle exceptions with python library requests?
For example how to check is PC connected to internet?
When I try
try:
requests.get('http://www.google.com')
except ConnectionError:
# handle the exception
it gives me error name ConnectionError is not defined
Assuming you did import requests, you want requests.ConnectionError. ConnectionError is an exception defined by requests. See the API documentation here.
Thus the code should be:
try:
requests.get('http://www.google.com')
except requests.ConnectionError:
# handle the exception
The original link to the Python v2 API documentation from the original answer no longer works.
As per the documentation, I have added the below points:
In the event of a network problem (refused connection e.g internet issue), Requests will raise a ConnectionError exception.
try:
requests.get('http://www.google.com')
except requests.ConnectionError:
# handle ConnectionError the exception
In the event of the rare invalid HTTP response, Requests will raise an HTTPError exception.
Response.raise_for_status() will raise an HTTPError if the HTTP request returned an unsuccessful status code.
try:
r = requests.get('http://www.google.com/nowhere')
r.raise_for_status()
except requests.exceptions.HTTPError as err:
#handle the HTTPError request here
In the event of times out of request, a Timeout exception is raised.
You can tell Requests to stop waiting for a response after a given number of seconds, with a timeout arg.
requests.get('https://github.com/', timeout=0.001)
# timeout is not a time limit on the entire response download; rather,
# an exception is raised if the server has not issued a response for
# timeout seconds
All exceptions that Requests explicitly raises inherit from requests.exceptions.RequestException. So a base handler can look like,
try:
r = requests.get(url)
except requests.exceptions.RequestException as e:
# handle all the errors here
The original link to the Python v2 documentation no longer works, and now points to the new documentation.
Actually, there are much more exceptions that requests.get() can generate than just ConnectionError. Here are some I've seen in production:
from requests import ReadTimeout, ConnectTimeout, HTTPError, Timeout, ConnectionError
try:
r = requests.get(url, timeout=6.0)
except (ConnectTimeout, HTTPError, ReadTimeout, Timeout, ConnectionError):
continue
Include the requests module using import requests .
It is always good to implement exception handling. It does not only help to avoid unexpected exit of script but can also help to log errors and info notification. When using Python requests I prefer to catch exceptions like this:
try:
res = requests.get(adress,timeout=30)
except requests.ConnectionError as e:
print("OOPS!! Connection Error. Make sure you are connected to Internet. Technical Details given below.\n")
print(str(e))
continue
except requests.Timeout as e:
print("OOPS!! Timeout Error")
print(str(e))
continue
except requests.RequestException as e:
print("OOPS!! General Error")
print(str(e))
continue
except KeyboardInterrupt:
print("Someone closed the program")
for clarity, that is
except requests.ConnectionError:
NOT
import requests.ConnectionError
You can also catch a general exception (although this isn't recommended) with
except Exception:

Categories