Simple way to re-run try block if a specific exception occurs? - python

I'm using Python 3.7 and Django and trying to figure out how to rerun a try block if a specific exception is thrown. I have
for article in all_articles:
try:
self.save_article_stats(article)
except urllib2.HTTPError as err:
if err.code == 503:
print("Got 503 error when looking for stats on " + url)
else:
raise
What I would like is if a 503 errors occurs, for the section in the "try" to be re-run, a maximum of three times. Is there a simple way to do this in Python?

You can turn this in a for loop, and break in case the try block was successful:
for article in all_articles:
for __ in range(3):
try:
self.save_article_stats(article)
break
except urllib2.HTTPError as err:
if err.code == 503:
print("Got 503 error when looking for stats on " + url)
else:
raise
In case the error code is not 503, then the error will reraise, and the control flow will exit the for loops.

Related

'continue not in loop' error while trying to add an error handling mechanism

I've been trying to add an error handling mechanism to my code section. However when it runs it says 'continue outside of loop' but looking at the code it should be inside the try loop. What's going wrong?
def download_media_item(self, entry):
try:
url, path = entry
# Get the file extension example: ".jpg"
ext = url[url.rfind('.'):]
if not os.path.isfile(path + ext):
r = requests.get(url, headers=headers, timeout=15)
if r.status_code == 200:
open(path + ext, 'wb').write(r.content)
self.user_log.info('File {} downloaded from {}'.format(path, url))
return True
elif r.status_code == 443:
print('------------the server reported a 443 error-----------')
return False
else:
self.user_log.info('File {} already exists. URL: {}'.format(path, url))
return False
except requests.ConnectionError:
print("Received ConnectionError. Retrying...")
continue
except requests.exceptions.ReadTimeout:
print("Received ReadTimeout. Retrying...")
continue
It seems that what you are actually wanting to do is keep looping until no exception is raised.
In general, you can do this by having an infinite loop that you break from in the event of a successful completion.
Either:
while True:
try:
# do stuff
except requests.ConnectionError:
# handle error
continue
except requests.exceptions.ReadTimeout:
# handle error
continue
break
Or:
while True:
try:
# do stuff
except requests.ConnectionError:
# handle error
except requests.exceptions.ReadTimeout:
# handle error
else:
break
However, in this case, the "do stuff" seems to always end by reaching a return statement, so the break is not required, and the following reduced version would suffice:
while True:
try:
# do stuff
return some_value
except requests.ConnectionError:
# handle error
except requests.exceptions.ReadTimeout:
# handle error
(The single return shown here may refer to alternative control flows, all of which lead to a return, as in your case.)
continue is specifically for immediately moving to the next iteration of a for or while loop; it is not an all-purpose move-to-the-next-statement instruction.
In a try/except statement, anytime you reach the end of a try, except, else, or finally block, execution proceeds with the next complete statement, not the next portion of the try statement.
def download_media_item(self, entry):
# 1: try statement
try:
...
# If you get here, execution goes to #2 below, not the
# except block below
except requests.ConnectionError:
print("Received ConnectionError. Retrying...")
# Execution goes to #2 below, not the except block below
except requests.exceptions.ReadTimeout:
print("Received ReadTimeout. Retrying...")
# Execution goes to #2 below
# 2: next statement
...

urllib request fails when page takes too long to respond

I have a simple function (in python 3) to take a url and attempt to resolve it: printing an error code if there is one (e.g. 404) or resolve one of the shortened urls to its full url. My urls are in one column of a csv files and the output is saved in the next column. The problem arises where the program encounters a url where the server takes too long to respond- the program just crashes. Is there a simple way to force urllib to print an error code if the server is taking too long. I looked into Timeout on a function call but that looks a little too complicated as i am just starting out. Any suggestions?
i.e. (COL A) shorturl (COL B) http://deals.ebay.com/500276625
def urlparse(urlColumnElem):
try:
conn = urllib.request.urlopen(urlColumnElem)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
else:
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
EDIT: if anyone gets the http.client.disconnected error (like me), see this question/answer http.client.RemoteDisconnected error while reading/parsing a list of URL's
Have a look at the docs:
urllib.request.urlopen(url, data=None[, timeout])
The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used).
You can set a realistic timeout (in seconds) for your process:
conn = urllib.request.urlopen(urlColumnElem, timeout=realistic_timeout_in_seconds)
and in order for your code to stop crushing, move everything inside the try except block:
import socket
def urlparse(urlColumnElem):
try:
conn = urllib.request.urlopen(
urlColumnElem,
timeout=realistic_timeout_in_seconds
)
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout as e:
return ('Connection timeout')
Now if a timeout occurs, you will catch the exception and the program will not crush.
Good luck :)
First, there is a timeout parameter than can be used to control the time allowed for urlopen. Next an timeout in urlopen should just throw an exception, more precisely a socket.timeout. If you do not want it to abort the program, you just have to catch it:
def urlparse(urlColumnElem, timeout=5): # allow 5 seconds by default
try:
conn = urllib.request.urlopen(urlColumnElem, timeout = timeout)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout:
return ('Timeout')
else:
...

python requests: how can I get the "exception code"

I am using "requests (2.5.1)" .now I want to catch the exception and return an dict with some exception message,the dict I will return is as following:
{
"status_code": 61, # exception code,
"msg": "error msg",
}
but now I can't get the error status_code and error message,I try to use
try:
.....
except requests.exceptions.ConnectionError as e:
response={
u'status_code':4040,
u'errno': e.errno,
u'message': (e.message.reason),
u'strerror': e.strerror,
u'response':e.response,
}
but it's too redundancy,how can I get the error message simplicity?anyone can give some idea?
try:
#the codes that you want to catch errors goes here
except:
print ("an error occured.")
This going to catch all errors, but better you define the errors instead of catching all of them and printing special sentences for errors.Like;
try:
#the codes that you want to catch errors goes here
except SyntaxError:
print ("SyntaxError occured.")
except ValueError:
print ("ValueError occured.")
except:
print ("another error occured.")
Or;
try:
#the codes that you want to catch errors goes here
except Exception as t:
print ("{} occured.".format(t))

check ftplib response code

I have a python application that's accessing an ftp server. There are several error cases I'd like to catch in a fashion similar to httplib2:
try:
urllib2.urlopen("http://google.com")
except urllib2.HTTPError, e:
if e.code == 304:
#do 304 stuff
if e.code == 404:
#do 404 stuff
else:
pass
Does a a construct like this exist in ftplib.err_perm? I know that could return a code of 500-599 according to the docs but I don't see anything in the docs about how to access that value. Did I miss something?
You can access error reponse string using <exception_obj>.args[0]. It contains strings like '550 /no-such-dir: No such file or directory'.
To get error code (only leading three chracters), use <exception_obj>.args[0][:3].
For example:
import ftplib
ftp = ftplib.FTP('ftp.hq.nasa.gov')
ftp.login('anonymous', 'user#example.com')
try:
ftp.cwd('/no-such-dir')
except ftplib.error_perm as e:
print('Error {}'.format(e.args[0][:3]))
finally:
ftp.quit()

urllib2 exception handling with couchdb

I usually have a hard time nailing down how to handle urllib2 exceptions. So I'm still learning. Here is a scenario that I'd like some advice on.
I have a local couch db database. I want to know if the database exists. ie "127.0.0.1:5984/database". If it does not exist, and I can reach "127.0.0.1:5984", I want to know so I can create the new database.
Here are several cases I'm thinking about:
1) I could get a timeout.
2) my url is wrong in the sense that I fail to reach the database entirely ie I typed 127.0.4.1:5984/database but couchdb is on 127.0.0.1:5984
3) the database path "database" does not exist on the couch database.
So here some code I wrote to handle it:
What I do is test the response. If everything is fine I set db_exists to True. The only time I set db_exists to False is if I get a 404. Everything else just exits the program.
request = urllib2.Request(address)
try:
response = urllib2.urlopen(request)
except urllib2.URLError, e:
if hasattr(e, 'reason'):
print 'Failed to reach database'
print 'Reason: ', e.reason
sys.exit()
elif hasattr(e, 'code'):
if e.code == 404:
db_exists = False
else:
print 'Failed to reach database'
print 'Reason: ' + str(e)
sys.exit()
else:
try:
#I am expecting a json response. So make sure of it.
json.loads(response.read())
except:
print 'Failed to reach database at "' + address + '"'
sys.exit()
else:
db_exists = True
I am following the exception handling scheme layed out in URLlib2 The Missing Manual.
So basically my questions are...
1) Is this a clean, robust way to handle this?
2) is it common practice to sprinkle sys.exit() throughout code.
-Update-
Using couchdb-python:
main(db_url):
database = couchdb.Database(url=db_url)
try:
database.info()
except couchdb.http.ResourceNotFound, err:
print '"' + db_url + '" ' + err.message[0] + ', ' + err.message[1]
return
except couchdb.http.Unauthorized, err:
print err.message[1]
return
except couchdb.http.ServerError, err:
print err.message
return
except socket.error, err:
print str(err)
return
if __name__ == '__main__':
# note that id did not show it here, but db_url comes from an arg.
main(db_url)
I would argue that you're attacking this problem at too low a level. Why not use couchdb-python?
To answer your questions, 1) no it is not an especially clean way to do this. I would at least factor the code in your except block out into a method that extracts error types suitable for your application out of the urrlib2.URLError. For 2), no it is bad practice to call sys.exit() nearly all the time. Raise an appropriate exception. By default this will bubble up and halt the interpreter, just like your sys.exit() but with a traceback. Or, since your Couch client is a library, the exceptions can be handled at the application's discretion. Library code should never exit the interpreter.

Categories