closing files properly opened with urllib2.urlopen()

closing files properly opened with urllib2.urlopen() - python

I have following code in a python script
try:
# send the query request
sf = urllib2.urlopen(search_query)
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
sf.close()
except Exception, err:
print("Couldn't get programme information.")
print(str(err))
return
I'm concerned because if I encounter an error on sf.read(), then sf.clsoe() is not called.
I tried putting sf.close() in a finally block, but if there's an exception on urlopen() then there's no file to close and I encounter an exception in the finally block!
So then I tried
try:
with urllib2.urlopen(search_query) as sf:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except Exception, err:
print("Couldn't get programme information.")
print(str(err))
return
but this raised a invalid syntax error on the with... line.
How can I best handle this, I feel stupid!
As commenters have pointed out, I am using Pys60 which is python 2.5.4

I would use contextlib.closing (in combination with from __future__ import with_statement for old Python versions):
from contextlib import closing
with closing(urllib2.urlopen('http://blah')) as sf:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
Or, if you want to avoid the with statement:
try:
sf = None
sf = urllib2.urlopen('http://blah')
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
finally:
if sf:
sf.close()
Not quite as elegant though.

finally:
if sf: sf.close()

Why not just try closing sf, and passing if it doesn't exist?
import urllib2
try:
search_query = 'http://blah'
sf = urllib2.urlopen(search_query)
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except urllib2.URLError, err:
print(err.reason)
finally:
try:
sf.close()
except NameError:
pass

Given that you are trying to use 'with', you should be on Python 2.5, and then this applies too: http://docs.python.org/tutorial/errors.html#defining-clean-up-actions

If urlopen() has an exception, catch it and call the exception's close() function, like this:
try:
req = urllib2.urlopen(url)
req.close()
print 'request {0} ok'.format(url)
except urllib2.HTTPError, e:
e.close()
print 'request {0} failed, http code: {1}'.format(url, e.code)
except urllib2.URLError, e:
print 'request {0} error, error reason: {1}'.format(url, e.reason)
the exception is also a full response object, you can see this issue message: http://bugs.jython.org/issue1544

Looks like the problem runs deeper than I thought - this forum thread indicates urllib2 doesn't implement with until after python 2.6, and possibly not until 3.1

You could create your own generic URL opener:
from contextlib import contextmanager
#contextmanager
def urlopener(inURL):
"""Open a URL and yield the fileHandle then close the connection when leaving the 'with' clause."""
fileHandle = urllib2.urlopen(inURL)
try: yield fileHandle
finally: fileHandle.close()
Then you could then use your syntax from your original question:
with urlopener(theURL) as sf:
search_soup = BeautifulSoup.BeautifulSoup(sf.read())
This solution gives you a clean separation of concerns. You get a clean generic urlopener syntax that handles the complexities of properly closing the resource regardless of errors that occur underneath your with clause.

Why not just use multiple try/except blocks?
try:
# send the query request
sf = urllib2.urlopen(search_query)
except urllib2.URLError as url_error:
sys.stderr.write("Error requesting url: %s\n" % (search_query,))
raise
try:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except Exception, err: # Maybe catch more specific Exceptions here
sys.stderr.write("Couldn't get programme information from url: %s\n" % (search_query,))
raise # or return as in your original code
finally:
sf.close()

Related

Catch any of the errors in psycopg2 without listing them explicitly

I have a try and except block where I would like to catch only the errors in the psycopg2.errors and not any other error.
The explicit way would be:
try:
# execute a query
cur = connection.cursor()
cur.execute(sql_query)
except psycopg2.errors.SyntaxError, psycopg2.errors.GroupingError as err:
# handle in case of error
The query will always be some SELECT statement. If the execution fails it should be handled. Any other exception not belonging to psycopg, e.g. like ZeroDivisionError, should not be caught from the except clause. However, I would like to avoid to list all errors after the except clause. In fact, if you list the psycopg errors, you get a quite extensive list:
from psycopg2 import errors
dir(errors)
I have searched quite extensively and am not sure if this question has been asked already.

You can you use the base class psycopg2.Error it catch all psycopg2 related errors
import psycopg2
try:
cur = connection.cursor()
cur.execute(sql_query)
except psycopg2.Error as err:
# handle in case of error
see official documentation

Meanwhile, I have implemented by catching a generic Exception and checking if the exception belongs to the list returned by dir(errors). The solution proposed by Yannick looks simpler, though.
The function that I use prints the error details and checks using the name of the exception err_type.__name__ whether it is in any of the psycopg errors:
from psycopg2 import errors
def is_psycopg2_exception(_err):
err_type, err_obj, traceback = sys.exc_info()
print ("\npsycopg2 ERROR:", _err, "on line number:", traceback.tb_lineno)
print ("psycopg2 traceback:", traceback, "-- type:", err_type)
return err_type.__name__ in dir(errors)
Then, I use this function in the try/except clause:
try:
# execute a query
cur = connection.cursor()
cur.execute(sql_query)
except Exception as err:
if is_psycopg2_exception(err):
# handle in case of psycopg error
else:
# other type of error
sys.exit(1) # quit
For my very specific case, where I need to check for other other exceptions as well, I can readapt Yannick solution as follows:
try:
# execute a query
cur = connection.cursor()
cur.execute(sql_query)
except psycopg2.OperationalError as err:
# handle some connection-related error
except psycopg2.Error as err:
# handle in case of other psycopg error
except Exception as err:
# any other error
sys.exit(1) # quit

Overpass a custom exception but print all the other exceptions

I am running the following try-except code:
try:
paths = file_system_client.get_paths("{0}/{1}/0/{2}/{3}/{4}".format(container_initial_folder, container_second_folder, chronological_date[0], chronological_date[1], chronological_date[2]), recursive=True)
list_of_paths=["abfss://{0}#{1}.dfs.core.windows.net/".format(storage_container_name, storage_account_name)+path.name for path in paths if ".avro" in path.name]
except Exception as e:
if e=="AccountIsDisabled":
pass
else:
print(e)
I want neither to print the following error if my try-except fells upon it nor to stop my program execution if I fell upon this error:
"(AccountIsDisabled) The specified account is disabled.
RequestId:3159a59e-d01f-0091-5f71-2ff884000000
Time:2020-05-21T13:09:03.3540242Z"
I just want to overpass it and print any other error/exception (eg. TypeError, ValueError, etc) that will occur.
Is this feasible in Python 3?
Please note that the .get_paths() method belongs to the azure.storage.filedatalake module which enables direct connection of Python with Azure Data Lake for path extraction.
I am giving the note to pinpoint that the Exception I am trying to bypass is not a built-in Exception.
[Update] In sort after following the proposed attached answers I modified my code to this:
import sys
from concurrent.futures import ThreadPoolExecutor
from azure.storage.filedatalake._models import StorageErrorException
from azure.storage.filedatalake import DataLakeServiceClient, DataLakeFileClient
storage_container_name="name1" #confidential
storage_account_name="name2" #confidential
storage_account_key="password" #confidential
container_initial_folder="name3" #confidential
container_second_folder="name4" #confidential
def datalake_connector(storage_account_name, storage_account_key):
global service_client
datalake_client = DataLakeServiceClient(account_url="{0}://{1}.dfs.core.windows.net".format("https", storage_account_name), credential=storage_account_key)
print("Client successfuly created!")
return datalake_client
def create_list_paths(chronological_date,
container_initial_folder="name3",
container_second_folder="name4",
storage_container_name="name1",
storage_account_name="name2"
):
list_of_paths=list()
print("1. success")
paths = file_system_client.get_paths("{0}/{1}/0/{2}/{3}/{4}".format(container_initial_folder, container_second_folder, chronological_date[0], chronological_date[1], chronological_date[2]), recursive=True)
print("2. success")
list_of_paths=["abfss://{0}#{1}.dfs.core.windows.net/".format(storage_container_name, storage_account_name)+path.name for path in paths if ".avro" in path.name]
print("3. success")
list_of_paths=functools.reduce(operator.iconcat, result, [])
return list_of_paths
service_client = datalake_connector(storage_account_name, storage_account_key)
file_system_client = service_client.get_file_system_client(file_system=storage_container_name)
try:
list_of_paths=[]
executor=ThreadPoolExecutor(max_workers=8)
print("Start path extraction!")
list_of_paths=[executor.submit(create_list_paths, i, container_initial_folder, storage_container_name, storage_account_name).result() for i in date_list]
except:
print("no success")
print(sys.exc_info())
Unfortunately the StorageErrorException cannot be handled for a reason, I am still getting the following stdout:

Listing [Python 3.Docs]: Compound statements - The try statement.
There are several ways of achieving this. Here's one:
try:
# ...
except StorageErrorException:
pass
except:
print(sys.exc_info()[1])
Note that except: is tricky because you might silently handle exceptions that you shouldn't. Another way would be to catch any exception the code could raise explicitly.
try:
# ...
except StorageErrorException:
pass
except (SomeException, SomeOtherException, SomeOtherOtherException) as e:
print(e)
Quickly browsing [MS.Docs]: filedatalake package and the sourcecode, revealed that StorageErrorException (which extends [MS.Docs]: HttpResponseError class) is the one that you need to handle.
Might want to check [SO]: About catching ANY exception.
Related to the failure of catching the exception, apparently there are 2 having the same name:
azure.storage.blob._generated.models._models_py3.StorageErrorException (currently imported)
azure.storage.filedatalake._generated.models._models_py3.StorageErrorException
I don't know the rationale (I didn't work with the package), but given the fact the package raises an exception defined in another package when it also defines one with the same name, seems lame. Anyway importing the right exception solves the problem.
As a side note, when dealing with this kind of situation, don't only import the base name, but work with the fully qualified one:
import azure.storage.filedatalake._generated.models.StorageErrorException

you want to compare the type of the exception, change your condition to:
if type(e)==AccountIsDisabled:
example:
class AccountIsDisabled(Exception):
pass
print("try #1")
try:
raise AccountIsDisabled
except Exception as e:
if type(e)==AccountIsDisabled:
pass
else:
print(e)
print("try #2")
try:
raise Exception('hi', 'there')
except Exception as e:
if type(e)==AccountIsDisabled:
pass
else:
print(e)
Output:
try #1
try #2
('hi', 'there')

Pythonic exception handling: only catching specific errno

I often read that in python "it is easier to ask for forgiveness then for permission", so it is sometimes considered better to use try except instead of if.
I often have statements like
if (not os.path.isdir(dir)):
os.mkdir(dir).
The likely replacement would be
try:
os.mkdir(dir)
except OSError:
pass.
However I would like to be more specific and only ignore the errno.EEXIST, as this is the only error that is expected to happen and I have no idea what could happen.
try:
os.mkdir(dir)
except OSError:
if(OSError.errno != errno.EEXIST):
raise
else:
pass.
Seems to do the trick. But this is really bulky and will 'pollute' my code and reduce readability if I need plenty of these code-blocks. Is there a pythonic way to do this in Python 2.X? What is the standard procedure to handle such cases?
edits:
use raise instead raise OSerror as pointed out by #Francisco Couzo
I use Python 2.7

I just stumbled across the probably most elegant solution: creating the ignored context manager:
import errno
from contextlib import contextmanager
#contextmanager
def ignorednr(exception, *errornrs):
try:
yield
except exception as e:
if e.errno not in errornrs:
raise
pass
with ignorednr(OSError, errno.EEXIST):
os.mkdir(dir)
This way I just have the ugly job of creating the context manager once, from then on the syntax is quite nice and readable.
The solution is taken from https://www.youtube.com/watch?v=OSGv2VnC0go.

If you are calling it multiple times with different args, put it in a function:
def catch(d, err):
try:
os.mkdir(d)
except OSError as e:
if e.errno != err:
raise
Then call the function passing in whatever args:
catch(, "foo", errno.EEXIST)
You could also allow the option of passing passing multiple errno's if you wanted more:
def catch(d, *errs):
try:
os.mkdir(d)
except OSError as e:
if e.errno not in errs:
raise
catch("foo", errno.EEXIST, errno.EPERM)

This example is for exception OSError : 17, 'File exists'
import sys
try:
value = os.mkdir("dir")
except:
e = sys.exc_info()[:2]
e = str(e)
if "17" in e:
raise OSError #Perform Action
else:
pass
Just change the number 17 to your exception number. You can get a better explanation at this link.

How to handle multiple exceptions?

I'm a Python learner, trying to handle a few scenarios:
Reading a file.
Formatting Data.
Manipulating/Copying Data.
Writing a file.
So far, I have:
try:
# Do all
except Exception as err1:
print err1
#File Reading error/ File Not Present
except Exception as err2:
print err2
# Data Format is incorrect
except Exception as err3:
print err3
# Copying Issue
except Exception as err4:
print err4
# Permission denied for writing
The idea of implementing in this fashion is to catch the exact error for all different scenarios. I can do it in all separate try/except blocks.
Is this possible? And reasonable?

Your try blocks should be as minimal as possible, so
try:
# do all
except Exception:
pass
is not something you want to do.
The code in your example won't work as you expect it to, because in every except block you are catching the most general exception type Exception. In fact, only the first except block will ever be executed.
What you want to be doing is having multiple try/except blocks, each one responsible for as few things as possible and catching the most specific exception.
For example:
try:
# opening the file
except FileNotFoundException:
print('File does not exist')
exit()
try:
# writing to file
except PermissionError:
print('You do not have permission to write to this file')
exit()
However, sometimes it is appropriate to catch different types of exceptions, in the same except block or in several blocks.
try:
ssh.connect()
except (ConnectionRefused, TimeoutError):
pass
or
try:
ssh.connect()
except ConnectionRefused:
pass
except TimeoutError:
pass

As DeepSpace stated,
Your try blocks should be as minimal as possible.
If you want to achieve
try:
# do all
except Exception:
pass
Then you might as well do something like
def open_file(file):
retval = False
try:
# opening the file succesful?
retval = True
except FileNotFoundException:
print('File does not exist')
except PermissionError:
print('You have no permission.')
return retval
def crunch_file(file):
retval = False
try:
# conversion or whatever logical operation with your file?
retval = True
except ValueError:
print('Probably wrong data type?')
return retval
if __name__ == "__main__":
if open_file(file1):
open(file1)
if open_file(file2) and crunch_file(file2):
print('opened and crunched')

Yes this is possible.
Just say as example:
try:
...
except RuntimeError:
print err1
except NameError:
print err2
...
Just define the exact Error you want to intercept.

How to identify what function call raise an exception in Python?

i need to identify who raise an exception to handle better str error, is there a way ?
look at my example:
try:
os.mkdir('/valid_created_dir')
os.listdir('/invalid_path')
except OSError, msg:
# here i want i way to identify who raise the exception
if is_mkdir_who_raise_an_exception:
do some things
if is_listdir_who_raise_an_exception:
do other things ..
how i can handle this, in python ?

If you have completely separate tasks to execute depending on which function failed, as your code seems to show, then separate try/exec blocks, as the existing answers suggest, may be better (though you may probably need to skip the second part if the first one has failed).
If you have many things that you need to do in either case, and only a little amount of work that depends on which function failed, then separating might create a lot of duplication and repetition so the form you suggested may well be preferable. The traceback module in Python's standard library can help in this case:
import os, sys, traceback
try:
os.mkdir('/valid_created_dir')
os.listdir('/invalid_path')
except OSError, msg:
tb = sys.exc_info()[-1]
stk = traceback.extract_tb(tb, 1)
fname = stk[0][2]
print 'The failing function was', fname
Of course instead of the print you'll use if checks to decide exactly what processing to do.

Wrap in "try/catch" each function individually.
try:
os.mkdir('/valid_created_dir')
except Exception,e:
## doing something,
## quite probably skipping the next try statement
try:
os.listdir('/invalid_path')
except OSError, msg:
## do something
This will help readability/comprehension anyways.

How about the simple solution:
try:
os.mkdir('/valid_created_dir')
except OSError, msg:
# it_is_mkdir_whow_raise_ane_xception:
do some things
try:
os.listdir('/invalid_path')
except OSError, msg:
# it_is_listdir_who_raise_ane_xception:
do other things ..

Here's the clean approach: attach additional information to the exception where it happens, and then use it in a unified place:
import os, sys
def func():
try:
os.mkdir('/dir')
except OSError, e:
if e.errno != os.errno.EEXIST:
e.action = "creating directory"
raise
try:
os.listdir('/invalid_path')
except OSError, e:
e.action = "reading directory"
raise
try:
func()
except Exception, e:
if getattr(e, "action", None):
text = "Error %s: %s" % (e.action, e)
else:
text = str(e)
sys.exit(text)
In practice, you'd want to create wrappers for functions like mkdir and listdir if you want to do this, rather than scattering small try/except blocks all over your code.
Normally, I don't find this level of detail in error messages so important (the Python message is usually plenty), but this is a clean way to do it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

closing files properly opened with urllib2.urlopen() - python

finally: if sf: sf.close()

Why not just try closing sf, and passing if it doesn't exist? import urllib2 try: search_query = 'http://blah' sf = urllib2.urlopen(search_query) search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read()) except urllib2.URLError, err: print(err.reason) finally: try: sf.close() except NameError: pass

Given that you are trying to use 'with', you should be on Python 2.5, and then this applies too: http://docs.python.org/tutorial/errors.html#defining-clean-up-actions

Looks like the problem runs deeper than I thought - this forum thread indicates urllib2 doesn't implement with until after python 2.6, and possibly not until 3.1

Related

Catch any of the errors in psycopg2 without listing them explicitly

Overpass a custom exception but print all the other exceptions

Pythonic exception handling: only catching specific errno

How to handle multiple exceptions?

How to identify what function call raise an exception in Python?

Categories

Resources