Parameter validation, Best practices in Python [closed]

Parameter validation, Best practices in Python [closed] - python

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 months ago.
The community reviewed whether to reopen this question last month and left it closed:
Original close reason(s) were not resolved
Improve this question
Lets take an example of a API
def get_abs_directory(self, path):
if os.path.isdir(path):
return path
else:
return os.path.split(os.path.abspath(path))[0]
My question is what is the pythonic way of validating parameters, should I ignore any type of validations (I observed that all the python code does no validation at all)
Should I check for "path" to be empty and not null
Should I check the "type" of path to be string always
In general should I check for type of parameters ? (I guess not as python in dynamically typed)
This question is not specific to File IO, instead FileIO is used only as an example

As mentioned by the documentation here, Python follows an EAFP approach. This means that we usually use more try and catch blocks instead of trying to validate parameters. Let me demonstrate:
import os
def get_abs_directory(path):
try:
if os.path.isdir(path):
return path
else:
return os.path.split(os.path.abspath(path))[0]
except TypeError:
print "You inserted the wrong type!"
if __name__ == '__main__':
get_abs_directory(1) # Using an int instead of a string, which is caught by TypeError
You could however, wish to code in a LBYL (Look Before You Leap) style and this would look something like this:
import os
def get_abs_directory(path):
if not isinstance(path, str):
print "You gave us the wrong type, you big meany!"
return None
if os.path.isdir(path):
return path
else:
return os.path.split(os.path.abspath(path))[0]
if __name__ == '__main__':
get_abs_directory(1)

EAFP is the de-facto standard in Python for situations like this and at the same time, nothing impedes you from following LBYL all the way if you like to.
However, there are reservations when EAFP is applicable:
When the code is still able to deal with exception scenarios, break at some point or enable the caller to validate possible errors then it may be best just to follow the EAFP principle.
When using EAFP leads to silent errors then explicit checks/validations (LBYL) may be best.
Regarding that, there is a Python module, parameters-validation, to ease validation of function parameters when you need to:
#validate_parameters
def register(
token: strongly_typed(AuthToken),
name: non_blank(str),
age: non_negative(int),
nickname: no_whitespaces(non_empty(str)),
bio: str,
):
# do register
Disclaimer: I am the project maintainer.

even though already answered, this is too long for a comment, so i'll just add an another answer.
In general, type checking is done for two reasons: making sure your function actually completes, and avoiding difficult-to-debug downstream failures from bad output.
For the first problem, the answer is always appropriate - EAFP is the normal method. and you don't worry about bad inputs.
For the second... the answer depends on your normal use cases, and you DO worry about bad inputs/bugs. EAFP is still appropriate (and it's easier, and more debuggable) when bad inputs always generate exceptions (where 'bad inputs' can be limited to the types of bad inputs that your app expects to produce, possibly). But if there's a possibility bad inputs could create a valid output, then LYBL may make your life easier later.
Example: let's say that you call square(), put this value into a dictionary, and then (much) later extract this value from the dictionary and use it as an index. Indexes, of course, must be integers.
square(2) == 4, and is a valid integer, and so is correct. square('a') will always fail, because 'a'*'a' is invalid, and will always throw an exception. If these are the only two possibilities, then you can safely use EAFP. if you do get bad data, it will throw an exception, generate a traceback, and you can restart with pdb and get a good indication of what's wrong.
however... let's say that your app uses some FP. and it's possible (assuming you have a bug! not normal operation of course) for you to accidentally call square(1.43). This will return a valid value - 2.0449 or so. you will NOT get exception here, and so your app will happily take that 2.0449 and put it in the dictionary for you. Much later, your app will pull this value back out of the dictionary, use it as an index into a list and - crash. You'll get a traceback, you'll restart with pdb, and realize that doesn't help you at all, because that value was calculated a long time ago, and you no longer have the inputs, or any idea how that data got there. And those are not fun to debug.
In those cases, you can use asserts (special form of LYBL) to move detection of those sorts of bugs earlier, or you can do it explicitly. If you don't ever have a bug calling that function, then either one will work. But if you do... then you'll really be glad you checked the inputs artificially close to the failure, rather than naturally some random later place in your app.

The code does "trap" errors as this test code shows, an exception is raised for passing in None
import os.path
import os
class pathetic(unittest.TestCase):
def setUp(self):
if (not(os.path.exists("ABC"))):
os.mkdir("ABC")
else:
self.assert_(False, "ABC exists, can't make test fixture")
def tearDown(self):
if (os.path.exists("ABC")):
os.rmdir("ABC")
def test1(self):
mycwd = os.path.split(os.path.abspath(os.getcwd()))[0]
self.assertEquals("/", self.get_abs_directory("/abc"))
self.assertEquals(mycwd, self.get_abs_directory(""))
self.assertEquals("/ABC", self.get_abs_directory("/ABC/DEF"))
try:
self.get_abs_directory(None)
self.assert_(False, "should raise exception")
except TypeError:
self.assert_(True, "woo hoo, exception")
def get_abs_directory(self, path):
if os.path.isdir(path):
return path
else:
return os.path.split(os.path.abspath(path))[0]
if __name__ == '__main__':
unittest.main()

Related

Python: How to store exception value?

How might I store the value of an exception as a variable? Looking at older posts, such as the one below, these methods do not work in Python3. What should I do without using a pip install package?
My goal is to iterate through data, checking it all for various things, and retain all the errors to put into a spreadsheet.
Older post: Python: catch any exception and put it in a variable
MWE:
import sys
# one way to do this
test_val = 'stringing'
try:
assert isinstance(test_val, int), '{} is not the right dtype'.format(test_val)
except AssertionError:
the_type, the_value, the_traceback = sys.exc_info()

except AssertionError as err:
For more details like unpacking the contents of the exception, see towards the bottom of this block of the tutorial, the paragraph that begins "The except clause may specify a variable after the exception name."

If you want to test whether a certain variable is of type int, you can use isinstance(obj, int) as you already do.
However you should never use assert to validate any data, since these checks will be turned off when Python is invoked with the -O command line option. assert statements serve to validate your program's logic, i.e. to verify certain states that you can logically deduce your program should be in.
If you want to save whether your data conforms to some criteria, just store the True or False result directly, rather than asserting it and then excepting the error (beyond the reasons above, that's unnecessary extra work).

Python: Returning errors rather than throwing

Here is the normal way of catching and throwing exceptions to callee's
def check(a):
data = {}
if not a:
raise Exception("'a' was bad")
return data
def doSomething():
try:
data = check(None)
except Exception, e:
print e
Here is an alternative + a few things I like:
'data' is always present, the 'check' function could set up some
defaults for data, the logic is then contained within the function
and doesn't have to be repeated. Also means that a dev can't make the mistake of trying to access data when an exception has occurred. (data could be defined at the very top of 'doSomething' function + assigned some defaults)
You don't have to have try/excepts everywhere cluttering up the 'doSomething' functions
def check(a):
errors = []
data = {}
if not a:
errors.append("'a' was bad")
return data, errors
def doSomething():
data, errors = check(None)
if errors:
print errors
Is there anything wrong with it? what are people's opinions?

There are times when the second approach can be useful (for instance, if you're doing a sequence of relatively independent operations and just want to keep a record of which ones failed). But if your goal is to prevent continuing in an incorrect state (i.e., not "make the mistake of trying to access data when an exception has occurred"), the second way is not good. If you need to do the check at all, you probably want to do it like this:
def check(a):
data = {}
if not a:
raise Exception("'a' was bad")
return data
def doSomething():
data = check(None)
# continue using data
That is, do the check and, if it succeeds, just keep going. You don't need to "clutter up" the code with except. You should only use try/except if you can actually handle the error in some way. If you can't, then you want the exception to continue to propagate, and if necessary, propagate all the way up and stop the program, because that will stop you from going on to use data in an invalid way.
Moreover, if it's possible to continue after the check but still "make the mistake" of accessing invalid data, then you didn't do a very good check. The point of the check, if there is one, should be to ensure that you can confidently proceed as long as the check doesn't raise an exception. The way you're doing it, you basically do the check twice: you run check, and then you check for an exception. Instead, just run the check. If it fails, raise an exception. If it succeeds, go on with your code. If you want the check to distinguish recoverable and unrecoverable errors and log the unrecoverable ones, then just do the logging in the check.
Of course, in many cases you can make it even simpler:
def doSomething():
data.blah()
# use data however you're going to use it
In other words, just do what you're going to do. You will then get an exception if you do something with data that doesn't work. There's often no reason to put in a separate explicit check. (There are certainly valid reasons for checks, of course. One would be if the actual operation is expensive and might fail at a late stage, so you want to check for validity upfront to avoid wasting time on an operation that will fail after a long computation. Another reason might be if the operation involves I/O or concurrency of some kind and could leave some shared resource in an invalid state.)

Some years from now, when you read your own code again and try to figure out what on earth you were trying to do, you'll be glad for all the clutter of the try-excepts that help make perfectly obvious what is an error and what is data.

try/excepts are not clutter; They increase the code readability by indicating that this piece of code can "expect" an exception. If you mix your logic-"ifs" with error-handling "ifs" - you may lose some readability in your code.
Further, if you know 'a' being None is the only kind of error you will have, you can write an if and handle it this way; Makes sense in this simple example you gave.
However, if a different error happens, an exception is raised anyway!
I wouldn't recommend to use it as a general programming practice to avoid try/except anywhere. It is acknowledging and marking places in code where exceptions happen.

Best practice for using assert?

Is there a performance or code maintenance issue with using assert as part of the standard code instead of using it just for debugging purposes?
Is
assert x >= 0, 'x is less than zero'
better or worse than
if x < 0:
raise Exception('x is less than zero')
Also, is there any way to set a business rule like if x < 0 raise error that is always checked without the try/except/finally so, if at anytime throughout the code x is less than 0 an error is raised, like if you set assert x < 0 at the start of a function, anywhere within the function where x becomes less then 0 an exception is raised?

Asserts should be used to test conditions that should never happen. The purpose is to crash early in the case of a corrupt program state.
Exceptions should be used for errors that can conceivably happen, and you should almost always create your own Exception classes.
For example, if you're writing a function to read from a configuration file into a dict, improper formatting in the file should raise a ConfigurationSyntaxError, while you can assert that you're not about to return None.
In your example, if x is a value set via a user interface or from an external source, an exception is best.
If x is only set by your own code in the same program, go with an assertion.

"assert" statements are removed when the compilation is optimized. So, yes, there are both performance and functional differences.
The current code generator emits no code for an assert statement when optimization is requested at compile time. - Python 2 Docs Python 3 Docs
If you use assert to implement application functionality, then optimize the deployment to production, you will be plagued by "but-it-works-in-dev" defects.
See PYTHONOPTIMIZE and -O -OO

To be able to automatically throw an error when x become less than zero throughout the function. You can use class descriptors. Here is an example:
class LessThanZeroException(Exception):
pass
class variable(object):
def __init__(self, value=0):
self.__x = value
def __set__(self, obj, value):
if value < 0:
raise LessThanZeroException('x is less than zero')
self.__x = value
def __get__(self, obj, objType):
return self.__x
class MyClass(object):
x = variable()
>>> m = MyClass()
>>> m.x = 10
>>> m.x -= 20
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "my.py", line 7, in __set__
raise LessThanZeroException('x is less than zero')
LessThanZeroException: x is less than zero

The four purposes of assert
Assume you work on 200,000 lines of code with four colleagues Alice, Bernd, Carl, and Daphne.
They call your code, you call their code.
Then assert has four roles:
Inform Alice, Bernd, Carl, and Daphne what your code expects.
Assume you have a method that processes a list of tuples and the program logic can break if those tuples are not immutable:
def mymethod(listOfTuples):
assert(all(type(tp)==tuple for tp in listOfTuples))
This is more trustworthy than equivalent information in the documentation
and much easier to maintain.
Inform the computer what your code expects.
assert enforces proper behavior from the callers of your code.
If your code calls Alices's and Bernd's code calls yours,
then without the assert, if the program crashes in Alices code,
Bernd might assume it was Alice's fault,
Alice investigates and might assume it was your fault,
you investigate and tell Bernd it was in fact his.
Lots of work lost.
With asserts, whoever gets a call wrong, they will quickly be able to see it was
their fault, not yours. Alice, Bernd, and you all benefit.
Saves immense amounts of time.
Inform the readers of your code (including yourself) what your code has achieved at some point.
Assume you have a list of entries and each of them can be clean (which is good)
or it can be smorsh, trale, gullup, or twinkled (which are all not acceptable).
If it's smorsh it must be unsmorshed; if it's trale it must be baludoed;
if it's gullup it must be trotted (and then possibly paced, too);
if it's twinkled it must be twinkled again except on Thursdays.
You get the idea: It's complicated stuff.
But the end result is (or ought to be) that all entries are clean.
The Right Thing(TM) to do is to summarize the effect of your
cleaning loop as
assert(all(entry.isClean() for entry in mylist))
This statements saves a headache for everybody trying to understand
what exactly it is that the wonderful loop is achieving.
And the most frequent of these people will likely be yourself.
Inform the computer what your code has achieved at some point.
Should you ever forget to pace an entry needing it after trotting,
the assert will save your day and avoid that your code
breaks dear Daphne's much later.
In my mind, assert's two purposes of documentation (1 and 3) and
safeguard (2 and 4) are equally valuable.
Informing the people may even be more valuable than informing the computer
because it can prevent the very mistakes the assert aims to catch (in case 1)
and plenty of subsequent mistakes in any case.

In addition to the other answers, asserts themselves throw exceptions, but only AssertionErrors. From a utilitarian standpoint, assertions aren't suitable for when you need fine grain control over which exceptions you catch.

The only thing that's really wrong with this approach is that it's hard to make a very descriptive exception using assert statements. If you're looking for the simpler syntax, remember you can also do something like this:
class XLessThanZeroException(Exception):
pass
def CheckX(x):
if x < 0:
raise XLessThanZeroException()
def foo(x):
CheckX(x)
#do stuff here
Another problem is that using assert for normal condition-checking is that it makes it difficult to disable the debugging asserts using the -O flag.

The English language word assert here is used in the sense of swear, affirm, avow. It doesn't mean "check" or "should be". It means that you as a coder are making a sworn statement here:
# I solemnly swear that here I will tell the truth, the whole truth,
# and nothing but the truth, under pains and penalties of perjury, so help me FSM
assert answer == 42
If the code is correct, barring Single-event upsets, hardware failures and such, no assert will ever fail. That is why the behaviour of the program to an end user must not be affected. Especially, an assert cannot fail even under exceptional programmatic conditions. It just doesn't ever happen. If it happens, the programmer should be zapped for it.

As has been said previously, assertions should be used when your code SHOULD NOT ever reach a point, meaning there is a bug there. Probably the most useful reason I can see to use an assertion is an invariant/pre/postcondition. These are something that must be true at the start or end of each iteration of a loop or a function.
For example, a recursive function (2 seperate functions so 1 handles bad input and the other handles bad code, cause it's hard to distinguish with recursion). This would make it obvious if I forgot to write the if statement, what had gone wrong.
def SumToN(n):
if n <= 0:
raise ValueError, "N must be greater than or equal to 0"
else:
return RecursiveSum(n)
def RecursiveSum(n):
#precondition: n >= 0
assert(n >= 0)
if n == 0:
return 0
return RecursiveSum(n - 1) + n
#postcondition: returned sum of 1 to n
These loop invariants often can be represented with an assertion.

Well, this is an open question, and I have two aspects that I want to touch on: when to add assertions and how to write the error messages.
Purpose
To explain it to a beginner - assertions are statements which can raise errors, but you won't be catching them. And they normally should not be raised, but in real life they sometimes do get raised anyway. And this is a serious situation, which the code cannot recover from, what we call a 'fatal error'.
Next, it's for 'debugging purposes', which, while correct, sounds very dismissive. I like the 'declaring invariants, which should never be violated' formulation better, although it works differently on different beginners... Some 'just get it', and others either don't find any use for it, or replace normal exceptions, or even control flow with it.
Style
In Python, assert is a statement, not a function! (remember assert(False, 'is true') will not raise. But, having that out of the way:
When, and how, to write the optional 'error message'?
This acually applies to unit testing frameworks, which often have many dedicated methods to do assertions (assertTrue(condition), assertFalse(condition), assertEqual(actual, expected) etc.). They often also provide a way to comment on the assertion.
In throw-away code you could do without the error messages.
In some cases, there is nothing to add to the assertion:
def dump(something):
assert isinstance(something, Dumpable)
# ...
But apart from that, a message is useful for communication with other programmers (which are sometimes interactive users of your code, e.g. in Ipython/Jupyter etc.).
Give them information, not just leak internal implementation details.
instead of:
assert meaningless_identifier <= MAGIC_NUMBER_XXX, 'meaningless_identifier is greater than MAGIC_NUMBER_XXX!!!'
write:
assert meaningless_identifier > MAGIC_NUMBER_XXX, 'reactor temperature above critical threshold'
or maybe even:
assert meaningless_identifier > MAGIC_NUMBER_XXX, f'reactor temperature({meaningless_identifier }) above critical threshold ({MAGIC_NUMBER_XXX})'
I know, I know - this is not a case for a static assertion, but I want to point to the informational value of the message.
Negative or positive message?
This may be conroversial, but it hurts me to read things like:
assert a == b, 'a is not equal to b'
these are two contradictory things written next to eachother. So whenever I have an influence on the codebase, I push for specifying what we want, by using extra verbs like 'must' and 'should', and not to say what we don't want.
assert a == b, 'a must be equal to b'
Then, getting AssertionError: a must be equal to b is also readable, and the statement looks logical in code. Also, you can get something out of it without reading the traceback (which can sometimes not even be available).

For what it's worth, if you're dealing with code which relies on assert to function properly, then adding the following code will ensure that asserts are enabled:
try:
assert False
raise Exception('Python assertions are not working. This tool relies on Python assertions to do its job. Possible causes are running with the "-O" flag or running a precompiled (".pyo" or ".pyc") module.')
except AssertionError:
pass

Is there a performance issue?
Please remember to "make it work first before you make it work fast".
Very few percent of any program are usually relevant for its speed.
You can always kick out or simplify an assert if it ever proves to
be a performance problem -- and most of them never will.
Be pragmatic:
Assume you have a method that processes a non-empty list of tuples and the program logic will break if those tuples are not immutable. You should write:
def mymethod(listOfTuples):
assert(all(type(tp)==tuple for tp in listOfTuples))
This is probably fine if your lists tend to be ten entries long, but
it can become a problem if they have a million entries.
But rather than discarding this valuable check entirely you could
simply downgrade it to
def mymethod(listOfTuples):
assert(type(listOfTuples[0])==tuple) # in fact _all_ must be tuples!
which is cheap but will likely catch most of the actual program errors anyway.

An Assert is to check -
1. the valid condition,
2. the valid statement,
3. true logic;
of source code. Instead of failing the whole project it gives an alarm that something is not appropriate in your source file.
In example 1, since variable 'str' is not null. So no any assert or exception get raised.
Example 1:
#!/usr/bin/python
str = 'hello Python!'
strNull = 'string is Null'
if __debug__:
if not str: raise AssertionError(strNull)
print str
if __debug__:
print 'FileName '.ljust(30,'.'),(__name__)
print 'FilePath '.ljust(30,'.'),(__file__)
------------------------------------------------------
Output:
hello Python!
FileName ..................... hello
FilePath ..................... C:/Python\hello.py
In example 2, var 'str' is null. So we are saving the user from going ahead of faulty program by assert statement.
Example 2:
#!/usr/bin/python
str = ''
strNull = 'NULL String'
if __debug__:
if not str: raise AssertionError(strNull)
print str
if __debug__:
print 'FileName '.ljust(30,'.'),(__name__)
print 'FilePath '.ljust(30,'.'),(__file__)
------------------------------------------------------
Output:
AssertionError: NULL String
The moment we don't want debug and realized the assertion issue in the source code. Disable the optimization flag
python -O assertStatement.py
nothing will get print

Both the use of assert and the raising of exceptions are about communication.
Assertions are statements about the correctness of code addressed at developers: An assertion in the code informs readers of the code about conditions that have to be fulfilled for the code being correct. An assertion that fails at run-time informs developers that there is a defect in the code that needs fixing.
Exceptions are indications about non-typical situations that can occur at run-time but can not be resolved by the code at hand, addressed at the calling code to be handled there. The occurence of an exception does not indicate that there is a bug in the code.
Best practice
Therefore, if you consider the occurence of a specific situation at run-time as a bug that you would like to inform the developers about ("Hi developer, this condition indicates that there is a bug somewhere, please fix the code.") then go for an assertion. If the assertion checks input arguments of your code, you should typically add to the documentation that your code has "undefined behaviour" when the input arguments violate that conditions.
If instead the occurrence of that very situation is not an indication of a bug in your eyes, but instead a (maybe rare but) possible situation that you think should rather be handled by the client code, raise an exception. The situations when which exception is raised should be part of the documentation of the respective code.
Is there a performance [...] issue with using assert
The evaluation of assertions takes some time. They can be eliminated at compile time, though. This has some consequences, however, see below.
Is there a [...] code maintenance issue with using assert
Normally assertions improve the maintainability of the code, since they improve readability by making assumptions explicit and during run-time regularly verifying these assumptions. This will also help catching regressions. There is one issue, however, that needs to be kept in mind: Expressions used in assertions should have no side-effects. As mentioned above, assertions can be eliminated at compile time - which means that also the potential side-effects would disappear. This can - unintendedly - change the behaviour of the code.

In IDE's such as PTVS, PyCharm, Wing assert isinstance() statements can be used to enable code completion for some unclear objects.

I'd add I often use assert to specify properties such as loop invariants or logical properties my code should have, much like I'd specify them in formally-verified software.
They serve both the purpose of informing readers, helping me reason, and checking I am not making a mistake in my reasoning. For example :
k = 0
for i in range(n):
assert k == i * (i + 1) // 2
k += i
#do some things
or in more complicated situations:
def sorted(l):
return all(l1 <= l2 for l1, l2 in zip(l, l[1:]))
def mergesort(l):
if len(l) < 2: #python 3.10 will have match - case for this instead of checking length
return l
k = len(l // 2)
l1 = mergesort(l[:k])
l2 = mergesort(l[k:])
assert sorted(l1) # here the asserts allow me to explicit what properties my code should have
assert sorted(l2) # I expect them to be disabled in a production build
return merge(l1, l2)
Since asserts are disabled when python is run in optimized mode, do not hesitate to write costly conditions in them, especially if it makes your code clearer and less bug-prone

How can I know which exceptions might be thrown from a method call?

Is there a way knowing (at coding time) which exceptions to expect when executing python code?
I end up catching the base Exception class 90% of the time since I don't know which exception type might be thrown (reading the documentation doesn't always help, since many times an exception can be propagated from the deep. And many times the documentation is not updated or correct).
Is there some kind of tool to check this (like by reading the Python code and libs)?

I guess a solution could be only imprecise because of lack of static typing rules.
I'm not aware of some tool that checks exceptions, but you could come up with your own tool matching your needs (a good chance to play a little with static analysis).
As a first attempt, you could write a function that builds an AST, finds all Raise nodes, and then tries to figure out common patterns of raising exceptions (e. g. calling a constructor directly)
Let x be the following program:
x = '''\
if f(x):
raise IOError(errno.ENOENT, 'not found')
else:
e = g(x)
raise e
'''
Build the AST using the compiler package:
tree = compiler.parse(x)
Then define a Raise visitor class:
class RaiseVisitor(object):
def __init__(self):
self.nodes = []
def visitRaise(self, n):
self.nodes.append(n)
And walk the AST collecting Raise nodes:
v = RaiseVisitor()
compiler.walk(tree, v)
>>> print v.nodes
[
Raise(
CallFunc(
Name('IOError'),
[Getattr(Name('errno'), 'ENOENT'), Const('not found')],
None, None),
None, None),
Raise(Name('e'), None, None),
]
You may continue by resolving symbols using compiler symbol tables, analyzing data dependencies, etc. Or you may just deduce, that CallFunc(Name('IOError'), ...) "should definitely mean raising IOError", which is quite OK for quick practical results :)

You should only catch exceptions that you will handle.
Catching all exceptions by their concrete types is nonsense. You should catch specific exceptions you can and will handle. For other exceptions, you may write a generic catch that catches "base Exception", logs it (use str() function) and terminates your program (or does something else that's appropriate in a crashy situation).
If you really gonna handle all exceptions and are sure none of them are fatal (for example, if you're running the code in some kind of a sandboxed environment), then your approach of catching generic BaseException fits your aims.
You might be also interested in language exception reference, not a reference for the library you're using.
If the library reference is really poor and it doesn't re-throw its own exceptions when catching system ones, the only useful approach is to run tests (maybe add it to test suite, because if something is undocumented, it may change!). Delete a file crucial for your code and check what exception is being thrown. Supply too much data and check what error it yields.
You will have to run tests anyway, since, even if the method of getting the exceptions by source code existed, it wouldn't give you any idea how you should handle any of those. Maybe you should be showing error message "File needful.txt is not found!" when you catch IndexError? Only test can tell.

The correct tool to solve this problem is unittests. If you are having exceptions raised by real code that the unittests do not raise, then you need more unittests.
Consider this
def f(duck):
try:
duck.quack()
except ??? could be anything
duck can be any object
Obviously you can have an AttributeError if duck has no quack, a TypeError if duck has a quack but it is not callable. You have no idea what duck.quack() might raise though, maybe even a DuckError or something
Now supposing you have code like this
arr[i] = get_something_from_database()
If it raises an IndexError you don't know whether it has come from arr[i] or from deep inside the database function. usually it doesn't matter so much where the exception occurred, rather that something went wrong and what you wanted to happen didn't happen.
A handy technique is to catch and maybe reraise the exception like this
except Exception as e
#inspect e, decide what to do
raise

Noone explained so far, why you can't have a full, 100% correct list of exceptions, so I thought it's worth commenting on. One of the reasons is a first-class function. Let's say that you have a function like this:
def apl(f,arg):
return f(arg)
Now apl can raise any exception that f raises. While there are not many functions like that in the core library, anything that uses list comprehension with custom filters, map, reduce, etc. are affected.
The documentation and the source analysers are the only "serious" sources of information here. Just keep in mind what they cannot do.

I ran into this when using socket, I wanted to find out all the error conditions I would run in to (so rather than trying to create errors and figure out what socket does I just wanted a concise list). Ultimately I ended up grep'ing "/usr/lib64/python2.4/test/test_socket.py" for "raise":
$ grep raise test_socket.py
Any exceptions raised by the clients during their tests
raise TypeError, "test_func must be a callable function"
raise NotImplementedError, "clientSetUp must be implemented."
def raise_error(*args, **kwargs):
raise socket.error
def raise_herror(*args, **kwargs):
raise socket.herror
def raise_gaierror(*args, **kwargs):
raise socket.gaierror
self.failUnlessRaises(socket.error, raise_error,
self.failUnlessRaises(socket.error, raise_herror,
self.failUnlessRaises(socket.error, raise_gaierror,
raise socket.error
# Check that setting it to an invalid value raises ValueError
# Check that setting it to an invalid type raises TypeError
def raise_timeout(*args, **kwargs):
self.failUnlessRaises(socket.timeout, raise_timeout,
def raise_timeout(*args, **kwargs):
self.failUnlessRaises(socket.timeout, raise_timeout,
Which is a pretty concise list of errors. Now of course this only works on a case by case basis and depends on the tests being accurate (which they usually are). Otherwise you need to pretty much catch all exceptions, log them and dissect them and figure out how to handle them (which with unit testing wouldn't be to difficult).

There are two ways that I found informative. The first one, run the code in iPython, which will display the exception type.
n = 2
str = 'me '
str + 2
TypeError: unsupported operand type(s) for +: 'int' and 'str'
In the second way we settle for catching too much and improve on it over time. Include a try expression in your code and catch except Exception as err. Print sufficient data to know what exception was thrown. As exceptions are thrown improve your code by adding a more precise except clause. When you feel that you have caught all relevant exceptions remove the all inclusive one. A good thing to do anyway because it swallows programming errors.
try:
so something
except Exception as err:
print "Some message"
print err.__class__
print err
exit(1)

normally, you'd need to catch exception only around a few lines of code. You wouldn't want to put your whole main function into the try except clause. for every few line you always should now (or be able easily to check) what kind of exception might be raised.
docs have an exhaustive list of built-in exceptions. don't try to except those exception that you're not expecting, they might be handled/expected in the calling code.
edit: what might be thrown depends on obviously on what you're doing! accessing random element of a sequence: IndexError, random element of a dict: KeyError, etc.
Just try to run those few lines in IDLE and cause an exception. But unittest would be a better solution, naturally.

This is a copy and pasted answer I wrote for How to list all exceptions a function could raise in Python 3?, I hope that is allowed.
I needed to do something similar and found this post. I decided I
would write a little library to help.
Say hello to Deep-AST. It's very early alpha but it is pip
installable. It has all of the limitations mentioned in this post
and some additional ones but its already off to a really good start.
For example when parsing HTTPConnection.getresponse() from
http.client it parses 24489 AST Nodes. It finds 181 total raised
Exceptions (this includes duplicates) and 8 unique Exceptions were
raised. A working code example.
The biggest flaw is this it currently does work with a bare raise:
def foo():
try:
bar()
except TypeError:
raise
But I think this will be easy to solve and I plan on fixing it.
The library can handle more than just figuring out exceptions, what
about listing all Parent classes? It can handle that too!

Raise exception vs. return None in functions? [duplicate]

This question already has answers here:
Is it better to use an exception or a return code in Python?
(6 answers)
Python: Throw Exception or return None? [closed]
(3 answers)
Closed 1 year ago.
What's better practice in a user-defined function in Python: raise an exception or return None? For example, I have a function that finds the most recent file in a folder.
def latestpdf(folder):
# list the files and sort them
try:
latest = files[-1]
except IndexError:
# Folder is empty.
return None # One possibility
raise FileNotFoundError() # Alternative
else:
return somefunc(latest) # In my case, somefunc parses the filename
Another option is leave the exception and handle it in the caller code, but I figure it's more clear to deal with a FileNotFoundError than an IndexError. Or is it bad form to re-raise an exception with a different name?

It's really a matter of semantics. What does foo = latestpdf(d) mean?
Is it perfectly reasonable that there's no latest file? Then sure, just return None.
Are you expecting to always find a latest file? Raise an exception. And yes, re-raising a more appropriate exception is fine.
If this is just a general function that's supposed to apply to any directory, I'd do the former and return None. If the directory is, e.g., meant to be a specific data directory that contains an application's known set of files, I'd raise an exception.

I would make a couple suggestions before answering your question as it may answer the question for you.
Always name your functions descriptive. latestpdf means very little to anyone but looking over your function latestpdf() gets the latest pdf. I would suggest that you name it getLatestPdfFromFolder(folder).
As soon as I did this it became clear what it should return.. If there isn't a pdf raise an exception. But wait there more..
Keep the functions clearly defined. Since it's not apparent what somefuc is supposed to do and it's not (apparently) obvious how it relates to getting the latest pdf I would suggest you move it out. This makes the code much more readable.
for folder in folders:
try:
latest = getLatestPdfFromFolder(folder)
results = somefuc(latest)
except IOError: pass
Hope this helps!

I usually prefer to handle exceptions internally (i.e. try/except inside the called function, possibly returning a None) because python is dynamically typed. In general, I consider it a judgment call one way or the other, but in a dynamically typed language, there are small factors that tip the scales in favor of not passing the exception to the caller:
Anyone calling your function is not notified of the exceptions that can be thrown. It becomes a bit of an art form to know what kind of exception you are hunting for (and generic except blocks ought to be avoided).
if val is None is a little easier than except ComplicatedCustomExceptionThatHadToBeImportedFromSomeNameSpace. Seriously, I hate having to remember to type from django.core.exceptions import ObjectDoesNotExist at the top of all my django files just to handle a really common use case. In a statically typed world, let the editor do it for you.
Honestly, though, it's always a judgment call, and the situation you're describing, where the called function receives an error it can't help, is an excellent reason to re-raise an exception that is meaningful. You have the exact right idea, but unless you're exception is going to provide more meaningful information in a stack trace than
AttributeError: 'NoneType' object has no attribute 'foo'
which, nine times out of ten, is what the caller will see if you return an unhandled None, don't bother.
(All this kind of makes me wish that python exceptions had the cause attributes by default, as in java, which lets you pass exceptions into new exceptions so that you can rethrow all you want and never lose the original source of the problem.)

with python 3.5's typing:
example function when returning None will be:
def latestpdf(folder: str) -> Union[str, None]
and when raising an exception will be:
def latestpdf(folder: str) -> str
option 2 seem more readable and pythonic
(+option to add comment to exception as stated earlier.)

In general, I'd say an exception should be thrown if something catastrophic has occured that cannot be recovered from (i.e. your function deals with some internet resource that cannot be connected to), and you should return None if your function should really return something but nothing would be appropriate to return (i.e. "None" if your function tries to match a substring in a string for example).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.