How does one indicate invalid arguments in Python? [duplicate]

How does one indicate invalid arguments in Python? [duplicate] - python

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
python: Should I use ValueError or create my own subclass to handle invalid strings?
Reading Built-in Exceptions I read:
All user-defined exceptions should also be derived from this class" with regards to Exception.
I also see a ValueError which says:
Raised when a built-in operation or function receives an argument that has the right type but an inappropriate value, and the situation is not described by a more precise exception such as IndexError.
If I want to raise an exception for invalid arguments (the equivalent of Ruby's ArgumentError), what should I do? Should I raise ValueError directly, or preferably subclass ValueError with my own intention revealing name?
In my case, I accept a key argument, but I want to restrict the set of characters in the key, such that only /\A[\w.]+\Z/ (Perl/Ruby regular expression) is accepted.

I think the general idea is this: ValueError should almost always denote some sort of client error (where 'client' means the programmer using your interface). There are two high-level types of exceptions in Python:
uncommon cases for otherwise normally functioning code; the client is not to blame
usage errors where some interface is used incorrectly, or the through a series of interface calls, the system has reached an inconsistent state; time to blame the client
In my opinion, for the first case, it makes sense to create exception class hierarchies to allow client code fine-grained control over what to do in uncommon cases.
In the second case, and ValueError is an example of this, you are telling the client that they did something wrong. Fine-grained exception hierarchies are not as important here, because client code probably should be fixed to do the right thing (e.g., pass the right parameter types in the first place).
TL;DR: Just use ValueError, but include a helpful message (e.g., raise ValueError("I'm afraid I can't let you do that, Dave. -HAL 9000"). Don't subclass unless you genuinely expect someone to want to catch SubClassError but not other ValueErrors.
With that said, as you've mentioned, the Python built-in library does subclass ValueError in some cases (e.g. UnicodeError).

ValueError seems to cover your situation quite well.

Related

In what circumstances, if any, is it appropriate to raise a FileExistsError manually?

I'm writing a Python function which takes data from an online source and copies it into a local data dump. If there's a file present already on the intended path for said data dump, I'd like my program to stop abruptly, giving a short message to the user - I'm just building a tiny CLI-type program - explaining why what he or she was about to attempt could destroy valuable data.
Would it be appropriate to raise a FileExists error in the above circumstances? If so, I imagine my code would look something like this:
def make_data_dump():
if os.path.exists("path/to/dump"):
raise FileExistsError("Must not overwrite dump at path/to/dump.")
data = get_data_from_url()
write_to_path(data, "path/to/dump")
Apologies if this is a silly question, but I couldn't find any guidance on when to raise a FileExistsError manually, only on what to do if one's program raises such an exception unexpectedly - hence my asking if raising said exception manually is ever good practice.

The Python documentation explicitly states that this is allowed:
User code can raise built-in exceptions. This can be used to test an exception handler or to report an error condition “just like” the situation in which the interpreter raises the same exception; but beware that there is nothing to prevent user code from raising an inappropriate error.
However, your code example is wrong for a different reason. The problem with this code is that it's using the LBYL (Look Before You Leap) pattern, which might read to race conditions. (In the time between checking if the file exists and writing the file, another process could have created the file, which now would be overwritten). A better pattern for these type of scenarios is the EAFP (Easier to Ask for Forgiveness than Permission) pattern.
See What is the EAFP principle in Python? for more information and examples.
Having said that, I think that for most Python code, manually raising a FileExistsError isn't that common, since you would use the standard Python libraries that already throw this error when necessary. However, one valid reason I can think of is when you would write a wrapper for a low-level function (implemented in another language like C) and would want to translate the error to Python.
A hypothetical code example to demonstrate this:
def make_data_dump():
data = get_data_from_url()
# Assuming write_to_path() is a function in a C library, which returns an error code.
error = write_to_path(data, "path/to/dump")
if error == EEXIST:
raise FileExistsError("Must not overwrite dump at path/to/dump.")
Note that some other built-in exceptions are much more common to raise manually, for example a StopIteration, when implementing an iterator, or a ValueError, for example when your method gets an argument with the wrong value (example).
Some exceptions you will rarely use yourself, like a SyntaxError for example.

As per NobbyNobbs's comment above: if the programmer raises standard exception in his code, it's difficult to work out, during error handling, if a given exception was raised on the application or the system level. Therefore, it's a practice best avoided.

Finding all exceptions that a function can raise [duplicate]

This question already has answers here:
How can I know which exceptions might be thrown from a method call?
(8 answers)
Closed 6 years ago.
We are working on a medium-sized commercial Python project and have one reccuring problem when using functions from the standard library.
The documentation of the standard library often does not list all (or even any) exceptions that a function can throw, so we try all the error cases that we can come up with, have a look through the source of the library and then catch whatever is plausible. But quite often we miss that one random error that can still happen, but that we didn't come up with. For example, we missed, that json.loads() can raise a ValueError, if any of the built-in constants are spelled the wrong way (e.g. True instead of true).
In other cases, we tried to just catch Exception, because that part of the code is so critical, that it should never break on an Exception, but should rather try again. The problem here is, that it even caught the KeyboardInterrupt.
So, is there any way to find all exceptions that a function can raise, even if the documentation does not say anything about that?
Are there any tools that can determine what exceptions can be raised?

There is no real way to do this other than reading all of the possible code paths that can be taken in that function and looking to see what exceptions can be raised there. I suppose some sort of automated tool could be written to do this, but even that is pretty tricky because due to python's dynamic nature, just about any exception could be raised from anywhere (If I really wanted to, I can always patch a dependency function with a different function that raises something else entirely).
Monkey Patching aside, To actually get it right, you'd need a really good type inferencer (maybe astroid could help?) to infer various TypeError or AttributeError that could be raised from accessing non-existent members or calling functions with the wrong arguments, etc. ValueError is particularly tricky because it can still get raised when you pass something of the correct type.
In other cases, we tried to just catch Exception, because that part of the code is so critical, that it should never break on an Exception, but should rather try again. The problem here is, that it even caught the KeyboardInterrupt.
This feels like a bad idea to me. For one, retrying code should only be done for exceptions that might give you a different result if you retry it (weird connectivity issues, etc.). For your ValueError case, you'll just raise the ValueError again. The best case scenario here is that the ValueError is allowed to propagate out of the exception handler on the second call -- The worst case is that you end up in an infinite loop (or RecursionError) that you don't really get much information to help debug.
Catching Exception should be a last resort (and it shouldn't catch KeyboardInterrupt or SystemExit since those don't inherit from Exception) and should probably only format some sort of error message that somebody can use to track down the issue and fix it.

Catching Python runtime errors only

I find myself handling exceptions without specifying an exception type when I call code that interacts with system libraries like shutil, http, etc, all of which can throw if the system is in an unexpected state (e.g. a file locked, network unavailable, etc)
try:
# call something
except:
print("OK, so it went wrong.")
This bothers me because it also catches SyntaxError and other exceptions based in programmer error, and I've seen recommendations to avoid such open-ended exception handlers.
Is there a convention that all runtime errors derive from some common exception base class that I can use here? Or anything that doesn't involve syntax errors, module import failures, etc.? Even KeyError I would consider a bug, because I tend to use dict.get() if I'm not 100% sure the key will be there.
I'd hate to have to list every single conceivable exception type, especially since I'm calling a lot of supporting code I have no control over.
UPDATE: OK, the answers made me realize I'm asking the wrong question -- what I'm really wondering is if there's a Python convention or explicit recommendation for library writers to use specific base classes for their exceptions, so as to separate them from the more mundane SyntaxError & friends.
Because if there's a convention for library writers, I, as a library consumer, can make general assumptions about what might be thrown, even if specific cases may vary. Not sure if that makes more sense?
UPDATE AGAIN: Sven's answer finally led me to understand that instead of giving up and catching everything at the top level, I can handle and refine exceptions at the lower levels, so the top level only needs to worry about the specific exception type from the level below.
Thanks!

Always make the try block as small as possible.
Only catch the exceptions you want to handle. Look in the documentation of the functions you are dealing with.
This ensures that you think about what exceptions may occur, and you think about what to do if they occur. If something happens you never thought about, chances are your exception handling code won't be able to correctly deal with that case anyway, so it would be better the exception gets propagated.
You said you'd "hate to have to list every single conceivable exception type", but usually it's not that bad. Opening a file? Catch IOError. Dealing with some library code? They often have their own exception hierarchies with a specific top-level exception -- just catch this one if you want to catch any of the library-specific exceptions. Be as specific as possible, otherwise errors will pass unnoticed sooner or later.
As for convention about user-defined exceptions in Python: They usually should be derived from Exception. This is also what most user-defined exceptions in the standard library derive from, so the least you should do is use
except Exception:
instead of a bare except clause, which also catches KeyboardInterrupt and SystemExit. As you have noted yourself, this would still catch a lot of exceptions you don't want to catch.

Check out this list of built-in exceptions in the Python docs. It sounds like what you mostly want is to catch StandardError although this also includes KeyError. There are also other base classes, such as ArithmeticError and EnvironmentError, that you may find useful sometimes.
I find third-party libraries do derive their custom exceptions from Exception as documented here. Programmers also tend to raise standard Python exceptions such as TypeError, ValueError, etc. in appropriate situations. Unfortunately, this makes it difficult to consistently catch library errors separately from other errors derived from Exception. I wish Python defined e.g. a UserException base class. Some libraries do declare a base class for their exceptions, but you'd have to know what these are, import the module, and catch them explicitly.
Of course, if you want to catch everything except KeyError and, say, IndexError, along with the script-stopper exceptions, you could do this:
try:
doitnow()
except (StopIteration, GeneratorExit, KeyboardInterrupt, SystemExit):
raise # these stop the script
except (KeyError, IndexError):
raise # we don't want to handle these ones
except Exception as e:
handleError(e)
Admittedly, this becomes a hassle to write each time.

How about RuntimeError: http://docs.python.org/library/exceptions.html#exceptions.RuntimeError
If that isn't what you want (and it may well not be), look at the list of exceptions on that page. If you're confused by how the hierarchy fits together, I suggest you spend ten minutes examining the __bases__ property of the exceptions you're interested in, to see what base classes they share. (Note that __bases__ isn't closed over the whole hierarchy - you may need to examine superclass bases also).

Is it bad form to raise ArgumentError by hand?

If you want to add an extra check not provided by argparse, such as:
if variable a == b then c should be not None
...is it permissible to raise ArgumentError yourself?
Or, should you raise Exception instead?
Also what is common practice for this kind of situation? Say that you add a piece of code that's almost like a local extension of the library. Should you use the same exception type(s) as those provided by the library you are extending?

There's nothing inherently wrong with raising an ArgumentError. You can use it anytime the arguments you receive are not what you expected them to be, including checking range of numbers.
Also, yes, in general it's alright for you to use the same exceptions provided by a given library if you are writing an extension to that library.
Regarding raising Exceptions, I wouldn't do that. You should always raise a specific exception so you know how to handle it in the code. Catching Exception objects should be done at the highest level in your application, to catch and log all exceptions that you missed.

How to handle "duck typing" in Python?

I usually want to keep my code as generic as possible. I'm currently writing a simple library and being able to use different types with my library feels extra important this time.
One way to go is to force people to subclass an "interface" class. To me, this feels more like Java than Python and using issubclass in each method doesn't sound very tempting either.
My preferred way is to use the object in good faith, but this will raise some AttributeErrors. I could wrap each dangerous call in a try/except block. This, too, seems kind of cumbersome:
def foo(obj):
...
# it should be able to sleep
try:
obj.sleep()
except AttributeError:
# handle error
...
# it should be able to wag it's tail
try:
obj.wag_tail()
except AttributeError:
# handle this error as well
Should I just skip the error handling and expect people to only use objects with the required methods? If I do something stupid like [x**2 for x in 1234] I actually get a TypeError and not a AttributeError (ints are not iterable) so there must be some type checking going on somewhere -- what if I want to do the same?
This question will be kind of open ended, but what is the best way to handle the above problem in a clean way? Are there any established best practices? How is the iterable "type checking" above, for example, implemented?
Edit
While AttributeErrors are fine, the TypeErrors raised by native functions usually give more information about how to solve the errors. Take this for example:
>>> ['a', 1].sort()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: int() < str()
I'd like my library to be as helpful as possible.

I'm not a python pro but I believe that unless you can try an alternative for when the parameter doesn't implement a given method, you shoudn't prevent exceptions from being thrown. Let the caller handle these exceptions. This way, you would be hidding problems from the developers.
As I have read in Clean Code, if you want to search for an item in a collection, don't test your parameters with ìssubclass (of a list) but prefer to call getattr(l, "__contains__"). This will give someone who is using your code a chance to pass a parameter that isn't a list but which has a __contains__ method defined and things should work equally well.
So, I think that you should code in an abstract, generic way, imposing as few restrictions as you can. For that, you'll have to make the fewest assumptions possible. However, when you face something that you can't handle, raise an exception and let the programmer know what mistake he made!

If your code requires a particular interface, and the user passes an object without that interface, then nine times out of ten, it's inappropriate to catch the exception. Most of the time, an AttributeError is not only reasonable but expected when it comes to interface mismatches.
Occasionally, it may be appropriate to catch an AttributeError for one of two reasons. Either you want some aspect of the interface to be optional, or you want to throw a more specific exception, perhaps a package-specific exception subclass. Certainly you should never prevent an exception from being thrown if you haven't honestly handled the error and any aftermath.
So it seems to me that the answer to this question must be problem- and domain- specific. It's fundamentally a question of whether using a Cow object instead of a Duck object ought to work. If so, and you handle any necessary interface fudging, then that's fine. On the other hand, there's no reason to explicitly check whether someone has passed you a Frog object, unless that will cause a disastrous failure (i.e. something much worse than a stack trace).
That said, it's always a good idea to document your interface -- that's what docstrings (among other things) are for. When you think about it, it's much more efficient to throw a general error for most cases and tell users the right way to do it in the docstring, than to try to foresee every possible error a user might make and create a custom error message.
A final caveat -- it's possible that you're thinking about UI here -- I think that's another story. It's good to check the input that an end user gives you to make sure it isn't malicious or horribly malformed, and provide useful feedback instead of a stack trace. But for libraries or things like that, you really have to trust the programmer using your code to use it intelligently and respectfully, and to understand the errors that Python generates.

If you just want the unimplemented methods to do nothing, you can try something like this, rather than the multi-line try/except construction:
getattr(obj, "sleep", lambda: None)()
However, this isn't necessarily obvious as a function call, so maybe:
hasattr(obj, "sleep") and obj.sleep()
or if you want to be a little more sure before calling something that it can in fact be called:
hasattr(obj, "sleep") and callable(obj.sleep) and obj.sleep()
This "look-before-you-leap" pattern is generally not the preferred way to do it in Python, but it is perfectly readable and fits on a single line.
Another option of course is to abstract the try/except into a separate function.

Good question, and quite open-ended. I believe typical Python style is not to check, either with isinstance or catching individual exceptions. Cerainly, using isinstance is quite bad style, as it defeats the whole point of duck typing (though using isinstance on primitives can be OK -- be sure to check for both int and long for integer inputs, and check for basestring for strings (base class of str and unicode). If you do check, you hould raise a TypeError.
Not checking is generally OK, as it typically raises either a TypeError or AttributeError anyway, which is what you want. (Though it can delay those errors making client code hard to debug).
The reason you see TypeErrors is that primitive code raises it, effectively because it does an isinstance. The for loop is hard-coded to raise a TypeError if something is not iterable.

First of all, the code in your question is not ideal:
try:
obj.wag_tail()
except AttributeError:
...
You don't know whether the AttributeError is from the lookup of wag_tail or whether it happened inside the function. What you are trying to do is:
try:
f = getattr(obj, 'wag_tail')
except AttributeError:
...
finally:
f()
Edit: kindall rightly points out that if you are going to check this, you should also check that f is callable.
In general, this is not Pythonic. Just call and let the exception filter down, and the stack trace is informative enough to fix the problem. I think you should ask yourself whether your rethrown exceptions are useful enough to justify all of this error-checking code.
The case of sorting a list is a great example.
List sorting is very common,
passing unorderable types happens for a significant proportion of those, and
throwing AttributeError in that case is very confusing.
If those three criteria apply to your problem (especially the third), I agree with building pretty exception rethrower.
You have to balance with the fact that throwing these pretty errors is going to make your code harder to read, which statistically means more bugs in your code. It's a question of balancing the pros and the cons.
If you ever need to check for behaviours (like __real__ and __contains__), don't forget to use the Python abstract base classes found in collections, io, and numbers.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.