Special case of exception handling in Python - python

I am trying to do some parsing on some text. I have the binary version of the parser to in my program'd body,I use call to run this parser and pass my sentences one by one. However, for some reasons sometimes the parser is not able to parse the sentence and generates an error.It might be a bit difficult to put it into words, it just prints some error messages but does not crash and ends normally. My understanding is that there is some sort of exception handling done in the parser itself that it doesn't crash. However, I want to keep track of these problematic sentences. In other words, if the parser couldn't parse the sentence I want to write that sentence in a file. I used the normal exception handling as I do with all of my programs, but it seems it cannot catch exception as the exception has been taken care of inside the parser program. Does anyone know how I should catch these kind of external exception?
thanks

Check the return code of call. Is it any different when you get the exception compared to normal/correct execution? If you want to get an exception you could use check_call.
Another solution could be the usage of check_output to call the parser-program and examine the output.
Documentation for all functions: Python subprocess module

Related

Why catch exceptions when we can simply prevent them from happening in the first place

I've always had trouble understanding the real purpose of catching exceptions in any programming language.
Generally, when you can catch an exception, you can also prevent the exception from ever occurring. Here's an example.
What's the point of doing this:
import os
import argparse
if __name__=="__main__":
parser = argparse.ArgumentParser()
parser.add_argument("filename", type=str, help="the filename")
args = parser.parse_args()
try:
with open(args.filename) as fs:
fs.write("test")
except FileNotFoundError:
print("that file was not found")
If I can do this:
import os
import argparse
if __name__=="__main__":
parser = argparse.ArgumentParser()
parser.add_argument("filename", type=str, help="the filename")
args = parser.parse_args()
if os.path.exists(args.filename):
with open(args.filename) as fs:
fs.write("test")
else:
print("that file was not found")
What is the added value of catching anything as an exception, when it is often more feasible to prevent the exception from ever happening in the first place?
Exceptions are exceptional situations being thrown when an undesired, erroneous issue happened. It is true that in many cases it is unnecessary to use the exception pattern as you can check for the possibility to perform the operation before giving it a chance to err. But, sometimes that's not possible.
For instance, let's consider the example that you are running a database query. You cannot know whether that will succeed without actually running it. So, there are cases when you cannot prevent exceptions by doing validations.
Also, there are cases when there could be many exceptions upon many levels. Your code would become very difficult to read and work with if you validate everything. Consider the case when you have a method that receives an object and calls 100 other methods, passing the same object.
When do you validate whether that object was properly initialized? Do you validate it at the method that calls the other methods only? But then, later somebody else might call one of the 100 methods from another place without validating it. So, if we are to only validate, then we will end up writing the same validating code in 101 methods, instead of catching the exception at a single place, regardless of which method throws it.
You will also need to use third-party libraries. Are they perfect? Probably not quite and not all of them. How do you validate everything in the code written by someone else?
Summary:
sometimes there is no way to know whether an operation succeeds before running it
third-party libraries will come as they are, possibly with errors, you cannot apply validation for them, unless you get into their code and refactor the whole thing (you could as well write the whole library instead)
doing validation-only may lead to code repetition, unreadable code, code difficult to maintain and very long refactoring when the duplicated validation needs to be changed
you cannot think about all the possible errors, you need at least a layer that catches the problems you didn't foresee
when you upgrade some versions, like Python version for example, it is quite possible that something will no longer work
Validating operations is often a good idea, but it cannot substitute exception handling. They are going hand-in-hand. You validate what makes sense to be validated and that's often a subjective decision, but, if still an exception occurs, you need to handle that properly

In what circumstances, if any, is it appropriate to raise a FileExistsError manually?

I'm writing a Python function which takes data from an online source and copies it into a local data dump. If there's a file present already on the intended path for said data dump, I'd like my program to stop abruptly, giving a short message to the user - I'm just building a tiny CLI-type program - explaining why what he or she was about to attempt could destroy valuable data.
Would it be appropriate to raise a FileExists error in the above circumstances? If so, I imagine my code would look something like this:
def make_data_dump():
if os.path.exists("path/to/dump"):
raise FileExistsError("Must not overwrite dump at path/to/dump.")
data = get_data_from_url()
write_to_path(data, "path/to/dump")
Apologies if this is a silly question, but I couldn't find any guidance on when to raise a FileExistsError manually, only on what to do if one's program raises such an exception unexpectedly - hence my asking if raising said exception manually is ever good practice.
The Python documentation explicitly states that this is allowed:
User code can raise built-in exceptions. This can be used to test an exception handler or to report an error condition “just like” the situation in which the interpreter raises the same exception; but beware that there is nothing to prevent user code from raising an inappropriate error.
However, your code example is wrong for a different reason. The problem with this code is that it's using the LBYL (Look Before You Leap) pattern, which might read to race conditions. (In the time between checking if the file exists and writing the file, another process could have created the file, which now would be overwritten). A better pattern for these type of scenarios is the EAFP (Easier to Ask for Forgiveness than Permission) pattern.
See What is the EAFP principle in Python? for more information and examples.
Having said that, I think that for most Python code, manually raising a FileExistsError isn't that common, since you would use the standard Python libraries that already throw this error when necessary. However, one valid reason I can think of is when you would write a wrapper for a low-level function (implemented in another language like C) and would want to translate the error to Python.
A hypothetical code example to demonstrate this:
def make_data_dump():
data = get_data_from_url()
# Assuming write_to_path() is a function in a C library, which returns an error code.
error = write_to_path(data, "path/to/dump")
if error == EEXIST:
raise FileExistsError("Must not overwrite dump at path/to/dump.")
Note that some other built-in exceptions are much more common to raise manually, for example a StopIteration, when implementing an iterator, or a ValueError, for example when your method gets an argument with the wrong value (example).
Some exceptions you will rarely use yourself, like a SyntaxError for example.
As per NobbyNobbs's comment above: if the programmer raises standard exception in his code, it's difficult to work out, during error handling, if a given exception was raised on the application or the system level. Therefore, it's a practice best avoided.

Adding custom information to Python stack trace to be printed out

I am writing an interpreter for my simple programing language in Python. When there is an error during interpreting, a Python exception is thrown. The problem is that the stack trace is really ugly because it contains many recursive calls as I am navigating program's structure and interpreting it. And from the stack trace you cannot really see where in my program's structure the code was when it failed.
I know that I could capture all exceptions myself and rethrow a new exception adding some information about where an exception happened, but because it is recursive, I would have to do such rethrowing again and again.
I wonder if there isn't an easier way. For example, that next to path, line numbers, and function name in interpreter's code, I could print out also information where in the interpreted program's code that function is at that moment in the stack.
You can write a sys.excepthook that prints whatever traceback you want. I regularly use such a hook that uses the traceback module to print a normal traceback but with added lines that give the values of interesting local variables (via f_locals introspection).

Finding all exceptions that a function can raise [duplicate]

This question already has answers here:
How can I know which exceptions might be thrown from a method call?
(8 answers)
Closed 6 years ago.
We are working on a medium-sized commercial Python project and have one reccuring problem when using functions from the standard library.
The documentation of the standard library often does not list all (or even any) exceptions that a function can throw, so we try all the error cases that we can come up with, have a look through the source of the library and then catch whatever is plausible. But quite often we miss that one random error that can still happen, but that we didn't come up with. For example, we missed, that json.loads() can raise a ValueError, if any of the built-in constants are spelled the wrong way (e.g. True instead of true).
In other cases, we tried to just catch Exception, because that part of the code is so critical, that it should never break on an Exception, but should rather try again. The problem here is, that it even caught the KeyboardInterrupt.
So, is there any way to find all exceptions that a function can raise, even if the documentation does not say anything about that?
Are there any tools that can determine what exceptions can be raised?
There is no real way to do this other than reading all of the possible code paths that can be taken in that function and looking to see what exceptions can be raised there. I suppose some sort of automated tool could be written to do this, but even that is pretty tricky because due to python's dynamic nature, just about any exception could be raised from anywhere (If I really wanted to, I can always patch a dependency function with a different function that raises something else entirely).
Monkey Patching aside, To actually get it right, you'd need a really good type inferencer (maybe astroid could help?) to infer various TypeError or AttributeError that could be raised from accessing non-existent members or calling functions with the wrong arguments, etc. ValueError is particularly tricky because it can still get raised when you pass something of the correct type.
In other cases, we tried to just catch Exception, because that part of the code is so critical, that it should never break on an Exception, but should rather try again. The problem here is, that it even caught the KeyboardInterrupt.
This feels like a bad idea to me. For one, retrying code should only be done for exceptions that might give you a different result if you retry it (weird connectivity issues, etc.). For your ValueError case, you'll just raise the ValueError again. The best case scenario here is that the ValueError is allowed to propagate out of the exception handler on the second call -- The worst case is that you end up in an infinite loop (or RecursionError) that you don't really get much information to help debug.
Catching Exception should be a last resort (and it shouldn't catch KeyboardInterrupt or SystemExit since those don't inherit from Exception) and should probably only format some sort of error message that somebody can use to track down the issue and fix it.

Is it bad form to raise ArgumentError by hand?

If you want to add an extra check not provided by argparse, such as:
if variable a == b then c should be not None
...is it permissible to raise ArgumentError yourself?
Or, should you raise Exception instead?
Also what is common practice for this kind of situation? Say that you add a piece of code that's almost like a local extension of the library. Should you use the same exception type(s) as those provided by the library you are extending?
There's nothing inherently wrong with raising an ArgumentError. You can use it anytime the arguments you receive are not what you expected them to be, including checking range of numbers.
Also, yes, in general it's alright for you to use the same exceptions provided by a given library if you are writing an extension to that library.
Regarding raising Exceptions, I wouldn't do that. You should always raise a specific exception so you know how to handle it in the code. Catching Exception objects should be done at the highest level in your application, to catch and log all exceptions that you missed.

Categories