Python as Hive UDF - Clean Exit on Exception

Python as Hive UDF - Clean Exit on Exception - python

I'm trying to do a clean exit from the program whenever my python as Hive UDF fails with an exception.
Here is an example:
SELECT TRANSFORM (id, name) USING
'D:\Python27\python.exe streaming.py' AS (id string,
name string, count integer) FROM hivesampletable;
#streaming.py
import sys
from datetime import datetime
try:
for line in sys.stdin.readlines():
fields = line.strip().split('\t')
fields.append(len(name))
print "\t".join(fields)
except:
#I want program to break/clean exit with out printing (writing back) to table
Ideas appreciated

The pass statement will let you ignore the error and return control flow to your program.
except Exception, e:
pass
#To see your exception
print str(e)
# Alternately, you could get details of your exception using
print sys.exc_info()

If you essentially want to "swallow" the exception, then I would recommend in the except block you explicitly call sys.exit(0), which will both exit the program and indicate (from a shell level) that the program is "OK".
e.g. You will end up with a a truly clean exit that even a shell, e.g. bash, will see as "success".
Note: If you want to exit without printing anything but allow a shell to know something went awry, pass a non-zero error code to exit.
Response to comment from OP:
Hmm, I wouldn't expect that, since you're explicitly swallowing the exception...
The next step would likely be to print out what the exception is, as was suggested in the other answer, and go from there, depending on what that exception is.
Another thing that may be contributing is I don't think your Python script matches your TRANSFORM statement, so that may be contributing to the issue.
Also, you are referencing name without that being initialized (that might be the exception here -- NameError: name 'name' is not defined).

Related

How to make python not close it self after error, python exception

I have made an program in python that i converted into .exe using auto-py-to-exe and im wondering how to stop .exe stop closing it self after an error (python exception) the .exe closes it self in the speed of light and you cant read the error.
Does someone knows how to make it not close it self?
using input doesnt work if the exception happens in a pip library
Thanks

You can use a try-except based system (that you have to build suitably to catch every exception of your code) and to print the exception you can use the module traceback like so:
try:
## code with error ##
except Exception:
print("Exception in user code:")
print("-"*60)
traceback.print_exc(file=sys.stdout)
print("-"*60)
or you could use a simpler form like:
try:
## code with error ##
except Exception as e:
print(e, file='crash log.txt')
that only prints the error class (like file not found).
I should also point out that the finally keyword exists with the purpose of executing code either if an exception arose or not:
try:
## code with error ##
except: #optional
## code to execute in case of exception ##
finally:
## code executed either way ##
Something you could do on top of that is logging everything your program does with
print(status_of_program, file=open('log.txt','a'))
this is useful to see exactly at what point the program has crashed and in general to see the program in action step by step.
But a thing you should do is properly test the program while in .py form and if it works you could possibly assume the error comes from the actual exportation method and consult the documentation or try exporting simpler programs to catch the difference (and so the error).
i'd suggest to consult:
https://docs.python.org/3/library/exceptions.html
to learn the error types and the try-except construct.
If input() doesn't work try using sys.stdout().

I gather that this is a console app. If it's a GUI app, you can use a similar approach but the details will be different.
The approach I'd use is to set sys.excepthook to be your own function that prints the exception, then waits for input. You can call the existing exception hook to actually do the printing.
import sys, traceback as tb
oldhook = sys.excepthook
def waitexcepthook(type, exception, traceback):
oldhook(type, exception, traceback)
input()
sys.excepthook = waitexcepthook

Python - do something and continue when process finished with exit code 1

I have created a text file content reader - I have created several functions but I can't handle all cases - if some case is not defined, the program returns the message "Process finished with exit code 1" - Is it possible to get around this? If there is a problem - the Process finished with exit code 1 message - then I would like to save the name of the file and continue reading the rest of the files - at the moment it works so that if there is an error the program just stops working - but I want it to save the name of the file with the error and continue working. How to add such a condition to existing functions?
Please give me some advice
Example of such an error:
import pandas as pd
list=['a',2,3,4]
df = pd.DataFrame(list)
df[0] = df[0].astype(int)
print (df)
Can't convert a character into an int so an error occurs.
This is only an example of a possible error, but the name of the file where the error occurs should be saved and the loop should continue.

As you haven't included any code, I assume that the error was created by a block of code. In that case, you can enclose it within a try block and add an 'except` block to save the file's name in a file.
for example,
try :
# to read your files
except: # an exception occurred
# save the file name in a log
continue
try to share your code for better help.
EDIT 1 : Added an explanation for using plain except.
As documented, SystemExit does not inherit from Exception. You would have to use except BaseException.
However, this is for a reason:
This exception inherits from BaseException instead of StandardError or
Exception so that it is not accidentally caught by code that catches
Exception.
It is unusual to want to handle "real" exceptions in the same way you want to handle SystemExit. You might be better off catching SystemExit explicitly with except SystemExit.

why do we use 'pass' in error handling of python?

It is conventional to use pass statement in python like the following piece of code.
try:
os.makedirs(dir)
except OSError:
pass
So, 'pass' bascially does not do anything here. In this case, why would we still put a few codes like this in the program? I am confused. Many thanks for your time and attention.

It's for the parser. If you wrote this:
try:
# Code
except Error:
And then put nothing in the except spot, the parser would signal an error because it would incorrectly identify the next indentation level. Imagine this code:
def f(x):
try:
# Something
except Error:
def g(x):
# More code
The parser was expecting a statement with a greater indentation than the except statement but got a new top-level definition. pass is simply filler to satisfy the parser.

This is in case you want the code to continue right after the lines in the try block. If you won't catch it - it either skips execution until it is caught elsewhere - or fails the program altogether.
Suppose you're creating a program that attempts to print to a printer, but also prints to the standard output - you may not want it to file if the printer is not available:
try:
print_to_printer('hello world')
except NoPrinterError:
pass # no printer - that's fine
print("hello world")
If you would not use a try-catch an error would stop execution until the exception is caught (or would fail the program) and nothing would be printed to standard output.

The pass is used to tell the program what to do when it catches an error. In this particular case you're pretty much ignoring it. So you're running your script and if you experience an error keep going without worrying as to why and how.
That particular case is when you are definite on what is expected. There are other cases where you can break and end the program, or even assign the error to a variable so you can debug your program by using except Error as e.
try:
os.makedirs(dir)
except OSError:
break
or:
try:
os.makedirs(dir)
except OSError as e:
print(str(e))

try:
# Do something
except:
# again some code
# few more code
There are two uses of pass. First, and most important use :- if exception arises for the code under try, the execution will jump to except block. And if you have nothing inside the except block, it will throw IndentationError at the first place. So, to avoid this error, even if you have nothing to do when exception arises, you need to put pass inside except block.
The second use, if you have some more code pieces after the try-except block (e.g. again some code and few more code), and you don't put pass inside except, then that code piece will not be executed (actually the whole code will not be executed since compiler will throw IndentationError). So, in order to gracefully handle the scenario and tell the interpreter to execute the lines after except block, we need to put pass inside the except block, even though we don't want to do anything in case of exception.
So, here pass as indicated from name, handles the except block and then transfers the execution to the next lines below the except block.

Get Exit Status of a Nested Function?

I have a function called within a different function. In the nested function, various errors (e.g. improper arguments, missing parameters, etc.) should result in exit status 1. Something like:
if not os.path.isdir(filepath):
print('Error: could not find source directory...')
sys.exit(1)
Is this the correct way to use exit statuses within python? Should I have, instead,
return sys.exit(1)
??? Importantly, how would I reference the exit status of this nested function in the other function once the nested function had finished?

sys.exit() raises a SystemExit exception. Normally, you should not use it unless you really mean to exit your program.
You could catch this exception:
try:
function_that_uses_sys.exit()
except SystemExit as exc:
print exc.code
The .code attribute of the SystemExit exception is set to the proposed exit code.
However, you should really use a more specific exception, or create a custom exception for the job. A ValueError might be appropriate here, for example:
if not os.path.isdir(filepath):
raise ValueError('Error: could not find source directory {!r}'.format(filepath))
then catch that exception:
try:
function_that_may_raise_valueerror()
except ValueError as exc:
print "Oops, something went wrong: {}".format(exc.message)

By using sys.exit, you typically signal that you want the entire program to end. If you want to handle the error in a calling function, you should probably have the inner function raise a more specific exception instead. (You could catch the exception raised by SystemExit, but it would be a rather awkward way to pass error information out.)

I guess that the right thing to do is this:
if not os.path.isdir(filepath):
raise ValueError('the given filepath is not a directory')
However, the code as it stands still could be improved. One point is that a path to a file should never be a directory, so that is not an exceptional state. Maybe what you want is to just name it path without adding an unintended connotations.
Further, and that has actual functional implications, you are still not guaranteed to be able to access a directory there even if isdir() returns true! The reason is that something could have switched the thing under your feet, typically a malicious attacker, or, more simple, you could simply not have the rights to access it. If you care, you should rather just open the directory and handle the according errors instead of trying in advance to determine if something in the future will fail. This is in general a better approach, as the "normal" code doesn't get cluttered by such checks and you also don't pay any albeit small performance penalty except when an error occurs.

Do we use try,except in every single function?

Should we always enclose every function we write with a try...except block?
I ask this because sometimes in one function we raise Exception, and the caller that calls this function doesn't have exception
def caller():
stdout, stderr = callee(....)
def callee():
....
if stderr:
raise StandardError(....)
then our application crash. In this obvious case, I am tempted to enclose callee and caller with try..except.
But I've read so many Python code and they don't do these try..block all the time.
def cmd(cmdl):
try:
pid = Popen(cmdl, stdout=PIPE, stderr=PIPE)
except Exception, e:
raise e
stdout, stderr = pid.communicate()
if pid.returncode != 0:
raise StandardError(stderr)
return (stdout, stderr)
def addandremove(*args,**kwargs):
target = kwargs.get('local', os.getcwd())
f = kwargs.get('file', None)
vcs = kwargs.get('vcs', 'hg')
if vcs is "hg":
try:
stdout, stderr = cmd(['hg', 'addremove', '--similarity 95'])
except StandardError, e:
// do some recovery
except Exception, e:
// do something meaningful
return True
The real thing that bothers me is this:
If there is a 3rd function that calls addandremove() in one of the statements, do we also surround the call with a try..except block? What if this 3rd function has 3 lines, and each function calls itself has a try-except? I am sorry for building this up. But this is the sort of problem I don't get.

Exceptions are, as the name implies, for exceptional circumstances - things that shouldn't really happen
..and because they probably shouldn't happen, for the most part, you can ignore them. This is a good thing.
There are times where you do except an specific exception, for example if I do:
urllib2.urlopen("http://example.com")
In this case, it's reasonable to expect the "cannot contact server" error, so you might do:
try:
urllib2.urlopen("http://example.com")
except urllib2.URLError:
# code to handle the error, maybe retry the server,
# report the error in a helpful way to the user etc
However it would be futile to try and catch every possible error - there's an inenumerable amount of things that could potentially go wrong.. As a strange example, what if a module modifies urllib2 and removes the urlopen attribute - there's no sane reason to expect that NameError, and no sane way you could handle such an error, therefore you just let the exception propagate up
Having your code exit with a traceback is a good thing - it allows you to easily see where the problem originate, and what caused it (based on the exception and it's message), and fix the cause of the problem, or handle the exception in the correct location...
In short, handle exceptions only if you can do something useful with them. If not, trying to handle all the countless possible errors will only make your code buggier and harder to fix
In the example you provide, the try/except blocks do nothing - they just reraise the exception, so it's identical to the much tidier:
def cmd(cmdl):
pid = Popen(cmdl, stdout=PIPE, stderr=PIPE)
stdout, stderr = pid.communicate()
if pid.returncode != 0:
raise StandardError(stderr)
return (stdout, stderr)
# Note: Better to use actual args instead of * and **,
# gives better error handling and docs from help()
def addandremove(fname, local = None, vcs = 'hg'):
if target is None:
target = os.getcwd()
if vcs is "hg":
stdout, stderr = cmd(['hg', 'addremove', '--similarity 95'])
return True
About the only exception-handling related thing I might expect is to handle if the 'hg' command isn't found, the resulting exception isn't particularly descriptive. So for a library, I'd do something like:
class CommandNotFound(Exception): pass
def cmd(cmdl):
try:
pid = Popen(cmdl, stdout=PIPE, stderr=PIPE)
except OSError, e:
if e.errno == 2:
raise CommandNotFound("The command %r could not be found" % cmdl)
else:
# Unexpected error-number in OSError,
# so a bare "raise" statement will reraise the error
raise
stdout, stderr = pid.communicate()
if pid.returncode != 0:
raise StandardError(stderr)
return (stdout, stderr)
This just wraps the potentially confusing "OSError" exception in the clearer "CommandNotFound".
Rereading the question, I suspect you might be misunderstanding something about how Python exceptions work (the "and the caller that calls this function doesn't have exception" bit, so to be hopefully clarify:
The caller function does not need any knowledge of the exceptions that might be raised from the children function. You can just call the cmd() function and hope it works fine.
Say your code is in a mystuff module, and someone else wants to use it, they might do:
import mystuff
mystuff.addandremove("myfile.txt")
Or, maybe they want to give a nice error message and exit if the user doesn't have hg installed:
import mystuff
try:
mystuff.addandremove("myfile.txt")
except mystuff.CommandNotFound:
print "You don't appear to have the 'hg' command installed"
print "You can install it with by... etc..."
myprogram.quit("blahblahblah")

You should use a try catch block so that you can specifically locate the source of the exception. You can put these blocks around anything you want, but unless they produce some sort of useful information, there is no need to add them.

try/except clauses are really only useful if you know how to handle the error that gets raised. Take the following program:
while True:
n=raw_input("Input a number>")
try:
n=float(n)
break
except ValueError:
print ("That wasn't a number!") #Try again.
However, you may have a function like:
def mult_2_numbers(x,y):
return x*y
and the user may try to use it as:
my_new_list=mult_2_numbers([7,3],[8,7])
The user could put this in a try/except block, but then my_new_list wouldn't be defined and would probably just raise an exception later (likely a NameError). In that case, you'd make it harder to debug because the line number/information in the traceback is pointing to a piece of code which isn't the real problem.

There are a couple of programming decisions for your coding team to make regarding introspection type tools and "exception" handling.
One good place to use exception handling is with Operating System calls such as file operations. Reasoning is, for example, a file may have it's access restricted from being accessed by the client appliction. That access restriction is typically an OS-admin task, not the Python application function. So exceptions would be a good use where you application does NOT have control.
You can mutate the previous paragraph's application of exceptions to be broader, in the sense of using Exceptions for things outside the control of what your team codes, to such things as all "devices" or OS resources like OS timers, symbolic links, network connections, etc.
Another common use case is when you use a library or package that is -designed- to through lots of exceptions, and that package expects you to catch or code for that. Some packages are designed to throw as few exceptions as possible and expect you to code based on return values. Then your exceptions should be rare.
Some coding teams use exceptions as a way to log fringe cases of "events" within your application.
I find when desiding whether to use exceptions I am either programing to minimize failure by not having a lot of try/except and have calling routines expect either valid return values or expect an invalid return values. OR I program for failure. That is to say, I program for expected failure by functions, which I use alot of try/except blocks. Think of programming for "expected" failure like working with TCP; the packets aren't garenteed to get there or even in order, but there are exception handling with TCP by use of send/read retrys and such.
Personally I use try-except blocks around the smallest possible block sizes, usually one line of code.

It's up to you; the main exceptions' roles involve (a quote from this splendid book):
Error handling
Event notification
Special-case handling
Termination actions
Unusual control flows

When you know what the error will be, use try/except for debugging purpose. Otherwise, you don't have to use try/except for every function.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python as Hive UDF - Clean Exit on Exception - python

The pass statement will let you ignore the error and return control flow to your program. except Exception, e: pass #To see your exception print str(e) # Alternately, you could get details of your exception using print sys.exc_info()

Related

How to make python not close it self after error, python exception

Python - do something and continue when process finished with exit code 1

why do we use 'pass' in error handling of python?

Get Exit Status of a Nested Function?

Do we use try,except in every single function?

Categories

Resources