For example, I am using multiprocessing pool to process files:
with Pool(5) as pool:
results = pool.starmap(self.process_file, zip(files, repeat(channel)))
When an exception occurs inside the function process_file, the exception message indicates that it occurs at the pool.starmap line, but not the actual place inside process_file function.
I am using PyCharm to develop and debug. Is there a way to change this behavior? The current error message doesn't give the correct position of the error occurred.
Multiprocessing transfers the errors between processes using the pickle module, but pickle doesn't know how to preserve the tracebacks of exceptions by default.
I found tblib to be a very convenient way to address this shortcoming. Based on this example I suggest you try adding this code to the main module of your code:
from tblib import pickling_support
# all your setup code
pickling_support.install()
if __name__ == "__main__":
# your pool work
The exception has the original exception info but PyCharm is not ferreting it out.
Assuming there are no PyCharm configuration options to enhance its ability to ferret out all the exception information, and not just the outer exception as your are seeing, you need to programmatically extract it out.
For good in-program error handling, you probably want to do that anyway. Especially with sub-processes, I very often will catch Exception and log it and re-raise if not considered handled (depends), or if I catch a specific exception I'm expecting and consider it handled, I won't re-raise it.
Note, it's not only PyCharm showing the outer exception... I see the same thing with other tools.
Below will show you the original problem (see "Line 7" below) and re-raise. Again, re-raising or not is really context dependent so below is just an example. But the point is that the exception you are seeing has more data that PyCharm by default is apparently not showing you.
from itertools import repeat
from multiprocessing import Pool
import traceback
def process(a,b):
print(a,b)
raise Exception("not good")
if __name__ == '__main__':
with Pool(5) as pool:
try:
results = pool.starmap(process, zip([1,2,3,4,5], repeat('A')))
except Exception as ex:
print("starmap failure:")
for error_line in traceback.format_exception(ex, ex, ex.__traceback__):
error_line = error_line.strip()
if not error_line:
continue
print(f" {error_line}")
raise # re-raise if we do not consider this handled.
Gives me this output:
starmap failure:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\...\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\...\multiprocessing\pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "...\starmap_exception.py", line 7, in process
raise Exception("not good")
Exception: not good
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "...\starmap_exception.py", line 12, in <module>
results = pool.starmap(process, zip([1,2,3,4,5], repeat('A')))
File "C:\Users\...\multiprocessing\pool.py", line 372, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\...\multiprocessing\pool.py", line 771, in get
raise self._value
Exception: not good
Related
So when I run this... the error is on this line bomb=pd.DataFrame(here,0) but the trace shows me a bunch of code from the pandas library to get to the error.
import traceback,sys
import pandas as pd
def error_handle(err_var,instance_name=None): #err_var list of variables, instance_name
print(traceback.format_exc())
a= sys._getframe(1).f_locals
for i in err_var: # selected var for instance
t= a[instance_name]
print i,"--->",getattr(t,i.split(".")[1])
here=['foo']
err_var = ['self.needthisone','self.constant2']
class test:
def __init__(self):
self.constant1 = 'hi1'
#self.constant2 = 'hi2'
#self.needthisone = ':)'
for i in err_var:
setattr(self, i.split('.')[1], None)
def other_function(self):
self.other_var=5
def testing(self):
self.other_function()
vars=[self.constant1,self.constant2]
try:
for i in vars:
bomb=pd.DataFrame(here,0)
except:
error_handle(err_var,'self')
t=test()
t.testing()
How do I suppress all that and have the error just look like this:
Traceback (most recent call last):
File "C:\Users\Jason\Google Drive\python\error_handling.py", line 34, in testing
bomb=pd.DataFrame(here,0)
TypeError: Index(...) must be called with a collection of some kind, 0 was passed
I just want what's relevant to me and the last line of code that I wrote which was bad.
This is the original:
Traceback (most recent call last):
File "C:\Users\Jason\Google Drive\python\error_handling.py", line 35, in testing
bomb=pd.DataFrame(here,0)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 330, in __init__
copy=copy)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 474, in _init_ndarray
index, columns = _get_axes(*values.shape)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 436, in _get_axes
index = _ensure_index(index)
File "C:\Python27\lib\site-packages\pandas\core\indexes\base.py", line 3978, in _ensure_index
return Index(index_like)
File "C:\Python27\lib\site-packages\pandas\core\indexes\base.py", line 326, in __new__
cls._scalar_data_error(data)
File "C:\Python27\lib\site-packages\pandas\core\indexes\base.py", line 678, in _scalar_data_error
repr(data)))
TypeError: Index(...) must be called with a collection of some kind, 0 was passed
self.needthisone ---> None
self.constant2 ---> None
You can define how far back a traceback goes using the sys.traceback variable. If your code is only 3 levels deep, (a function in a class in a file), then you can define this appropriately with the code:
sys.tracebacklimit = 3
at the top of your file. BE CAREFUL WITH THIS: As you write more code, the portion that you've written will become deeper and deeper, and you may sometime soon find that an error is a result of something deeper in the traceback. As a general rule, I would avoid using the variable and just deal with a longer traceback for the time being.
Please, don't ever think about limiting the stack trace. It is very important.
Only at this moment, in this small example of yours, the error really is in your code.
But in an infinite other cases, an error could be triggered much deeper than that. It could be in the framework, or even out of the code whatsoever, like a configuration error, or it could be in the platform, like an Out of Memory error, etc.
The stack trace is there to help you. It lists all frames the compiler was executing, to give you all the info you need to understand what was going on.
I would highly encourage you not to limit your traceback output, because it is bad practice. You feel like there is too much info; but this is only because you already looked at it already and you know what error to look for.
In most cases, the problem may be hiding elsewhere. So there has to be a better way to achieve what you look for.
Why not wrap your function call in a try except clause and print the exception message? Take this scenario for example:
def f():
a = 0
i = 1
print i/a
def another_func():
print 'this is another func'
return f()
def higher_level_func():
print 'this is higher level'
return another_func()
if __name__ == '__main__':
try:
higher_level_func()
except Exception as e:
print 'caught the exception: {}-{}'.format(type(e)__name__, e.message)
When called, this is the output:
this is higher level
this is another func
caught the exception: ZeroDivisionError-integer division or modulo by zero
This prints only the relevant exception in your code, hiding any information about the traceback, but the traceback is still available and you can print it as well (just raise the exception from your except block).
Compared to this, if I remove the try except block:
this is higher level
this is another func
caught the exception: integer division or modulo by zero
Traceback (most recent call last):
File "test.py", line 17, in <module>
higher_level_func()
File "test.py", line 12, in higher_level_func
return another_func()
File "test.py", line 8, in another_func
return f()
File "test.py", line 4, in f
print i/a
ZeroDivisionError: integer division or modulo by zero
You better use this technique to capture the relevant exception, rather than limiting the tracebacks. If you want your program to stop, just add sys.exit(1) in the except block.
I want a context manager to catch an exception, print the stack trace, and then allow execution to continue.
I want to know if I can do this with the contextlib contextmanager decorator. If not, how can I do it?
Documentation suggests the following:
At the point where the generator yields, the block nested in the with statement is executed. The generator is then resumed after the block is exited. If an unhandled exception occurs in the block, it is reraised inside the generator at the point where the yield occurred. Thus, you can use a try…except…finally statement to trap the error (if any), or ensure that some cleanup takes place. If an exception is trapped merely in order to log it or to perform some action (rather than to suppress it entirely), the generator must reraise that exception.
So I try the obvious approach that the documentation leads me to:
import contextlib
import logging
#contextlib.contextmanager
def log_error():
try:
yield
except Exception as e:
logging.exception('hit exception')
finally:
print 'done with contextmanager'
def something_inside_django_app():
with log_error():
raise Exception('alan!')
something_inside_django_app()
print 'next block of code'
This produces the output
ERROR:root:hit exception
Traceback (most recent call last):
File "exception_test.py", line 8, in log_error
yield
File "exception_test.py", line 17, in something_inside_django_app
raise Exception('alan!')
Exception: alan!
done with contextmanager
next block of code
This loses critical information about where the exception was raised from. Consider what you get when you adjust the context manager to not supress the exception:
Traceback (most recent call last):
File "exception_test.py", line 20, in <module>
something_inside_django_app()
File "exception_test.py", line 17, in something_inside_django_app
raise Exception('alan!')
Exception: alan!
Yes, it was able to tell me that the exception was raised from line 17, thank you very much, but the prior call at line 20 is lost information. How can I have the context manager give me the actual full call stack and not its truncated version of it? To recap, I want to fulfill two requirements:
have a python context manager suppress an exception raised in the code it wraps
print the stack trace that would have been generated by that code, had I not been using the context manager
If this cannot be done with the decorator, then I'll use the other style of context manager instead. If this cannot be done with context managers, period, I would like to know what a good pythonic alternative is.
I have updated my solution for this problem here:
https://gist.github.com/AlanCoding/288ee96b60e24c1f2cca47326e2c0af1
There was more context that the question missed. In order to obtain the full stack at the point of exception, we need both the traceback returned to the context manager, and the current context. Then we can glue together the top of the stack with the bottom of the stack.
To illustrate the use case better, consider this:
def err_method1():
print [1, 2][4]
def err_method2():
err_method1()
def outside_method1():
with log_error():
err_method2()
def outside_method2():
outside_method1()
outside_method2()
To really accomplish what this question is looking for, we want to see both outer methods, and both inner methods in the call stack.
Here is a solution that does appear to work for this:
class log_error(object):
def __enter__(self):
return
def __exit__(self, exc_type, exc_value, exc_traceback):
if exc_value:
# We want the _full_ traceback with the context, so first we
# get context for the current stack, and delete the last 2
# layers of context, saying that we're in the __exit__ method...
top_stack = StringIO.StringIO()
tb.print_stack(file=top_stack)
top_lines = top_stack.getvalue().strip('\n').split('\n')[:-4]
top_stack.close()
# Now, we glue that stack to the stack from the local error
# that happened within the context manager
full_stack = StringIO.StringIO()
full_stack.write('Traceback (most recent call last):\n')
full_stack.write('\n'.join(top_lines))
full_stack.write('\n')
tb.print_tb(exc_traceback, file=full_stack)
full_stack.write('{}: {}'.format(exc_type.__name__, str(exc_value)))
sinfo = full_stack.getvalue()
full_stack.close()
# Log the combined stack
logging.error('Log message\n{}'.format(sinfo))
return True
The traceback looks like:
ERROR:root:Log message
Traceback (most recent call last):
File "exception_test.py", line 71, in <module>
outside_method2()
File "exception_test.py", line 69, in outside_method2
outside_method1()
File "exception_test.py", line 65, in outside_method1
err_method2()
File "exception_test.py", line 60, in err_method2
err_method1()
File "exception_test.py", line 56, in err_method1
print [1, 2][4]
IndexError: list index out of range
This is the same information that you would expect from doing logging.exception in a try-except over the same code that you wrap in the context manager.
Having a raised exception I would like to jump into that frame. To explain better what I mean I wrote this mwe:
Assuming I have the following code:
from multiprocessing import Pool
import sys
# Setup debugger
def raiseDebugger(*args):
""" http://code.activestate.com/recipes/65287-automatically-start-the-
debugger-on-an-exception/ """
import traceback, pdb
traceback.print_exception(*args)
pdb.pm()
sys.excepthook = raiseDebugger
# Now start with the question
def faulty(i):
return 1 / i
with Pool() as pool:
pool.map(faulty, range(6))
which unsurprisingly leads to:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/bin/conda/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/bin/conda/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "test2.py", line 19, in faulty
return 1 / i
ZeroDivisionError: division by zero
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "test2.py", line 23, in <module>
pool.map(faulty, range(6))
File "/home/bin/conda/lib/python3.5/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/bin/conda/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
ZeroDivisionError: division by zero
> /home/bin/conda/lib/python3.5/multiprocessing/pool.py(608)get()
-> raise self._value
(Pdb)
Now to debug the problem I would like to "jump" into the frame which originally raised the exception (ZeroDivisionError).
The original exception is still available under self._value complete with self._value.__traceback__.
The call that pm (or post_mortem) calls is from the value field of sys.exc_info, and the default invocation of post_mortem is done on the __traceback__ of that value. However if you want to get to the underlying object, you want to access its __context__ instead. Given this code example:
import pdb
import sys
import traceback
def top():
value = 1
raise Exception('this always fails')
def bottom():
try:
top()
except Exception as bot_ex:
x = {}
return x['nothing']
try:
bottom()
except Exception as main_ex:
pdb.post_mortem()
Running the code. The main_ex would be analogous to your self._value.
> /tmp/foo.py(14)bottom()
-> return x['nothing']
(Pdb) main_ex
KeyError('nothing',)
(Pdb) pdb.post_mortem(main_ex.__traceback__)
> /tmp/foo.py(14)bottom()
-> return x['nothing']
Note we have a new pdb prompt at the same location, which is where the exception was originally raised. Let's try it with __context__ if we need to go further up:
(Pdb) c
(Pdb) pdb.post_mortem(main_ex.__context__.__traceback__)
> /tmp/foo.py(7)top()
-> raise Exception('this always fails')
If needed, keep repeating until you get to the target context/traceback desired.
Now for the multiprocessing case, which I wasn't aware would have made this much difference, as the question implies something general (How can I “jump” into stackframe from exception?), but it turns out the specifics in multiprocessing made all the difference.
In Python 3.4 a workaround was done to just show that traceback as a string; due to how much stuff a traceback actually has, communicating all that proved to be difficult as discussed in the issue 13831 on the Python tracker, so instead a hack was done to bring a __cause__ attribute into the current exception, but it is no full __traceback__ as it just has the string representation of that, as I had suspected.
Anyway this is what would have happened:
(Pdb) !import pdb
(Pdb) !self._value.__cause__
RemoteTraceback('\n"""\nTraceback (most recent call last):...',)
(Pdb) !type(self._value.__cause__)
<class 'multiprocessing.pool.RemoteTraceback'>
(Pdb) !self._value.__cause__.__traceback__
(Pdb) !self._value.__cause__.__context__
So this isn't actually possible until they figure out how to bring all those states across process boundaries.
In the following stacktrace I miss the upper frames.
Who called callback() in ioloop.py line 458?
The stacktrace comes from a unittest TestCase. All tests pass but this traceback is in the logs and reproducible.
I can't see in which test of the TestCase the exception was raised.
ERROR [25950] Exception in callback <functools.partial object at 0x5358368>
Traceback (most recent call last):
File "/home/modwork_foo_dtg/lib/python2.7/site-packages/tornado/ioloop.py", line 458, in _run_callback
callback()
File "/home/modwork_foo_dtg/lib/python2.7/site-packages/tornado/stack_context.py", line 331, in wrapped
raise_exc_info(exc)
File "/home/modwork_foo_dtg/lib/python2.7/site-packages/tornado/stack_context.py", line 302, in wrapped
ret = fn(*args, **kwargs)
File "/home/modwork_foo_dtg/src/websocketrpc/websocketrpc/client.py", line 71, in connect
self.ws = websocket_connect(self.args.url)
File "/home/modwork_foo_dtg/src/websocketrpc/websocketrpc/client.py", line 179, in websocket_connect
conn = websocket.WebSocketClientConnection(io_loop, request)
File "/home/modwork_foo_dtg/lib/python2.7/site-packages/tornado/websocket.py", line 777, in __init__
raise Exception('%s %s' % (request, request.url))
Exception: <tornado.httpclient._RequestProxy object at 0x535cb10> None
How could I use tornado to see the upper stacktrace frames?
The exception itself is not the problem.
You can use IOLoop.handle_callback_exception to print from sys.exc_info to see what specifically is breaking.
The callback was invoked by ioloop.py:458, just like it says. No outer stack frames are shown because the exception didn't escape that frame. The thing that's confusing you is that the callback goes on to re-raise an exception that was captured earlier.
In Python 2, preserving tracebacks to re-raise later is messy (it gets better in Python 3). Tornado usually does the right thing here, but there are some gaps where a traceback will get truncated. The main problem I'm aware of in current versions is that AsyncHTTPClient tends to throw away tracebacks (and there are some annoying backwards-compatibility issues with fixing this).
As a crude workaround while debugging, you can try printing traceback.format_stack just before throwing an exception (at least where it's feasible to modify the code, as you've done here to add an exception to websocket.py).
I just converted all my unit test data from JSON to YAML, and now an exception is raised somewhere in my code. More specifically, this is printed traceback:
Traceback (most recent call last):
File "tests/test_addrtools.py", line 95, in test_validate_correctable_addresses
self.assertTrue(self.validator(addr), msg)
File "/Users/tomas/Dropbox/Broadnet/broadpy/lib/broadpy/addrtools.py", line 608, in __call__
self.validate(addr)
File "/Users/tomas/Dropbox/Broadnet/broadpy/lib/broadpy/addrtools.py", line 692, in validate
if self._correction_citytypo(addr): return
File "/Users/tomas/Dropbox/Broadnet/broadpy/lib/broadpy/addrtools.py", line 943, in _correction_citytypo
ratio = lev_ratio(old_city, city)
TypeError: ratio expected two Strings or two Unicodes
Now, the file "addrtools.py" on line 943 contains the answer to my problem. I want to see the type and values of old_city and city in the scope where the exception is raised. I have this sort of issue all the time, and a quick and painless method of using pdb to inspect the locals in the scope where the exception is raised would save me tons of time in the future.
I did try the solution posted in the answer to this question, but the post-mortem function places me in python2.7/unittest/main.py(231)runTests() which doesn't help me a whole lot. I guess this is because the exception is caught and re-raised from the unittest code.
Wrap it with that:
def debug_on(*exceptions):
if not exceptions:
exceptions = (AssertionError, )
def decorator(f):
#functools.wraps(f)
def wrapper(*args, **kwargs):
try:
return f(*args, **kwargs)
except exceptions:
pdb.post_mortem(sys.exc_info()[2])
return wrapper
return decorator
Example:
#debug_on(TypeError)
def buggy_function()
....
raise TypeError
The unittest superset nose has an option that drops you to pdb when a test fails, if it's okay for you to use nose as your test runner:
--pdb Drop into debugger on errors
--pdb-failures Drop into debugger on failures