Why can't I pickle an error's Traceback in Python? - python

I've since found a work around, but still want to know the answer.

The traceback holds references to the stack frames of each function/method that was called on the current thread, from the topmost-frame on down to the point where the error was raised. Each stack frame also holds references to the local and global variables in effect at the time each function in the stack was called.
Since there is no way for pickle to know what to serialize and what to ignore, if you were able to pickle a traceback you'd end up pickling a moving snapshot of the entire application state: as pickle runs, other threads may be modifying the values of shared variables.
One solution is to create a picklable object to walk the traceback and extract only the information you need to save.

You can use tblib
try:
1 / 0
except Exception as e:
raise Exception("foo") from e
except Exception as e:
s = pickle.dumps(e)
raise pickle.loads(s)

I guess you are interested in saving the complete call context (traceback + globals + locals of each frame).
That would be very useful to determine a difference of behavior of the same function in two different call contexts, or to build your own advanced tools to process, show or compare those tracebacks.
The problem is that pickl doesn't know how to serialize all type of objects that could be in the locals or globals.
I guess you can build your own object and save it, filtering out all those objects that are not picklabe. This code can serve as basis:
import sys, traceback
def print_exc_plus():
"""
Print the usual traceback information, followed by a listing of all the
local variables in each frame.
"""
tb = sys.exc_info()[2]
while 1:
if not tb.tb_next:
break
tb = tb.tb_next
stack = []
f = tb.tb_frame
while f:
stack.append(f)
f = f.f_back
stack.reverse()
traceback.print_exc()
print "Locals by frame, innermost last"
for frame in stack:
print
print "Frame %s in %s at line %s" % (frame.f_code.co_name,
frame.f_code.co_filename,
frame.f_lineno)
for key, value in frame.f_locals.items():
print "\t%20s = " % key,
#We have to be careful not to cause a new error in our error
#printer! Calling str() on an unknown object could cause an
#error we don't want.
try:
print value
except:
print "<ERROR WHILE PRINTING VALUE>"
but instead of printing the objects you can add them to a list with your own pickable representation ( a json or yml format might be better).
Maybe you want to load all this call context in order to reproduce the same situation for your function without run the complicated workflow that generate it. I don't know if this can be done (because of memory references), but in that case you would need to de-serialize it from your format.

Related

python how to re-raise an exception which is already caught?

import sys
def worker(a):
try:
return 1 / a
except ZeroDivisionError:
return None
def master():
res = worker(0)
if not res:
print(sys.exc_info())
raise sys.exc_info()[0]
As code piece above, I have a bunch of functions like worker. They already have their own try-except block to handle exceptions. And then one master function will call each worker. Right now, sys.exc_info() return all None to 3 elements, how to re-raise the exceptions in the master function?
I am using Python 2.7
One update:
I have more than 1000 workers and some worker has very complex logic, they may deal multiple types of exceptions at same time. So my question is can I just raise those exceptions from master rather than edit works?
In your case, the exception in worker returns None. Once that happens, there's no getting the exception back. If your master function knows what the return values should be for each function (for example, ZeroDivisionError in worker reutrns None, you can manually reraise an exception.
If you're not able to edit the worker functions themselves, I don't think there's too much you can do. You might be able to use some of the solutions from this answer, if they work in code as well as on the console.
krflol's code above is kind of like how C handled exceptions - there was a global variable that, whenever an exception happened, was assigned a number which could later be cross-referenced to figure out what the exception was. That is also a possible solution.
If you're willing to edit the worker functions, though, then escalating an exception to the code that called the function is actually really simple:
try:
# some code
except:
# some response
raise
If you use a blank raise at the end of a catch block, it'll reraise the same exception it just caught. Alternatively, you can name the exception if you need to debug print, and do the same thing, or even raise a different exception.
except Exception as e:
# some code
raise e
What you're trying to do won't work. Once you handle an exception (without re-raising it), the exception, and the accompanying state, is cleared, so there's no way to access it. If you want the exception to stay alive, you have to either not handle it, or keep it alive manually.
This isn't that easy to find in the docs (the underlying implementation details about CPython are a bit easier, but ideally we want to know what Python the language defines), but it's there, buried in the except reference:
… This means the exception must be assigned to a different name to be able to refer to it after the except clause. Exceptions are cleared because with the traceback attached to them, they form a reference cycle with the stack frame, keeping all locals in that frame alive until the next garbage collection occurs.
Before an except clause’s suite is executed, details about the exception are stored in the sys module and can be accessed via sys.exc_info(). sys.exc_info() returns a 3-tuple consisting of the exception class, the exception instance and a traceback object (see section The standard type hierarchy) identifying the point in the program where the exception occurred. sys.exc_info() values are restored to their previous values (before the call) when returning from a function that handled an exception.
Also, this is really the point of exception handlers: when a function handles an exception, to the world outside that function, it looks like no exception happened. This is even more important in Python than in many other languages, because Python uses exceptions so promiscuously—every for loop, every hasattr call, etc. is raising and handling an exception, and you don't want to see them.
So, the simplest way to do this is to just change the workers to not handle the exceptions (or to log and then re-raise them, or whatever), and let exception handling work the way it's meant to.
There are a few cases where you can't do this. For example, if your actual code is running the workers in background threads, the caller won't see the exception. In that case, you need to pass it back manually. For a simple example, let's change the API of your worker functions to return a value and an exception:
def worker(a):
try:
return 1 / a, None
except ZeroDivisionError as e:
return None, e
def master():
res, e = worker(0)
if e:
print(e)
raise e
Obviously you can extend this farther to return the whole exc_info triple, or whatever else you want; I'm just keeping this as simple as possible for the example.
If you look inside the covers of things like concurrent.futures, this is how they handle passing exceptions from tasks running on a thread or process pool back to the parent (e.g., when you wait on a Future).
If you can't modify the workers, you're basically out of luck. Sure, you could write some horrible code to patch the workers at runtime (by using inspect to get their source and then using ast to parse, transform, and re-compile it, or by diving right down into the bytecode), but this is almost never going to be a good idea for any kind of production code.
Not tested, but I suspect you could do something like this. Depending on the scope of the variable you'd have to change it, but I think you'll get the idea
try:
something
except Exception as e:
variable_to_make_exception = e
.....later on use variable
an example of using this way of handling errors:
errors = {}
try:
print(foo)
except Exception as e:
errors['foo'] = e
try:
print(bar)
except Exception as e:
errors['bar'] = e
print(errors)
raise errors['foo']
output..
{'foo': NameError("name 'foo' is not defined",), 'bar': NameError("name 'bar' is not defined",)}
Traceback (most recent call last):
File "<input>", line 13, in <module>
File "<input>", line 3, in <module>
NameError: name 'foo' is not defined

Restoring variables from outer scope to previous values if an exception occur

Is it possible to call a function in a kind of protected environment with the following feature: if calling function f raises an exception, then make sure all (outer) variables are restored to their previous values.
For instance, the following code:
a = 42
def f():
global a
a += 1
error
f()
will obviously set a to 43 before raising the exception. I would like to build some try/except structure for calling f() where the exception would restore local variables to their previous state.
Of course I thought to something related to sys._getframe(1).f_locals. Is it possible? Would it be portable accross different versions of Python? etc.
No major goal right now; just curious about that idea.
Short answer is no, there's no snapshot feature to these executions and thus no way of reverting the variables.
However there are some things you can do. One of them being:
(And I'm writing this as I go so this will be resource exhausting way to solve your problem if you use it on large variables.)
from pickle import load, dump
def snapshot(v):
with open('snapshot.bin', 'wb') as fh:
dump(v, fh)
def restore():
with open('snapshot.bin', 'rb') as fh:
v = load(fh)
return v
a = 42
snapshot(a)
def f():
global a
a += 1
error
try:
f()
except:
a = restore()
If this were a class with initated values, you could also snapshot the entire class or peak inside it and pull out certain variables. But there's no way to automatically do these things for you.
Of course this requires you to know a head of time what variables will be affected, I'm not sure there is a way to "peak inside" a function and see what variable names will be used, and even then you'd have to use a traceback call to see on which row your got the error and restored based on that.
One way I would solve it, is to store all my critical variables in a dictionary and snapshot branches of that dictionary or the entire dictionary itself.

How to get hold of the object missing an attribute

Suppose we try to access a non-existing attribute:
>>> {'foo': 'bar'}.gte('foo') # well, I meant “get”!
Python’s AttributeError only has the attribute args with a string containing the finished error message: 'dict' object has no attribute 'gte'
Using the inspect and/or traceback modules with sys.last_traceback, is there a way to get hold of the actual dict object?
>>> offending_object = get_attributeerror_obj(sys.last_traceback)
>>> dir(offending_object)
[...
'clear',
'copy',
'fromkeys',
'get', # ah, here it is!
'items',
...]
Edit: since the cat is out of the bag anyway, I’ll share my findings and code (please don’t solve this and submit to PyPI, please ;))
The AttributeError is created here, which shows that there’s clearly no reference to the originating object attached.
Here the code with the same placeholder function:
import sys
import re
import difflib
AE_MSG_RE = re.compile(r"'(\w+)' object has no attribute '(\w+)'")
def get_attributeerror_obj(tb):
???
old_hook = sys.excepthook
def did_you_mean_hook(type, exc, tb):
old_hook(type, exc, tb)
if type is AttributeError:
match = AE_MSG_RE.match(exc.args[0])
sook = match.group(2)
raising_obj = get_attributeerror_obj(tb)
matches = difflib.get_close_matches(sook, dir(raising_obj))
if matches:
print('\n\nDid you mean?', matches[0], file=sys.stderr)
sys.excepthook = did_you_mean_hook
It's not the answer you want, but I'm pretty sure you can't... at least not with sys.excepthook. This is because the reference counts are decremented as the frame is unwound, so it's perfectly valid for the object to be garbage collected before sys.excepthook is called. In fact, this is what happens in CPython:
import sys
class X:
def __del__(self):
print("deleting")
def error():
X().wrong
old_hook = sys.excepthook
def did_you_mean_hook(type, exc, tb):
print("Error!")
sys.excepthook = did_you_mean_hook
error()
#>>> deleting
#>>> Error!
That said, it isn't always the case. Because the exception object points to the frame, if your code looks like:
def error():
x = X()
x.wrong
x cannot yet be collected. x is owned by the frame, and the frame is alive. But since I've already proven that there is no explicit reference made to this object, it's not ever obvious what to do. For example,
def error():
foo().wrong
may or may not have an object that has survived, and the only feasible way to find out is to run foo... but even then you have problems with side effects.
So no, this is not possible. If you don't mind going to any lengths whatsoever, you'll probably end up having to rewrite the AST on load (akin to FuckIt.py). You don't want to do that, though.
My suggestion would be to try using a linter to get the names of all known classes and their methods. You can use this to reverse-engineer the traceback string to get the class and incorrect method, and then run a fuzzy match to find the suggestion.
Adding my 2 cents as I successfully (so far) tried to do something similar for DidYouMean-Python.
The trick here is that it is pretty much the one case where the error message contains enough information to infer what you actually meant. Indeed, what really matters here is that you tried to call gte on a dict object : you need the type, not the object itself.
If you had written {'foo': 'bar'}.get('foob') the situation would be much trickier to handle and I'd be happy to know if anyone had a solution.
Step one
Check that you are handling an AttributeError (using the first argument of the hook).
Step two
Retrieve the relevant information from the message (using the second argument). I did this with regexp. Please note that this exception can take multiple forms depending on the version of Python, the object you are calling the method on, etc.
So far, my regexp is : "^'?(\w+)'? (?:object|instance) has no attribute '(\w+)'$"
Step three
Get the type object corresponding to the type ('dict' in your case) so that you can call dir() on it. A dirty solution would be just use eval(type) but you can do better and cleaner by reusing the information in the trace (third argument of your hook) : the last element of the trace contains the frame in which the exception occured and in that frame, the type was properly defined (either as a local type, a global type or a builtin).
Once you have the type object, you just need to call dir() on it and extract the suggestion you like the most.
Please let me know if you need more details on what I did.

Getting a hold of locals() from surrounding frame

I'm trying to replace my Template(s).substitute("$a,$b", locals()) with something short like
sub("$a,$b")
However, I don't have access to locals of surrounding scope inside sub(), any idea how to get them?
One possible workaround I found is to throw an exception, catch it, and step along the frames to find the previous frame, but perhaps there's an easier way?
import traceback, sys, code
try:
2/0
except Exception as e:
type, value, tb = sys.exc_info()
traceback.print_exc()
last_frame = lambda tb=tb: last_frame(tb.tb_next) if tb.tb_next else tb
frame = last_frame().tb_frame
ns = dict(frame.f_globals)
Try using sys._current_frames() instead of raising exception.
Possible alternatives: sys._getframe(), inspect.currentframe(), inspect.stack()
I cant think of better solution, than analysing frames.
You can access it directly viasys._getframe(), although it's only guaranteed to work with CPython.
from string import Template
import sys
def sub(template):
namespace = sys._getframe(1).f_locals # caller's locals
return Template(template).substitute(namespace)
a, b = 1, 42
print sub("$a,$b") # -> 1,42

Python function local variable scope during exceptions

Background: I'm doing COM programming of National Instruments' TestStand in Python. TestStand complains if objects aren't "released" properly (it pops up an "objects not released properly" debug dialog box). The way to release the TestStand COM objects in Python is to ensure all variables no longer contain the object—e.g. del() them, or set them to None. Or, as long as the variables are function local variables, the object is released as soon as the variable goes out of scope when the function ends.
Well, I've followed this rule in my program, and my program releases object properly as long as there are no exceptions. But if I get an exception, then I'm getting the "objects not released" message from TestStand. This seems to indicate that function local variables aren't going out of scope normally, when an exception happens.
Here is a simplified code example:
class TestObject(object):
def __init__(self, name):
self.name = name
print("Init " + self.name)
def __del__(self):
print("Del " + self.name)
def test_func(parameter):
local_variable = parameter
try:
pass
# raise Exception("Test exception")
finally:
pass
# local_variable = None
# parameter = None
outer_object = TestObject('outer_object')
try:
inner_object = TestObject('inner_object')
try:
test_func(inner_object)
finally:
inner_object = None
finally:
outer_object = None
When this runs as shown, it shows what I expect:
Init outer_object
Init inner_object
Del inner_object
Del outer_object
But if I uncomment the raise Exception... line, instead I get:
Init outer_object
Init inner_object
Del outer_object
Traceback (most recent call last):
...
Exception: Test exception
Del inner_object
The inner_object is deleted late due to the exception.
If I uncomment the lines that set both parameter and local_variable to None, then I get what I expect:
Init outer_object
Init inner_object
Del inner_object
Del outer_object
Traceback (most recent call last):
...
Exception: Test exception
So when exceptions happen in Python, what exactly happens to function local variables? Are they being saved somewhere so they don't go out of scope as normal? What is "the right way" to control this behaviour?
Your exception-handling is probably creating reference loops by keeping references to frames. As the docs put it:
Note Keeping references to frame
objects, as found in the first element
of the frame records these functions
return [[NB: "these functions" here refers to
some in module inspect, but the rest of the
paragraph applies more widely!]], can cause your program to
create reference cycles. Once a
reference cycle has been created, the
lifespan of all objects which can be
accessed from the objects which form
the cycle can become much longer even
if Python’s optional cycle detector is
enabled. If such cycles must be
created, it is important to ensure
they are explicitly broken to avoid
the delayed destruction of objects and
increased memory consumption which
occurs. Though the cycle detector will
catch these, destruction of the frames
(and local variables) can be made
deterministic by removing the cycle in
a finally clause. This is also
important if the cycle detector was
disabled when Python was compiled or
using gc.disable(). For example:
def handle_stackframe_without_leak():
frame = inspect.currentframe()
try:
# do something with the frame
finally:
del frame
A function's scope is for the entire function. Handle this in finally.
According to this answer for another question, it is possible to inspect local variables on the frame in an exception traceback via tb_frame.f_locals. So it does look as though the objects are kept "alive" for the duration of the exception handling.

Categories