Raising multiple exceptions from threads in Python - python

tl;dr I spawn 3 threads, each thread throws an exception, most pythonic way to raise all 3 exceptions?
Below is a code example that is similar to what I am doing.
from multiprocessing.pool import ThreadPool
def fail_func(host):
raise Exception('{} FAILED!!!'.format(host))
hosts = ['172.1.1.1', '172.1.1.2', '172.1.1.3']
pool = ThreadPool(processes=5)
workers = [pool.apply_async(fail_func(host)) for host in hosts]
# join and close thread pool
pool.join(); pool.close()
# get the exceptions
[worker.get() for worker in workers if not worker.successful()]
What it ends up doing is just failing on the 1st host with the following traceback:
Traceback (most recent call last):
File "thread_exception_example.py", line 8, in <module>
workers = [pool.apply_async(fail_func(host)) for host in hosts]
File "thread_exception_example.py", line 4, in fail_func
raise Exception('{} FAILED!!!'.format(host))
Exception: 172.1.1.1 FAILED!!!
But what I want it to do is raise multiple exceptions for each thread that failed, like so:
Traceback (most recent call last):
File "thread_exception_example.py", line 8, in <module>
workers = [pool.apply_async(fail_func(host)) for host in hosts]
File "thread_exception_example.py", line 4, in fail_func
raise Exception('{} FAILED!!!'.format(host))
Exception: 172.1.1.1 FAILED!!!
Traceback (most recent call last):
File "thread_exception_example.py", line 8, in <module>
workers = [pool.apply_async(fail_func(host)) for host in hosts]
File "thread_exception_example.py", line 4, in fail_func
raise Exception('{} FAILED!!!'.format(host))
Exception: 172.1.1.2 FAILED!!!
Traceback (most recent call last):
File "thread_exception_example.py", line 8, in <module>
workers = [pool.apply_async(fail_func(host)) for host in hosts]
File "thread_exception_example.py", line 4, in fail_func
raise Exception('{} FAILED!!!'.format(host))
Exception: 172.1.1.3 FAILED!!!
is there any pythonic way of doing this? or do I need to wrap everything in a try/except, collect all the messages, then re-raise a single Exception?

There is no way to "raise multiple exceptions". In a given exception context, there is either an exception, or not.
So yes, you will have to create a wrapper exception that holds all of the exceptions, and raise that. But you've almost got all the code you need:
def get_exception():
try:
worker.get()
except Exception as e:
return e
Now, instead of:
[worker.get() for worker in workers if not worker.successful()]
… you can just do:
[get_exception(worker.get) for worker in workers if not worker.successful()]
And that's a list of exceptions.
Personally, I've always thought AsyncResult should have an exception method, similar to the one in concurrent.futures.Future. But then I would have used futures here in the first place (installing the backport if I were forced to use Python 2.x).

Related

How to handle initializer error in multiprocessing.Pool?

When initializer throw Error like below, script won't stop.
I would like to abort before starting main process(do not run 'do_something').
from multiprocessing import Pool
import contextlib
def initializer():
raise Exception("init failed")
def do_something(args):
# main process
pass
pool = Pool(1, initializer=initializer)
with contextlib.closing(pool):
try:
pool.map_async(do_something, [1]).get(100)
except:
pool.terminate()
The never stopping stacktrace on console is below
...
Exception: init failed
Process ForkPoolWorker-18:
Traceback (most recent call last):
File "/home/hoge/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/hoge/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/hoge/anaconda3/lib/python3.6/multiprocessing/pool.py", line 103, in worker
initializer(*initargs)
File "hoge.py", line 5, in initializer
raise Exception("init failed")
Exception: init failed
...
My workaround is suppressing initializer error and return at the beginning of the main process by using global flag like below.
But I would like to learn better one.
def initializer():
try:
raise Exception("init failed")
except:
global failed
failed = True
def do_something(args):
global failed
if failed:
# skip when initializer failed
return
# main process
After navigating through the implementation of multiprocessing using PyCharm, I'm convinced that there is no better solution, because Pool started a thread to _maintain_pool() by _repopulate_pool() if any worker process exists--either accidentally or failed to initialize.
Check this out: Lib/multiprocessing/pool.py line 244
I just came across the same woe. My first solution was to catch the exception and raise it in the worker function (see below). But on second thought it really means that initializer support of multiprocessing.Pool is broken and sould not be used. So I now prefer to do the initialization stuff directly in the worker.
from multiprocessing import Pool
import contextlib, sys
_already_inited = False
def initializer():
global _already_inited
if _already_inited:
return
_already_inited = True
raise Exception("init failed")
def do_something(args):
initializer()
# main process
pool = Pool(1)
with contextlib.closing(pool):
pool.map_async(do_something, [1]).get(100)
Both the code and the stacktrace are simpler.
Off course all your worker function need to call initializer().
My initial solution was to defer the exception to the worker function.
from multiprocessing import Pool
import contextlib, sys
failed = None
def initializer():
try:
raise Exception("init failed")
except:
global failed
failed = sys.exc_info()[1]
def do_something(args):
global failed
if failed is not None:
raise RuntimeError(failed) from failed
# main process
pool = Pool(1, initializer=initializer)
with contextlib.closing(pool):
pool.map_async(do_something, [1]).get(100)
That way the caller still gets access to the exception.
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/tmp/try.py", line 15, in do_something
raise RuntimeError(failed)
RuntimeError: init failed
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/tmp/try.py", line 20, in <module>
pool.map_async(do_something, [1]).get(100)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
RuntimeError: init failed
(venv) kmkaplan#dev1:~/src/options$ python3 /tmp/try.py
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/tmp/try.py", line 7, in initializer
raise Exception("init failed")
Exception: init failed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/tmp/try.py", line 15, in do_something
raise RuntimeError(failed) from failed
RuntimeError: init failed
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/tmp/try.py", line 20, in <module>
pool.map_async(do_something, [1]).get(100)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
RuntimeError: init failed

Push a custom string to current Python stack / traceback

How can I implement a context manager with the following API:
s = "this is my message"
with PushStackFrame(s):
raise RuntimeError("something")
such that when RuntimeError is raised, I get the following message:
Traceback (most recent call last):
File "foo.py", line 4, in <module>
## PushStackFrame: this is my message ## KEY LINE
File "foo.py", line 5, in <module>
raise RuntimeError("something")
RuntimeError: something
Most importantly, I want the string passed to PushStackFrame to be inserted verbatim into the stack trace, I don't want to see just the code.
One way to do this is to catch the exception on the way out of the context manager, figure out where in the traceback the context manager was called, and insert a new traceback frame, before rethrowing the exception with traceback. I'd prefer not to do this.

debugging errors in python multiprocessing

I'm using the Pool function of the multiprocessing module in order to run the same code in parallel on different data.
It turns out that on some data my code raises an exception, but the precise line in which this happens is not given:
Traceback (most recent call last):
File "my_wrapper_script.py", line 366, in <module>
main()
File "my_wrapper_script.py", line 343, in main
results = pool.map(process_function, folders)
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 148, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 422, in get
raise self._value
KeyError: 'some_key'
I am aware of multiprocessing.log_to_stderr() , but it seems that it is useful when concurrency issues arise, which is not my case.
Any ideas?
If you're using a new enough version of Python, you'll actually see the real exception get printed prior to that one. For example, here's a sample that fails:
import multiprocessing
def inner():
raise Exception("FAIL")
def f():
print("HI")
inner()
p = multiprocessing.Pool()
p.apply(f)
p.close()
p.join()
Here's the exception when running this with python 3.4:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.4/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "test.py", line 9, in f
inner()
File "test.py", line 4, in inner
raise Exception("FAIL")
Exception: FAIL
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "test.py", line 13, in <module>
p.apply(f)
File "/usr/local/lib/python3.4/multiprocessing/pool.py", line 253, in apply
return self.apply_async(func, args, kwds).get()
File "/usr/local/lib/python3.4/multiprocessing/pool.py", line 599, in get
raise self._value
Exception: FAIL
If using a newer version isn't an option, the easiest thing to do is to wrap your worker function in a try/except block that will print the exception prior to re-raising it:
import multiprocessing
import traceback
def inner():
raise Exception("FAIL")
def f():
try:
print("HI")
inner()
except Exception:
print("Exception in worker:")
traceback.print_exc()
raise
p = multiprocessing.Pool()
p.apply(f)
p.close()
p.join()
Output:
HI
Exception in worker:
Traceback (most recent call last):
File "test.py", line 11, in f
inner()
File "test.py", line 5, in inner
raise Exception("FAIL")
Exception: FAIL
Traceback (most recent call last):
File "test.py", line 18, in <module>
p.apply(f)
File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 244, in apply
return self.apply_async(func, args, kwds).get()
File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
Exception: FAIL
You need to implement your own try/except block in the worker. Depending on how you want to organize your code, you could log to stderr as you mention above, log to some other place like a file, return some sort of error code or even tag the exception with the current traceback and re-raise. Here's an example of the last technique:
import traceback
import multiprocessing as mp
class MyError(Exception):
pass
def worker():
try:
# your real code here
raise MyError("boom")
except Exception, e:
e.traceback = traceback.format_exc()
raise
def main():
pool = mp.Pool()
try:
print "run worker"
result = pool.apply_async(worker)
result.get()
# handle exceptions you expect
except MyError, e:
print e.traceback
# re-raise the rest
except Exception, e:
print e.traceback
raise
if __name__=="__main__":
main()
It returns
run worker
Traceback (most recent call last):
File "doit.py", line 10, in worker
raise MyError("boom")
MyError: boom

Python print last traceback only?

Consider the following code and traceback:
>>> try:
... raise KeyboardInterrupt
... except KeyboardInterrupt:
... raise Exception
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
KeyboardInterrupt
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
Exception
>>>
I'd like to print only the most recent traceback (the one in which Exception was raised).
How can this be achieved?
From the above example, I'd like to print the following, as if raise Exception had been called outside the except clause.
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
Exception
The perfect question for me.
You can suppress the exception context, that is the first part of the traceback, by explicitly raising the exception from None:
>>> try:
raise KeyboardInterrupt
except:
raise Exception from None
Traceback (most recent call last):
File "<pyshell#4>", line 4, in <module>
raise Exception from None
Exception
This was formalized in PEP 409 and further improved in PEP 415. The original bug request for this was filed by myself btw.
Note that suppressing the context will not actually remove the context from the new exception. So you can still access the original exception:
try:
try:
raise Exception('inner')
except:
raise Exception('outer') from None
except Exception as e:
print(e.__context__) # inner

Custom python traceback or debug output

I have a traceback print and want to customize the last part of it.
What: The error occurred in another process and traceback lies there (as is the case in multiprocessing).
Problem: I want to have the full traceback and error report.
Similar to this code:
>>> def f():
g()
>>> def g():
raise Exception, Exception(), None ## my traceback here
>>> f()
Traceback (most recent call last):
File "<pyshell#14>", line 1, in <module>
f()
File "<pyshell#8>", line 2, in f
g()
File "<pyshell#11>", line 2, in g
raise Exception, Exception(), None ## my traceback starts here
my traceback appears here
my traceback appears here
Exception
Impossible "Solutions": subclass and mock-object
>>> from types import *
>>> class CostomTB(TracebackType):
pass
Traceback (most recent call last):
File "<pyshell#125>", line 1, in <module>
class CostomTB(TracebackType):
TypeError: Error when calling the metaclass bases
type 'traceback' is not an acceptable base type
>>> class CostomTB(object):
pass
>>> try: zzzzzzzzz
except NameError:
import sys
ty, err, tb = sys.exc_info()
raise ty, err, CostomTB()
Traceback (most recent call last):
File "<pyshell#133>", line 5, in <module>
raise ty, err, CostomTB()
TypeError: raise: arg 3 must be a traceback or None
I am using python 2.7.
I guess you want the full traceback stack. See this which is having very good examples python logging module.
If some confusion comes See the logging documentation.
You mentioned a separate process: if your problem is to capture the traceback in process A and show it in process B, as if the exception was actually raised in the latter, then I'm afraid there is no clean way to do it.
I would suggest to serialize the traceback in process A, send it to process B and from there raise a new exception that includes the former in its description. The result is a somewhat longer output, but it carries information about both processes stacks.
In the following example there aren't really two separate processes, but I hope it makes my point clearer:
import traceback, StringI
def functionInProcessA():
raise Exception('Something happened in A')
class RemoteException(Exception):
def __init__(self, tb):
Exception.__init__(self, "Remote traceback:\n\n%s" % tb)
def controlProcessB():
try:
functionInProcessA()
except:
fd = StringIO.StringIO()
traceback.print_exc(file=fd)
tb = fd.getvalue()
raise RemoteException(tb)
if __name__ == '__main__':
controlProcessB()
Output:
Traceback (most recent call last):
File "a.py", line 20, in <module>
controlProcessB()
File "a.py", line 17, in controlProcessB
raise RemoteException(tb)
__main__.RemoteException: Remote traceback:
Traceback (most recent call last):
File "a.py", line 12, in controlProcessB
functionInProcessA()
File "a.py", line 4, in functionInProcessA
raise Exception('Something happened in A')
Exception: Something happened in A

Categories