I'm using a custom timeout exception to get around iter(subprocess.Popen.stdout.readline,'') blocking when there is no more output to read, but the exception isn't being caught properly. This is a code that has both a main process and a separate process (implemented with multiprocessing.Process), where timeouts can happen in either. The relevant sections are:
class Timeout(Exception):
def __init__(self, message):
self.message = message
def handle_timeout(signal, frame):
raise Timeout("Timed out")
This custom exception is caught just fine in the main process, but in the child process, whenever the Timeout is raised, it is never caught despite using (I believe) the appropriate standard syntax:
from subprocess import Popen, PIPE
subProc = Popen(('tail', '-f', fileName), stdout=PIPE, stderr=PIPE, shell=False, close_fds=True)
lines = iter(subProc.stdout.readline,'')
for line in lines:
try:
process_line(line)
except Timeout as time_out:
print(time_out.message)
subProc.terminate()
break
Instead of printing the timeout message and terminating subProc, I get the following output:
Traceback (most recent call last):
File "/home/username/anaconda2/envs/Py2.7/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
self.run()
File "reader.py", line 50, in run
for line in lines:
File "reader.py", line 13, in handle_timeout
raise Timeout("Timed out")
Timeout
handle_timeout appears to be working fine since the timeout is being raised, but the exception handling is being ignored or skipped. Am I doing anything wrong syntax-wise, or do I need to define a separate custom exception, presumably within the child process?
Edit:
The second code block before was incomplete. Here it is, as it currently exists (with chepner's advice on the irrelevance of iter(stdout.readline,'') included):
from subprocess import Popen, PIPE
signal.signal(signal.SIGALRM, handle_timeout)
subProc = Popen(('tail', '-f', fileName), stdout=PIPE, stderr=PIPE, shell=False, close_fds=True)
for line in subProc.stdout:
signal.alarm(CHILD_TIMEOUT)
try:
process_line(line)
except Timeout as time_out:
print(time_out.message)
subProc.terminate()
break
In the parent process (where the timeout exception works exactly as desired), the format is:
# signal masking as in last block
while True:
try:
signal.alarm(MASTER_TIMEOUT) # different from CHILD_TIMEOUT
other_processing()
except Timeout:
shutDown(children) # method to shut down child processes
break
SOLVED:
I've found a solution.
subProc = Popen(('tail', '-f', fileName), stdout=PIPE, stderr=PIPE, shell=False, close_fds=True)
while not exit.is_set(): # exit is defined by multiprocessing.Event()
signal.alarm(3)
try:
for line in subProc.stdout:
process_line(line)
except Timeout:
print("Process timed out while waiting for log output")
subProc.terminate()
exit.set()
Now when the alarm goes off, the timeout exception is raised and caught as it should be, ending the subprocess before triggering the exit condition, after which the child process shuts down gracefully.
You can't actually trap an error inside a subproccess the way your working your code. What you think of as error handling using an event to catch or what not is actually a subprocess being raised, executing your code, and managing the response. Since you are using popopen to manually control the subprocess you need to manually process its response.
When your subprocess ends it should return a 0. If it returns a -1 or 1 that means an error has occurred you and need to read from stderr to capture the error.
Edit1
I see your problem. The way you have it written the handler handle_timeout will grab the error and re-raise it every-time. You can't handle an exception in multiple places. As it is you have two separate functions trying to handle the same error concurrently. This will always produce a conflict and the first one that catches the error will cause your main process to exit. You can do a couple different things here, but let me implore you - do not eat an error for no reason.
fix 1:
Remove your error handler
def handle_timeout(signal, frame):
raise Timeout("Timed out")
fix 2:
try:
process_line(line)
finally:
subProc.terminate()
The above will guarantee the termination of the sub process without eating an error. Also, catching the error with a custom handle like your handle_timeout handler is a technique almost exclusively used to deconstruct a complex run or object before re-raising the error. Its a last ditch solution basically for when you have A LOT of clean up after a particular error. If you want to do that, do not use an except block.
Related
Suppose we have the two files namely mymanger.py and mysub.py.
mymanager.py
import time
from multiprocessing import Process
import mysub # the process file
def main():
xprocess = Process(
target=mysub.main,
)
xprocess.start()
xprocess.join()
time.sleep(1)
print(f"== Done, errorcode is {xprocess.exitcode} ==")
if __name__ == '__main__':
main()
mysub.py
import sys
def myexception(exc_type, exc_value, exc_traceback):
print("I want this to be printed!")
print("Uncaught exception", exc_type, exc_value, exc_traceback)
def main():
sys.excepthook = myexception # !!!
raise ValueError()
if __name__ == "__main__":
sys.exit()
When executing mymanager.py the resulting output is:
Process Process-1:
Traceback (most recent call last):
File "c:\program files\python\3.9\lib\multiprocessing\process.py", line 315, in _bootstrap
self.run()
File "c:\program files\python\3.9\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\lx\mysub.py", line 11, in main
raise ValueError()
ValueError
== Done, errorcode is 1 ==
When the output i expected would be something like:
I want this to be printed!
Uncaught exception <class 'ValueError'> <traceback object at 0x0000027B6F952780>
which is what i get if i execute main from mysub.py without the multiprocessing.Process.
I've checked the underlying cpython (reference) and the problem seems to be that the try-except in the _boostrap function takes precedence over my child processes sys.excepthook but from my understanding, shouldn't the excepthook from the childs process fire first and then trigger the except from the _boostrap?
I need the child process to handle the exception using the sys.excepthook function.
How can i achieve that?
sys.excepthook is invoked when an exception goes uncaught (bubbling all the way out of the running program). But Process objects run their target function in a special bootstrap function (BaseProcess._bootstrap if it matters to you) that intentionally catches all exceptions, prints information about the failing process plus the traceback, then returns an exit code to the caller (a launcher that varies by start method).
When using the fork start method, the caller of _bootstrap then exits the worker with os._exit(code) (a "hard exit" command which bypasses the normal exception handling system, though since your exception was already caught and handled this hardly matters). When using 'spawn', it uses plain sys.exit over os._exit, but AFAICT the SystemExit exception that sys.exit is implemented in terms of is special cased in the interpreter so it doesn't pass through sys.excepthook when uncaught (presumably because it being implemented via exceptions is considered an implementation detail; when you ask to exit the program it's not the same as dying with an unexpected exception).
Summarizing: No matter the start method, there is no possible way for an exception raised by your code to be "unhandled" (for the purposes of reaching sys.excepthook), because multiprocessing handles all exceptions your function can throw on its own. It's theoretically possible to have an excepthook you set in the worker execute for exceptions raised after your target completes if the multiprocessing wrapper code itself raises an exception, but only if you do pathological things like replace the definition of os._exit or sys.exit (and it would only report the horrible things that happened because you replaced them, your own exception was already swallowed by that point, so don't do that).
If you really want to do this, the closest you could get would be to explicitly catch exceptions and manually call your handler. A simple wrapper function would allow this for instance:
def handle_exceptions_with(excepthook, target, /, *args, **kwargs)
try:
target(*args, **kwargs)
except:
excepthook(*sys.exc_info())
raise # Or maybe convert to sys.exit(1) if you don't want multiprocessing to print it again
changing your Process launch to:
xprocess = Process(
target=handle_exceptions_with,
args=(mysub.myexception, mysub.main)
)
Or for one-off use, just be lazy and only rewrite mysub.main as:
def main():
try:
raise ValueError()
except:
myexception(*sys.exc_info())
raise # Or maybe convert to sys.exit(1) if you don't want multiprocessing to print it again
and leave everything else untouched. You could still set your handler in sys.excepthook and/or threading.excepthook() (to handle cases where a thread launched in the worker process might die with an unhandled exception), but it won't apply to the main thread of the worker process (or more precisely, there's no way for an exception to reach it).
I am using subprocess to execute a python script. I can successfully run the script.
import subprocess
import time
proc = subprocess.Popen(['python', 'test.py', ''], shell=True)
proc.communicate()
time.sleep(10)
proc.terminate()
test.py
import time
while True:
print("I am inside test.py")
time.sleep(1)
I can see the message I am inside test.py printed every second. However, I am not able to terminate the script while it is still running by proc.terminate().
Could someone kindly help me out?
proc.communicate waits for the process to finish unless you include a timeout. Your process doesn't exit and you don't have a timeout so communicate never returns. You could verify that with a print after the call. Instead, add a timeout and catch its exception
import subprocess
import time
proc = subprocess.Popen(['python', 'test.py', ''], shell=True)
try:
proc.communicate(timeout=10)
except subprocess.TimeoutExpired:
proc.terminate()
First things first, don't use a list to pass in the arguments for the subprocess.Popen() if you're using the shell = True!, change the command to string, "python test.py".
Popen.communicate(input=None, timeout=None) is a blocking class method, it shall Interact with process, and Wait for process to terminate and set the returncode attribute.
since your test.py running infinite while loop, he will never return !
you have 2 options to timeout the process proc that you have spawned:
assign the timeout keyword argument in the,e.g. timing the process for 5 seconds, communicate(timeout=5) method. If the process proc does not terminate after timeout seconds, a TimeoutExpired exception will be raised. Catching this exception and retrying communication will not lose any output (in your case you dont need the child outputs, but i will mention this in the example below). ATTENTION The child process is not killed if the timeout expires, so in order to cleanup properly a well-behaved application should kill the child process (proc) and finish communication.
by using the poll method and do the timing by your calling method.
communicate with timeout
try:
outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
proc.kill()
outs, errs = proc.communicate()
poll with time.sleep
proc = subprocess.Popen('python test.py', shell=True)
t=10
while proc.poll() is None and t >= 0:
print('Still sleeping')
time.sleep(1)
t -= 1
proc.kill()
Hi I embeded a time constraint in to my python code which is running a fortran code with a function. However I realized that the function that puts a time constraint on the other function doesn't terminates the code, just leaves it in background and skips it instead. What I want to do is terminate the code, when it exceed the constrained time. Here is the code that I'm using to constrain time which is taken from here.
def timeout(func, args=(), kwargs={}, timeout_duration=15, default=1):
import signal
class TimeoutError(Exception):
pass
def handler(signum, frame):
raise TimeoutError()
# set the timeout handler
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout_duration)
try:
result = func(*args, **kwargs)
except TimeoutError as exc:
result = default
finally:
signal.alarm(0)
return result
I looked up popen.terminate(), sys.exit() and atexit.register() but couldn't figure out how it will work with this piece of code which I tried to add in part below that I showed in comment.
...
result = func(*args, **kwargs)
except TimeoutError as exc:
result = default
#add the terminator
finally:
...
NOTE: Function is inside a for loop chain so I dont want to kill whole python session but just want to kill the program that this function runs which is a fortran code and skip to the other element in the for loop chain.
Part below added after some comments and answers:
I tried to add SIGTERM with popen.terminate() however it terminated all python session which I just want to terminate current running session and skip to the other elements in the for loop. what I did is as follows:
...
signal.signal(signal.SIGTERM, handler)
signal.alarm(timeout_duration)
try:
result = func(*args, **kwargs)
except TimeoutError as exc:
result = default
popen.terminate()
...
You cannot expect the signal handler raising an Exception to get propagated through the call stack, it's invoked in a different context.
popen.terminate() will generate SIGTERM on posix systems so you should have a signal handler for SIGTERM and not SIGALRM.
Instead of raising an exception in your signal handler, you should set some variable that you periodically check in order to halt activity.
Alternatively if you don't have a signal handler for SIGTERM the default handler will probably generate a KeyboardInterrupt exception.
My understanding is that finally clauses must *always* be executed if the try has been entered.
import random
from multiprocessing import Pool
from time import sleep
def Process(x):
try:
print x
sleep(random.random())
raise Exception('Exception: ' + x)
finally:
print 'Finally: ' + x
Pool(3).map(Process, ['1','2','3'])
Expected output is that for each of x which is printed on its own by line 8, there must be an occurrence of 'Finally x'.
Example output:
$ python bug.py
1
2
3
Finally: 2
Traceback (most recent call last):
File "bug.py", line 14, in <module>
Pool(3).map(Process, ['1','2','3'])
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 225, in map
return self.map_async(func, iterable, chunksize).get()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 522, in get
raise self._value
Exception: Exception: 2
It seems that an exception terminating one process terminates the parent and sibling processes, even though there is further work required to be done in other processes.
Why am I wrong? Why is this correct? If this is correct, how should one safely clean up resources in multiprocess Python?
Short answer: SIGTERM trumps finally.
Long answer: Turn on logging with mp.log_to_stderr():
import random
import multiprocessing as mp
import time
import logging
logger=mp.log_to_stderr(logging.DEBUG)
def Process(x):
try:
logger.info(x)
time.sleep(random.random())
raise Exception('Exception: ' + x)
finally:
logger.info('Finally: ' + x)
result=mp.Pool(3).map(Process, ['1','2','3'])
The logging output includes:
[DEBUG/MainProcess] terminating workers
Which corresponds to this code in multiprocessing.pool._terminate_pool:
if pool and hasattr(pool[0], 'terminate'):
debug('terminating workers')
for p in pool:
p.terminate()
Each p in pool is a multiprocessing.Process, and calling terminate (at least on non-Windows machines) calls SIGTERM:
from multiprocessing/forking.py:
class Popen(object)
def terminate(self):
...
try:
os.kill(self.pid, signal.SIGTERM)
except OSError, e:
if self.wait(timeout=0.1) is None:
raise
So it comes down to what happens when a Python process in a try suite is sent a SIGTERM.
Consider the following example (test.py):
import time
def worker():
try:
time.sleep(100)
finally:
print('enter finally')
time.sleep(2)
print('exit finally')
worker()
If you run it, then send it a SIGTERM, then the process ends immediately, without entering the finally suite, as evidenced by no output, and no delay.
In one terminal:
% test.py
In second terminal:
% pkill -TERM -f "test.py"
Result in first terminal:
Terminated
Compare that with what happens when the process is sent a SIGINT (C-c):
In second terminal:
% pkill -INT -f "test.py"
Result in first terminal:
enter finally
exit finally
Traceback (most recent call last):
File "/home/unutbu/pybin/test.py", line 14, in <module>
worker()
File "/home/unutbu/pybin/test.py", line 8, in worker
time.sleep(100)
KeyboardInterrupt
Conclusion: SIGTERM trumps finally.
The answer from unutbu definitely explains why you get the behavior you observe. However, it should emphasized that SIGTERM is sent only because of how multiprocessing.pool._terminate_pool is implemented. If you can avoid using Pool, then you can get the behavior you desire. Here is a borrowed example:
from multiprocessing import Process
from time import sleep
import random
def f(x):
try:
sleep(random.random()*10)
raise Exception
except:
print "Caught exception in process:", x
# Make this last longer than the except clause in main.
sleep(3)
finally:
print "Cleaning up process:", x
if __name__ == '__main__':
processes = []
for i in range(4):
p = Process(target=f, args=(i,))
p.start()
processes.append(p)
try:
for process in processes:
process.join()
except:
print "Caught exception in main."
finally:
print "Cleaning up main."
After sending a SIGINT is, example output is:
Caught exception in process: 0
^C
Cleaning up process: 0
Caught exception in main.
Cleaning up main.
Caught exception in process: 1
Caught exception in process: 2
Caught exception in process: 3
Cleaning up process: 1
Cleaning up process: 2
Cleaning up process: 3
Note that the finally clause is ran for all processes. If you need shared memory, consider using Queue, Pipe, Manager, or some external store like redis or sqlite3.
finally re-raises the original exception unless you return from it. The exception is then raised by Pool.map and kills your entire application. The subprocesses are terminated and you see no other exceptions.
You can add a return to swallow the exception:
def Process(x):
try:
print x
sleep(random.random())
raise Exception('Exception: ' + x)
finally:
print 'Finally: ' + x
return
Then you should have None in your map result when an exception occurred.
I need to do the following in Python. I want to spawn a process (subprocess module?), and:
if the process ends normally, to continue exactly from the moment it terminates;
if, otherwise, the process "gets stuck" and doesn't terminate within (say) one hour, to kill it and continue (possibly giving it another try, in a loop).
What is the most elegant way to accomplish this?
The subprocess module will be your friend. Start the process to get a Popen object, then pass it to a function like this. Note that this only raises exception on timeout. If desired you can catch the exception and call the kill() method on the Popen process. (kill is new in Python 2.6, btw)
import time
def wait_timeout(proc, seconds):
"""Wait for a process to finish, or raise exception after timeout"""
start = time.time()
end = start + seconds
interval = min(seconds / 1000.0, .25)
while True:
result = proc.poll()
if result is not None:
return result
if time.time() >= end:
raise RuntimeError("Process timed out")
time.sleep(interval)
There are at least 2 ways to do this by using psutil as long as you know the process PID.
Assuming the process is created as such:
import subprocess
subp = subprocess.Popen(['progname'])
...you can get its creation time in a busy loop like this:
import psutil, time
TIMEOUT = 60 * 60 # 1 hour
p = psutil.Process(subp.pid)
while 1:
if (time.time() - p.create_time()) > TIMEOUT:
p.kill()
raise RuntimeError('timeout')
time.sleep(5)
...or simply, you can do this:
import psutil
p = psutil.Process(subp.pid)
try:
p.wait(timeout=60*60)
except psutil.TimeoutExpired:
p.kill()
raise
Also, while you're at it, you might be interested in the following extra APIs:
>>> p.status()
'running'
>>> p.is_running()
True
>>>
I had a similar question and found this answer. Just for completeness, I want to add one more way how to terminate a hanging process after a given amount of time: The python signal library
https://docs.python.org/2/library/signal.html
From the documentation:
import signal, os
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError("Couldn't open device!")
# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
# This open() may hang indefinitely
fd = os.open('/dev/ttyS0', os.O_RDWR)
signal.alarm(0) # Disable the alarm
Since you wanted to spawn a new process anyways, this might not be the best soloution for your problem, though.
A nice, passive, way is also by using a threading.Timer and setting up callback function.
from threading import Timer
# execute the command
p = subprocess.Popen(command)
# save the proc object - either if you make this onto class (like the example), or 'p' can be global
self.p == p
# config and init timer
# kill_proc is a callback function which can also be added onto class or simply a global
t = Timer(seconds, self.kill_proc)
# start timer
t.start()
# wait for the test process to return
rcode = p.wait()
t.cancel()
If the process finishes in time, wait() ends and code continues here, cancel() stops the timer. If meanwhile the timer runs out and executes kill_proc in a separate thread, wait() will also continue here and cancel() will do nothing. By the value of rcode you will know if we've timeouted or not. Simplest kill_proc: (you can of course do anything extra there)
def kill_proc(self):
os.kill(self.p, signal.SIGTERM)
Koodos to Peter Shinners for his nice suggestion about subprocess module. I was using exec() before and did not have any control on running time and especially terminating it. My simplest template for this kind of task is the following and I am just using the timeout parameter of subprocess.run() function to monitor the running time. Of course you can get standard out and error as well if needed:
from subprocess import run, TimeoutExpired, CalledProcessError
for file in fls:
try:
run(["python3.7", file], check=True, timeout=7200) # 2 hours timeout
print("scraped :)", file)
except TimeoutExpired:
message = "Timeout :( !!!"
print(message, file)
f.write("{message} {file}\n".format(file=file, message=message))
except CalledProcessError:
message = "SOMETHING HAPPENED :( !!!, CHECK"
print(message, file)
f.write("{message} {file}\n".format(file=file, message=message))