If a process is running and for example the user accidentally terminates via the Task Manager or the machine reboots (thus forcefully terminating processes), how can I register such an event that the process will execute some task before completely terminating?
What I've tried unsuccessfully is:
from signal import signal
from signal import SIGTERM
def foo():
print('hello world')
if __name__ == '__main__':
signal(SIGTERM, foo)
while True:
pass
I'll run this from the command line, then navigate to task manager and end the task but foo is never called.
Based on the answer to Can I handle the killing of my windows process through the Task Manager? - it seems like the task manager kills processes using an unmaskable signal (equivalent to Linux's SIGKILL). This means that you cannot catch it.
There are other signals you can catch in Windows, like SIGBREAK, SIGCHLD, CTRL_C_EVENT and CTRL_BREAK_EVENT, but I guess the task manager does not use any of those for terminating a process.
Related
I'm using Celery to distribute multiple long running tasks. The celery worker is launched inside a container and scaled using docker-compose.
My goal is to be able to handle a clean shutdown on the reception of a signal like SIGINT, where a running task would be able to execute some code and quit, without waiting the end of it's execution.
An example would be the following :
#app.task(ignore_result=True)
def test():
try:
# Long running code
import time
time.sleep(1000)
except SystemExit:
# Store Progression
print('Exit !!!!!!!!!')
When I'm not using Celery, I'm able to catch the signal SIGINT and raise a SystemExit with the following code :
import signal
import sys
def sigterm_handler(_signo, _stack_frame):
# Raises SystemExit(0):
sys.exit(0)
signal.signal(signal.SIGTERM, sigterm_handler)
With Celery I'm only able to catch the signal at the worker level, but I'm left with no way to share this signal with a running task.
Other answers indicate the need to use worker_shutting_down, but I've still the same issue to comminicate with the running tasks.
I am creating a custom job scheduler with a web frontend in python 3.4 on linux. This program creates a daemon (consumer) thread that waits for jobs to come available in a PriorityQueue. These jobs can manually be added through the web interface which adds them to the queue. When the consumer thread finds a job, it executes a program using subprocess.run, and waits for it to finish.
The basic idea of the worker thread:
class Worker(threading.Thread):
def __init__(self, queue):
self.queue = queue
# more code here
def run(self):
while True:
try:
job = self.queue.get()
#do some work
proc = subprocess.run("myprogram", timeout=my_timeout)
#do some more things
except TimeoutExpired:
#do some administration
self.queue.add(job)
However:
This consumer should be able to receive some kind of signal from the frontend (main thread) that it should stop the current job and instead work on the next job in the queue (saving the state of the current job and adding it to the end of the queue again). This can (and will most likely) happen while blocked on subprocess.run().
The subprocesses can simply be killed (the program that is executed saves sme state in a file) but the worker thread needs to do some administration on the killed job to make sure it can be resumed later on.
There can be multiple such worker threads.
Signal handlers are not an option (since they are always handled by the main thread which is a webserver and should not be bothered with this).
Having an event loop in which the process actively polls for events (such as the child exiting, the timeout occurring or the interrupt event) is in this context not really a solution but an ugly hack. The jobs are performance-heavy and constant context switches are unwanted.
What synchronization primitives should I use to interrupt this thread or to make sure it waits for several events at the same time in a blocking fashion?
I think you've accidentally glossed over a simple solution: your second bullet point says that you have the ability to kill the programs that are running in subprocesses. Notice that subprocess.call returns the return code of the subprocess. This means that you can let the main thread kill the subprocess, and just check the return code to see if you need to do any cleanup. Even better, you could use subprocess.check_call instead, which will raise an exception for you if the returncode isn't 0. I don't know what platform you're working on, but on Linux, killed processes generally don't return a 0 if they're killed.
It could look something like this:
class Worker(threading.Thread):
def __init__(self, queue):
self.queue = queue
# more code here
def run(self):
while True:
try:
job = self.queue.get()
#do some work
subprocess.check_call("myprogram", timeout=my_timeout)
#do some more things
except (TimeoutExpired, subprocess.CalledProcessError):
#do some administration
self.queue.add(job)
Note that if you're using Python 3.5, you can use subprocess.run instead, and set the check argument to True.
If you have a strong need to handle the cases where the worker needs to be interrupted when it isn't running the subprocess, then I think you're going to have to use a polling loop, because I don't think the behavior you're looking for is supported for threads in Python. You can use a threading.Event object to pass the "stop working now" pseudo-signal from your main thread to the worker, and have the worker periodically check the state of that event object.
If you're willing to consider using multiple processing stead of threads, consider switching over to the multiprocessing module, which would allow you to handle signals. There is more overhead to spawning full-blown subprocesses instead of threads, but you're essentially looking for signal-like asynchronous behavior, and I don't think Python's threading library supports anything like that. One benefit though, would be that you would be freed from the Global Interpreter Lock(PDF link), so you may actually see some speed benefits if your worker processes (formerly threads) are doing anything CPU intensive.
I am limited to python2.5, and I thought that threading.Thread was asynchronous. I run: python t.py and the script does not return to the shell until 3 seconds have gone by, which means its blocking. Why is it blocking?
My Code:
#!/usr/bin/python
import threading,time
def doit():
time.sleep(3)
print "DONE"
thr = threading.Thread(target=doit, args=(), kwargs={})
thr.start() # will run "foo"
By default, threads in Python are non-daemonic. A Python application will not exit until the all non-daemon threads have completed, so in your case it won't exit until doit has finished. If you want to script to exit immediately upon reaching the end of the main thread, you need to make the thread a daemon, by setting the daemon attribute prior to starting the thread:
thr = threading.Thread(target=doit, args=(), kwargs={})
thr.daemon = True
thr.start()
Threading in Python is "kind-of" asynchronous. What does this mean?
Only one thread can be running Python code at one time
threads that are Python code and CPU intensive will not benefit
Your issue seems to be that you think a Python thread should keep running after Python itself quits -- that's not how it works. If you do make a thread a daemon then when Python quits those threads just die, instantly -- no cleanup, no error recovery, just dead.
If you want to actually make a daemon process, something that keeps running in the background after the main application exits, you want to look at os.fork(). If you want to do it the easier way, you can try my daemon library, pandaemonium
I have a long-running Python process that I want to be able to terminate in the event it gets hung-up and stops reporting progress. But I want to signal it in a way that allows it to safely cleanup, in case it hasn't completely hung-up and there's still something running that can respond to signals gracefully. What's the best order of signals to send before outright killing it?
I'm currently doing something like:
def safe_kill(pid):
for sig in [SIGTERM, SIGABRT, SIGINT, SIGKILL]:
os.kill(pid, sig)
time.sleep(1)
if not pid_exists(pid):
return
Is there a better order? I know SIGKILL bypasses the process entirely, but is there any significant difference between SIGTERM/SIGABRT/SIGINT or do they all have the same effect as far as Python is concerned?
I believe the proper way for stopping a process is SIGTERM followed by SIGKILL after a small timeout.
I don't think that SIGINT and SIGABRT are necessary if that process handles signals in a standard way. SIGINT is usually handled the same way as SIGTERM and SIGABRT is usually used by process itself on abort() (wikipedia).
Anything more complex than a small script usually implements custom SIGTERM handling to shutdown gracefully (cleaning up all the resources, etc).
For example, take a look at Upstart. It is an init daemon - it starts and stops most of processes in Ubuntu and some other distributions. The default Upstart behavior for stopping a process is to send SIGTERM, wait 5 seconds and send SIGKILL (source - upstart cookbook).
You probably should do some testing to determine the best timeout for your process.
You need to register a signal handler, as you would do in C.
import signal
import sys
def clean_termination(signal):
# perform your cleanup
sys.exit(1)
# register the signal handler for the signals specified in the question
signal.signal(signal.SIGTERM, clean_termination)
signal.signal(signal.SIGABRT, clean_termination)
Note that Python maps the SIGINT signal to a KeyboardInterrupt exception, that you can catch with a regular except statement.
I'm using Python in a webapp (CGI for testing, FastCGI for production) that needs to send an occasional email (when a user registers or something else important happens). Since communicating with an SMTP server takes a long time, I'd like to spawn a thread for the mail function so that the rest of the app can finish up the request without waiting for the email to finish sending.
I tried using thread.start_new(func, (args)), but the Parent return's and exits before the sending is complete, thereby killing the sending process before it does anything useful. Is there anyway to keep the process alive long enough for the child process to finish?
Take a look at the thread.join() method. Basically it will block your calling thread until the child thread has returned (thus preventing it from exiting before it should).
Update:
To avoid making your main thread unresponsive to new requests you can use a while loop.
while threading.active_count() > 0:
# ... look for new requests to handle ...
time.sleep(0.1)
# or try joining your threads with a timeout
#for thread in my_threads:
# thread.join(0.1)
Update 2:
It also looks like thread.start_new(func, args) is obsolete. It was updated to thread.start_new_thread(function, args[, kwargs]) You can also create threads with the higher level threading package (this is the package that allows you to get the active_count() in the previous code block):
import threading
my_thread = threading.Thread(target=func, args=(), kwargs={})
my_thread.daemon = True
my_thread.start()
You might want to use threading.enumerate, if you have multiple workers and want to see which one(s) are still running.
Other alternatives include using threading.Event---the main thread sets the event to True and starts the worker thread off. The worker thread unsets the event when if finishes work, and the main check whether the event is set/unset to figure out if it can exit.