Python multiprocessing.Process calls join by itself

Python multiprocessing.Process calls join by itself - python

I have this code:
class ExtendedProcess(multiprocessing.Process):
def __init__(self):
super(ExtendedProcess, self).__init__()
self.stop_request = multiprocessing.Event()
def join(self, timeout=None):
logging.debug("stop request received")
self.stop_request.set()
super(ExtendedProcess, self).join(timeout)
def run(self):
logging.debug("process has started")
while not self.stop_request.is_set():
print "doing something"
logging.debug("proc is stopping")
When I call start() on the process it should be running forever, since self.stop_request() is not set. After some miliseconds join() is being called by itself and breaking run. What is going on!? why is join being called by itself?
Moreover, when I start a debugger and go line by line it's suddenly working fine.... What am I missing?
OK, thanks to ely's answer the reason hit me:
There is a race condition -
new process created...
as it's starting itself and about to run logging.debug("process has started") the main function hits end.
main function calls sys exit and on sys exit python calls for all finished processes to close with join().
since the process didn't actually hit "while not self.stop_request.is_set()" join is called and "self.stop_request.set()". Now stop_request.is_set and the code closes.

As mentioned in the updated question, this is because of a race condition. Below I put an initial example highlighting a simplistic race condition where the race is against the overall program exit, but this could also be caused by other types of scope exits or other general race conditions involving your process.
I copied your class definition and added some "main" code to run it, here's my full listing:
import logging
import multiprocessing
import time
class ExtendedProcess(multiprocessing.Process):
def __init__(self):
super(ExtendedProcess, self).__init__()
self.stop_request = multiprocessing.Event()
def join(self, timeout=None):
logging.debug("stop request received")
self.stop_request.set()
super(ExtendedProcess, self).join(timeout)
def run(self):
logging.debug("process has started")
while not self.stop_request.is_set():
print("doing something")
time.sleep(1)
logging.debug("proc is stopping")
if __name__ == "__main__":
p = ExtendedProcess()
p.start()
while True:
pass
The above code listing runs as expected for me using both Python 2.7.11 and 3.6.4. It loops infinitely and the process never terminates:
ely#eschaton:~/programming$ python extended_process.py
doing something
doing something
doing something
doing something
doing something
... and so on
However, if I instead use this code in my main section, it exits right away (as expected):
if __name__ == "__main__":
p = ExtendedProcess()
p.start()
This exits because the interpreter reaches the end of the program, which in turn triggers automatically destroying the p object as it goes out of scope of the whole program.
Note this could also explain why it works for you in the debugger. That is an interactive programming session, so after you start p, the debugger environment allows you to wait around and inspect it ... it would not be automatically destroyed unless you somehow invoked it within some scope that is exited while stepping through the debugger.
Just to verify the join behavior too, I also tried with this main block:
if __name__ == "__main__":
log = logging.getLogger()
log.setLevel(logging.DEBUG)
p = ExtendedProcess()
p.start()
st_time = time.time()
while time.time() - st_time < 5:
pass
p.join()
print("Finished!")
and it works as expected:
ely#eschaton:~/programming$ python extended_process.py
DEBUG:root:process has started
doing something
doing something
doing something
doing something
doing something
DEBUG:root:stop request received
DEBUG:root:proc is stopping
Finished!

Related

How to kill threads in Python with CTRL + C [duplicate]

I am testing Python threading with the following script:
import threading
class FirstThread (threading.Thread):
def run (self):
while True:
print 'first'
class SecondThread (threading.Thread):
def run (self):
while True:
print 'second'
FirstThread().start()
SecondThread().start()
This is running in Python 2.7 on Kubuntu 11.10. Ctrl+C will not kill it. I also tried adding a handler for system signals, but that did not help:
import signal
import sys
def signal_handler(signal, frame):
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
To kill the process I am killing it by PID after sending the program to the background with Ctrl+Z, which isn't being ignored. Why is Ctrl+C being ignored so persistently? How can I resolve this?

Ctrl+C terminates the main thread, but because your threads aren't in daemon mode, they keep running, and that keeps the process alive. We can make them daemons:
f = FirstThread()
f.daemon = True
f.start()
s = SecondThread()
s.daemon = True
s.start()
But then there's another problem - once the main thread has started your threads, there's nothing else for it to do. So it exits, and the threads are destroyed instantly. So let's keep the main thread alive:
import time
while True:
time.sleep(1)
Now it will keep print 'first' and 'second' until you hit Ctrl+C.
Edit: as commenters have pointed out, the daemon threads may not get a chance to clean up things like temporary files. If you need that, then catch the KeyboardInterrupt on the main thread and have it co-ordinate cleanup and shutdown. But in many cases, letting daemon threads die suddenly is probably good enough.

KeyboardInterrupt and signals are only seen by the process (ie the main thread)... Have a look at Ctrl-c i.e. KeyboardInterrupt to kill threads in python

I think it's best to call join() on your threads when you expect them to die. I've taken the liberty to make the change your loops to end (you can add whatever cleanup needs are required to there as well). The variable die is checked on each pass and when it's True, the program exits.
import threading
import time
class MyThread (threading.Thread):
die = False
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run (self):
while not self.die:
time.sleep(1)
print (self.name)
def join(self):
self.die = True
super().join()
if __name__ == '__main__':
f = MyThread('first')
f.start()
s = MyThread('second')
s.start()
try:
while True:
time.sleep(2)
except KeyboardInterrupt:
f.join()
s.join()

An improved version of #Thomas K's answer:
Defining an assistant function is_any_thread_alive() according to this gist, which can terminates the main() automatically.
Example codes:
import threading
def job1():
...
def job2():
...
def is_any_thread_alive(threads):
return True in [t.is_alive() for t in threads]
if __name__ == "__main__":
...
t1 = threading.Thread(target=job1,daemon=True)
t2 = threading.Thread(target=job2,daemon=True)
t1.start()
t2.start()
while is_any_thread_alive([t1,t2]):
time.sleep(0)

One simple 'gotcha' to beware of, are you sure CAPS LOCK isn't on?
I was running a Python script in the Thonny IDE on a Pi4. With CAPS LOCK on, Ctrl+Shift+C is passed to the keyboard buffer, not Ctrl+C.

Allow process to finish rather than be interrupted when SIGTERM is used in Python 3

I am developing some code which I need to gracefully shutdown when a sigterm signal is sent from the command line in unix. I found this example https://stackoverflow.com/a/31464349/7019148 which works great, but there's one problem with it.
Code:
import signal
import time
class GracefulKiller:
def __init__(self):
signal.signal(signal.SIGTERM, self.exit_gracefully)
self.kill_now = False
def exit_gracefully(self, signum, frame):
self.kill_now = True
def run_something(self):
print("starting")
time.sleep(5)
print("ending")
if __name__ == '__main__':
killer = GracefulKiller()
print(os.getpid())
while True:
killer.run_something()
if killer.kill_now:
break
print("End of the program. I was killed gracefully :)")
When you pass the kill command kill -15 <pid>, the run_something method is interrupted and the process killed, gracefully. However, is there a way to do this so that the run_something method can complete before the process is killed? I.e. prevent the interruption?
Desired output:
>>> starting
*kill executed during the middle sleep*
>>> ending
>>> End of the program. I was killed gracefully :)
My use case is that this will be turned into a download script and if I want to terminate the process, I would like the process to finish downloading before terminating...

thread.join() waits till the thread finishes even if an exit signal was caught.
import threading
import Queue
import time
def download_for(seconds=5):
for i in range(seconds):
print("downloading...")
time.sleep(1)
print("finished download")
download_thread = threading.Thread(target=download_for, args=(3,))
download_thread.start()
# this waits till the thread finishes even if an exit signal was received
download_thread.join()
# this would just stop the download midway
# download_for(seconds=5)

The answer is in the original question. I am just leaving this here for future Google searchers.
I never had an issue in the first place, my terminal was just having a problem printing 'ending' following the kill command.

signal not handled when multiple threads join [duplicate]

This should be very simple and I'm very surprised that I haven't been able to find this questions answered already on stackoverflow.
I have a daemon like program that needs to respond to the SIGTERM and SIGINT signals in order to work well with upstart. I read that the best way to do this is to run the main loop of the program in a separate thread from the main thread and let the main thread handle the signals. Then when a signal is received the signal handler should tell the main loop to exit by setting a sentinel flag that is routinely being checked in the main loop.
I've tried doing this but it is not working the way I expected. See the code below:
from threading import Thread
import signal
import time
import sys
stop_requested = False
def sig_handler(signum, frame):
sys.stdout.write("handling signal: %s\n" % signum)
sys.stdout.flush()
global stop_requested
stop_requested = True
def run():
sys.stdout.write("run started\n")
sys.stdout.flush()
while not stop_requested:
time.sleep(2)
sys.stdout.write("run exited\n")
sys.stdout.flush()
signal.signal(signal.SIGTERM, sig_handler)
signal.signal(signal.SIGINT, sig_handler)
t = Thread(target=run)
t.start()
t.join()
sys.stdout.write("join completed\n")
sys.stdout.flush()
I tested this in the following two ways:
1)
$ python main.py > output.txt&
[2] 3204
$ kill -15 3204
2)
$ python main.py
ctrl+c
In both cases I expect this written to the output:
run started
handling signal: 15
run exited
join completed
In the first case the program exits but all I see is:
run started
In the second case the SIGTERM signal is seemingly ignored when ctrl+c is pressed and the program doesn't exit.
What am I missing here?

The problem is that, as explained in Execution of Python signal handlers:
A Python signal handler does not get executed inside the low-level (C) signal handler. Instead, the low-level signal handler sets a flag which tells the virtual machine to execute the corresponding Python signal handler at a later point(for example at the next bytecode instruction)
…
A long-running calculation implemented purely in C (such as regular expression matching on a large body of text) may run uninterrupted for an arbitrary amount of time, regardless of any signals received. The Python signal handlers will be called when the calculation finishes.
Your main thread is blocked on threading.Thread.join, which ultimately means it's blocked in C on a pthread_join call. Of course that's not a "long-running calculation", it's a block on a syscall… but nevertheless, until that call finishes, your signal handler can't run.
And, while on some platforms pthread_join will fail with EINTR on a signal, on others it won't. On linux, I believe it depends on whether you select BSD-style or default siginterrupt behavior, but the default is no.
So, what can you do about it?
Well, I'm pretty sure the changes to signal handling in Python 3.3 actually changed the default behavior on Linux so you won't need to do anything if you upgrade; just run under 3.3+ and your code will work as you're expecting. At least it does for me with CPython 3.4 on OS X and 3.3 on Linux. (If I'm wrong about this, I'm not sure whether it's a bug in CPython or not, so you may want to raise it on python-list rather than opening an issue…)
On the other hand, pre-3.3, the signal module definitely doesn't expose the tools you'd need to fix this problem yourself. So, if you can't upgrade to 3.3, the solution is to wait on something interruptible, like a Condition or an Event. The child thread notifies the event right before it quits, and the main thread waits on the event before it joins the child thread. This is definitely hacky. And I can't find anything that guarantees it will make a difference; it just happens to work for me in various builds of CPython 2.7 and 3.2 on OS X and 2.6 and 2.7 on Linux…

abarnert's answer was spot on. I'm still using Python 2.7 however. In order to solve this problem for myself I wrote an InterruptableThread class.
Right now it doesn't allow passing additional arguments to the thread target. Join doesn't accept a timeout parameter either. This is just because I don't need to do that. You can add it if you want. You will probably want to remove the output statements if you use this yourself. They are just there as a way of commenting and testing.
import threading
import signal
import sys
class InvalidOperationException(Exception):
pass
# noinspection PyClassHasNoInit
class GlobalInterruptableThreadHandler:
threads = []
initialized = False
#staticmethod
def initialize():
signal.signal(signal.SIGTERM, GlobalInterruptableThreadHandler.sig_handler)
signal.signal(signal.SIGINT, GlobalInterruptableThreadHandler.sig_handler)
GlobalInterruptableThreadHandler.initialized = True
#staticmethod
def add_thread(thread):
if threading.current_thread().name != 'MainThread':
raise InvalidOperationException("InterruptableThread objects may only be started from the Main thread.")
if not GlobalInterruptableThreadHandler.initialized:
GlobalInterruptableThreadHandler.initialize()
GlobalInterruptableThreadHandler.threads.append(thread)
#staticmethod
def sig_handler(signum, frame):
sys.stdout.write("handling signal: %s\n" % signum)
sys.stdout.flush()
for thread in GlobalInterruptableThreadHandler.threads:
thread.stop()
GlobalInterruptableThreadHandler.threads = []
class InterruptableThread:
def __init__(self, target=None):
self.stop_requested = threading.Event()
self.t = threading.Thread(target=target, args=[self]) if target else threading.Thread(target=self.run)
def run(self):
pass
def start(self):
GlobalInterruptableThreadHandler.add_thread(self)
self.t.start()
def stop(self):
self.stop_requested.set()
def is_stop_requested(self):
return self.stop_requested.is_set()
def join(self):
try:
while self.t.is_alive():
self.t.join(timeout=1)
except (KeyboardInterrupt, SystemExit):
self.stop_requested.set()
self.t.join()
sys.stdout.write("join completed\n")
sys.stdout.flush()
The class can be used two different ways. You can sub-class InterruptableThread:
import time
import sys
from interruptable_thread import InterruptableThread
class Foo(InterruptableThread):
def __init__(self):
InterruptableThread.__init__(self)
def run(self):
sys.stdout.write("run started\n")
sys.stdout.flush()
while not self.is_stop_requested():
time.sleep(2)
sys.stdout.write("run exited\n")
sys.stdout.flush()
sys.stdout.write("all exited\n")
sys.stdout.flush()
foo = Foo()
foo2 = Foo()
foo.start()
foo2.start()
foo.join()
foo2.join()
Or you can use it more like the way threading.thread works. The run method has to take the InterruptableThread object as a parameter though.
import time
import sys
from interruptable_thread import InterruptableThread
def run(t):
sys.stdout.write("run started\n")
sys.stdout.flush()
while not t.is_stop_requested():
time.sleep(2)
sys.stdout.write("run exited\n")
sys.stdout.flush()
t1 = InterruptableThread(run)
t2 = InterruptableThread(run)
t1.start()
t2.start()
t1.join()
t2.join()
sys.stdout.write("all exited\n")
sys.stdout.flush()
Do with it what you will.

I faced the same problem here signal not handled when multiple threads join. After reading abarnert's answer, I changed to Python 3 and solved the problem. But I do like to change all my program to python 3. So, I solved my program by avoiding calling thread join() before signal sent. Below is my code.
It is not very good, but solved my program in python 2.7. My question was marked as duplicated, so I put my solution here.
import threading, signal, time, os
RUNNING = True
threads = []
def monitoring(tid, itemId=None, threshold=None):
global RUNNING
while(RUNNING):
print "PID=", os.getpid(), ";id=", tid
time.sleep(2)
print "Thread stopped:", tid
def handler(signum, frame):
print "Signal is received:" + str(signum)
global RUNNING
RUNNING=False
#global threads
if __name__ == '__main__':
signal.signal(signal.SIGUSR1, handler)
signal.signal(signal.SIGUSR2, handler)
signal.signal(signal.SIGALRM, handler)
signal.signal(signal.SIGINT, handler)
signal.signal(signal.SIGQUIT, handler)
print "Starting all threads..."
thread1 = threading.Thread(target=monitoring, args=(1,), kwargs={'itemId':'1', 'threshold':60})
thread1.start()
threads.append(thread1)
thread2 = threading.Thread(target=monitoring, args=(2,), kwargs={'itemId':'2', 'threshold':60})
thread2.start()
threads.append(thread2)
while(RUNNING):
print "Main program is sleeping."
time.sleep(30)
for thread in threads:
thread.join()
print "All threads stopped."

stop a threading.Thread in python

the following code:
import time
import threading
tasks = dict()
class newTask(object):
def __init__(self, **kw):
[setattr(self, x, kw[x]) for x in kw]
self.object_ret()
def object_ret(self): return self
def task_create(name, timeout, function):
task = newTask(**{
'timeout': int(timeout),
'function': function,
'start': time.time()
})
def set_timeout(v):
while True:
if (time.time() - v.start) > v.timeout:
v.function()
v.start = time.time()
tasks[name] = threading.Thread(target=set_timeout, args=(task,))
tasks[name].start()
def stop(x):
#stops the thread in tasks[x]
is a simple task system that i am using for minor tasks such as pings and timeouts. This works for my needs but if i ever wanted to stop a ping or task that was running, there is no way for me to do so. Is there a way for me to delete or stop that thread that i created using any means possible? I do not care if it is bad or messy to do so, i just want it stopped.

I suggest the following:
In your newTask.init function, add a line "self.alive = True"
In the set_timeout function, replace "while True:" with "while v.alive:"
Store newTask objects in your "tasks" dictionary, not thread objects.
The stop(x) function has one line: "tasks[x].alive = False"
This will cause the thread to die when you call stop(x), where x is the thread's name. It provides a mechanism that allows a thread to die without killing it in some bogus way. I know you said you don't care, but you really should care if you want your multithreaded programs to work.
Second suggestion: read Ulrich Eckhardt's comment carefully and take it seriously; all of his points are well taken.

Signal handler:::
def signal_handler(signal, frame):
print('You pressed Ctrl+C!')
tasks[name].stop()
sys.exit(0)
in the main script, register the handler:::
signal.signal(signal.SIGINT, signal_handler)
signal.pause()

Cannot kill Python script with Ctrl-C

I am testing Python threading with the following script:
import threading
class FirstThread (threading.Thread):
def run (self):
while True:
print 'first'
class SecondThread (threading.Thread):
def run (self):
while True:
print 'second'
FirstThread().start()
SecondThread().start()
This is running in Python 2.7 on Kubuntu 11.10. Ctrl+C will not kill it. I also tried adding a handler for system signals, but that did not help:
import signal
import sys
def signal_handler(signal, frame):
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
To kill the process I am killing it by PID after sending the program to the background with Ctrl+Z, which isn't being ignored. Why is Ctrl+C being ignored so persistently? How can I resolve this?

Ctrl+C terminates the main thread, but because your threads aren't in daemon mode, they keep running, and that keeps the process alive. We can make them daemons:
f = FirstThread()
f.daemon = True
f.start()
s = SecondThread()
s.daemon = True
s.start()
But then there's another problem - once the main thread has started your threads, there's nothing else for it to do. So it exits, and the threads are destroyed instantly. So let's keep the main thread alive:
import time
while True:
time.sleep(1)
Now it will keep print 'first' and 'second' until you hit Ctrl+C.
Edit: as commenters have pointed out, the daemon threads may not get a chance to clean up things like temporary files. If you need that, then catch the KeyboardInterrupt on the main thread and have it co-ordinate cleanup and shutdown. But in many cases, letting daemon threads die suddenly is probably good enough.

KeyboardInterrupt and signals are only seen by the process (ie the main thread)... Have a look at Ctrl-c i.e. KeyboardInterrupt to kill threads in python

I think it's best to call join() on your threads when you expect them to die. I've taken the liberty to make the change your loops to end (you can add whatever cleanup needs are required to there as well). The variable die is checked on each pass and when it's True, the program exits.
import threading
import time
class MyThread (threading.Thread):
die = False
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run (self):
while not self.die:
time.sleep(1)
print (self.name)
def join(self):
self.die = True
super().join()
if __name__ == '__main__':
f = MyThread('first')
f.start()
s = MyThread('second')
s.start()
try:
while True:
time.sleep(2)
except KeyboardInterrupt:
f.join()
s.join()

An improved version of #Thomas K's answer:
Defining an assistant function is_any_thread_alive() according to this gist, which can terminates the main() automatically.
Example codes:
import threading
def job1():
...
def job2():
...
def is_any_thread_alive(threads):
return True in [t.is_alive() for t in threads]
if __name__ == "__main__":
...
t1 = threading.Thread(target=job1,daemon=True)
t2 = threading.Thread(target=job2,daemon=True)
t1.start()
t2.start()
while is_any_thread_alive([t1,t2]):
time.sleep(0)

One simple 'gotcha' to beware of, are you sure CAPS LOCK isn't on?
I was running a Python script in the Thonny IDE on a Pi4. With CAPS LOCK on, Ctrl+Shift+C is passed to the keyboard buffer, not Ctrl+C.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.