os._exit(1) does not kill non-daemonic sibling processes - python

I am writing a python script which has 2 child processes. The main logic occurs in one process and another process waits for some time and then kills the main process even if the logic is not done.
I read that calling os_exit(1) stops the interpreter, so the entire script is killed automatically. I've used it like shown below:
import os
from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Array
# Main process
def main_process(shared_variable):
shared_variable.value = "mainprc"
time.sleep(20)
print("Task finished normally.")
os._exit(1)
# Timer process
def timer_process(shared_variable):
threshold_time_secs = 5
time.sleep(threshold_time_secs)
print("Timeout reached")
print("Shared variable ",shared_variable.value)
print("Task is shutdown.")
os._exit(1)
if __name__ == "__main__":
lock = Lock()
shared_variable = Array('c',"initial",lock=lock)
process_main = Process(target=main_process, args=(shared_variable))
process_timer = Process(target=timer_process, args=(shared_variable))
process_main.start()
process_timer.start()
process_timer.join()
The timer process calls os._exit but the script still waits for the main process to print "Task finished normally." before exiting.
How do I make it such that if timer process exits, the entire program is shutdown (including main process)?
Thanks.

Related

How to terminate Python's `ProcessPoolExecutor` when parent process dies?

Is there a way to make the processes in concurrent.futures.ProcessPoolExecutor terminate if the parent process terminates for any reason?
Some details: I'm using ProcessPoolExecutor in a job that processes a lot of data. Sometimes I need to terminate the parent process with a kill command, but when I do that the processes from ProcessPoolExecutor keep running and I have to manually kill them too. My primary work loop looks like this:
with concurrent.futures.ProcessPoolExecutor(n_workers) as executor:
result_list = [executor.submit(_do_work, data) for data in data_list]
for id, future in enumerate(
concurrent.futures.as_completed(result_list)):
print(f'{id}: {future.result()}')
Is there anything I can add here or do differently to make the child processes in executor terminate if the parent dies?
You can start a thread in each process to terminate when parent process dies:
def start_thread_to_terminate_when_parent_process_dies(ppid):
pid = os.getpid()
def f():
while True:
try:
os.kill(ppid, 0)
except OSError:
os.kill(pid, signal.SIGTERM)
time.sleep(1)
thread = threading.Thread(target=f, daemon=True)
thread.start()
Usage: pass initializer and initargs to ProcessPoolExecutor
with concurrent.futures.ProcessPoolExecutor(
n_workers,
initializer=start_thread_to_terminate_when_parent_process_dies, # +
initargs=(os.getpid(),), # +
) as executor:
This works even if the parent process is SIGKILL/kill -9'ed.
I would suggest two changes:
Use a kill -15 command, which can be handled by the Python program as a SIGTERM signal rather than a kill -9 command.
Use a multiprocessing pool created with the multiprocessing.pool.Pool class, whose terminate method works quite differently than that of the concurrent.futures.ProcessPoolExecutor class in that it will kill all processes in the pool so any tasks that have been submitted and running will be also immediately terminated.
Your equivalent program using the new pool and handling a SIGTERM interrupt would be:
from multiprocessing import Pool
import signal
import sys
import os
...
def handle_sigterm(*args):
#print('Terminating...', file=sys.stderr, flush=True)
pool.terminate()
sys.exit(1)
# The process to be "killed", if necessary:
print(os.getpid(), file=sys.stderr)
pool = Pool(n_workers)
signal.signal(signal.SIGTERM, handle_sigterm)
results = pool.imap_unordered(_do_work, data_list)
for id, result in enumerate(results):
print(f'{id}: {result}')
You could run the script in a kill-cgroup. When you need to kill the whole thing, you can do so by using the cgroup's kill switch. Even a cpu-cgroup will do the trick as you can access the group's pids.
Check this article on how to use cgexec.

Cancel an ProcessPoolExecutor future that has hung

I have python function which is calling into a C library I cannot control or update. Unfortunately, there is an intermittent bug with the C library and occasionally it hangs. To protect my application from also hanging I'm trying to isolate the function call in ThreadPoolExecutor or ProcessPoolExecutor so only that thread or process crashes.
However, the following code hangs, because the executor cannot shut down because the process is still running!
Is it possible to cancel an executor with a future that has hung?
import time
from concurrent.futures import ThreadPoolExecutor, wait
if __name__ == "__main__":
def hang_forever(*args):
print("Starting hang_forever")
time.sleep(10.0)
print("Finishing hang_forever")
print("Starting executor")
with ThreadPoolExecutor() as executor:
future = executor.submit(hang_forever)
print("Submitted future")
done, not_done = wait([future], timeout=1.0)
print("Done", done, "Not done", not_done)
# with never exits because future has hung!
if len(not_done) > 0:
raise IOError("Timeout")
The docs say that it's not possible to shut down the executor until all pending futures are done executing:
Regardless of the value of wait, the entire Python program will not
exit until all pending futures are done executing.
Calling future.cancel() won't help as it will also hang. Fortunately, you can solve your problem by using multiprocessing.Process directly instead of using ProcessPoolExecutor:
import time
from multiprocessing import Process
def hang_forever():
while True:
print('hang forever...')
time.sleep(1)
def main():
proc = Process(target=hang_forever)
print('start the process')
proc.start()
time.sleep(1)
timeout = 3
print(f'trying to join the process in {timeout} sec...')
proc.join(timeout)
if proc.is_alive():
print('timeout is exceeded, terminate the process!')
proc.terminate()
proc.join()
print('done')
if __name__ == '__main__':
main()
Output:
start the process
hang forever...
trying to join the process in 3 sec...
hang forever...
hang forever...
hang forever...
hang forever...
timeout is exceeded, terminate the process!
done

Allow process to finish rather than be interrupted when SIGTERM is used in Python 3

I am developing some code which I need to gracefully shutdown when a sigterm signal is sent from the command line in unix. I found this example https://stackoverflow.com/a/31464349/7019148 which works great, but there's one problem with it.
Code:
import signal
import time
class GracefulKiller:
def __init__(self):
signal.signal(signal.SIGTERM, self.exit_gracefully)
self.kill_now = False
def exit_gracefully(self, signum, frame):
self.kill_now = True
def run_something(self):
print("starting")
time.sleep(5)
print("ending")
if __name__ == '__main__':
killer = GracefulKiller()
print(os.getpid())
while True:
killer.run_something()
if killer.kill_now:
break
print("End of the program. I was killed gracefully :)")
When you pass the kill command kill -15 <pid>, the run_something method is interrupted and the process killed, gracefully. However, is there a way to do this so that the run_something method can complete before the process is killed? I.e. prevent the interruption?
Desired output:
>>> starting
*kill executed during the middle sleep*
>>> ending
>>> End of the program. I was killed gracefully :)
My use case is that this will be turned into a download script and if I want to terminate the process, I would like the process to finish downloading before terminating...
thread.join() waits till the thread finishes even if an exit signal was caught.
import threading
import Queue
import time
def download_for(seconds=5):
for i in range(seconds):
print("downloading...")
time.sleep(1)
print("finished download")
download_thread = threading.Thread(target=download_for, args=(3,))
download_thread.start()
# this waits till the thread finishes even if an exit signal was received
download_thread.join()
# this would just stop the download midway
# download_for(seconds=5)
The answer is in the original question. I am just leaving this here for future Google searchers.
I never had an issue in the first place, my terminal was just having a problem printing 'ending' following the kill command.

python mutiprocessing main process waiting for deamon process

I'm trying to understand multiprocessing module. Below is my code.
from multiprocessing import Process, current_process
#from time import time
import time
def work(delay):
p = current_process()
print p.name, p.pid, p.deamon
time.sleep(delay)
print 'Finised deamon work'
def main():
print 'Starting Main Process'
p = Process(target=work, args=(2,))
p.deamon = True
p.start()
print 'Exiting Main Process'
if __name__ == '__main__':
main()
Output:
Starting Main Process
Exiting Main Process
Process-1 7863 True
Finised deamon work
I expect main process to exit before deamon process(sleep for 2 secs). Since main process exits, deamon process should also exit. But output is confusing me.
Expected Output:
Starting Main Process
Exiting Main Process
Process-1 7863 True
Is my understanding of multiprocessing module wrong?

Python program with thread can't catch CTRL+C

I am writing a python script that needs to run a thread which listens to a network socket.
I'm having trouble with killing it using Ctrl+c using the code below:
#!/usr/bin/python
import signal, sys, threading
THREADS = []
def handler(signal, frame):
global THREADS
print "Ctrl-C.... Exiting"
for t in THREADS:
t.alive = False
sys.exit(0)
class thread(threading.Thread):
def __init__(self):
self.alive = True
threading.Thread.__init__(self)
def run(self):
while self.alive:
# do something
pass
def main():
global THREADS
t = thread()
t.start()
THREADS.append(t)
if __name__ == '__main__':
signal.signal(signal.SIGINT, handler)
main()
Appreciate any advise on how to catch Ctrl+c and terminate the script.
The issue is that after the execution falls off the main thread (after main() returned), the threading module will pause, waiting for the other threads to finish, using locks; and locks cannot be interrupted with signals. This is the case in Python 2.x at least.
One easy fix is to avoid falling off the main thread, by adding an infinite loop that calls some function that sleeps until some action is available, like select.select(). If you don't need the main thread to do anything at all, use signal.pause(). Example:
if __name__ == '__main__':
signal.signal(signal.SIGINT, handler)
main()
while True: # added
signal.pause() # added
It's because signals can only be caught by main thread. And here main thread ended his life long time ago (application is waiting for your thread to finish). Try adding
while True:
sleep(1)
to the end of your main() (and of course from time import sleep at the very top).
or as Kevin said:
for t in THREADS:
t.join(1) # join with timeout. Without timeout signal cannot be caught.

Categories