when will main thread exit in python - python

I am reading The Python Standard Library by Example and get confused when I arrived page 509.
Up to this point, the example programs have implicitly waited to exit until all threads have completed their work. Programs sometimes spawn a thread as a daemon that runs without blocking the main program from exiting.
but after I run some codes, I get result that is opposite. The code is like this:
#!/usr/bin/env python
# encoding: utf-8
#
# Copyright (c) 2008 Doug Hellmann All rights reserved.
#
"""Creating and waiting for a thread.
"""
#end_pymotw_header
import threading
import time
def worker():
"""thread worker function"""
print 'Worker'
# time.sleep(10000)
return
threads = []
for i in range(5):
t = threading.Thread(target=worker)
threads.append(t)
t.start()
print "main Exit"
and sometime the result is this:
Worker
Worker
WorkerWorker
main Exit
Worker
So I want to ask when will main thread exit in python after it starts several thread?

The main thread will exit whenever it is finished executing all the code in your script that is not started in a separate thread.
Since the t.start() will start the thread and then return to the main thread, the main thread will simply continue to execute until it reaches the bottom of your script and then exit.
Since you started the other threads in a non-daemon mode, they will continue running until they are finished.
If you wanted the main thread to not exit until all the threads have finished, you should explicitly join them to the main thread. The join will cause the thread calling join to wait until the the thread being joined is finished.
for i in range(5):
threads[i].join()
print "main Exit"
As #codesparkle pointed out, the more Pythonic way to write this would be to skip the index variable entirely.
for thread in threads:
thread.join()
print "main Exit"

According to the threading docs:
The entire Python program exits when only daemon threads are left
This agrees with the quote you give, but the slight difference in wording shows the result you get. The 'main thread' exits when you would expect it to. Note that the worker threads keep running at this point - you can see this in the test output you give. So, the main thread has finished, but the whole process is still running because there are other threads still running.
The difference is that if some of those worker threads were daemonised, they would be forcibly killed when the last non-daemon thread finished. If all of the workers were daemon threads, then the entire process would finish - and you would be back at your systems shell prompt - very soon after you print 'main exit', and it would be very rare (though not impossible, owing to race conditions) for any worker to print after that.

Your main thread will exit as soos as for loop completes its execution. Your main thread is starting new asynchronous threads. Which means that it will not wait untill new thread finishes its execution. So in your case main thread will start 5 threads in parallel and exit itself.

Note that Main does not exit when you print main Exit, but after it. Consider this program:
import threading
import time
def worker():
"""thread worker function"""
print 'Worker'
time.sleep(1)
print 'Done'
return
class Exiter(object):
def __init__(self):
self.a = 5.0
print 'I am alive'
def __del__(self):
print 'I am dying'
exiter = Exiter()
threads = []
for i in range(5):
t = threading.Thread(target=worker)
threads.append(t)
t.start()
print "main Exit"
I have created an object whose sole purpose is to print "I am dying" when it is being finalised. I am not deleting it explicitly anywhere, so it will only die when the main thread finishes, that is, when Python starts to kill everything to return memory to the OS.
If you run this you will see that workers are working when the main thread is done, but the objects are still alive. I am dying always comes after all the workers have finished their job.

For example as follow:
class ThreadA(Thread):
def __init__(self, mt):
Thread.__init__(self)
self.mt = mt
def run(self):
print 'T1: sleeping...'
time.sleep(4)
print 'current thread is ', self.isAlive()
print 'main thread is ', self.mt.isAlive()
print 'T1: raising...'
if __name__ == '__main__':
mt = threading.currentThread()
ta = ThreadA(mt)
ta.start()
logging.debug('main end')
>
T1: sleeping...
(MainThread) main end
current thread is True
main thread is False
T1: raising...
you can see the main thread active state is false?

Related

How check if a process has finished but without waiting?

I'm doing a small project in python/tkinter and I have been looking for a way to check if a process has finished but "without waiting". I have tried with:
process = subprocess.Popen(command)
while process.poll() is None:
print('Running!')
print('Finished!')
or:
process = subprocess.Popen(command)
stdoutdata, stderrdata = process.communicate()
print('Finished!')
Both codes execute the command and print "Finished!" when the process ends, but the main program freezes (waiting) and that's what I want to avoid. I need the GUI to stay functional while the process is running and then run some code right after it finishes. Any help?
It's common that you use the Thread module for that purpose:
For example:
# import Thread
from threading import Thread
import time
# create a function that checks if the process has finished
process = True
def check():
while process:
print('Running')
time.sleep(1) # here you can wait as much as you want without freezing the program
else:
print('Finished')
# call the function with the use of Thread
Thread(target=check).start()
# or if you want to keep a reference to it
t = Thread(target=check)
# you might also want to set thread daemon to True so as the Thread ends when the program closes
t.deamon = True
t.start()
This way when you do process=False the program will end and the output will show 'Finished'

Weird behaviour with threads and processes mixing

I'm running the following python code:
import threading
import multiprocessing
def forever_print():
while True:
print("")
def main():
t = threading.Thread(target=forever_print)
t.start()
return
if __name__=='__main__':
p = multiprocessing.Process(target=main)
p.start()
p.join()
print("main process on control")
It terminates.
When I unwrapped main from the new process, and just ran it directly, like this:
if name == '__main__':
main()
The script went on forever, as I thought it should. Am I wrong to assume that, given that t is a non-daemon process, p shouldn't halt in the first case?
I basically set up this little test because i've been developing an app in which threads are spawned inside subprocesses, and it's been showing some weird behaviour (sometimes it terminates properly, sometimes it doesn't). I guess what I wanted to know, in a broader sense, is if there is some sort of "gotcha" when mixing these two python libs.
My running environment: python 2.7 # Ubuntu 14.04 LTS
For now, threads created by multiprocessing worker processes act like daemon threads with respect to process termination: the worker process exits without waiting for the threads it created to terminate. This is due to worker processes using os._exit() to shut down, which skips most normal shutdown processing (and in particular skips the normal exit processing code (sys.exit()) that .join()'s non-daemon threading.Threads).
The easiest workaround is for worker processes to explicitly .join() the non-daemon threads they create.
There's an open bug report about this behavior, but it hasn't made much progress: http://bugs.python.org/issue18966
You need to call t.join() in your main function.
As your main function returns, the process gets terminated with both its threads.
p.join() blocks the main thread waiting for the spawned process to end. Your spawned process then, creates a thread but does not wait for it to end. It returns immediately thus trashing the thread itself.
If Threads share memory, Processes don't. Therefore, the Thread you create in the newly spawned process remains relegated to that process. The parent process is not aware of it.
The gotcha is that the multiprocessing machinery calls os._exit() after your target function exits, which violently kills the child process, even if it has background threads running.
The code for Process.start() looks like this:
def start(self):
'''
Start child process
'''
assert self._popen is None, 'cannot start a process twice'
assert self._parent_pid == os.getpid(), \
'can only start a process object created by current process'
assert not _current_process._daemonic, \
'daemonic processes are not allowed to have children'
_cleanup()
if self._Popen is not None:
Popen = self._Popen
else:
from .forking import Popen
self._popen = Popen(self)
_current_process._children.add(self)
Popen.__init__ looks like this:
def __init__(self, process_obj):
sys.stdout.flush()
sys.stderr.flush()
self.returncode = None
self.pid = os.fork() # This forks a new process
if self.pid == 0: # This if block runs in the new process
if 'random' in sys.modules:
import random
random.seed()
code = process_obj._bootstrap() # This calls your target function
sys.stdout.flush()
sys.stderr.flush()
os._exit(code) # Violent death of the child process happens here
The _bootstrap method is the one that actually executes the target function you passed passed to the Process object. In your case, that's main. main returns right after you start your background thread, even though the process doesn't exit, because there's still a non-daemon thread running.
However, as soon execution hits os._exit(code), the child process is killed, regardless of any non-daemon threads still executing.

Why doesn't the daemon program exit without join()

The answer might be right in front of me on the link below but I still don't understand. I'm sure after someone explains this to me, Darwin will be making a call to me.
The example is at this link here, although I've made some changes to try to experiment and help my understanding.
Here's the code:
import multiprocessing
import time
import sys
def daemon():
p = multiprocessing.current_process()
print 'Starting: ', p.name, p.pid
sys.stdout.flush()
time.sleep(2)
print 'Exiting: ', p.name, p.pid
sys.stdout.flush()
def non_daemon():
p = multiprocessing.current_process()
print 'Starting: ', p.name, p.pid
sys.stdout.flush()
time.sleep(6)
print 'Exiting: ', p.name, p.pid
sys.stdout.flush()
if __name__ == '__main__':
d = multiprocessing.Process(name='daemon', target=daemon)
d.daemon = True
n = multiprocessing.Process(name='non-daemon', target=non_daemon)
n.daemon = False
d.start()
time.sleep(1)
n.start()
# d.join()
And the output of the code is:
Starting: daemon 6173
Starting: non-daemon 6174
Exiting: non-daemon 6174
If the join() at the end is uncommented, then the output is:
Starting: daemon 6247
Starting: non-daemon 6248
Exiting: daemon 6247
Exiting: non-daemon 6248
I'm confused b/c the sleep of the daemon is 2 sec, whereas the non-daemon is 6 sec. Why doesn't it print out the "Exiting" message in the first case? The daemon should have woken up before the non-daemon and printed the message.
The explanation from the site is as such:
The output does not include the “Exiting” message from the daemon
process, since all of the non-daemon processes (including the main
program) exit before the daemon process wakes up from its 2 second
sleep.
but I changed it such that the daemon should have woken up before the non-daemon does. What am I missing here? Thanks in advance for your help.
EDIT: Forgot to mention I'm using python 2.7 but apparently this problem is also in python 3.x
This was a fun one to track down. The docs are somewhat misleading, in that they describe the non-daemon processes as if they are all equivalent; the existence of any non-daemon process means the process "family" is alive. But that's not how it's implemented. The parent process is "more equal" than others; multiprocessing registers an atexit handler that does the following:
for p in active_children():
if p.daemon:
info('calling terminate() for daemon %s', p.name)
p._popen.terminate()
for p in active_children():
info('calling join() for process %s', p.name)
p.join()
So when the main process finishes, it first terminates all daemon child processes, then joins all child processes to wait on non-daemon children and clean up resources from daemon children.
Because it performs cleanup in this order, a moment after your non-daemon Process starts, the main process begins cleanup and forcibly terminates the daemon Process.
Note that fixing this can be as simple as joining the non-daemon process manually, not just joining the daemon process (which defeats the whole point of a daemon completely); that prevents the atexit handler from being called, delaying the cleanup that would terminate the daemon child.
It's arguably a bug (one that seems to exist up through 3.5.1; I reproed myself), but whether it's a behavior bug or a docs bug is arguable.

Python: Ignoring signals in background process

I am creating a Python program that calls an external command periodically. The external command takes a few
seconds to complete. I want to reduce the possibility of the external command terminating
badly by adding a signal handler for SIGINT. Basically, I want SIGINT to attempt to wait until the command
executes before terminating the Python program. The problem is that the external perogram seems to be
getting the SIGINT as well, causing it to end abruptly. I am invoking the command using an external thread, since
the Python documentation for signal mentions that only the main thread receives the signal, according to http://docs.python.org/2/library/signal.html.
Can someone help with this.
Here is a stub of my code. Imagine that the external program is /bin/sleep:
import sys
import time
import threading
import signal
def sleep():
import subprocess
global sleeping
cmd = ['/bin/sleep', '10000']
sleeping = True
p = subprocess.Popen(cmd)
p.wait()
sleeping = False
def sigint_handler(signum, frame):
if sleeping:
print 'busy, will terminate shortly'
while(sleeping): time.sleep(0.5)
sys.exit(0)
else:
print 'clean exit'
sys.exit(0)
sleeping = False
signal.signal(signal.SIGINT, sigint_handler)
while(1):
t1 = threading.Thread(target=sleep)
t1.start()
time.sleep(500)
The expected behavior is that pressing Ctrl+C N seconds after the program starts will result in
it waiting (10000 - N) seconds and then exiting. What is happening is the program immediately terminates.
Thanks!
The problem is the way signal handlers are modified when executing a new process. From POSIX:
A child created via fork(2) inherits a copy of its parent's signal dis‐
positions. During an execve(2), the dispositions of handled signals
are reset to the default; the dispositions of ignored signals are left
unchanged.
So what you need to do is:
Ignore the SIGINT signal
Start the external program
Set the SIGINT handler as desired
That way, the external program will ignore SIGINT.
Of course, this leaves a (very) small time window when your script won't respond to SIGINT. But that's something you'll have to live with.
For example:
sleeping = False
while(1):
t1 = threading.Thread(target=sleep)
signal.signal(signal.SIGINT, signal.SIG_IGN)
t1.start()
signal.signal(signal.SIGINT, sigint_handler)
time.sleep(500)

Wait until nested python script ends before continuing in current python script

I have a python script that calls another python script. Inside the other python script it spawns some threads.How do I make the calling script wait until the called script is completely done running?
This is my code :
while(len(mProfiles) < num):
print distro + " " + str(len(mProfiles))
mod_scanProfiles.main(distro)
time.sleep(180)
mProfiles = readProfiles(mFile,num,distro)
print "yoyo"
How do I wait until mod_scanProfiles.main() and all threads are completely finished? ( I used time.sleep(180) for now but its not good programming habit)
You want to modify the code in mod_scanProfiles.main to block until all it's threads are finished.
Assuming you make a call to subprocess.Popen in that function just do:
# in mod_scanPfiles.main:
p = subprocess.Popen(...)
p.wait() # wait until the process completes.
If you're not currently waiting for your threads to end you'll also want to call Thread.join (docs) to wait for them to complete. For example:
# assuming you have a list of thread objects somewhere
threads = [MyThread(), ...]
for thread in threads:
thread.start()
for thread in threads:
thread.join()

Categories