Why sys.exit() dose not shutdown python when there are backgroud threads? - python

import sys
import time
from threading import Thread
def run_forever():
while True:
time.sleep(1)
print('running')
if __name__ == '__main__':
t = Thread(target=run_forever)
t.start()
sys.exit()
It seems sys.exit can't shutdown the program if I still have any background threads. In this case, how can I gracefully close this program (especially when I can't see all the background threads)?
EDIT
I know it's better to keep track of all the background threads. But in this use case I know it's safe to kill them all at once. I just can't find a way to simply close the program.

There's no real reason why this happens other than sys.exit is designed to only end the thread it is called from, as you've demonstrated here. There are two options that you can use to get the behavior you're expecting.
Use daemon threads. Daemon threads will exit if their parent exits. You can make your code use daemon threads like this:
if __name__ == '__main__':
t = Thread(target=run_forever)
t.daemon = True
t.start()
sys.exit()
Use os._exit This will end your program similar to how I imagine you think sys.exit() behaves. This is more of a brute force method and is probably not a good idea, but is possible.
I would recommend using daemon threads unless you absolutely cannot. (i.e. you want to sometimes end just the main thread while the other continue, and sometimes end all of them at once.)
More information on daemon threads: https://www.geeksforgeeks.org/python-daemon-threads/

Related

Why is the dameon flag useful when using the threading module in Python?

Given a simple script as the following, why is using daemon flag useful? Maybe this is too simple, but I understand that daemon threads are generally long running background tasks (they are not meant to be waited for as quoted by Raymond Hettinger). So, if I have a task that I am not waiting for and simply start a non-daemon thread and not join, is that psuedo-daemon? It seems the functionality runs the same. Or is this more of a question of memory than processing logic? With the second question Im actually not sure how much resources this script consumes in the aspects of daemon vs non-daemon
from threading import Thread
import time
import sys
def func():
for i in range(4):
print(f"Running Thread-{i}")
time.sleep(1)
t = Thread(target=func)
# t.daemon = True # nothing seems to change
t.start()
sys.exit()

What happens if I don't join() a python thread?

I have a query. I have seen examples where developers write something like the code as follows:
import threading
def do_something():
return true
t = threading.Thread(target=do_something)
t.start()
t.join()
I know that join() signals the interpreter to wait till the thread is completely executed. But what if I do not write t.join()? Will the thread get closed automatically and will it be reused later?
Please let me know the answer. It's my first attempt at creating a multi-threaded application in Python 3.5.0.
A Python thread is just a regular OS thread. If you don't join it, it still keeps running concurrently with the current thread. It will eventually die, when the target function completes or raises an exception. No such thing as "thread reuse" exists, once it's dead it rests in peace.
Unless the thread is a "daemon thread" (via a constructor argument daemon or assigning the daemon property) it will be implicitly joined for before the program exits, otherwise, it is killed abruptly.
One thing to remember when writing multithreading programs in Python, is that they only have limited use due to infamous Global interpreter lock. In short, using threads won't make your CPU-intensive program any faster. They can be useful only when you perform something involving waiting (e.g. you wait for certain file system event to happen in a thread).
The join part means the main program will wait for the thread to end before continuing. Without join, the main program will end and the thread will continue.
Now if you set the daemon parameter to "True", it means the thread will depends on the main program, and it will ends if the main program ends before.
Here is an example to understand better :
import threading
import time
def do_something():
time.sleep(2)
print("do_something")
return True
t = threading.Thread(target=do_something)
t.daemon = True # without the daemon parameter, the function in parallel will continue even your main program ends
t.start()
t.join() # with this, the main program will wait until the thread ends
print("end of main program")
no daemon, no join:
end of main program
do_something
daemon only:
end of main program
join only:
do_something
end of main program
daemon and join:
do_something
end of main program
# Note : in this case the daemon parameter is useless
Without join(), non-daemon threads are running and are completed with the main thread concurrently.
Without join(), daemon threads are running with the main thread concurrently and when the main thread is completed, the daemon threads are exited without completed if the daemon threads are still running.
You can see my answer in this post explaining about it in detail.

Threading cleanup (disposing)

I am referring to this Simple threading event example .
More specifically, this piece of code:
for i in range(4):
t = threading.Thread(target=worker)
t.daemon = True # thread dies when main thread (only non-daemon thread) exits.
t.start()
From what I understand this created 4 threads that will be used. As I am more familiar with C++ and C# I am wondering about cleanup. Can I just leave these threads open or is there a proper way of 'closing'/disposing them? Please do not misread this as wanting to kill the thread. I am just wondering that when all work is completed is there a proper way of cleaning up.

Is it a Python bug that the main thread of a process created in a daemon thread is a daemon itself?

When I call os.fork() inside a daemon thread, the main thread in the child process has the daemon property set to True. This is very confusing, since the program keeps running while the only thread is a daemon. According to the docs, if all the threads are daemons the program should exit.
Here is an example:
import os
import threading
def child():
assert not threading.current_thread().daemon # This shouldn't fail
def parent():
new_pid = os.fork()
if new_pid == 0:
child()
else:
os.waitpid(new_pid, 0)
t = threading.Thread(target=parent)
t.setDaemon(True)
t.start()
t.join()
Is it a bug in the CPython implementation?
The reason for this behaviour is that the daemonization is only relevant for threads other than the main-thread. In the main-thread, the return-value of current_thread().daemon is hard-coded to be False.
See the relevant source code here:
https://github.com/python/cpython/blob/2.7/Lib/threading.py#L1097
So after a fork, there is only one thread, and it's consequently the main-thread.
Which means it can never be a daemon-thread.
I can not point you to any documentation beyond the source, but it is most certainly not a bug - it would be a bug the other way round, if your expectation was met.
The interaction between fork and threads are complex, and as I mentioned: don't mix them before fork.
This is very confusing, since the program keeps running while the only thread is a daemon. According to the docs, if all the threads are daemons the program should exit.
You are explicitly waiting for the thread to end. Whether the thread is a daemon or not has no effect to t.join(). The thread again won't end unless the child process has terminated due to os.waitpid().
I'm not sure about the behaviour of a forked thread, though, so I can't tell you why you experience what you do.

Threaded python application not closing cleanly

I have a small crawling application written in Python 2.7 that uses threads to fetch a lot of URLs. But it doesn't close cleanly or respond properly to a KeyboardInterrupt, although I tried to fix the latter issue with some advice I found here.
def main():
...
for i in range(NUMTHREADS):
worker = Thread(target=get_malware, args=(malq,dumpdir,))
worker.setDaemon(True)
worker.start()
...
malq.join()
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
sys.exit()
I need to make sure that it will exit properly when I hit Ctrl-C or when it completes its run rather than having to Ctrl-Z and kill the job.
Thanks!
There are discussions about how GIL could affect signal handling for Python applications with multiple threads that are IO bound. apparently IO bound threads cause the main thread starve for process time and not be able to handle signals as supposed to. I suggest looking at alternative parallel processing options (like subprocess module, or multiprocessing) or asynchronous frameworks (like asyncoro)

Categories