Threading in 'while True' loop - python

First of all, I've only started working with Python a month ago, so I don't have deep knowledge about anything.
In a project I'm trying to collect the results of multiple (simultaneous) functions in a database, infinitely until I tell it to stop.
In an earlier attempt, I successfully used multiprocessing to do what I need, but since I now need to collect all the results of those functions for a database within the main, I switched to threading instead.
Basically, what I'm trying to do is:
collect1 = Thread(target=collect_data1)
collect2 = Thread(target=collect_data2)
send1 = Thread(target=send_data1)
send2 = Thread(target=send_data2)
collect = (collect1, collect2)
send = (send1, send2)
while True:
try:
for thread in collect:
thread.start()
for thread in collect:
thread.join()
for thread in send:
thread.start()
for thread in send:
thread.join()
except KeyboardInterrupt:
break
Now, obviously I can't just restart the threads. Nor explicitly kill them. The functions within the threads can theoretically be stopped at any point, so terminate() from multiprocessing was fine.
I was thinking whether something like the following could work (at least PyCharm is fine with it, so it seems to work) or if it creates a memory leak (which I assume), because the threads are never properly closed or deleted, at least as far as I can tell from researching.
Again, I'm new to Python, so I don't know anything about this aspect.
Code:
while True:
try:
collect1 = Thread(target=collect_data1)
collect2 = Thread(target=collect_data2)
send1 = Thread(target=send_data1)
send2 = Thread(target=send_data2)
collect = (collect1, collect2)
send = (send1, send2)
for thread in collect:
thread.start()
for thread in collect:
thread.join()
for thread in send:
thread.start()
for thread in send:
thread.join()
except KeyboardInterrupt:
break
I feel like this approach seems too good to be true, especially since I've never came across a similar solution during my research.
Anyways, any input is appreciated.
Have a good day.

You explicitly await the threads' termination in loops containing thread.join(), so no memory leak takes place, and your code is fine. If you're worried that you don't dispose of the thread objects in any way after they terminate, it will be done automatically as soon as they aren't used anymore, so this souldn't be a concern neither.

Related

How to find which have finished executing in Python

I am very new to the concept of threading and the concepts are still somewhat fuzzy.
But as of now i have a requirement in which i spin up an arbitrary number of threads from my Python program and then my Python program should indicate to the user running the process which threads have finished executing. Below is my first try:
import threading
from threading import Thread
from time import sleep
def exec_thread(n):
name = threading.current_thread().getName()
filename = name + ".txt"
with open(filename, "w+") as file:
file.write(f"My name is {name} and my main thread is {threading.main_thread()}\n")
sleep(n)
file.write(f"{name} exiting\n")
t1 = Thread(name="First", target=exec_thread, args=(10,))
t2 = Thread(name="Second", target=exec_thread, args=(2,))
t1.start()
t2.start()
while len(threading.enumerate()) > 1:
print(f"Waiting ... !")
sleep(5)
print(f"The threads are done"
So this basically tells me when all the threads are done executing.
But i want to know as soon as any one of my threads have completed execution so that i can tell the user that please check the output file for the thread.
I cannot use thread.join() since that would block my main program and the user would not know anything unless everything is complete which might take hours. The user wants to know as soon as some results are available.
Now i know that we can check individual threads whether they are active or not by doing : thread.isAlive() but i was hoping for a more elegant solution in which if the child threads can somehow communicate with the main thread and say I am done !
Many thanks for any answers in advance.
The simplest and most straightforward way to indicate a single thread is "done" is to put the required notification in the thread's implementation method, as the very last step. For example, you could print out a notification to the user.
Or, you could use events, see: https://docs.python.org/3/library/threading.html#event-objects
This is one of the simplest mechanisms for communication between
threads: one thread signals an event and other threads wait for it.
An event object manages an internal flag that can be set to true with
the set() method and reset to false with the clear() method. The
wait() method blocks until the flag is true.
So, the "final act" in your thread implementation would be to set an event object, and your main thread can wait until it's set.
Or, for an even fancier and more mechanism, use queues: https://docs.python.org/3/library/queue.html
Each thread writes an "I'm done" object to the queue when done, and the main thread can read those notifications from the queue in sequence as each thread completes.

How to close all threads with endless loops? (with _thread! nothing else!)

import _thread
import time
def test1():
while True:
time.sleep(1)
print('TEST1')
def test2():
while True:
time.sleep(3)
print('TEST2')
try:
_thread.start_new_thread(test1,())
_thread.start_new_thread(test2,())
except:
print("ERROR")
How can I stop the two threads for example in case of KeyboardInterrupts?
Because for "except KeyboardInterrupt" the threads are still running :/
Important:
The question is about closing threads only with the module _thread!
Is it possible?
There's no way to directly interact with another thread, except for the main thread. While some platforms do offer thread cancel or kill semantics, Python doesn't expose them, and for good reason.1
So, the usual solution is to use some kind of signal to tell everyone to exit. One possibility is a done flag with a Lock around it:
done = False
donelock = _thread.allocate_lock()
def test1():
while True:
try:
donelock.acquire()
if done:
return
finally:
donelock.release()
time.sleep(1)
print('TEST1')
_thread.start_new_thread(test1,())
time.sleep(3)
try:
donelock.acquire()
done = True
finally:
donelock.release()
Of course the same thing is a lot cleaner if you use threading (or a different higher-level API like Qt's threads). Plus, you can use a Condition or Event to make the background threads exit as soon as possible, instead of only after their next sleep finishes.
done = threading.Event()
def test1():
while True:
if done.wait(1):
return
print('TEST1')
t1 = threading.Thread(target=test1)
t1.start()
time.sleep(3)
done.set()
The _thread module doesn't have an Event or Condition, of course, but you can always build one yourself—or just borrowing from the threading source.
Or, if you wanted the threads to be killed asynchronously (which obviously isn't safe if they're, e.g., writing files, but if they're just doing computation or downloads or the like that you don't care about if you're canceling, that's fine), threading makes it even easier:
t1 = threading.Thread(target=test1, daemon=True)
As a side note, the behavior you're seeing isn't actually reliable across platforms:
Background threads created with _thread may keep running, or shut down semi-cleanly, or terminate hard. So, when you use _thread in a portable application, you have to write code that can handle any of the three.
KeyboardInterrupt may be delivered to an arbitrary thread rather than the main thread. If it is, it will usually kill that thread, unless you've set up a handler. So, if you're using _thread, you usually want to handle KeyboardInterrupt and call _thread.interrupt_main().
Also, I don't think your except: is doing what you think it is. That try only covers the start_new_thread calls. If the threads start successfully, the main thread exits the try block and reaches the end of the program. If a KeyboardInterrupt or other exception is raised, the except: isn't going to be triggered. (Also, using a bare except: and not even logging which exception got handled is a really bad idea if you want to be able to understand what your code is doing.) Presumably, on your platform, background threads continue running, and the main thread blocks on them (and probably at the OS level, not the Python level, so there's no code you can write that gets involved there).
If you want your main thread to keep running to make sure it can handle a KeyboardInterrupt and so something with it (but see the caveats above!), you have to give it code to keep running:
try:
while True:
time.sleep(1<<31)
except KeyboardInterrupt:
# background-thread-killing code goes here.
1. TerminateThread on Windows makes it impossible to do all the cleanup Python needs to do. pthread_cancel on POSIX systems like Linux and macOS makes it possible, but very difficult. And the semantics are different enough between the two that trying to write a cross-platform wrapper would be a nightmare. Not to mention that Python supports systems (mostly older Unixes) that don't have the full pthread API, or even have a completely different threading API.

Python script is hanging AFTER multithreading

I know there are a few questions and answers related to hanging threads in Python, but my situation is slightly different as the script is hanging AFTER all the threads have been completed. The threading script is below, but obviously the first 2 functions are simplified massively.
When I run the script shown, it works. When I use my real functions, the script hangs AFTER THE LAST LINE. So, all the scenarios are processed (and a message printed to confirm), logStudyData() then collates all the results and writes to a csv. "Script Complete" is printed. And THEN it hangs.
The script with threading functionality removed runs fine.
I have tried enclosing the main script in try...except but no exception gets logged. If I use a debugger with a breakpoint on the final print and then step it forward, it hangs.
I know there is not much to go on here, but short of including the whole 1500-line script, I don't know hat else to do. Any suggestions welcome!
def runScenario(scenario):
# Do a bunch of stuff
with lock:
# access global variables
pass
pass
def logStudyData():
# Combine results from all scenarios into a df and write to csv
pass
def worker():
global q
while True:
next_scenario = q.get()
if next_scenario is None:
break
runScenario(next_scenario)
print(next_scenario , " is complete")
q.task_done()
import threading
from queue import Queue
global q, lock
q = Queue()
threads = []
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
num_worker_threads = 6
lock = threading.Lock()
for i in range(num_worker_threads):
print("Thread number ",i)
this_thread = threading.Thread(target=worker)
this_thread.start()
threads.append(this_thread)
for scenario_name in scenario_list:
q.put(scenario_name)
q.join()
print("q.join completed")
logStudyData()
print("script complete")
As the docs for Queue.get say:
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
In other words, there is no way get can ever return None, except by you calling q.put(None) on the main thread, which you don't do.
Notice that the example directly below those docs does this:
for i in range(num_worker_threads):
q.put(None)
for t in threads:
t.join()
The second one is technically necessary, but you usually get away with not doing it.
But the first one is absolutely necessary. You need to either do this, or come up with some other mechanism to tell your workers to quit. Without that, your main thread just tries to exit, which means it tries to join every worker, but those workers are all blocked forever on a get that will never happen, so your program hangs forever.
Building a thread pool may not be rocket science (if only because rocket scientists tend to need their calculations to be deterministic and hard real-time…), but it's not trivial, either, and there are plenty of things you can get wrong. You may want to consider using one of the two already-built threadpools in the Python standard library, concurrent.futures.ThreadPoolExecutor or multiprocessing.dummy.Pool. This would reduce your entire program to:
import concurrent.futures
def work(scenario):
runScenario(scenario)
print(scenario , " is complete")
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
with concurrent.futures.ThreadPoolExecutor(max_workers=6) as x:
results = list(x.map(work, scenario_list))
print("q.join completed")
logStudyData()
print("script complete")
Obviously you'll still need a lock around any mutable variables you change inside runScenario—although if you're only using a mutable variable there because you couldn't figure out how to return values to the main thread, that's trivial with an Executor: just return the values from work, and then you can use them like this:
for result in x.map(work, scenario_list):
do_something(result)

(Python) Stop thread with raw input?

EDIT 9/15/16: In my original code (still posted below) I tried to use .join() with a function, which is a silly mistake because it can only be used with a thread object. I am trying to
(1) continuously run a thread that gets data and saves it to a file
(2) have a second thread, or incorporate queue, that will stop the program once a user enters a flag (i.e. "stop"). It doesn't interrupt the data gathering/saving thread.
I need help with multithreading. I am trying to run two threads, one that handles data and the second checks for a flag to stop the program.
I learned by trial and error that I can't interrupt a while loop without my computer exploding. Additionally, I have abandoned my GUI code because it made my code too complicated with the mulithreading.
What I want to do is run a thread that gathers data from an Arduino, saves it to a file, and repeats this. The second thread will scan for a flag -- which can be a raw_input? I can't think of anything else that a user can do to stop the data acquisition program.
I greatly appreciate any help on this. Here is my code (much of it is pseudocode, as you can see):
#threading
import thread
import time
global flag
def monitorData():
print "running!"
time.sleep(5)
def stopdata(flag ):
flag = raw_input("enter stop: ")
if flag == "stop":
monitorData.join()
flag = "start"
thread.start_new_thread( monitorData,())
thread.start_new_thread( stopdata,(flag,))
The error I am getting is this when I try entering "stop" in the IDLE.
Unhandled exception in thread started by
Traceback (most recent call last):
File "c:\users\otangu~1\appdata\local\temp\IDLE_rtmp_h_frd5", line 16, in stopdata
AttributeError: 'function' object has no attribute 'join'
Once again I really appreciate any help, I have taught myself Python so far and this is the first huge wall that I've hit.
The error you see is a result of calling join on the function. You need to call join on the thread object. You don't capture a reference to the thread so you have no way to call join anyway. You should join like so.
th1 = thread.start_new_thread( monitorData,())
# later
th1.join()
As for a solution, you can use a Queue to communicate between threads. The queue is used to send a quit message to the worker thread and if the worker does not pick anything up off the queue for a second it runs the code that gathers data from the arduino.
from threading import Thread
from Queue import Queue, Empty
def worker(q):
while True:
try:
item = q.get(block=True, timeout=1)
q.task_done()
if item == "quit":
print("got quit msg in thread")
break
except Empty:
print("empty, do some arduino stuff")
def input_process(q):
while True:
x = raw_input("")
if x == 'q':
print("will quit")
q.put("quit")
break
q = Queue()
t = Thread(target=worker, args=(q,))
t.start()
t2 = Thread(target=input_process, args=(q,))
t2.start()
# waits for the `task_done` function to be called
q.join()
t2.join()
t.join()
It's possibly a bit more code than you hoped for and having to detect the queue is empty with an exception is a little ugly, but this doesn't rely on any global variables and will always exit promptly. That wont be the case with sleep based solutions, which need to wait for any current calls to sleep to finish before resuming execution.
As noted by someone else, you should really be using threading rather than the older thread module and also I would recommend you learn with python 3 and not python 2.
You're looking for something like this:
from threading import Thread
from time import sleep
# "volatile" global shared by threads
active = True
def get_data():
while active:
print "working!"
sleep(3)
def wait_on_user():
global active
raw_input("press enter to stop")
active = False
th1 = Thread(target=get_data)
th1.start()
th2 = Thread(target=wait_on_user)
th2.start()
th1.join()
th2.join()
You made a few obvious and a few less obvious mistakes in your code. First, join is called on a thread object, not a function. Similarly, join doesn't kill a thread, it waits for the thread to finish. A thread finishes when it has no more code to execute. If you want a thread to run until some flag is set, you normally include a loop in your thread that checks the flag every second or so (depending on how precise you need the timing to be).
Also, the threading module is preferred over the lower lever thread module. The latter has been removed in python3.
This is not possible. The thread function has to finish. You can't join it from the outside.

How to pause and resume a thread using the threading module?

I have a long process that I've scheduled to run in a thread, because otherwise it will freeze the UI in my wxpython application.
I'm using:
threading.Thread(target=myLongProcess).start()
to start the thread and it works, but I don't know how to pause and resume the thread. I looked in the Python docs for the above methods, but wasn't able to find them.
Could anyone suggest how I could do this?
I did some speed tests as well, the time to set the flag and for action to be taken is pleasantly fast 0.00002 secs on a slow 2 processor Linux box.
Example of thread pause test using set() and clear() events:
import threading
import time
# This function gets called by our thread.. so it basically becomes the thread init...
def wait_for_event(e):
while True:
print('\tTHREAD: This is the thread speaking, we are Waiting for event to start..')
event_is_set = e.wait()
print('\tTHREAD: WHOOOOOO HOOOO WE GOT A SIGNAL : %s' % event_is_set)
# or for Python >= 3.6
# print(f'\tTHREAD: WHOOOOOO HOOOO WE GOT A SIGNAL : {event_is_set}')
e.clear()
# Main code
e = threading.Event()
t = threading.Thread(name='pausable_thread',
target=wait_for_event,
args=(e,))
t.start()
while True:
print('MAIN LOOP: still in the main loop..')
time.sleep(4)
print('MAIN LOOP: I just set the flag..')
e.set()
print('MAIN LOOP: now Im gonna do some processing')
time.sleep(4)
print('MAIN LOOP: .. some more processing im doing yeahhhh')
time.sleep(4)
print('MAIN LOOP: ok ready, soon we will repeat the loop..')
time.sleep(2)
There is no method for other threads to forcibly pause a thread (any more than there is for other threads to kill that thread) -- the target thread must cooperate by occasionally checking appropriate "flags" (a threading.Condition might be appropriate for the pause/unpause case).
If you're on a unix-y platform (anything but windows, basically), you could use multiprocessing instead of threading -- that is much more powerful, and lets you send signals to the "other process"; SIGSTOP should unconditionally pause a process and SIGCONT continues it (if your process needs to do something right before it pauses, consider also the SIGTSTP signal, which the other process can catch to perform such pre-suspension duties. (There may be ways to obtain the same effect on Windows, but I'm not knowledgeable about them, if any).
You can use signals: http://docs.python.org/library/signal.html#signal.pause
To avoid using signals you could use a token passing system. If you want to pause it from the main UI thread you could probably just use a Queue.Queue object to communicate with it.
Just pop a message telling the thread the sleep for a certain amount of time onto the queue.
Alternatively you could simply continuously push tokens onto the queue from the main UI thread. The worker should just check the queue every N seconds (0.2 or something like that). When there are no tokens to dequeue the worker thread will block. When you want it to start again just start pushing tokens on to the queue from the main thread again.
The multiprocessing module works fine on Windows. See the documentation here (end of first paragraph):
http://docs.python.org/library/multiprocessing.html
On the wxPython IRC channel, we had a couple fellows trying multiprocessing out and they said it worked. Unfortunately, I have yet to see anyone who has written up a good example of multiprocessing and wxPython.
If you (or anyone else on here) come up with something, please add it to the wxPython wiki page on threading here: http://wiki.wxpython.org/LongRunningTasks
You might want to check that page out regardless as it has several interesting examples using threads and queues.
You might take a look at the Windows API for thread suspension.
As far as I'm aware there is no POSIX/pthread equivalent. Furthermore, I cannot ascertain if thread handles/IDs are made available from Python. There are also potential issues with Python, as its scheduling is done using the native scheduler, it's unlikely that it is expecting threads to suspend, particularly if threads suspended while holding the GIL, amongst other possibilities.
I had the same issue. It is more effective to use time.sleep(1800) in the thread loop to pause the thread execution.
e.g
MON, TUE, WED, THU, FRI, SAT, SUN = range(7) #Enumerate days of the week
Thread 1 :
def run(self):
while not self.exit:
try:
localtime = time.localtime(time.time())
#Evaluate stock
if localtime.tm_hour > 16 or localtime.tm_wday > FRI:
# do something
pass
else:
print('Waiting to evaluate stocks...')
time.sleep(1800)
except:
print(traceback.format_exc())
Thread 2
def run(self):
while not self.exit:
try:
localtime = time.localtime(time.time())
if localtime.tm_hour >= 9 and localtime.tm_hour <= 16:
# do something
pass
else:
print('Waiting to update stocks indicators...')
time.sleep(1800)
except:
print(traceback.format_exc())

Categories