How to start a thread again with an Event object in Python? - python

I want to make a thread and control it with an event object. Detailedly speaking, I want the thread to be executed whenever the event object is set and to wait itselt, repeatedly.
The below shows a sketchy logic I thought of.
import threading
import time
e = threading.Event()
def start_operation():
e.wait()
while e.is_set():
print('STARTING TASK')
e.clear()
t1 = threading.Thread(target=start_operation)
t1.start()
e.set() # first set
e.set() # second set
I expected t1 to run once the first set has been commanded and to stop itself(due to e.clear inside it), and then to run again after the second set has been commanded. So, accordign to what I expected, it should print out 'STARTING TASK' two times. But it shows it only once, which I don't understand why. How am I supposed to change the code to make it run the while loop again, whenever the event object is set?

The first problem is that once you exit a while loop, you've exited it. Changing the predicate back won't change anything. Forget about events for a second and just look at this code:
i = 0
while i == 0:
i = 1
It obviously doesn't matter if you set i = 0 again later, right? You've already left the while loop, and the whole function. And your code is doing exactly the same thing.
You can fix problem that by just adding another while loop around the whole thing:
def start_operation():
while True:
e.wait()
while e.is_set():
print('STARTING TASK')
e.clear()
However, that still isn't going to work—except maybe occasionally, by accident.
Event.set doesn't block; it just sets the event immediately, even if it's already set. So, the most likely flow of control here is:
background thread hits e.wait() and blocks.
main thread hits e.set() and sets event.
main thread hits e.set() and sets event again, with no effect.
background thread wakes up, does the loop once, calls e.clear() at the end.
background thread waits forever on e.wait().
(The fact that there's no way to avoid missed signals with events is effectively the reason conditions were invented, and that anything newer than Win32 and Python doesn't bother with events… But a condition isn't sufficient here either.)
If you want the main thread to block until the event is clear, and only then set it again, you can't do that. You need something extra, like a second event, which the main thread can wait on and the background thread can set.
But if you want to keep track of multiple set calls, without missing any, you need to use a different sync mechanism. A queue.Queue may be overkill here, but it's dead simple to do in Python, so let's just use that. Of course you don't actually have any values to put on the queue, but that's OK; you can just stick a dummy value there:
import queue
import threading
q = queue.Queue()
def start_operation():
while True:
_ = q.get()
print('STARTING TASK')
t1 = threading.Thread(target=start_operation)
t1.start()
q.put(None)
q.put(None)
And if you later want to add a way to shut down the background thread, just change it to stick values on:
import queue
import threading
q = queue.Queue()
def start_operation():
while True:
if q.get():
return
print('STARTING TASK')
t1 = threading.Thread(target=start_operation)
t1.start()
q.put(False)
q.put(False)
q.put(True)

Related

What is the best way to user-input data while in a loop?

I have a Python program running a (nested) loop which will run for a fairly long time, and I want the user to be able to pause and/or abort it with just pressing p and c, respectively.
I am running this in an IPython console, so I don't really have access to msvctr.getch and I kinda want to keep it platform independent.
Obviously, input() blocks, which is exactly what I do not want. So I tried threading, which works when used as intended, but when hitting CTRLC the thread does not stop. This is likely because any legitimate method to stop the thread (atexit, global variable or lambda stop_thread) isn't executed because the thread blocks.
import threading
import queue
q = queue.SimpleQueue()
stop_thread = False
def handle_input(q, stopped):
s = ''
while not stopped():
s = input()
q.put(s)
thread = threading.Thread(target=handle_input,
args=[q, lambda: stop_thread])
thread.start()
for i in range(very_long_time):
#Do something time consuming
if not q.empty():
s = q.get_nowait()
if 'p' in s:
print('Paused...', end='\r')
s = s.replace('p', '')
while True:
if not q.empty():
s += q.get_nowait()
if 'p' in s or 'c' in s:
s = s.replace('p', '')
break
time.sleep(0.5)
if 'c' in s:
print('\rAborted training loop...' + ' '*50, end='\r')
s = s.replace('c', '')
stop_thread = True
# Another method of stopping the thread
# thread.__getattribute__('_tstate_lock').release()
# thread._stop()
# thread.join()
break
This works in principle, but breaks when interrupting.
The thread does not seem to stop, which poses a problem when running this again in the same console, because it does not even ask for user input then.
Additionally, this prints my 'c' or 'p' and a newline, which I can't get rid of, because IPython doesn't allow all ANSI escapes.
Is there a fix to my method, or even better, a cleaner alternative?
You can try using the keyboard module, which (among other things) lets you bind event hooks to keyboard presses.
In this case, I would create a set of global variables/flags (say, paused and abort), initially set to False, and then make some hotkeys for p and c respectively to toggle them:
paused = False
abort = False
def toggle_paused():
global paused
paused = not paused
def trigger_abort():
abort = True
keyboard.add_hotkey('p', toggle_paused())
keyboard.add_hotkey('c', trigger_abort())
And then change your loop to check for paused and abort on every iteration (assuming, that is, that each iteration is fairly quick). What you're already doing would more-or-less work - just remove the queues and threading stuff you've already set up (IIRC keyboard's events run on their own threads anyway), de-indent the if conditions, and change the conditions to if paused: and if abort: respectively.
You can also lace the rest of your code with things that look for pause or abort flags, so that your program can gracefully pause or exit at a convenient time for it. You can also extend the toggle_paused() and trigger_abort() to do whatever you need them to (e.g. have trigger_abort() print "Trying to abort program (kill me if I'm not done in 5 seconds)" or something.
Although, as #Tomerikoo suggested in a comment, creating the threat with the daemon=True option is the best answer, if it's possible with the way your program is designed. If this is all your program does then using daemon threads wouldn't work, because your program would just quit immediately, but if this is a background operation then you can use a daemon thread to put it in the background where it won't obstruct the rest of the user's experience.

Python script is hanging AFTER multithreading

I know there are a few questions and answers related to hanging threads in Python, but my situation is slightly different as the script is hanging AFTER all the threads have been completed. The threading script is below, but obviously the first 2 functions are simplified massively.
When I run the script shown, it works. When I use my real functions, the script hangs AFTER THE LAST LINE. So, all the scenarios are processed (and a message printed to confirm), logStudyData() then collates all the results and writes to a csv. "Script Complete" is printed. And THEN it hangs.
The script with threading functionality removed runs fine.
I have tried enclosing the main script in try...except but no exception gets logged. If I use a debugger with a breakpoint on the final print and then step it forward, it hangs.
I know there is not much to go on here, but short of including the whole 1500-line script, I don't know hat else to do. Any suggestions welcome!
def runScenario(scenario):
# Do a bunch of stuff
with lock:
# access global variables
pass
pass
def logStudyData():
# Combine results from all scenarios into a df and write to csv
pass
def worker():
global q
while True:
next_scenario = q.get()
if next_scenario is None:
break
runScenario(next_scenario)
print(next_scenario , " is complete")
q.task_done()
import threading
from queue import Queue
global q, lock
q = Queue()
threads = []
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
num_worker_threads = 6
lock = threading.Lock()
for i in range(num_worker_threads):
print("Thread number ",i)
this_thread = threading.Thread(target=worker)
this_thread.start()
threads.append(this_thread)
for scenario_name in scenario_list:
q.put(scenario_name)
q.join()
print("q.join completed")
logStudyData()
print("script complete")
As the docs for Queue.get say:
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
In other words, there is no way get can ever return None, except by you calling q.put(None) on the main thread, which you don't do.
Notice that the example directly below those docs does this:
for i in range(num_worker_threads):
q.put(None)
for t in threads:
t.join()
The second one is technically necessary, but you usually get away with not doing it.
But the first one is absolutely necessary. You need to either do this, or come up with some other mechanism to tell your workers to quit. Without that, your main thread just tries to exit, which means it tries to join every worker, but those workers are all blocked forever on a get that will never happen, so your program hangs forever.
Building a thread pool may not be rocket science (if only because rocket scientists tend to need their calculations to be deterministic and hard real-time…), but it's not trivial, either, and there are plenty of things you can get wrong. You may want to consider using one of the two already-built threadpools in the Python standard library, concurrent.futures.ThreadPoolExecutor or multiprocessing.dummy.Pool. This would reduce your entire program to:
import concurrent.futures
def work(scenario):
runScenario(scenario)
print(scenario , " is complete")
scenario_list = ['s1','s2','s3','s4','s5','s6','s7','s8','s9','s10','s11','s12']
with concurrent.futures.ThreadPoolExecutor(max_workers=6) as x:
results = list(x.map(work, scenario_list))
print("q.join completed")
logStudyData()
print("script complete")
Obviously you'll still need a lock around any mutable variables you change inside runScenario—although if you're only using a mutable variable there because you couldn't figure out how to return values to the main thread, that's trivial with an Executor: just return the values from work, and then you can use them like this:
for result in x.map(work, scenario_list):
do_something(result)

(Python) Stop thread with raw input?

EDIT 9/15/16: In my original code (still posted below) I tried to use .join() with a function, which is a silly mistake because it can only be used with a thread object. I am trying to
(1) continuously run a thread that gets data and saves it to a file
(2) have a second thread, or incorporate queue, that will stop the program once a user enters a flag (i.e. "stop"). It doesn't interrupt the data gathering/saving thread.
I need help with multithreading. I am trying to run two threads, one that handles data and the second checks for a flag to stop the program.
I learned by trial and error that I can't interrupt a while loop without my computer exploding. Additionally, I have abandoned my GUI code because it made my code too complicated with the mulithreading.
What I want to do is run a thread that gathers data from an Arduino, saves it to a file, and repeats this. The second thread will scan for a flag -- which can be a raw_input? I can't think of anything else that a user can do to stop the data acquisition program.
I greatly appreciate any help on this. Here is my code (much of it is pseudocode, as you can see):
#threading
import thread
import time
global flag
def monitorData():
print "running!"
time.sleep(5)
def stopdata(flag ):
flag = raw_input("enter stop: ")
if flag == "stop":
monitorData.join()
flag = "start"
thread.start_new_thread( monitorData,())
thread.start_new_thread( stopdata,(flag,))
The error I am getting is this when I try entering "stop" in the IDLE.
Unhandled exception in thread started by
Traceback (most recent call last):
File "c:\users\otangu~1\appdata\local\temp\IDLE_rtmp_h_frd5", line 16, in stopdata
AttributeError: 'function' object has no attribute 'join'
Once again I really appreciate any help, I have taught myself Python so far and this is the first huge wall that I've hit.
The error you see is a result of calling join on the function. You need to call join on the thread object. You don't capture a reference to the thread so you have no way to call join anyway. You should join like so.
th1 = thread.start_new_thread( monitorData,())
# later
th1.join()
As for a solution, you can use a Queue to communicate between threads. The queue is used to send a quit message to the worker thread and if the worker does not pick anything up off the queue for a second it runs the code that gathers data from the arduino.
from threading import Thread
from Queue import Queue, Empty
def worker(q):
while True:
try:
item = q.get(block=True, timeout=1)
q.task_done()
if item == "quit":
print("got quit msg in thread")
break
except Empty:
print("empty, do some arduino stuff")
def input_process(q):
while True:
x = raw_input("")
if x == 'q':
print("will quit")
q.put("quit")
break
q = Queue()
t = Thread(target=worker, args=(q,))
t.start()
t2 = Thread(target=input_process, args=(q,))
t2.start()
# waits for the `task_done` function to be called
q.join()
t2.join()
t.join()
It's possibly a bit more code than you hoped for and having to detect the queue is empty with an exception is a little ugly, but this doesn't rely on any global variables and will always exit promptly. That wont be the case with sleep based solutions, which need to wait for any current calls to sleep to finish before resuming execution.
As noted by someone else, you should really be using threading rather than the older thread module and also I would recommend you learn with python 3 and not python 2.
You're looking for something like this:
from threading import Thread
from time import sleep
# "volatile" global shared by threads
active = True
def get_data():
while active:
print "working!"
sleep(3)
def wait_on_user():
global active
raw_input("press enter to stop")
active = False
th1 = Thread(target=get_data)
th1.start()
th2 = Thread(target=wait_on_user)
th2.start()
th1.join()
th2.join()
You made a few obvious and a few less obvious mistakes in your code. First, join is called on a thread object, not a function. Similarly, join doesn't kill a thread, it waits for the thread to finish. A thread finishes when it has no more code to execute. If you want a thread to run until some flag is set, you normally include a loop in your thread that checks the flag every second or so (depending on how precise you need the timing to be).
Also, the threading module is preferred over the lower lever thread module. The latter has been removed in python3.
This is not possible. The thread function has to finish. You can't join it from the outside.

Python Multiprocessing Async Can't Terminate Process

I have an infinite loop running async but I can't terminate it. Here is a similiar version of my code :
from multiprocessing import Pool
test_pool = Pool(processes=1)
self.button1.clicked.connect(self.starter)
self.button2.clicked.connect(self.stopper)
def starter(self):
global test_pool
test_pool.apply_async(self.automatizer)
def automatizer(self):
i = 0
while i != 0 :
self.job1()
# safe stop point
self.job2()
# safe stop point
self.job3()
# safe stop point
def job1(self):
# doing some stuff
def job2(self):
# doing some stuff
def job3(self):
# doing some stuff
def stopper(self):
global test_pool
test_pool.terminate()
My problem is terminate() inside stopper function doesn't work. I tried to put terminate() inside job1,job2,job3 functions still not working, tried putting at the end of the loop in starter function, again not working. How can I stop this async process ?
While stopping the process at anytime is good enough, is it possible to make it stop at the points I want ? I mean if a stop command (not sure about what command it is) is given to process, I want it to complete the steps to "# safe stop point" marker then terminate the process.
You really should be avoiding the use of terminate() in normal operation. It should only be used in unusual cases, such as hanging or unresponsive processes. The normal way to end a process pool is to call pool.close() followed by pool.join().
These methods do require the function that your pool is executing to return, and your call to pool.join() will block your main process until it does so. I would suggest you add a multiprocess.Queue to give yourself a way to tell your subprocess to exit:
# this import is NOT the same as multiprocessing.Queue - this is here for the
# queue.Empty exception
import Queue
queue = multiprocessing.Queue() # not the same as a Queue.Queue()
def stopper(self):
# don't need "global" keyword to call a global object's method
# it's only necessary if we want to modify a global
queue.put("Stop")
test_pool.close()
test_pool.join()
def automatizer(self):
while True: # cleaner infinite loop - yours was never executing
for func in [self.job1, self.job2, self.job3]: # iterate over methods
func() # call each one
# between each function call, check the queue for "poison pill"
try:
if queue.get(block=False) == "Stop":
return
except Queue.Empty:
pass
Since you didn't provide a more complete code sample, you'll have to figure out where to actually instantiate the multiprocessing.Queue and how to pass things around. Also, the comment from Janne Karila was correct. You should switch your code to use a single Process instead of a pool if you're only using one process at a time anyway. The Process class also uses a blocking join() method to tell it to end once it has returned. The only safe way to end processes at "known safe points" is to implement some kind of interprocess communication like I've done here. Pipes would work as well.

Python Queue waiting for thread before getting next item

I have a queue that always needs to be ready to process items when they are added to it. The function that runs on each item in the queue creates and starts thread to execute the operation in the background so the program can go do other things.
However, the function I am calling on each item in the queue simply starts the thread and then completes execution, regardless of whether or not the thread it started completed. Because of this, the loop will move on to the next item in the queue before the program is done processing the last item.
Here is code to better demonstrate what I am trying to do:
queue = Queue.Queue()
t = threading.Thread(target=worker)
t.start()
def addTask():
queue.put(SomeObject())
def worker():
while True:
try:
# If an item is put onto the queue, immediately execute it (unless
# an item on the queue is still being processed, in which case wait
# for it to complete before moving on to the next item in the queue)
item = queue.get()
runTests(item)
# I want to wait for 'runTests' to complete before moving past this point
except Queue.Empty, err:
# If the queue is empty, just keep running the loop until something
# is put on top of it.
pass
def runTests(args):
op_thread = SomeThread(args)
op_thread.start()
# My problem is once this last line 't.start()' starts the thread,
# the 'runTests' function completes operation, but the operation executed
# by some thread is not yet done executing because it is still running in
# the background. I do not want the 'runTests' function to actually complete
# execution until the operation in thread t is done executing.
"""t.join()"""
# I tried putting this line after 't.start()', but that did not solve anything.
# I have commented it out because it is not necessary to demonstrate what
# I am trying to do, but I just wanted to show that I tried it.
Some notes:
This is all running in a PyGTK application. Once the 'SomeThread' operation is complete, it sends a callback to the GUI to display the results of the operation.
I do not know how much this affects the issue I am having, but I thought it might be important.
A fundamental issue with Python threads is that you can't just kill them - they have to agree to die.
What you should do is:
Implement the thread as a class
Add a threading.Event member which the join method clears and the thread's main loop occasionally checks. If it sees it's cleared, it returns. For this override threading.Thread.join to check the event and then call Thread.join on itself
To allow (2), make the read from Queue block with some small timeout. This way your thread's "response time" to the kill request will be the timeout, and OTOH no CPU choking is done
Here's some code from a socket client thread I have that has the same issue with blocking on a queue:
class SocketClientThread(threading.Thread):
""" Implements the threading.Thread interface (start, join, etc.) and
can be controlled via the cmd_q Queue attribute. Replies are placed in
the reply_q Queue attribute.
"""
def __init__(self, cmd_q=Queue.Queue(), reply_q=Queue.Queue()):
super(SocketClientThread, self).__init__()
self.cmd_q = cmd_q
self.reply_q = reply_q
self.alive = threading.Event()
self.alive.set()
self.socket = None
self.handlers = {
ClientCommand.CONNECT: self._handle_CONNECT,
ClientCommand.CLOSE: self._handle_CLOSE,
ClientCommand.SEND: self._handle_SEND,
ClientCommand.RECEIVE: self._handle_RECEIVE,
}
def run(self):
while self.alive.isSet():
try:
# Queue.get with timeout to allow checking self.alive
cmd = self.cmd_q.get(True, 0.1)
self.handlers[cmd.type](cmd)
except Queue.Empty as e:
continue
def join(self, timeout=None):
self.alive.clear()
threading.Thread.join(self, timeout)
Note self.alive and the loop in run.

Categories