I am working on a Python service that subscribes real-time streaming data from one messaging broker and publishes to another broker, in some situations I also need to get snapshot data from other data source on network disconnection or system recovery. While the streaming data comes from one thread, and some service events happen in another thread, I decided to create a data processing thread to just pop the queue one by one. I got it to work but later I tried to keep the snapshot fetching logic in a separate thread and that's where things get messy.
I know this is a long question with a lot of specific nuances but I tried to make the example here as clear as I can.
So here is what the 1st attempt looks like, and it works well:
import queue
import threading
def process_data(data_queue, data_store):
# data_store is my internal cache data structure.
# so for simplicity and demonstration purpose, I assume the following:
# if its type is dict, it's snapshot data
# if its type is tuple, it's a key/value pair and that's an incremental update data
# if it is -1, we terminate the queue processing
# if it is -2, we need to retrieve a snapshot
while True:
x = data_queue.get()
if isinstance(x, dict):
data_store.on_snapshot(x)
elif isinstance(x, tuple):
k, v = x
data_store.on_update(k, v)
elif isinstance(x, int):
if x == -1:
data_queue.task_done()
break
elif x == -2:
get_snapshot() # this is potentially a long blocking call
else:
print('unknown int', x)
else:
print('unknown data', x)
data_queue.task_done()
if __name__ == '__main__':
data_store = DataStore()
data_queue = queue.Queue()
# start other threads that write data to the queue
start_data_writer1(data_queue)
start_data_writer2(data_queue)
start_thread_for_some_event(data_queue) # may put -2 in the queue for snapshot
process_thread = threading.Thread(
target=process_data,
args=(data_queue, data_store))
process_thread.start()
data_queue.put(-2) # signal a snapshot fetching
do_something_else()
try:
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
print('terminating...')
finally:
# to break out of the infinite loop in process_data()
data_queue.put(-1)
process_thread.join()
data_queue.join()
This way works however I don't particularly like the fact that I am calling get_snapshot() function in the processing thread. I thought the idea of the processing thread is to be busy at popping data off the queue all the time, unless there's nothing to pop. In the above implementation, during the get_snapshot call, it is possible that queue could be built up due to other writer threads.
So I tried something else and I also wanted to be able to exit the program gracefully. That's where things get really ugly. I created a new thread for occasionally fetching the snapshot and used condition object for thread communication. This is what I did on top of the existing code:
snapshot_lock = threading.Lock()
cond = threading.Condition(snapshot_lock)
need_for_snapshot = False # used to trigger snapshots
keep_snapshot_thread = True # flag if the snapshot thread is done
# then I need to add this new function to run snapshot fetching
def fetch_snapshot(data_queue):
global need_for_snapshot
global keep_snapshot_thread
def _need_snapshot():
global need_for_snapshot
return need_for_snapshot
while True:
with cond:
cond.wait_for(_need_snapshot)
if not keep_snapshot_thread:
break
data_queue.put(get_snapshot()) # the long blocking function
need_for_snapshot = False
# in process_data() function, everything stays the same except for the `if` statement for handling `x == 2`
def process_data(data_queue, data_store):
global need_for_snapshot
while True:
x = data_queue.get()
# omitting some old code
elif isinstance(x, int):
if x == -1:
data_queue.task_done()
break
elif x == -2:
**with cond:
need_for_snapshot = True
cond.notify()**
# more code omitted
if __name__ == '__main__':
# same code as before except for the finally part
try:
# start other threads...omitting some code
# when a snapshot is needed in these threads
# do the following
# with cond:
# need_for_snapshot = True
# cond.notify()
# start snapshot worker thread
snapshot_thread = threading.Thread(
target=fetch_snapshot, args=(data_queue,))
process_thread = threading.Thread(
target=process_data,
args=(data_queue, data_store))
snapshot_thread.start()
process_thread.start()
data_queue.put(-2) # signal fetching a snapshot
# omitting more code here...
finally:
keep_snapshot_thread = False
# we don't technically need to trigger another snapshot now
# but the code below is to unblock the cond.wait_for() part
# since keep_snapshot_thread flag is just flipped, we can use
# it to break out of the infinite loop in fetch_snapshot thread.
# This is the part that I feel hacky...
with cond:
need_for_snapshot = True
cond.notify()
snapshot_t.join()
data_queue.put(-1) # signal the termination of process_thread
process_t.join()
data_queue.join()
I think I got this to work, especially that the program can exit gracefully when I hit ctrl-c but it is so ugly and tricky that I had to play with it quick a bit to get it to work correctly.
Is there some way I can write it more elegantly? Is there some sort of pattern that we generally use to solve this type of problem? Thank you so much for your help.
The standard technique for handling multiple producers and multiple consumers is to use an Event is_done and a joinable Queue work.
The worker queues do nothing but:
while not event.is_set():
try:
job = work.get(timeout=5)
except Empty:
continue
handle the job
work.task_done()
Your main worker does the following:
start the jobs that produce work
wait for them to be done
work.join() # wait for the queue to be empty
event.set() # tell the workers they can exit
perform any cleanup necessary
Note that the goal is to decouple the workers and the producers as much as possible. Trying to create complicated logic tying them together is almost certain to produce race cases.
An alternative would be to create a sentinel object like "END" indicating that everything is done. Once all the producers are done, the main thread would push a number of sentinel objects equal to the number of workers onto the work queue, and the call work.join(). Each worker thread would call work.get() inside a loop and exit when it saw the sentinel. Remember to call work.task_done() on the sentinel, too!
Again. You want simple logic and to use the tools provided by multithreading.
Related
I'm using the multiprocessing module to split up a very large task. It works for the most part, but I must be missing something obvious with my design, because this way it's very hard for me to effectively tell when all of the data has been processed.
I have two separate tasks that run; one that feeds the other. I guess this is a producer/consumer problem. I use a shared Queue between all processes, where the producers fill up the queue, and the consumers read from the queue and do the processing. The problem is that there is a finite amount of data, so at some point everyone needs to know that all of the data has been processed so the system can shut down gracefully.
It would seem to make sense to use the map_async() function, but since the producers are filling up the queue, I don't know all of the items up front, so I have to go into a while loop and use apply_async() and try to detect when everything is done with some sort of timeout...ugly.
I feel like I'm missing something obvious. How can this be better designed?
PRODCUER
class ProducerProcess(multiprocessing.Process):
def __init__(self, item, consumer_queue):
self.item = item
self.consumer_queue = consumer_queue
multiprocessing.Process.__init__(self)
def run(self):
for record in get_records_for_item(self.item): # this takes time
self.consumer_queue.put(record)
def start_producer_processes(producer_queue, consumer_queue, max_running):
running = []
while not producer_queue.empty():
running = [r for r in running if r.is_alive()]
if len(running) < max_running:
producer_item = producer_queue.get()
p = ProducerProcess(producer_item, consumer_queue)
p.start()
running.append(p)
time.sleep(1)
CONSUMER
def process_consumer_chunk(queue, chunksize=10000):
for i in xrange(0, chunksize):
try:
# don't wait too long for an item
# if new records don't arrive in 10 seconds, process what you have
# and let the next process pick up more items.
record = queue.get(True, 10)
except Queue.Empty:
break
do_stuff_with_record(record)
MAIN
if __name__ == "__main__":
manager = multiprocessing.Manager()
consumer_queue = manager.Queue(1024*1024)
producer_queue = manager.Queue()
producer_items = xrange(0,10)
for item in producer_items:
producer_queue.put(item)
p = multiprocessing.Process(target=start_producer_processes, args=(producer_queue, consumer_queue, 8))
p.start()
consumer_pool = multiprocessing.Pool(processes=16, maxtasksperchild=1)
Here is where it gets cheesy. I can't use map, because the list to consume is being filled up at the same time. So I have to go into a while loop and try to detect a timeout. The consumer_queue can become empty while the producers are still trying to fill it up, so I can't just detect an empty queue an quit on that.
timed_out = False
timeout= 1800
while 1:
try:
result = consumer_pool.apply_async(process_consumer_chunk, (consumer_queue, ), dict(chunksize=chunksize,))
if timed_out:
timed_out = False
except Queue.Empty:
if timed_out:
break
timed_out = True
time.sleep(timeout)
time.sleep(1)
consumer_queue.join()
consumer_pool.close()
consumer_pool.join()
I thought that maybe I could get() the records in the main thread and pass those into the consumer instead of passing the queue in, but I think I end up with the same problem that way. I still have to run a while loop and use apply_async() Thank you in advance for any advice!
You could use a manager.Event to signal the end of the work. This event can be shared between all of your processes and then when you signal it from your main process the other workers can then gracefully shutdown.
while not event.is_set():
...rest of code...
So, your consumers would wait for the event to be set and handle the cleanup once it is set.
To determine when to set this flag you can do a join on the producer threads and when those are all complete you can then join on the consumer threads.
I would like to strongly recommend SimPy instead of multiprocess/threading to do discrete event simulation.
I want to make a thread and control it with an event object. Detailedly speaking, I want the thread to be executed whenever the event object is set and to wait itselt, repeatedly.
The below shows a sketchy logic I thought of.
import threading
import time
e = threading.Event()
def start_operation():
e.wait()
while e.is_set():
print('STARTING TASK')
e.clear()
t1 = threading.Thread(target=start_operation)
t1.start()
e.set() # first set
e.set() # second set
I expected t1 to run once the first set has been commanded and to stop itself(due to e.clear inside it), and then to run again after the second set has been commanded. So, accordign to what I expected, it should print out 'STARTING TASK' two times. But it shows it only once, which I don't understand why. How am I supposed to change the code to make it run the while loop again, whenever the event object is set?
The first problem is that once you exit a while loop, you've exited it. Changing the predicate back won't change anything. Forget about events for a second and just look at this code:
i = 0
while i == 0:
i = 1
It obviously doesn't matter if you set i = 0 again later, right? You've already left the while loop, and the whole function. And your code is doing exactly the same thing.
You can fix problem that by just adding another while loop around the whole thing:
def start_operation():
while True:
e.wait()
while e.is_set():
print('STARTING TASK')
e.clear()
However, that still isn't going to work—except maybe occasionally, by accident.
Event.set doesn't block; it just sets the event immediately, even if it's already set. So, the most likely flow of control here is:
background thread hits e.wait() and blocks.
main thread hits e.set() and sets event.
main thread hits e.set() and sets event again, with no effect.
background thread wakes up, does the loop once, calls e.clear() at the end.
background thread waits forever on e.wait().
(The fact that there's no way to avoid missed signals with events is effectively the reason conditions were invented, and that anything newer than Win32 and Python doesn't bother with events… But a condition isn't sufficient here either.)
If you want the main thread to block until the event is clear, and only then set it again, you can't do that. You need something extra, like a second event, which the main thread can wait on and the background thread can set.
But if you want to keep track of multiple set calls, without missing any, you need to use a different sync mechanism. A queue.Queue may be overkill here, but it's dead simple to do in Python, so let's just use that. Of course you don't actually have any values to put on the queue, but that's OK; you can just stick a dummy value there:
import queue
import threading
q = queue.Queue()
def start_operation():
while True:
_ = q.get()
print('STARTING TASK')
t1 = threading.Thread(target=start_operation)
t1.start()
q.put(None)
q.put(None)
And if you later want to add a way to shut down the background thread, just change it to stick values on:
import queue
import threading
q = queue.Queue()
def start_operation():
while True:
if q.get():
return
print('STARTING TASK')
t1 = threading.Thread(target=start_operation)
t1.start()
q.put(False)
q.put(False)
q.put(True)
I have the following thread, for where the queue q is empty most of the time:
def run(self):
while True:
if not self.exit_flag:
items = self.q.get()
q.work_it()
else:
return 0
If the exit flag is set, it's not likely to exit immediately, because it's probably currently blocking at
items = self.q.get()
If I put a timeout on it
items = self.q.get(True, 0.1)
I will often raise the Empty exception, since the queue is more often empty than not, using more resources than I'd like.
If I do busy waiting like
def run(self):
while True:
if not self.exit_flag:
if not self.q.empty():
items = self.q.get()
q.work_it()
time.sleep(0.1)
else:
return 0
Then I do busy waiting instead of using the blocking feature of Queue.get() which seems ugly. It seems like I'm missing the elegant solution of this problem here? Is there one or should I just use the busy waiting solution?
Instead of using an exit_flag flag, consider putting a special "shutdown" indicator in the queue. When the worker dequeues a shutdown indicator, have it recognize the shutdown indicator and shut down instead of continuing on.
One approach with no third party dependencies is to use the consumer/producer pattern where you have your code run in it's own thread so it's not blocking.
I am having problems using Process from multiprocessing module. Before I used to create a Thread instead and everything worked fine, unfortunately, I had to change to optimise performance.
This is my code to play a game, it basically uses computer vision for object detection and by using a separate Process allows the game to be started.
#opecv infinite loop for frames processing
while True:
# detect object, code omitted
k = cv2.waitKey(20) & 0xFF
# when user press key 's' start the game
if (k == ord('s') or k == ord('S')) and start is False:
start = True
info = False
# # t = Thread(target=playGame, args=(k,))
# # t = Thread(target=playGame)
# # t.start() with threads worked successfully
p = Process(target=playGame)
p.start()
# if user press 'p' capture marker position and set boolean flag to true
elif k == ord('p') or k == ord('P'):
waitForUserMove = True
This is my playGame() function containing the game loop:
def playGame():
#omitted code
while gameIsPlaying:
getUserMove()
#rest of code
And finally this is my getUserMove() function containing a while loop to wait for the user to make the move:
def getUserMove():
while waitForUserMove is False:
pass
So basically when the user makes the move and press key 'p', it changes the boolean flag waitForUserMove to True and automatically breaks from the while loop, executing the rest of the code.
As I said before using Threads everything worked fine, now that I substitute Processes with Threads I am having this problem, where the boolean flag waitForUserMove changes to true, but this information cannot be received by the Process for some reasons.
In other words, once user press key 'p' changes the boolean flag waitForUserMove to True just outside the Process, inside the Process this waitForUserMove is still False.
So how can I do to send this info to the Process in order to change the flag waitForUserMove from False to True?
I hope is clear, I couldn't find better words to write my problem. Thank you in advance for your help.
Multiprocessing is fundamentally different from threads. In multiprocessing the two processes have a separate memory address space so if one process writes into his memory, the sibling process can't see the change of the variable.
To exchange data between different processes you should refer to Exchanging Objects Between Processes
In your case you only have a one way communication so Queues should work:
Setting up the queue:
q = Queue()
p = Process(target=playGame, args=(q,))
Sending in playGame:
def playGame(q):
#omitted code
while gameIsPlaying:
move = getUserMove()
q.put(move)
Receive in the main process:
def getUserMove():
move = q.get()
Note that q.get() is blocking, that means the main process is blocked until playGame adds something into the queue. If you need to do something alongside, then use q.get_nowait()
I have a queue that always needs to be ready to process items when they are added to it. The function that runs on each item in the queue creates and starts thread to execute the operation in the background so the program can go do other things.
However, the function I am calling on each item in the queue simply starts the thread and then completes execution, regardless of whether or not the thread it started completed. Because of this, the loop will move on to the next item in the queue before the program is done processing the last item.
Here is code to better demonstrate what I am trying to do:
queue = Queue.Queue()
t = threading.Thread(target=worker)
t.start()
def addTask():
queue.put(SomeObject())
def worker():
while True:
try:
# If an item is put onto the queue, immediately execute it (unless
# an item on the queue is still being processed, in which case wait
# for it to complete before moving on to the next item in the queue)
item = queue.get()
runTests(item)
# I want to wait for 'runTests' to complete before moving past this point
except Queue.Empty, err:
# If the queue is empty, just keep running the loop until something
# is put on top of it.
pass
def runTests(args):
op_thread = SomeThread(args)
op_thread.start()
# My problem is once this last line 't.start()' starts the thread,
# the 'runTests' function completes operation, but the operation executed
# by some thread is not yet done executing because it is still running in
# the background. I do not want the 'runTests' function to actually complete
# execution until the operation in thread t is done executing.
"""t.join()"""
# I tried putting this line after 't.start()', but that did not solve anything.
# I have commented it out because it is not necessary to demonstrate what
# I am trying to do, but I just wanted to show that I tried it.
Some notes:
This is all running in a PyGTK application. Once the 'SomeThread' operation is complete, it sends a callback to the GUI to display the results of the operation.
I do not know how much this affects the issue I am having, but I thought it might be important.
A fundamental issue with Python threads is that you can't just kill them - they have to agree to die.
What you should do is:
Implement the thread as a class
Add a threading.Event member which the join method clears and the thread's main loop occasionally checks. If it sees it's cleared, it returns. For this override threading.Thread.join to check the event and then call Thread.join on itself
To allow (2), make the read from Queue block with some small timeout. This way your thread's "response time" to the kill request will be the timeout, and OTOH no CPU choking is done
Here's some code from a socket client thread I have that has the same issue with blocking on a queue:
class SocketClientThread(threading.Thread):
""" Implements the threading.Thread interface (start, join, etc.) and
can be controlled via the cmd_q Queue attribute. Replies are placed in
the reply_q Queue attribute.
"""
def __init__(self, cmd_q=Queue.Queue(), reply_q=Queue.Queue()):
super(SocketClientThread, self).__init__()
self.cmd_q = cmd_q
self.reply_q = reply_q
self.alive = threading.Event()
self.alive.set()
self.socket = None
self.handlers = {
ClientCommand.CONNECT: self._handle_CONNECT,
ClientCommand.CLOSE: self._handle_CLOSE,
ClientCommand.SEND: self._handle_SEND,
ClientCommand.RECEIVE: self._handle_RECEIVE,
}
def run(self):
while self.alive.isSet():
try:
# Queue.get with timeout to allow checking self.alive
cmd = self.cmd_q.get(True, 0.1)
self.handlers[cmd.type](cmd)
except Queue.Empty as e:
continue
def join(self, timeout=None):
self.alive.clear()
threading.Thread.join(self, timeout)
Note self.alive and the loop in run.