Avoiding deadlocks due to queue overflow with multiprocessing.JoinableQueue

Avoiding deadlocks due to queue overflow with multiprocessing.JoinableQueue - python

Suppose we have a multiprocessing.Pool where worker threads share a multiprocessing.JoinableQueue, dequeuing work items and potentially enqueuing more work:
def worker_main(queue):
while True:
work = queue.get()
for new_work in process(work):
queue.put(new_work)
queue.task_done()
When the queue fills up, queue.put() will block. As long as there is at least one process reading from the queue with queue.get(), it will free up space in the queue to unblock the writers. But all of the processes could potentially block at queue.put() at the same time.
Is there a way to avoid getting jammed up like this?

Depending on how often process(work) creates more items, there may be no solution beside a queue of an infinite maximum size.
In short, your queue must be large enough to accomodate the entire backlog of work items that you can have at any time.
Since queue is implemented with semaphores, there may indeed be a hard size limit of SEM_VALUE_MAX which in MacOS is 32767. So you'll need to subclass that implementation or use put(block=False) and handle queue.Full (e.g. put excess items somewhere else) if that's not enough.
Alternatively, look at one of the 3rd-party implementations of distributed work item queue for Python.

Related

What is the difference between a Queue and a JoinableQueue in multiprocessing in Python?

What is the difference between a Queue and a JoinableQueue in multiprocessing in Python? This question has already been asked here, but as some comments point out, the accepted answer is not helpful because all it does is quote the documentation. Could someone explain the difference in terms of when to use one versus the other? For example, why would one choose to use Queue over JoinableQueue if JoinableQueue is pretty much the same thing except for offering the two extra methods join() and task_done(). Additionally, the other answer in the post I linked to mentions that Based on the documentation, it's hard to be sure that Queue is actually empty. which again raises the question as to why would I want to use a Queue over JoinableQueue? What advantages does it offer?

multiprocessing patterns its queues off of queue.Queue. In that model, Queue keeps a "task count" of everything put on the queue. There are generally two ways to use this queue. Producers could just put things on the queue and ignore what happens to them in the long run. The producer may wait from time to time if the queue is full, but doesn't care if any of the things put on the queue are actually processed by the consumer. In this case the queue's task count grows, but who cares?
Alternately, the producer can "join" the queue. That means that it waits until the last task on the queue has been processed and the task count has gone to zero. But to do this, the producer needs the consumer's help. A consumer gets an item from the queue, but that doesn't decrease the task count. The consumer has to actively call task_done (typically when the task is done...) and the join will wait until every put has a task_done.
Fast forward to multiprocessing. The task_done mechanism requires communication between processes which is relatively expensive. If you are a type A producer that doesn't play the join game, use a multiprocessing.Queue and save a bit of CPU time. If you are a type B producer use multiprocessing.JoinableQueue. But remember that the consumer also has to play the task_done game or the producer will hang.

Multithreaded socket Program - Handling Critical section

I am creating a multi-threaded program, in which I want only 1 thread at a time to go in the critical section where is creates a socket and send some data and all the other to wait for that variable to clear.
I tried threading.Events but later realized that on set() it will notify all the threads waiting. While I only wanted to notify one.
Tried locks(acquire and release). It suited my scenario well but I got to know that lock contention for a long time is expensive. After acquiring the lock my thread was performing many functions and hence resulted in holding the lock for long.
Now I tried threading.conditions. Just wanted to know if acquiring and holding the condition for a long time, is it not a good practice as it also uses locks.
Can anyone suggest a better approach to my problem statement.

I would use an additional thread dedicated to sending. Use a Queue where the other threads put their Send-Data. The socket-thread gets items from the queue in a loop and sends them one after the other.
As long as the queue is empty, .get blocks and the send-thread sleeps.
The "producer" threads have no waiting time at all, they just put their data in the queue and continue.
There is no concern about possible deadlock conditions.
To stop the send-thread, put some special item (e.g. None) in the queue.
To enable returning of values, put a tuple (send_data, return_queue) in the send-queue. when a result is ready, return it by putting it in the return_queue.

threading.Lock() performance issues

I have multiple threads:
dispQ = Queue.Queue()
stop_thr_event = threading.Event()
def worker (stop_event):
while not stop_event.wait(0):
try:
job = dispQ.get(timeout=1)
job.waitcount -= 1
dispQ.task_done()
except Queue.Empty, msg:
continue
# create job objects and put into dispQ here
for j in range(NUM_OF_JOBS):
j = Job()
dispQ.put(j)
# NUM_OF_THREADS could be 10-20 ish
running_threads = []
for t in range(NUM_OF_THREADS):
t1 = threading.Thread( target=worker, args=(stop_thr_event,) )
t1.daemon = True
t1.start()
running_threads.append(t1)
stop_thr_event.set()
for t in running_threads:
t.join()
The code above was giving me some very strange behavior.
I've ended up finding out that it was due to decrementing waitcount with out a lock
I 've added an attribute to Job class self.thr_lock = threading.Lock()
Then I've changed it to
with job.thr_lock:
job.waitcount -= 1
This seems to fix the strange behavior but it looks like it has degraded in performance.
Is this expected? is there way to optimize locking?
Would it be better to have one global lock rather than one lock per job object?

About the only way to "optimize" threading would be to break the processing down in blocks or chunks of work that can be performed at the same time. This mostly means doing input or output (I/O) because that is the only time the interpreter will release the Global Interpreter Lock, aka the GIL.
In actuality there is often no gain or even a net slow-down when threading is added due to the overhead of using it unless the above condition is met.
It would probably be worse if you used a single global lock for all the shared resources because it would make parts of the program wait when they really didn't need to do so since it wouldn't distinguish what resource was needed so unnecessary waiting would occur.
You might find the PyCon 2015 talk David Beasley gave titled Python Concurrency From the Ground Up of interest. It covers threads, event loops, and coroutines.

It's hard to answer your question based on your code. Locks do have some inherent cost, nothing is free, but normally it is quite small. If your jobs are very small, you might want to consider "chunking" them, that way you have many fewer acquire/release calls relative to the amount of work being done by each thread.
A related but separate issue is one of threads blocking each other. You might notice large performance issues if many threads are waiting on the same lock(s). Here your threads are sitting idle waiting on each other. In some cases this cannot be avoided because there is a shared resource which is a performance bottlenecking. In other cases you can re-organize your code to avoid this performance penalty.
There are some things in your example code that make me thing that it might be very different from actual application. First, your example code doesn't share job objects between threads. If you're not sharing job objects you shouldn't need locks on them. Second, as written your example code might not empty the queue before finishing. It will exit as soon as you hit stop_thr_event.set() leaving any remaining jobs in queue, is this by design?

Worker pool where certain tasks can only be done by certain workers

I have a lot of tasks that I'd like to execute a few at a time. The normal solution for this is a thread pool. However, my tasks need resources that only certain threads have. So I can't just farm a task out to any old thread; the thread has to have the resource the task needs.
It seems like there should be a concurrency pattern for this, but I can't seem to find it. I'm implementing this in Python 2 with multiprocessing, so answers in those terms would be great, but a generic solution is fine. In my case the "threads" are actually separate OS processes and the resources are network connections (and no, it's not a server, so (e)poll/select is not going to help). In general, a thread/process can hold several resources.
Here is a naive solution: put the tasks in a work queue and turn my thread pool loose on it. Have each thread check, "Can I do this task?" If yes, do it; if no, put it back in the queue. However, if each task can only be done by one of N threads, then I'm doing ~2N expensive, wasted accesses to a shared queue just to get one unit of work.
Here is my current thought: have a shared work queue for each resource. Farm out tasks to the matching queue. Each thread checks the queue(s) it can handle.
Ideas?

A common approach to this is to not allocate resources to threads and queue the appropriate resource in with the data, though I appreciate that this is not always possible if a resource is bound to a particular thread.
The idea of using a queue per resource with threads only popping objects from the queues containing objects it can handle may work.
It may be possible to use a semaphore+concurrentQueue array, indexed by resource, for signaling such threads and also providing a priority system, so eliminating most of the polling and wasteful requeueing. I will have to think a bit more about that - it kinda depends on how the resources map to the threads.

How to synchronize python lists?

I have different threads and after processing they put data in a common list. Is there anything built in python for a list or a numpy array to be accessed by only a single thread. Secondly, if it is not what is an elegant way of doing it?

According to Thread synchronisation mechanisms in Python, reading a single item from a list and modifying a list in place are guaranteed to be atomic. If this is right (although it seems to be partially contradicted by the very existence of the Queue module), then if your code is all of the form:
try:
val = mylist.pop()
except IndexError:
# wait for a while or exit
else:
# process val
And everything put into mylist is done by .append(), then your code is already threadsafe. If you don't trust that one document on that score, use a queue.queue, which does all synchronisation for you, and has a better API than list for concurrent programs - particularly, it gives you the option of blocking indefinitely, or for a timeout, waiting for .pop() to work if you don't have anything else the thread could be getting on with in the mean time.
For numpy arrays, and in general any case where you need more than a producer/consumer queue, use a Lock or RLock from threading - these implement the context manager protocol, so using them is quite simple:
with mylock:
# Process as necessarry
And python will guarantee that the lock gets released once you fall off the end of the with block - including in tricky cases like if something you do raises an exception.
Finally, consider whether multiprocessing is a better fit for your application than threading - threads in Python aren't guaranteed to actually run concurrently, and in CPython only can if the drop to C-level code. multiprocessing gets around that issue, but may have some extra overhead - if you haven't already, you should read the docs to determine which one suits your needs better.

threading provides Lock objects if you need to protect an entire critical section, or the Queue module provides a queue that is threadsafe.

How about the standard library Queue?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Avoiding deadlocks due to queue overflow with multiprocessing.JoinableQueue - python

Related

What is the difference between a Queue and a JoinableQueue in multiprocessing in Python?

Multithreaded socket Program - Handling Critical section

threading.Lock() performance issues

Worker pool where certain tasks can only be done by certain workers

How to synchronize python lists?

Categories

Resources