JoinableQueue join() method blocking main thread even after task_done()

JoinableQueue join() method blocking main thread even after task_done() - python

In below code, if I put daemon = True , consumer will quit before reading all queue entries. If consumer is non-daemon, Main thread is always blocked even after the task_done() for all the entries.
from multiprocessing import Process, JoinableQueue
import time
def consumer(queue):
while True:
final = queue.get()
print (final)
queue.task_done()
def producer1(queue):
for i in "QWERTYUIOPASDFGHJKLZXCVBNM":
queue.put(i)
if __name__ == "__main__":
queue = JoinableQueue(maxsize=100)
p1 = Process(target=consumer, args=((queue),))
p2 = Process(target=producer1, args=((queue),))
#p1.daemon = True
p1.start()
p2.start()
print(p1.is_alive())
print (p2.is_alive())
for i in range(1, 10):
queue.put(i)
time.sleep(0.01)
queue.join()

Let's see what—I believe—is happening here:
both processes are being started.
the consumer process starts its loop and blocks until a value is received from the queue.
the producer1 process feeds the queue 26 times with a letter while the main process feeds the queue 9 times with a number. The order in which letters or numbers are being fed is not guaranteed—a number could very well show up before a letter.
when both the producer1 and the main processes are done with feeding their data, the queue is being joined. No problem here, the queue can be joined since all the buffered data has been consumed and task_done() has been called after each read.
the consumer process is still running but is blocked until more data to consume show up.
Looking at your code, I believe that you are confusing the concept of joining processes with the one of joining queues. What you most likely want here is to join processes, you probably don't need a joinable queue at all.
#!/usr/bin/env python3
from multiprocessing import Process, Queue
import time
def consumer(queue):
for final in iter(queue.get, 'STOP'):
print(final)
def producer1(queue):
for i in "QWERTYUIOPASDFGHJKLZXCVBNM":
queue.put(i)
if __name__ == "__main__":
queue = Queue(maxsize=100)
p1 = Process(target=consumer, args=((queue),))
p2 = Process(target=producer1, args=((queue),))
p1.start()
p2.start()
print(p1.is_alive())
print(p2.is_alive())
for i in range(1, 10):
queue.put(i)
time.sleep(0.01)
queue.put('STOP')
p1.join()
p2.join()
Also your producer1 exits on its own after feeding all the letters but you need a way to tell your consumer process to exit when there won't be any more data for it to process. You can do this by sending a sentinel, here I chose the string 'STOP' but it can be anything.
In fact, this code is not great since the 'STOP' sentinel could be received before some letters, thus both causing some letters to not be processed but also a deadlock because the processes are trying to join even though the queue still contains some data. But this is a different problem.

Related

Multiprocess.Process not exiting while target function returns

I spawn a subprocess which simply copy data from one queue to another. The problem is: after subprocess`s target function return, the subprocess seems not exsiting as expect. It hangs on the pdet.join() line.
What's causing it to hang?
import numpy as np
import multiprocessing as mp
def load( qdet):
i = 0
while i < 500:
im = np.zeros((480, 640, 3), 'uint8')
i += 1
print(i)
qdet.put(im)
print('load exit.')
def detect(qdet, qshw):
while True:
im = qdet.get()
if im is None:
break
qshw.put(im)
print('detect exit.')
def main():
qdet = mp.Queue()
qshw = mp.Queue()
load(qdet)
pdet = mp.Process(target=detect, args=(qdet, qshw,))
pdet.start()
qdet.put(None)
pdet.join()
if __name__ == '__main__':
mp.freeze_support()
main()

This happens because the if a process puts items on a queue, it will not exit until the items are flushed from the other end. From the documentation:
Bear in mind that a process that has put items in a queue will wait
before terminating until all the buffered items are fed by the
“feeder” thread to the underlying pipe. (The child process can call
the Queue.cancel_join_thread method of the queue to avoid this
behaviour.)
This means that whenever you use a queue you need to make sure that
all items which have been put on the queue will eventually be removed
before the process is joined. Otherwise you cannot be sure that
processes which have put items on the queue will terminate. Remember
also that non-daemonic processes will be joined automatically.
You should therefore make sure that all items from the queue have been removed before attempting to join. However, you can also workaround this by using manager queues, which introduce some overhead but are not affected by such issues:
def main():
with mp.Manager() as manager:
qdet = manager.Queue()
qshw = manager.Queue()
load(qdet)
pdet = mp.Process(target=detect, args=(qdet, qshw,))
pdet.start()
qdet.put(None)
pdet.join()

Multiprocessing does not work and hangs on join on windows 10 [duplicate]

I have a question understanding the queue in the multiprocessing module in python 3
This is what they say in the programming guidelines:
Bear in mind that a process that has put items in a queue will wait before
terminating until all the buffered items are fed by the “feeder” thread to
the underlying pipe. (The child process can call the
Queue.cancel_join_thread
method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all
items which have been put on the queue will eventually be removed before the
process is joined. Otherwise you cannot be sure that processes which have
put items on the queue will terminate. Remember also that non-daemonic
processes will be joined automatically.
An example which will deadlock is the following:
from multiprocessing import Process, Queue
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
queue = Queue()
p = Process(target=f, args=(queue,))
p.start()
p.join() # this deadlocks
obj = queue.get()
A fix here would be to swap the last two lines (or simply remove the
p.join() line).
So apparently, queue.get() should not be called after a join().
However there are examples of using queues where get is called after a join like:
import multiprocessing as mp
import random
import string
# define a example function
def rand_string(length, output):
""" Generates a random string of numbers, lower- and uppercase chars. """
rand_str = ''.join(random.choice(
string.ascii_lowercase
+ string.ascii_uppercase
+ string.digits)
for i in range(length))
output.put(rand_str)
if __name__ == "__main__":
# Define an output queue
output = mp.Queue()
# Setup a list of processes that we want to run
processes = [mp.Process(target=rand_string, args=(5, output))
for x in range(2)]
# Run processes
for p in processes:
p.start()
# Exit the completed processes
for p in processes:
p.join()
# Get process results from the output queue
results = [output.get() for p in processes]
print(results)
I've run this program and it works (also posted as a solution to the StackOverFlow question Python 3 - Multiprocessing - Queue.get() does not respond).
Could someone help me understand what the rule for the deadlock is here?

The queue implementation in multiprocessing that allows data to be transferred between processes relies on standard OS pipes.
OS pipes are not infinitely long, so the process which queues data could be blocked in the OS during the put() operation until some other process uses get() to retrieve data from the queue.
For small amounts of data, such as the one in your example, the main process can join() all the spawned subprocesses and then pick up the data. This often works well, but does not scale, and it is not clear when it will break.
But it will certainly break with large amounts of data. The subprocess will be blocked in put() waiting for the main process to remove some data from the queue with get(), but the main process is blocked in join() waiting for the subprocess to finish. This results in a deadlock.
Here is an example where a user had this exact issue. I posted some code in an answer there that helped him solve his problem.

Don't call join() on a process object before you got all messages from the shared queue.
I used following workaround to allow processes to exit before processing all its results:
results = []
while True:
try:
result = resultQueue.get(False, 0.01)
results.append(result)
except queue.Empty:
pass
allExited = True
for t in processes:
if t.exitcode is None:
allExited = False
break
if allExited & resultQueue.empty():
break
It can be shortened but I left it longer to be more clear for newbies.
Here resultQueue is the multiprocess.Queue that was shared with multiprocess.Process objects. After this block of code you will get the result array with all the messages from the queue.
The problem is that input buffer of the queue pipe that receive messages may become full causing writer(s) infinite block until there will be enough space to receive next message. So you have three ways to avoid blocking:
Increase the multiprocessing.connection.BUFFER size (not so good)
Decrease message size or its amount (not so good)
Fetch messages from the queue immediately as they come (good way)

Multiprocessing Queue - child processes gets stuck sometimes and does not reap

First of all I apologize if the title is bit weird but i literally could not think of how to put into a single line the problem i am facing.
So I have the following code
import time
from multiprocessing import Process, current_process, Manager
from multiprocessing import JoinableQueue as Queue
# from threading import Thread, current_thread
# from queue import Queue
def checker(q):
count = 0
while True:
if not q.empty():
data = q.get()
# print(f'{data} fetched by {current_process().name}')
# print(f'{data} fetched by {current_thread().name}')
q.task_done()
count += 1
else:
print('Queue is empty now')
print(current_process().name, '-----', count)
# print(current_thread().name, '-----', count)
if __name__ == '__main__':
t = time.time()
# m = Manager()
q = Queue()
# with open("/tmp/c.txt") as ifile:
# for line in ifile:
# q.put((line.strip()))
for i in range(1000):
q.put(i)
time.sleep(0.1)
procs = []
for _ in range(2):
p = Process(target=checker, args=(q,), daemon=True)
# p = Thread(target=checker, args=(q,))
p.start()
procs.append(p)
q.join()
for p in procs:
p.join()
Sample outputs
1: When the process just hangs
Queue is empty now
Process-2 ----- 501
output hangs at this point
2: When everything works just fine.
Queue is empty now
Process-1 ----- 515
Queue is empty now
Process-2 ----- 485
Process finished with exit code 0
Now the behavior is intermittent and happens sometimes but not always.
I have tried using Manager.Queue() as well in place of multiprocessing.Queue() but no success and both exhibits same issue.
I tested this with both multiprocessing and multithreading and i get exactly same behavior, with one slight difference that with multithreading the rate of this behavior is much less compared to multiprocessing.
So I think there is something I am missing conceptually or doing wrong, but i am not able to catch it now since I have spent way too much time on this and now my mind is not seeing something which may be very basic.
So any help is appreciated.

I believe you have a race condition in the checker method. You check whether the queue is empty and then dequeue the next task in separate steps. It's usually not a good idea to separate these two kinds of operations without mutual exclusion or locking, because the state of the queue may change between the check and the pop. It may be non-empty, but another process may then dequeue the waiting work before the process which passed the check is able to do so.
However I generally prefer communication over locking whenever possible; it's less error prone and makes one's intentions clearer. In this case, I would send a sentinel value to the worker processes (such as None) to indicate that all work is done. Each worker then just dequeues the next object (which is always thread-safe), and, if the object is None, the sub-process exits.
The example code below is a simplified version of your program, and should work without races:
def checker(q):
while True:
data = q.get()
if data is None:
print(f'process f{current_process().name} ending')
return
else:
pass # do work
if __name__ == '__main__':
q = Queue()
for i in range(1000):
q.put(i)
procs = []
for _ in range(2):
q.put(None) # Sentinel value
p = Process(target=checker, args=(q,), daemon=True)
p.start()
procs.append(p)
for proc in procs:
proc.join()

Python 3 Multiprocessing queue deadlock when calling join before the queue is empty

I have a question understanding the queue in the multiprocessing module in python 3
This is what they say in the programming guidelines:
Bear in mind that a process that has put items in a queue will wait before
terminating until all the buffered items are fed by the “feeder” thread to
the underlying pipe. (The child process can call the
Queue.cancel_join_thread
method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all
items which have been put on the queue will eventually be removed before the
process is joined. Otherwise you cannot be sure that processes which have
put items on the queue will terminate. Remember also that non-daemonic
processes will be joined automatically.
An example which will deadlock is the following:
from multiprocessing import Process, Queue
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
queue = Queue()
p = Process(target=f, args=(queue,))
p.start()
p.join() # this deadlocks
obj = queue.get()
A fix here would be to swap the last two lines (or simply remove the
p.join() line).
So apparently, queue.get() should not be called after a join().
However there are examples of using queues where get is called after a join like:
import multiprocessing as mp
import random
import string
# define a example function
def rand_string(length, output):
""" Generates a random string of numbers, lower- and uppercase chars. """
rand_str = ''.join(random.choice(
string.ascii_lowercase
+ string.ascii_uppercase
+ string.digits)
for i in range(length))
output.put(rand_str)
if __name__ == "__main__":
# Define an output queue
output = mp.Queue()
# Setup a list of processes that we want to run
processes = [mp.Process(target=rand_string, args=(5, output))
for x in range(2)]
# Run processes
for p in processes:
p.start()
# Exit the completed processes
for p in processes:
p.join()
# Get process results from the output queue
results = [output.get() for p in processes]
print(results)
I've run this program and it works (also posted as a solution to the StackOverFlow question Python 3 - Multiprocessing - Queue.get() does not respond).
Could someone help me understand what the rule for the deadlock is here?

The queue implementation in multiprocessing that allows data to be transferred between processes relies on standard OS pipes.
OS pipes are not infinitely long, so the process which queues data could be blocked in the OS during the put() operation until some other process uses get() to retrieve data from the queue.
For small amounts of data, such as the one in your example, the main process can join() all the spawned subprocesses and then pick up the data. This often works well, but does not scale, and it is not clear when it will break.
But it will certainly break with large amounts of data. The subprocess will be blocked in put() waiting for the main process to remove some data from the queue with get(), but the main process is blocked in join() waiting for the subprocess to finish. This results in a deadlock.
Here is an example where a user had this exact issue. I posted some code in an answer there that helped him solve his problem.

Don't call join() on a process object before you got all messages from the shared queue.
I used following workaround to allow processes to exit before processing all its results:
results = []
while True:
try:
result = resultQueue.get(False, 0.01)
results.append(result)
except queue.Empty:
pass
allExited = True
for t in processes:
if t.exitcode is None:
allExited = False
break
if allExited & resultQueue.empty():
break
It can be shortened but I left it longer to be more clear for newbies.
Here resultQueue is the multiprocess.Queue that was shared with multiprocess.Process objects. After this block of code you will get the result array with all the messages from the queue.
The problem is that input buffer of the queue pipe that receive messages may become full causing writer(s) infinite block until there will be enough space to receive next message. So you have three ways to avoid blocking:
Increase the multiprocessing.connection.BUFFER size (not so good)
Decrease message size or its amount (not so good)
Fetch messages from the queue immediately as they come (good way)

python multi-processing zombie processes

I have a simple implementation of python's multi-processing module
if __name__ == '__main__':
jobs = []
while True:
for i in range(40):
# fetch one by one from redis queue
#item = item from redis queue
p = Process(name='worker '+str(i), target=worker, args=(item,))
# if p is not running, start p
if not p.is_alive():
jobs.append(p)
p.start()
for j in jobs:
j.join()
jobs.remove(j)
def worker(url_data):
"""worker function"""
print url_data['link']
What I expect this code to do:
run in infinite loop, keep waiting for Redis queue.
if Redis queue not empty, fetch item.
create 40 multiprocess.Process, not more not less
if a process has finished processing, start new process, so that ~40 process are running at all time.
I read that, to avoid zombie process that should be bound(join) to the parent, that's what I expected to achieve in the second loop. But the issue is that on launching it spawns 40 processes, workers finish processing and enter zombie state, until all currently spawned processes haven't finished,
then in next iteration of "while True", the same pattern continues.
So my question is:
How can I avoid zombie processes. and spawn new process as soon as 1 in 40 has finished

For a task like the one you described is usually better to use a different approach using Pool.
You can have the main process fetching data and the workers deal with it.
Following an example of Pool from Python Docs
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
print result.get(timeout=1) # prints "100" unless your computer is *very* slow
print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
I also suggest to use imap instead of map as it seems your task can be asynch.
Roughly your code will be:
p = Pool(40)
while True:
items = items from redis queue
p.imap_unordered(worker, items) #unordered version is faster
def worker(url_data):
"""worker function"""
print url_data['link']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

JoinableQueue join() method blocking main thread even after task_done() - python

Related

Multiprocess.Process not exiting while target function returns

Multiprocessing does not work and hangs on join on windows 10 [duplicate]

Multiprocessing Queue - child processes gets stuck sometimes and does not reap

Python 3 Multiprocessing queue deadlock when calling join before the queue is empty

python multi-processing zombie processes

Categories

Resources