Python ThreadPool with limited task queue size - python

My problem is the following: I have a multiprocessing.pool.ThreadPool object with worker_count workers and a main pqueue from which I feed tasks to the pool.
The flow is as follows: There is a main loop that gets an item of level level from pqueue and submits it tot the pool using apply_async. When the item is processed, it generates items of level + 1. The problem is that the pool accepts all tasks and processes them in the order they were submitted.
More precisely, what is happening is that the level 0 items are processed and each generates 100 level 1 items that are retrieved immediately from pqueue and added to the pool, each level 1 item produces 100 level 2 items that are submitted to the pool, and so on, and the items are processed in an BFS manner.
I need to tell the pool to not accept more than worker_count items in order to give a chance of higher level to be retrieved from pqueue in order to process items in a DFS manner.
The current solution I came with is: for each submitted task, save the AsyncResult object in a asyncres_list list, and before retrieving items from pqueue I remove the items that were processed (if any), check if the length of the asyncres_list is lower than the number of threads in the pool every 0.5 seconds, and like that only thread_number items will be processed at the same time.
I am wondering if there is a cleaner way to achieve this behaviour and I can't seem to find in the documentation some parameters to limit the maximum number of tasks that can be submitted to a pool.

ThreadPool is a simple tool for a common task. If you want to manage the queue yourself, to get DFS behavior; you could implement the necessary functionality on top threading and queue modules directly.
To prevent scheduling the next root task until all tasks spawned by the current task are done ("DFS"-like order), you could use Queue.join():
#!/usr/bin/env python3
import queue
import random
import threading
import time
def worker(q, multiplicity=5, maxlevel=3, lock=threading.Lock()):
for task in iter(q.get, None): # blocking get until None is received
try:
if len(task) < maxlevel:
for i in range(multiplicity):
q.put(task + str(i)) # schedule the next level
time.sleep(random.random()) # emulate some work
with lock:
print(task)
finally:
q.task_done()
worker_count = 2
q = queue.LifoQueue()
threads = [threading.Thread(target=worker, args=[q], daemon=True)
for _ in range(worker_count)]
for t in threads:
t.start()
for task in "01234": # populate the first level
q.put(task)
q.join() # block until all spawned tasks are done
for _ in threads: # signal workers to quit
q.put(None)
for t in threads: # wait until workers exit
t.join()
The code example is derived from the example in the queue module documentation.
The task at each level spawns multiplicity direct child tasks that spawn their own subtasks until maxlevel is reached.
None is used to signal the workers that they should quit. t.join() is used to wait until threads exit gracefully. If the main thread is interrupted for any reason then the daemon threads are killed unless there are other non-daemon threads (you might want to provide SIGINT hanlder, to signal the workers to exit gracefully on Ctrl+C instead of just dying).
queue.LifoQueue() is used, to get "Last In First Out" order (it is approximate due to multiple threads).
The maxsize is not set because otherwise the workers may deadlock--you have to put the task somewhere anyway. worker_count background threads are running regardless of the task queue.

Related

Why is my threadpool hanging even after tasks have been completed?

I have set up a thread pool executor with 4 threads. I have added 2 items to my queue to be processed. When I submit the tasks and retrieve futures, it appears the other 2 threads not processing items in the queue keep running and hang, even if they are not processing anything!
import time
import queue
import concurrent
def _read_queue(queue):
msg = queue.get()
time.sleep(2)
queue.task_done()
n_threads = 4
q = queue.Queue()
q.put('test')
q.put("test2")
with concurrent.futures.ThreadPoolExecutor(max_workers=n_threads) as pool:
futures = []
for _ in range(n_threads):
future = pool.submit(_read_queue, q)
print(future.running())
print("Why am running forever?")
How can I adjust my code so that threads that are not processing anything from the queue are shutdown so my program can terminate?
Because queue.get() operation block your ThreadPoolExecutor threads.
for _ in range(n_threads):
future = pool.submit(_read_queue, q)
print(future.running())
Let's examine future = pool.submit(_read_queue, q) in every iteration of for loop
In first iteration of for loop, pool.submit(_read_queue, q) will put a job inside the ThreadPoolExecutor internal queue. When any job are put inside the ThreadPoolExecutor internal queue (it's name is self._work_queue), submit method will create a thread1(I say thread1,thread2.. for easily understand) thread. This thread will execute _read_queue func(This can be happen immediately or this can be happen after the fourth iteration of for loop. This ordering is depends on the Operating System Scheduler, please look at this) and queue.get() will return "test". Then, this thread will sleep for 2 seconds.
In second iteration of for loop, pool.submit(_read_queue, q) will put a job inside the ThreadPoolExecutor internal queue and then submit method will check that there is any thread which is waiting for a job ? No, there is no any waiting thread, first thread is sleeping(for 2 seconds). So submit method will do below steps :
if "there is a thread which will accept a job immediately": #Step 1
return
# Step 2
if numbe_of_created_threads(now this is 1) < self._max_workers:
threading.Thread().... #create a new thread
And then submit method will create a new thread2 thread and this thread will execute _read_queue func and queue.get() will return "test2". Then, this thread will sleep for 2 seconds. Also, q, queue object will be empty and subsequent get() call will block the calling thread
In third iteration of for loop, submit method will put a job inside the ThreadPoolExecutor internal queue and then submit method will check that there is any thread which is waiting for a job ? There is no any waiting thread, first thread is sleeping(for 2 seconds) and second thread also is sleeping, so submit method will create a new thread3 thread (It will check the both step1 and Step2) and this thread will execute _read_queue func same as other threads did. When thread3 run, it will execute queue.get() but this will block the thread3, because q,queue object is empty and if you call get(blocking=True) method of a empty queue object, your calling thread will be blocked .
In fourth iteration of for loop, this will be same as with third case, and then thread4 will be blocked on queue.get() operation.
I assume 2 seconds not passed now, and there will be 5 thread which is alive (can be sleep mode or not) currently. After 2 seconds passed, thread1 and thread2(because time.sleep(2) will return) will terminate*1 but thread3 and thread4 will not, because queue.get() blocking them. That's why your main thread (whole program) will wait them and not terminate.
What can we do in this situation ?
We can put two elements inside the q object because q.get() blocking your thread by using acquire a lock object. We can only release this lock by calling release() method, to do that we need to call queue.put(something)
Here is one of the solutions ;
import time,threading
import queue
from concurrent import futures
def _read_queue(queue):
msg = queue.get()
time.sleep(2)
queue.put(None)
n_threads = 4
q = queue.Queue()
q.put('test')
q.put("test2")
with futures.ThreadPoolExecutor(max_workers=n_threads) as pool:
futures = []
for _ in range(n_threads):
futures.append(pool.submit(_read_queue, q))
*1, I said ThreadPoolExecutor threads will terminate after function finished, but it is depend on calling the shutdown() method, If we don't call shutdown() method of pool object, thread will not terminate even if function finished. Because creating and destruction a thread is costly, that's why threadpool concept is there.(shutdown() method will be called end of the with statement)
If I'm wrong somewhere please correct me.

How to wait for thread execution to complete before starting new thread?

I have a python code in which I can run a maximum of 10 threads at a time due to GPU and compute limitations. I have 100 folders that I want to process and I want each thread to process one folder. Here is some sample code that I have written to achieve this.
def random_wait(thread_id):
# print('Inside wait')
rand_number = random.randint(3, 9)
# print(f'Random number : {rand_number}')
print(f'Thread {thread_id} waiting for {rand_number} seconds')
time.sleep(rand_number)
print(f'Thread {thread_id} completed execution')
if __name__=='__main__':
total_runs = 6
thread_limit = 3
running_threads = list()
for i in range(total_runs):
print(f'Active threads : {threading.active_count()}')
if threading.active_count() > thread_limit:
print(f'Active thread count exceeded')
# check if an existing thread is alive and for it to finish execution
for running_thread in running_threads:
if not running_thread.is_alive():
# Remove thread
running_threads.remove(running_thread)
print(f'Removing thread: {running_thread}')
else:
thread = threading.Thread(target=random_wait, args=(i,), kwargs={})
running_threads.append(thread)
print(f'Starting thread : {i}')
thread.start()
In this code, I am checking if the number of active threads exceeds the thread limit that I have specified, and the process refrains from creating new threads unless there's space for one more thread to be executed.
I am able to refrain the process from starting new threads. However, I lose the threads that I wanted to start and the code just ends up starting and stopping the first three threads. How can I achieve starting a new thread/processing as soon as there's space for one more? Is there a better way in which I just start 10 threads, but as soon as one thread completes, I assign it to start processing another folder?
You should use a ThreadPoolExecutor from the Python standard library concurrent.futures, it automatically manages a fixed number of threads. If you need to execute the same function with different arguments in parallel (as in a parallel for-loop), you can use the .map() method:
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(10) as e:
results = e.map(work, (arg_1, arg_2, ..., arg_n))
If you need to schedule different work in parallel you should use the .submit() method:
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(10) as e:
future_1 = e.submit(work_1, arg_1)
future_2 = e.submit(work_2, arg_2)
result_1 = future_1.result()
result_2 = future_2.result()
In the second case, .submit() returns a Future object which encapsulates the asynchronous execution of the work. You should store that future and get the result when needed. Note that the context manager (with statement) ensures that the .shutdown() method is call before leaving it, so all works are done after this point.

How to properly implement producer consumer in python

I have two threads in a producer consumer pattern. When the consumer receives data it calls an time consuming function expensive() and then enters in a for loop.
But if while the consumer is working new data arrives, it should abort the current work, (exit the loop) and start with the new data.
I tried with a queue.Queue something like this:
q = queue.Queue()
def producer():
while True:
...
q.put(d)
def consumer():
while True:
d = q.get()
expensive(d)
for i in range(10000):
...
if not q.empty():
break
But the problem with this code is that if the producer put data too too fast, and the queue get to have many items, the consumer will do the expensive(d) call plus one loop iteration and then abort for each item, which is time consuming. The code should work, but is not optimized.
Without modifying the code in expensive one solution could be to run it as a separate process which will provide you the ability to terminateit prematurely. Since there's no mention to how long expensive runs this may or may not be more time efficient, however.
import multiprocessing as mp
q = queue.Queue()
def producer():
while True:
...
q.put(d)
def consumer():
while True:
d = q.get()
exp = mp.Thread(target=expensive, args=(d,))
for i in range(10000):
...
if not q.empty():
exp.terminate() # or exp.kill()
break
Well, one way is to use a queue design that can keep an internal lists of waiting and working threads. You can then create several consumer threads to wait on the queue and, when work arrives, set a known consumer thread to do the work. When the thread has finished, it calls into the queue to remove itself from the working list and add itself to the waiting list.
The consumer threads each have an 'abort' atomic that can signal the thread to finish early. There will be some latency while the thread performs inner loops, but that will not matter....
If new work arrives at the queue from the producer, and the working queue is not empty, the 'abort' bool of the working thread/s can be set and their priority set to the minimum possible. The new work can then be dispatched onto one of the waiting threads from the pool, so setting it working.
The waiting threads will need a 'start' function that signals an event/sema/condvar that the wait thread..well..waits on. That allows the producer that supplied work to set that specific thread running, rather than the 'usual' practice where any thread from a pool may pick up work.
Such a design allows new work to be started 'immediately', makes the previous work thread irrelevant by de-prioritizing it and avoids the overheads of thread/process termination.

End processing when all processes are trying to get from the queue and the queue is empty?

I want to set up some processes that take an input and process it and the result of this result is another task that I want to be handled. Essentially each task results in zero or multiple new tasks (of the same type) eventually all tasks will yield no new tasks.
I figured a queue would be good for this so I have an input queue and a results queue to add the tasks that result in nothing new. At any one time, the queue might be empty but more could be added if another process is working on a task.
Hence, I only want it to end when all processes are simultaneously trying to get from the input queue.
I am completely new to both python multiprocessing and multiprocessing in general.
Edited to add a basic overview of what I mean:
class Consumer(Process):
def __init__(self, name):
super().__init__(name=name)
def run():
# This is where I would have the task try to get a new task off of the
# queue and then calculate the results and put them into the queue
# After which it would then try to get a new task and repeat
# If this an all other processes are trying to get and the queue is
# empty That is the only time I know that everything is complete and can
# continue
pass
def start_processing():
in_queue = Queue()
results_queue = Queue()
consumers = [Consumer(str(i)) for i in range(cpu_count())]
for i in consumers:
i.start()
# Wait for the above mentioned conditions to be true before continuing
The JoinableQueue has been designed to fit this purpose. Joining a JoinableQueue will block until there are tasks in progress.
You can use it as follows: the main process will spawn a certain amount of worker processes assigning them the JoinableQueue. The worker processes will use the queue to produce and consume new tasks. The main process will wait by joining the queue up until no more tasks are in progress. After that, it will terminate the worker processes and quit.
A very simplified example (pseudocode):
def consumer(queue):
for task in queue.get():
results = process_task(task)
if 'more_tasks' in results:
for new_task in results['more_tasks']:
queue.put(new_task)
# signal the queue that a task has been completed
queue.task_done()
def main():
queue = JoinableQueue()
processes = start_processes(consumer, queue)
for task in initial_tasks:
queue.put(task)
queue.join() # block until all work is done
terminate_processes(processes)

Queue vs JoinableQueue in Python

In Python while using multiprocessing module there are 2 kinds of queues:
Queue
JoinableQueue.
What is the difference between them?
Queue
from multiprocessing import Queue
q = Queue()
q.put(item) # Put an item on the queue
item = q.get() # Get an item from the queue
JoinableQueue
from multiprocessing import JoinableQueue
q = JoinableQueue()
q.task_done() # Signal task completion
q.join() # Wait for completion
JoinableQueue has methods join() and task_done(), which Queue hasn't.
class multiprocessing.Queue( [maxsize] )
Returns a process shared queue implemented using a pipe and a few locks/semaphores. When a process first puts an item on the queue a feeder thread is started which transfers objects from a buffer into the pipe.
The usual Queue.Empty and Queue.Full exceptions from the standard library’s Queue module are raised to signal timeouts.
Queue implements all the methods of Queue.Queue except for task_done() and join().
class multiprocessing.JoinableQueue( [maxsize] )
JoinableQueue, a Queue subclass, is a queue which additionally has task_done() and join() methods.
task_done()
Indicate that a formerly enqueued task is complete. Used by queue consumer threads. For each get() used to fetch a task, a subsequent call to task_done() tells the queue that the processing on the task is complete.
If a join() is currently blocking, it will resume when all items have been processed (meaning that a task_done() call was received for every item that had been put() into the queue).
Raises a ValueError if called more times than there were items placed in the queue.
join()
Block until all items in the queue have been gotten and processed.
The count of unfinished tasks goes up whenever an item is added to the queue. The count goes down whenever a consumer thread calls task_done() to indicate that the item was retrieved and all work on it is complete. When the count of unfinished tasks drops to zero, join() unblocks.
If you use JoinableQueue then you must call JoinableQueue.task_done() for each task removed from the queue or else the semaphore used to count the number of unfinished tasks may eventually overflow, raising an exception.
Based on the documentation, it's hard to be sure that Queue is actually empty. With JoinableQueue you can wait for the queue to empty by calling q.join(). In cases where you want to complete work in distinct batches where you do something discrete at the end of each batch, this could be helpful.
For example, perhaps you process 1000 items at a time through the queue, then send a push notification to a user that you've completed another batch. This would be challenging to implement with a normal Queue.
It might look something like:
import multiprocessing as mp
BATCH_SIZE = 1000
STOP_VALUE = 'STOP'
def consume(q):
for item in iter(q.get, STOP_VALUE):
try:
process(item)
# Be very defensive about errors since they can corrupt pipes.
except Exception as e:
logger.error(e)
finally:
q.task_done()
q = mp.JoinableQueue()
with mp.Pool() as pool:
# Pull items off queue as fast as we can whenever they're ready.
for _ in range(mp.cpu_count()):
pool.apply_async(consume, q)
for i in range(0, len(URLS), BATCH_SIZE):
# Put `BATCH_SIZE` items in queue asynchronously.
pool.map_async(expensive_func, URLS[i:i+BATCH_SIZE], callback=q.put)
# Wait for the queue to empty.
q.join()
notify_users()
# Stop the consumers so we can exit cleanly.
for _ in range(mp.cpu_count()):
q.put(STOP_VALUE)
NB: I haven't actually run this code. If you pull items off the queue faster than you put them on, you might finish early. In that case this code sends an update AT LEAST every 1000 items, and maybe more often. For progress updates, that's probably ok. If it's important to be exactly 1000, you could use an mp.Value('i', 0) and check that it's 1000 whenever your join releases.

Categories