Force put in python queue even when full

Force put in python queue even when full - python

I have a thread that reads frames for a webcam and put them in a queue. Then another thread reads the frames as needed.
My problem is that I want to keep the latest frames in the queue even if the consumer thread cannot keep up.
I can do the following in the thread queuing frames:
if q.full():
drop = q.get()
q.put(new_frame)
But I think this can fail in to ways.
If the queue was full when full() was called but then the consumer fetches frames, the producer will discard a frame for no reason.
If the queue was full when full() was called then the consumer fetches all frames from the queue, the producer will freeze on q.get(). Basically breaking the application.
In my case, would using threading.Lock with a simple list be the way to go?

I believe that should be useful. Put a request to acquire the lock at two places, one before the function where the frames are fetched and one after checking the condition if the queue is full.
After you check that the queue is full and you have acquired the lock then you can check the condition again whether it is still full or not. With respect to this condition you can decide whether to call the get function or not.
Thread 1:
if q.full():
// acquire lock
if q.full():
drop = q.get()
else:
q.put(new_frame)
// release lock
else:
q.put(new_frame)
Thread 2:
if not q.empty():
//acquire lock
q.get()
// release lock

I have implemented a non-blocking put queue
from time import monotonic as time
from queue import Queue
from collections import deque
class NonBlockingPutQueue(Queue):
def put(self, item, block=True, timeout=None):
with self.not_full:
self._put(item)
self.unfinished_tasks += 1
self.not_empty.notify()
def _init(self, maxsize):
self.queue = deque(maxlen=maxsize)

Related

How to properly implement producer consumer in python

I have two threads in a producer consumer pattern. When the consumer receives data it calls an time consuming function expensive() and then enters in a for loop.
But if while the consumer is working new data arrives, it should abort the current work, (exit the loop) and start with the new data.
I tried with a queue.Queue something like this:
q = queue.Queue()
def producer():
while True:
...
q.put(d)
def consumer():
while True:
d = q.get()
expensive(d)
for i in range(10000):
...
if not q.empty():
break
But the problem with this code is that if the producer put data too too fast, and the queue get to have many items, the consumer will do the expensive(d) call plus one loop iteration and then abort for each item, which is time consuming. The code should work, but is not optimized.

Without modifying the code in expensive one solution could be to run it as a separate process which will provide you the ability to terminateit prematurely. Since there's no mention to how long expensive runs this may or may not be more time efficient, however.
import multiprocessing as mp
q = queue.Queue()
def producer():
while True:
...
q.put(d)
def consumer():
while True:
d = q.get()
exp = mp.Thread(target=expensive, args=(d,))
for i in range(10000):
...
if not q.empty():
exp.terminate() # or exp.kill()
break

Well, one way is to use a queue design that can keep an internal lists of waiting and working threads. You can then create several consumer threads to wait on the queue and, when work arrives, set a known consumer thread to do the work. When the thread has finished, it calls into the queue to remove itself from the working list and add itself to the waiting list.
The consumer threads each have an 'abort' atomic that can signal the thread to finish early. There will be some latency while the thread performs inner loops, but that will not matter....
If new work arrives at the queue from the producer, and the working queue is not empty, the 'abort' bool of the working thread/s can be set and their priority set to the minimum possible. The new work can then be dispatched onto one of the waiting threads from the pool, so setting it working.
The waiting threads will need a 'start' function that signals an event/sema/condvar that the wait thread..well..waits on. That allows the producer that supplied work to set that specific thread running, rather than the 'usual' practice where any thread from a pool may pick up work.
Such a design allows new work to be started 'immediately', makes the previous work thread irrelevant by de-prioritizing it and avoids the overheads of thread/process termination.

Python raise exception by dbus signal with multithreading concept

I have a specific problem.
Main content of program starts with creating Process with dbus loop, where I listen for signals.
Content of signals I store in queues. In next part of main I have a threadpool.
When some thread takes item from queue, it use specific function(detection) to handle request - based on content of item from queue. (There is operation on database, from where I take data and make some operations depends on request)
Every thread in thread pool starts one more thread, which should handle signals (current status and interrupt).
For example: I receive signal, which means I have to handle something on numbers. Any thread from threadpool takes this item from queue and starts function which handle something on numbers - it can take long time. So after any time, I receive signal for current status and I need to send current status of detection - that's why I use threads (for shared memory). Also I can receive interrupt signal from D-Bus ("it takes too long time, so stop this detection and be free for another request"). And the interrupt is the main problem...
So my main questions are:
Is there any way, I can raise exception on interrupt signal and stop function (detection)? (I just found solution, but only for catch in main... but I need to catch it in thread which is in threadpool and raise in thread which is in thread in threadpool)
Second question is about GIL... does my thread with signal receiving receive all signals? I think it doesn't... (Yes, I use threads_init())
program:
SERVICE = multiprocessing.Process(target=dbus_signal_receiver, args=(...))
SERVICE.daemon = True
SERVICE.start()
class worker(threading.Thread):
def __init__(self,...):
threading.Thread.__init__(self)
def run(self):
while True:
#get item from queue
s = threading.Thread(target=curr_and_interr_signal_handle, args=(ID of item from queue,...))
s.daemon = True
s.start()
#start specific detection based on request
for i in range(number of threads):
t = worker(...)
t.daemon = True
t.start()
and I hoped, something like this will work... (but it doesn't)
...
class worker(threading.Thread):
def __init__(self,...):
threading.Thread.__init__(self)
def run(self):
while True:
try:
#get item from queue
s = threading.Thread(target=curr_and_interr_signal_handle, args=(ID of item from queue,...))
s.daemon = True
s.start()
#start specific detection based on request
except raised_interrupt_exception:
#continue - wait for another request from queue
...

Read about 18.8.1.2. Signals and threads
Python signal handlers are always executed in the main Python thread,
even if the signal was received in another thread.
This means that signals can’t be used as a means of inter-thread communication.
You can use the synchronization primitives from the threading module instead.
Besides, only the main thread is allowed to set a new signal handler.
Read about 17.1.7. Event Objects
This is one of the simplest mechanisms for communication between threads: one thread signals an event and other threads wait for it
Isn't clear why you have to use thread in thread.
Why could your worker thread not handle detection?
For instance, the following should be do it:
def run(self):
while self.running.is_set():
#get item from queue
#start specific detection based on request

Queue vs JoinableQueue in Python

In Python while using multiprocessing module there are 2 kinds of queues:
Queue
JoinableQueue.
What is the difference between them?
Queue
from multiprocessing import Queue
q = Queue()
q.put(item) # Put an item on the queue
item = q.get() # Get an item from the queue
JoinableQueue
from multiprocessing import JoinableQueue
q = JoinableQueue()
q.task_done() # Signal task completion
q.join() # Wait for completion

JoinableQueue has methods join() and task_done(), which Queue hasn't.
class multiprocessing.Queue( [maxsize] )
Returns a process shared queue implemented using a pipe and a few locks/semaphores. When a process first puts an item on the queue a feeder thread is started which transfers objects from a buffer into the pipe.
The usual Queue.Empty and Queue.Full exceptions from the standard library’s Queue module are raised to signal timeouts.
Queue implements all the methods of Queue.Queue except for task_done() and join().
class multiprocessing.JoinableQueue( [maxsize] )
JoinableQueue, a Queue subclass, is a queue which additionally has task_done() and join() methods.
task_done()
Indicate that a formerly enqueued task is complete. Used by queue consumer threads. For each get() used to fetch a task, a subsequent call to task_done() tells the queue that the processing on the task is complete.
If a join() is currently blocking, it will resume when all items have been processed (meaning that a task_done() call was received for every item that had been put() into the queue).
Raises a ValueError if called more times than there were items placed in the queue.
join()
Block until all items in the queue have been gotten and processed.
The count of unfinished tasks goes up whenever an item is added to the queue. The count goes down whenever a consumer thread calls task_done() to indicate that the item was retrieved and all work on it is complete. When the count of unfinished tasks drops to zero, join() unblocks.
If you use JoinableQueue then you must call JoinableQueue.task_done() for each task removed from the queue or else the semaphore used to count the number of unfinished tasks may eventually overflow, raising an exception.

Based on the documentation, it's hard to be sure that Queue is actually empty. With JoinableQueue you can wait for the queue to empty by calling q.join(). In cases where you want to complete work in distinct batches where you do something discrete at the end of each batch, this could be helpful.
For example, perhaps you process 1000 items at a time through the queue, then send a push notification to a user that you've completed another batch. This would be challenging to implement with a normal Queue.
It might look something like:
import multiprocessing as mp
BATCH_SIZE = 1000
STOP_VALUE = 'STOP'
def consume(q):
for item in iter(q.get, STOP_VALUE):
try:
process(item)
# Be very defensive about errors since they can corrupt pipes.
except Exception as e:
logger.error(e)
finally:
q.task_done()
q = mp.JoinableQueue()
with mp.Pool() as pool:
# Pull items off queue as fast as we can whenever they're ready.
for _ in range(mp.cpu_count()):
pool.apply_async(consume, q)
for i in range(0, len(URLS), BATCH_SIZE):
# Put `BATCH_SIZE` items in queue asynchronously.
pool.map_async(expensive_func, URLS[i:i+BATCH_SIZE], callback=q.put)
# Wait for the queue to empty.
q.join()
notify_users()
# Stop the consumers so we can exit cleanly.
for _ in range(mp.cpu_count()):
q.put(STOP_VALUE)
NB: I haven't actually run this code. If you pull items off the queue faster than you put them on, you might finish early. In that case this code sends an update AT LEAST every 1000 items, and maybe more often. For progress updates, that's probably ok. If it's important to be exactly 1000, you could use an mp.Value('i', 0) and check that it's 1000 whenever your join releases.

Data type for a "closable" queue to handle a stream of items for multiple producers and consumers

Is there a specific type of Queue that is "closable", and is suitable for when there are multiple producers, consumers, and the data comes from a stream (so its not known when it will end)?
I've been unable to find a queue that implements this sort of behavior, or a name for one, but it seems like a integral type for producer-consumer type problems.
As an example, ideally I could write code where (1) each producer would tell the queue when it was done, (2) consumers would blindly call a blocking get(), and (3) when all consumers were done, and the queue was empty, all the producers would unblock and receive a "done" notification:
As code, it'd look something like this:
def produce():
for x in range(randint()):
queue.put(x)
sleep(randint())
queue.close() # called once for every producer
def consume():
while True:
try:
print queue.get()
except ClosedQueue:
print 'done!'
break
num_producers = randint()
queue = QueueTypeThatICantFigureOutANameFor(num_producers)
[Thread(target=produce).start() for _ in range(num_producers)]
[Thread(target=consume).start() for _ in range(random())
Also, I'm not looking for the "Poison Pill" solution, where a "done" value is added to the queue for every consumer -- I don't like the inelegance of producers needing to know how many consumers there are.

I'd call that a self-latching queue.
For your primary requirement, combine the queue with a condition variable check that gracefully latches (shuts down) the queue when all producers have vacated:
class SelfLatchingQueue(LatchingQueue):
...
def __init__(self, num_producers):
...
def close(self):
'''Called by a producer to indicate that it is done producing'''
... perhaps check that current thread is a known producer? ...
with self.a_mutex:
self._num_active_producers -= 1
if self._num_active_producers <= 0:
# Future put()s throw QueueLatched. get()s will empty the queue
# and then throw QueueEmpty thereafter
self.latch() # Guess what superclass implements this?
For your secondary requirement (#3 in the original post, finished producers apparently block until all consumers are finished), I'd perhaps use a barrier or just another condition variable. This could be implemented in a subclass of the SelfLatchingQueue, of course, but without knowing the codebase I'd keep this behavior separate from the automatic latching.

Can you join a Python queue without blocking?

Python's Queue has a join() method that will block until task_done() has been called on all the items that have been taken from the queue.
Is there a way to periodically check for this condition, or receive an event when it happens, so that you can continue to do other things in the meantime? You can, of course, check if the queue is empty, but that doesn't tell you if the count of unfinished tasks is actually zero.

The Python Queue itself does not support this, so you could try the following
from threading import Thread
class QueueChecker(Thread):
def __init__(self, q):
Thread.__init__(self)
self.q = q
def run(self):
q.join()
q_manager_thread = QueueChecker(my_q)
q_manager_thread.start()
while q_manager_thread.is_alive():
#do other things
#when the loop exits the tasks are done
#because the thread will have returned
#from blocking on the q.join and exited
#its run method
q_manager_thread.join() #to cleanup the thread
a while loop on the thread.is_alive() bit might not be exactly what you want, but at least you can see how to asynchronously check on the status of the q.join now.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Force put in python queue even when full - python

Related

How to properly implement producer consumer in python

Python raise exception by dbus signal with multithreading concept

Queue vs JoinableQueue in Python

Data type for a "closable" queue to handle a stream of items for multiple producers and consumers

Can you join a Python queue without blocking?

Categories

Resources