Efficiency with multiple asyncio readers and a single writer (different thread) - python

I would like to combine threading and asyncio with some synchronisation.
For example: A thread write-combines frames from a camera into some variable or buffer. Multiple readers (asyncio or threads) are woken on each write to take the latest available frame.
I have tried deriving from the asyncio.Event to no avail.
class EventThreadSafe(asyncio.Event):
def set(self):
self._loop.call_soon_threadsafe(super().set)
Is there a mechanism that does this already (https://github.com/aio-libs/janus?) or how is best to implement it?

you can use asyncio.Queue, however it is not thread safe, just task safe. If you want thread safe use queue.Queue, however this is not task safe as it will block your thread. Personally for multiprocess I use a 0MQ Push-Pull Pattern, which feeds to/from a asyncio.Queue adapter.

Related

What is the point for asyncio synchronization primitives not to be thread safe?

It seems that several asyncio functions, like those showed here, for synchronization primitives are not thread safe...
By being not thread safe, considering for example asyncio.Lock, I assume that this lock won't lock the global variable, when we're running multiple threads in our computer, so race conditions are problem.
So, what's the point of having this Lock that doesn't lock? (not a criticism, but an honest doubt)
What are the case uses for these unsafe primitives?
Asyncio is not made for multithreading or multiprocessing work, it is originally made for Asynchronous IO (network) operations with little overhead, hence a lock that is only running in a single thread (as is the case for tasks running in Asyncio eventloop) doesn't need to be threadsafe or process-safe, and therefore doesn't need to suffer from the extra overhead from using a thread-safe or process-safe lock.
using Thread and process executors is only added to allow mingling threaded futures and multiprocessing futures with applications running an eventloop futures seamlessly such as passing them to as_completed function or awaiting their completion as you would with a normal non-multithreaded asyncio task.
if you want a thread-safe lock you can use a thread.Lock, and if you want a process-safe lock you should use a multiprocessing.Lock and suffer the extra overhead.
keep in mind that those locks can still work in an asyncio eventloop and perform almost the same functionality as an asyncio.Lock, they just suffer from higher overhead and will make your code slower when used so don't use them unless you need your code to be Thread-safe or process-safe.
just to briefly explain the difference, when a thread is halted by a thread-safe lock the thread is halted and rescheduled by the operating system, which has a big overhead compared to Asyncio lock that will return to the eventloop again and continue execution instead of halting the thread.
Edit: a threading.Lock is not a direct replacement for asyncio.Lock, and instead you should use threading.RLock followed by an asyncio.Lock to make a function both thread-safe and asyncio-safe, as this will avoid a thread dead-locking itself.
Edit2: as commented by #dano, you can wait for a thread.Lock indirectly using the answer in this question if you want a function to work both threaded and in asyncio eventloop at the same time, but it is not recommended to run a function in both at the same time anyway How to use threading.Lock in async function while object can be accessed from multiple thread

Is `queue.Queue` thread-safe, when accessed by several subprocesses using `concurrent.futures.ProcessPoolExecutor`?

I have been using queue.Queue extensively in situations, where I execute multiple threads e.g. by using concurrent.futures.ThreadPoolExecutor.
I've read from blogs that queue.Queue should be thread-safe, but does that mean it's thread-safe under the assumption that the Python interpreter only executes one thread at a time (GIL), or is it also thread-safe in situations using multiprocessing, which side-steps the GIL by using subprocesses instead of threads?
https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor
ProcessPoolExecutor uses multiprocessing.queues.Queue for the call queue and a mp_context.SimpleQueue (multiprocessing) for the result queue - which are used to communicate between a local thread and the processes.
Nice graphic of ProcessPoolExecutor
concurrent.futures.ProcessPoolExecutor stuff uses multiprocessing Queues to communicate between threads and processes.
The multiprocessing.queues.Queue docs specifically state it is thread and process safe
At the bottom of the queue documentation there is a note referring to the multiprocessing.Queue object ... for use in a multi-processing (rather than multi-threading) context
There is a Queue developed for this in the multiprocessing library
from multiprocessing import Queue
This uses sockets to send byte data which is thread-safe.

How to perform *synchronous* read/write into/from a asyncio transport object

I am using asyncio on Windows and have a reference to a transport object of a named pipe:
class DataPipeHandler(asyncio.Protocol):
def connection_made(self, trans):
self.trans = trans # <<== this is a reference to a transport object of type _ProactorDuplexPipeTransport
loop = asyncio.get_event_loop()
server = loop.start_serving_pipe(lambda: DataPipeHandler(), r'\\.\pipe\test-pipe')
now I would like to use self.trans to synchronously write and then read data from the named pipe. How can I do this?
Its important for me to do this synchronously as this is kind of RPC call I am doing using the pipe (writing something and getting back response quickly) and I do want to block all the other activities of the even loop until this "pipe RPC call" returns.
If I don't block all other activities of the event loop until this RPC call is done I will have unwanted side effects as the loop will continue to process other events I don't want it to process yet.
What I want to do (the write to pipe and then read) is very similar to someone who is calling urllib2.urlopen(urllib2.Request('http://www.google.com')).read() from the event loop thread - here too all the event loop activities will be blocked until we get response from a remote http server.
I know that I can call self.trans.write(data) but this does not write the data synchronously (as I understand it does not block)
Thanks.
EDIT: Following the first comment let me add:
I understand I should never block event loop and that I can use synchronization primitives for accomplishing what I want. But lets say you have event loop that is doing 10 different activities in parallel and one of them is doing some kind of RPC (as describe above) and all other 9 activities should be blocked until this RPC is done. so I have 2 options:
(1) add synchronization primitives (lock/semaphore/condition) as you suggested to all these 10 activities for synchronizing them.
(2) implement this RPC by blocking write and then blocking read from/to the pipe. (assuming I trust the other side of the pipe)
I know this is not the usual way of using event loops but in my specific case I think (2) is better. (simpler logic)
I think you have to use threading synchronization primitives to make sure the whole loop (current thread) is blocked. I think the best offer is using threading queue and join features.
#Andrew Svetlov
As you said, asyncio Synchronization Primitives is the normal choice.
But how to wrap and expose the event loop coroutines as synchronization function APIs is my problem. I don't want API consumer to handle asyncio conception
Update:
Found the cool answer: how can I package a coroutine as normal function in event loop?.

python: waiting on multiple objects (Queue, Lock, Condition, etc.)

I'm using the Python threading library. Works fine (subject to the Global Interpreter Lock, of course).
Now I have a condundrum. I have two separate sources of concurrency: either two Queues, or a Queue and a Condition. How can I wait for the first one that is ready? (They have to be separate objects since they are owned by different modular parts of my application.)
Windows has the WaitForMultipleObjects function; is there something similar for Python concurrency primitives?
There is not an already existing function that I know of that you asked about. However there is the threading.enumaerate() which I think just might return a list off all current daemon threads no matter the source. Once you have that list you could iterate over it looking for the condition you want. To set a thread as a daemon each thread has a method that can be called like thread.setDaemon(True) before the thread is started.
I cant say for sure that this is your answer. I don't have as much experience as apparently you do, but I looked this up in a book I have, The Python Standard Library by Example - by Doug Hellmann. He has 23 pages on managing concurrent operations in the section on threading and enumerate seamed to be something that would help.
You could create a new synchronization object (queue, condition, etc.) let's call it the ready_event, and one Thread for each sync object you want to watch. Each thread waits for its sync object to be ready, when a thread's sync object is ready, the thread signals it via the ready_event. after you created and started the threads, you can wait on that ready_event.

Python safe queue

Is it save if I just using put and get_nowait functions in a queue, where the queue is shared between the threads. When Do I need to use a thread lock?
The essential idea of queue is to share it between multiple threads.
The Queue class implements all the required locking semantics.
So you don't have to acquire lock explicitly.
http://docs.python.org/library/queue.html#module-Queue
The Queue module (called queue in Python 3) is specifically designed to work in multithreaded environments.
If that's what you're using, you don't need any additional locking.

Categories