I'm not sure what I'm doing wrong here, I'm trying to have a class which contains a queue and uses a coroutine to consume items on that queue. The wrinkle is that the event loop is being run in a separate thread (in that thread I do loop.run_forever() to get it running).
What I'm seeing though is that the coroutine for consuming items is never fired:
import asyncio
from threading import Thread
import functools
# so print always flushes to stdout
print = functools.partial(print, flush=True)
def start_loop(loop):
def run_forever(loop):
print("Setting loop to run forever")
asyncio.set_event_loop(loop)
loop.run_forever()
print("Leaving run forever")
asyncio.set_event_loop(loop)
print("Spawaning thread")
thread = Thread(target=run_forever, args=(loop,))
thread.start()
class Foo:
def __init__(self, loop):
print("in foo init")
self.queue = asyncio.Queue()
asyncio.run_coroutine_threadsafe(self.consumer(self.queue), loop)
async def consumer(self, queue):
print("In consumer")
while True:
message = await queue.get()
print(f"Got message {message}")
if message == "END OF QUEUE":
print(f"exiting consumer")
break
print(f"Processing {message}...")
def main():
loop = asyncio.new_event_loop()
start_loop(loop)
f = Foo(loop)
f.queue.put("this is a message")
f.queue.put("END OF QUEUE")
loop.call_soon_threadsafe(loop.stop)
# wait for the stop to propagate and complete
while loop.is_running():
pass
if __name__ == "__main__":
main()
Output:
Spawaning thread
Setting loop to run forever
in foo init
Leaving run forever
There are several issues with this code.
First, check the warnings:
test.py:44: RuntimeWarning: coroutine 'Queue.put' was never awaited
f.queue.put("this is a message")
test.py:45: RuntimeWarning: coroutine 'Queue.put' was never awaited
f.queue.put("END OF QUEUE")
That means queue.put is a coroutine, so it has to be run using run_coroutine_threadsafe:
asyncio.run_coroutine_threadsafe(f.queue.put("this is a message"), loop)
asyncio.run_coroutine_threadsafe(f.queue.put("END OF QUEUE"), loop)
You could also use queue.put_nowait which is a synchronous method. However, asyncio objects are generally not threadsafe so every synchronous call has to go through call_soon_threadsafe:
loop.call_soon_threadsafe(f.queue.put_nowait, "this is a message")
loop.call_soon_threadsafe(f.queue.put_nowait, "END OF QUEUE")
Another issue is that the loop gets stopped before the consumer task can start processing items. You could add a join method to the Foo class to wait for the consumer to finish:
class Foo:
def __init__(self, loop):
[...]
self.future = asyncio.run_coroutine_threadsafe(self.consumer(self.queue), loop)
def join(self):
self.future.result()
Then make sure to call this method before stopping the loop:
f.join()
loop.call_soon_threadsafe(loop.stop)
This should be enough to get the program to work as you expect. However, this code is still problematic on several aspects.
First, the loop should not be set both in the main thread and the extra thread. Asyncio loops are not meant to be shared between threads, so you need to make sure that everything asyncio related happens in the dedicated thread.
Since Foo is responsible for the communication between those two threads, you'll have to be extra careful to make sure every line of code runs in the right thread. For instance, the instantiation of asyncio.Queue has to happen in the asyncio thread.
See this gist for a corrected version of your program.
Also, I'd like to point out that this is not the typical use case for asyncio. You generally want to have an asyncio loop running in the main thread, especially if you need subprocess support:
asyncio supports running subprocesses from different threads, but there are limits:
An event loop must run in the main thread
The child watcher must be instantiated in the main thread, before executing subprocesses from other threads. Call the get_child_watcher() function in the main thread to instantiate the child watcher.
I would suggest designing your application the other way, i.e. running asyncio in the main thread and use run_in_executor for the synchronous blocking code.
Related
I have a class inside a microservice that looks like this:
import asyncio
import threading
class A:
def __init__(self):
self.state = []
self._flush_thread = self._start_flush()
self.tasks = set()
def _start_flush(self):
threading.Thread(target=self._submit_flush).start()
def _submit_flush(self):
self._thread_loop = asyncio.new_event_loop()
self._thread_loop.run_until_complete(self.flush_state()) #
async def regular_func(self):
# This function is called on an event loop that is managed by asyncio.run()
# process self.state, fire and forget next func
task = asyncio.create_task(B.process_inputs(self.state)) # Should call process_inputs in the main thread event loop
self.tasks.add(task)
task.add_done_callback(self.tasks.discard)
pass
async def flush_state(self):
# flush out self.state at regular intervals, to next func
while True:
# flush state
asyncio.run_coroutine_threadsafe(B.process_inputs(self.state), self._thread_loop) # Calls process_inputs in the new thread event loop
await asyncio.sleep(10)
pass
class B:
#staticmethod
async def process_inputs(self, inputs):
# process
On these two threads, I have two separate event loops to avoid any other async functions in the main event loop from blocking other asyncio functions from running.
I see that asyncio.run_coroutine_threadsafe is thread safe when submitting to a given event loop. Is asyncio.run_coroutine_threadsafe(B.process_inputs()) called between different event loops still threadsafe?
Edit:
process_inputs uploads the state to an object store and calls an external API using the state we passed in.
The answer here is that asyncio.run_coroutine_threadsafe does not protect us from any thread safety issues across different event loops. We need to implement locks to protect any shared states while they are being modified. Credits to #Paul Cornelius for the reply.
"The run_coroutine_threadsafe() function allows a coroutine to be run in an asyncio program from another thread."
check out : Example of Running a Coroutine From Another Thread
import asyncio
from multiprocessing import Queue, Process
import time
task_queue = Queue()
# This is simulating the task
async def do_task(task_number):
for progress in range(task_number):
print(f'{progress}/{task_number} doing')
await asyncio.sleep(10)
# This is the loop that accepts and runs tasks
async def accept_tasks():
event_loop = asyncio.get_event_loop()
while True:
task_number = task_queue.get() <-- this blocks event loop from running do_task()
event_loop.create_task(do_task(task_number))
# This is the starting point of the process,
# the event loop runs here
def worker():
event_loop = asyncio.get_event_loop()
event_loop.run_until_complete(accept_tasks())
# Run a new process
Process(target=worker).start()
# Simulate adding tasks every 1 second
for _ in range(1,50):
task_queue.put(_)
print('added to queue', _)
time.sleep(1)
I'm trying to run a separate process that runs an event loop to do I/O operations. Now, from a parent process, I'm trying to "queue-in" tasks. The problem is that do_task() does not run. The only solution that works is polling (i.e. checking if empty, then sleeping X seconds).
After some researching, the problem seems to be that task_queue.get() isn't doing event-loop-friendly IO.
aiopipe provides a solution, but assumes both processes are running in an event loop.
I tried creating this. But the consumer isn't consuming anything...
read_fd, write_fd = os.pipe()
consumer = AioPipeReader(read_fd)
producer = os.fdopen(write_fd, 'w')
A simple workaround for this situation is to change task_number = task_queue.get() to task_number = await event_loop.run_in_executor(None, task_queue.get). That way the blocking Queue.get() function will be off-loaded to a thread pool and the current coroutine suspended, as a good asyncio citizen. Likewise, once the thread pool finishes with the function, the coroutine will resume execution.
This approach is a workaround because it doesn't scale to a large number of concurrent tasks: each blocking call "turned async" that way will take a slot in the thread pool, and those that exceed the pool's maximum number of workers will not even start executing before a threed frees up. For example, rewriting all of asyncio to call blocking functions through run_in_executor would just result in a badly written threaded system. However, if you know that you have a small number of child processes, using run_in_executor is correct and can solve the problem very effectively.
I finally figured it out. There is a known way to do this with aiopipe library. But it's made to run on two event loops on two different processes. In my case, I only have the child process running an event loop. To solve that, I changed the writing part into a unbuffered normal write using open(fd, buffering=0).
Here is the code without any library.
import asyncio
from asyncio import StreamReader, StreamReaderProtocol
from multiprocessing import Process
import time
import os
# This is simulating the task
async def do_task(task_number):
for progress in range(task_number):
print(f'{progress}/{task_number} doing')
await asyncio.sleep(1)
# This is the loop that accepts and runs tasks
async def accept_tasks(read_fd):
loop = asyncio.get_running_loop()
# Setup asynchronous reading
reader = StreamReader()
protocol = StreamReaderProtocol(reader)
transport, _ = await loop.connect_read_pipe(
lambda: protocol, os.fdopen(read_fd, 'rb', 0))
while True:
task_number = int(await reader.readline())
await asyncio.sleep(1)
loop.create_task(do_task(task_number))
transport.close()
# This is the starting point of the process,
# the event loop runs here
def worker(read_fd):
loop = asyncio.get_event_loop()
loop.run_until_complete(accept_tasks(read_fd))
# Create read and write pipe
read_fd, write_fd = os.pipe()
# allow inheritance to child
os.set_inheritable(read_fd, True)
Process(target=worker, args=(read_fd, )).start()
# detach from parent
os.close(read_fd)
writer = os.fdopen(write_fd, 'wb', 0)
# Simulate adding tasks every 1 second
for _ in range(1,50):
writer.write((f'{_}\n').encode())
print('added to queue', _)
time.sleep(1)
Basically, we use asynchronous reading on the child process' end, and do non-buffered synchronous write on the parent process' end. To do the former, you need to connect the event loop as shown in accept_tasks coroutine.
I've made a program which has a main thread that spawns many other threads by subclassing the threading.Thread class.
Each such child thread runs an infinite while loop, and inside the loop I check a condition. If the condition is true, I make the thread sleep for 1 second using time.sleep(1) and if it's false, then the thread performs some computation.
The program itself works fine and I've achieved what I wanted to do, my only remaining problem is that I seem unable to stop the threads after my work is done. I want the user to be able to kill all the threads by pressing a button or giving a keyboard interrupt like Ctrl+C.
For this I had tried using the signal module and inserted a conditon in the threads' loops that breaks the loop when the main thread catches a signal but it didn't work for some reason. Can anyone please help with this?
EDIT: This is some of the relevant code snippets:
def sighandler(signal,frame):
BaseThreadClass.stop_flag = True
class BaseThreadClass(threading.Thread):
stop_flag = False
def __init__(self):
threading.Thread.__init__(self)
def run(self,*args):
while True:
if condition:
time.sleep(1)
else:
#do computation and stuff
if BaseThreadClass.stop_flag:
#do cleanup
break
Your basic method does work, but you've still not posted enough code to show the flaw. I added a few lines of code to make it runnable and produced a result like:
$ python3 test.py
thread alive
main alive
thread alive
main alive
^CSignal caught
main alive
thread alive
main alive
main alive
main alive
^CSignal caught
^CSignal caught
main alive
^Z
[2]+ Stopped python3 test.py
$ kill %2
The problem demonstrated above involves the signal handler telling all the threads to exit, except the main thread, which still runs and still catches interrupts. The full source of this variant of the sample snippet is:
import threading, signal, time
def sighandler(signal,frame):
BaseThreadClass.stop_flag = True
print("Signal caught")
class BaseThreadClass(threading.Thread):
stop_flag = False
def __init__(self):
threading.Thread.__init__(self)
def run(self,*args):
while True:
if True:
time.sleep(1)
print("thread alive")
else:
#do computation and stuff
pass
if BaseThreadClass.stop_flag:
#do cleanup
break
signal.signal(signal.SIGINT, sighandler)
t = BaseThreadClass()
t.start()
while True:
time.sleep(1)
print("main alive")
The problem here is that the main thread never checks for the quit condition. But as you never posted what the main thread does, nor how the signal handler is activated, or information regarding whether threads may go a long time without checking the quit condition... I still don't know what went wrong in your program. The signal example shown in the library documentation raises an exception in order to divert the main thread.
Signals are a rather low level concept for this task, however. I took the liberty of writing a somewhat more naïve version of the main thread:
try:
t = BaseThreadClass()
t.start()
while True:
time.sleep(1)
print("main alive")
except KeyboardInterrupt:
BaseThreadClass.stop_flag = True
t.join()
This version catches the exception thrown by the default interrupt handler, signals the thread to stop, and waits for it to do so. It might even be appropriate to change the except clause to a finally, since we could want to clean the threads up on other errors too.
If you want to do this kind of "cooperative" polled-shutdown, you can use a threading.Event to signal:
import threading
import time
def proc1():
while True:
print("1") # payload
time.sleep(1)
# have we been signalled to stop?
if not ev1.wait(0): break
# do any shutdown etc. here
print ("T1 exiting")
ev1 = threading.Event()
ev1.set()
thread1 = threading.Thread(target=proc1)
thread1.start()
time.sleep(3)
# signal thread1 to stop
ev1.clear()
But be aware that if the "payload" does something blocking like network or file IO, that op will not be interrupted. You can do those blocking ops with a timeout, but that obviously will complicate your code.
I am using Python3 Asyncio module to create a load balancing application. I have two heavy IO tasks:
A SNMP polling module, which determines the best possible server
A "proxy-like" module, which balances the petitions to the selected server.
Both processes are going to run forever, are independent from eachother and should not be blocked by the other one.
I cant use 1 event loop because they would block eachother, is there any way to have 2 event loops or do I have to use multithreading/processing?
I tried using asyncio.new_event_loop() but havent managed to make it work.
The whole point of asyncio is that you can run multiple thousands of I/O-heavy tasks concurrently, so you don't need Threads at all, this is exactly what asyncio is made for. Just run the two coroutines (SNMP and proxy) in the same loop and that's it.
You have to make both of them available to the event loop BEFORE calling loop.run_forever(). Something like this:
import asyncio
async def snmp():
print("Doing the snmp thing")
await asyncio.sleep(1)
async def proxy():
print("Doing the proxy thing")
await asyncio.sleep(2)
async def main():
while True:
await snmp()
await proxy()
loop = asyncio.get_event_loop()
loop.create_task(main())
loop.run_forever()
I don't know the structure of your code, so the different modules might have their own infinite loop or something, in this case you can run something like this:
import asyncio
async def snmp():
while True:
print("Doing the snmp thing")
await asyncio.sleep(1)
async def proxy():
while True:
print("Doing the proxy thing")
await asyncio.sleep(2)
loop = asyncio.get_event_loop()
loop.create_task(snmp())
loop.create_task(proxy())
loop.run_forever()
Remember, both snmp and proxy needs to be coroutines (async def) written in an asyncio-aware manner. asyncio will not make simple blocking Python functions suddenly "async".
In your specific case, I suspect that you are confused a little bit (no offense!), because well-written async modules will never block each other in the same loop. If this is the case, you don't need asyncio at all and just simply run one of them in a separate Thread without dealing with any asyncio stuff.
Answering my own question to post my solution:
What I ended up doing was creating a thread and a new event loop inside the thread for the polling module, so now every module runs in a different loop. It is not a perfect solution, but it is the only one that made sense to me(I wanted to avoid threads, but since it is only one...). Example:
import asyncio
import threading
def worker():
second_loop = asyncio.new_event_loop()
execute_polling_coroutines_forever(second_loop)
return
threads = []
t = threading.Thread(target=worker)
threads.append(t)
t.start()
loop = asyncio.get_event_loop()
execute_proxy_coroutines_forever(loop)
Asyncio requires that every loop runs its coroutines in the same thread. Using this method you have one event loop foreach thread, and they are totally independent: every loop will execute its coroutines on its own thread, so that is not a problem.
As I said, its probably not the best solution, but it worked for me.
Though in most cases, you don't need multiple event loops running when using asyncio, people shouldn't assume their assumptions apply to all the cases or just give you what they think are better without directly targeting your original question.
Here's a demo of what you can do for creating new event loops in threads. Comparing to your own answer, the set_event_loop does the trick for you not to pass the loop object every time you do an asyncio-based operation.
import asyncio
import threading
async def print_env_info_async():
# As you can see each work thread has its own asyncio event loop.
print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}")
async def work():
while True:
await print_env_info_async()
await asyncio.sleep(1)
def worker():
new_loop = asyncio.new_event_loop()
asyncio.set_event_loop(new_loop)
new_loop.run_until_complete(work())
return
number_of_threads = 2
for _ in range(number_of_threads):
threading.Thread(target=worker).start()
Ideally, you'll want to put heavy works in worker threads and leave the asncyio thread run as light as possible. Think the asyncio thread as the GUI thread of a desktop or mobile app, you don't want to block it. Worker threads are usually very busy, this is one of the reason you don't want to create separate asyncio event loops in worker threads. Here's an example of how to manage heavy worker threads with a single asyncio event loop. And this is the most common practice in this kind of use cases:
import asyncio
import concurrent.futures
import threading
import time
def print_env_info(source_thread_id):
# This will be called in the main thread where the default asyncio event loop lives.
print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}, source thread: {source_thread_id}")
def work(event_loop):
while True:
# The following line will fail because there's no asyncio event loop running in this worker thread.
# print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}")
event_loop.call_soon_threadsafe(print_env_info, threading.get_ident())
time.sleep(1)
async def worker():
print(f"Thread: {threading.get_ident()}, event loop: {id(asyncio.get_running_loop())}")
loop = asyncio.get_running_loop()
number_of_threads = 2
executor = concurrent.futures.ThreadPoolExecutor(max_workers=number_of_threads)
for _ in range(number_of_threads):
asyncio.ensure_future(loop.run_in_executor(executor, work, loop))
loop = asyncio.get_event_loop()
loop.create_task(worker())
loop.run_forever()
I know it's an old thread but it might be still helpful for someone.
I'm not good in asyncio but here is a bit improved solution of #kissgyorgy answer. Instead of awaiting each closure separately we create list of tasks and fire them later (python 3.9):
import asyncio
async def snmp():
while True:
print("Doing the snmp thing")
await asyncio.sleep(0.4)
async def proxy():
while True:
print("Doing the proxy thing")
await asyncio.sleep(2)
async def main():
tasks = []
tasks.append(asyncio.create_task(snmp()))
tasks.append(asyncio.create_task(proxy()))
await asyncio.gather(*tasks)
asyncio.run(main())
Result:
Doing the snmp thing
Doing the proxy thing
Doing the snmp thing
Doing the snmp thing
Doing the snmp thing
Doing the snmp thing
Doing the proxy thing
Asyncio event loop is a single thread running and it will not run anything in parallel, it is how it is designed. The closest thing which I can think of is using asyncio.wait.
from asyncio import coroutine
import asyncio
#coroutine
def some_work(x, y):
print("Going to do some heavy work")
yield from asyncio.sleep(1.0)
print(x + y)
#coroutine
def some_other_work(x, y):
print("Going to do some other heavy work")
yield from asyncio.sleep(3.0)
print(x * y)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait([asyncio.async(some_work(3, 4)),
asyncio.async(some_other_work(3, 4))]))
loop.close()
an alternate way is to use asyncio.gather() - it returns a future results from the given list of futures.
tasks = [asyncio.Task(some_work(3, 4)), asyncio.Task(some_other_work(3, 4))]
loop.run_until_complete(asyncio.gather(*tasks))
If the proxy server is running all the time it cannot switch back and forth. The proxy listens for client requests and makes them asynchronous, but the other task cannot execute, because this one is serving forever.
If the proxy is a coroutine and is starving the SNMP-poller (never awaits), isn't the client requests being starved aswell?
every coroutine will run forever, they will not end
This should be fine, as long as they do await/yield from. The echo server will also run forever, it doesn't mean you can't run several servers (on differents ports though) in the same loop.
Python's Queue has a join() method that will block until task_done() has been called on all the items that have been taken from the queue.
Is there a way to periodically check for this condition, or receive an event when it happens, so that you can continue to do other things in the meantime? You can, of course, check if the queue is empty, but that doesn't tell you if the count of unfinished tasks is actually zero.
The Python Queue itself does not support this, so you could try the following
from threading import Thread
class QueueChecker(Thread):
def __init__(self, q):
Thread.__init__(self)
self.q = q
def run(self):
q.join()
q_manager_thread = QueueChecker(my_q)
q_manager_thread.start()
while q_manager_thread.is_alive():
#do other things
#when the loop exits the tasks are done
#because the thread will have returned
#from blocking on the q.join and exited
#its run method
q_manager_thread.join() #to cleanup the thread
a while loop on the thread.is_alive() bit might not be exactly what you want, but at least you can see how to asynchronously check on the status of the q.join now.