In this minimal example, I expect the program to print foo and fuu as the tasks are scheduled.
import asyncio
async def coroutine():
while True:
print("foo")
async def main():
asyncio.create_task(coroutine())
#asyncio.run_coroutine_threadsafe(coroutine(),asyncio.get_running_loop())
while True:
print("fuu")
asyncio.run(main())
The program will only write fuu.
My aim is to simply have two tasks executing concurrently, but it seems like the created task is never scheduled.
I also tried using run_coroutine_threadsafe with no success.
If I add await asyncio.sleep(1) to the main. The created task takes the execution and the program will only write foo.
What I am supposed to do to run two tasks simultaneously using asyncio ?
I love this question and the explanation of this question tells how asyncio and python works.
Spoilers - It works similar to Javascript single threaded runtime.
So, let's look at your code.
You only have one main thread running which would be continuously running since python scheduler don't have to switch between threads.
Now, the thing is, your main thread that creates a task actually creates a coroutine(green threads, managed by python scheduler and not OS scheduler) which needs main thread to get executed.
Now the main thread is never free, since you have put while True, it is never free to execute anything else and your task never gets executed because python scheduler never does the switching because it is busy executing while True code.
The moment you put sleep, it detects that the current task is sleeping and it does the context switching and your coroutine kicks in.
My suggestion. if your tasks are I/O heavy, use tasks/coroutines and if they are CPU heavy which is in your case (while True), create either Python Threads or Processes and then the OS scheduler will take care of running your tasks, they will get CPU slice to run while True.
Related
asyncio is causing issues on my spyder IDE => would like to replace it with concurent.futures library
how can I replace the below code relying only on concurent.futures library
asyncio.get_event_loop().run_until_complete(api(message))
exact function looks as follows
def async_loop(api, message):
return asyncio.get_event_loop().run_until_complete(api(message))
As written, you're starting up the event loop only until a particular task completes (which may or may not launch or wait on other tasks), and blocking until it completes. The only reason it's a task is because it needs to use async functions, those can only run in an event loop, and while running, they may launch other tasks or wait on other awaitables, and while waiting, the event loop can do other tasks.
In short, if not for the need to be an async task running in a non-async context, this would just be:
def async_loop(api, message):
return api(message)
which calls api and waits for it to complete.
Really, that's it. If the things api does or calls need to run some tasks asynchronously, without blocking on them immediately, you'd have some global executor, e.g.
executor = concurrent.Futures.ThreadPoolExecutor()
which would be used to launch tasks with:
fut = executor.submit(callable, 'arg1', 'arg2', kwarg1='somevalue')
and, when the result of the task is needed, someone would call:
value = fut.result()
on it (which would block if it wasn't done yet, return the result if it completed without an exception, or raise the exception it died with if it died with an exception).
Whenever you no longer need the executor, you just call .shutdown() on it and it will wait for all outstanding tasks to complete. That's it.
As a side-note, the error you're experiencing is part of why they've deprecated get_event_loop() in 3.10 (and discouraged it since 3.7). In all likelihood, the simplest solution to your problem (avoiding a switch to threads, because all that means is you've got new problems) is to use the much simpler high-level API, asyncio.run (introduced in 3.7), which creates an event loop, runs the task in it to completion, does reasonable cleanup, then returns the result:
def async_loop(api, message):
return asyncio.run(api(message))
There's also the asyncio.get_running_loop function (that is the exact replacement for get_event_loop) which you use when an event loop already exists (which you should typically be aware of; event loops don't pop into existence in given thread on their own, so you should know if you launched one; in this case you hadn't, so asyncio.run is the correct one to use).
The documentation of asyncio.create_task() states the following warning:
Important: Save a reference to the result of this function, to avoid a task disappearing mid execution. (source)
My question is: Is this really true?
I have several IO bound "fire and forget" tasks which I want to run concurrently using asyncio by submitting them to the event loop using asyncio.create_task(). However, I do not really care for the return value of the coroutine or even if they run successfully, only that they do run eventually. One use case is writing data from an "expensive" calculation back to a Redis data base. If Redis is available, great. If not, oh well, no harm. This is why I do not want/need to await those tasks.
Here a generic example:
import asyncio
async def fire_and_forget_coro():
"""Some random coroutine waiting for IO to complete."""
print('in fire_and_forget_coro()')
await asyncio.sleep(1.0)
print('fire_and_forget_coro() done')
async def async_main():
"""Main entry point of asyncio application."""
print('in async_main()')
n = 3
for _ in range(n):
# create_task() does not block, returns immediately.
# Note: We do NOT save a reference to the submitted task here!
asyncio.create_task(fire_and_forget_coro(), name='fire_and_forget_coro')
print('awaiting sleep in async_main()')
await asycnio.sleep(2.0) # <-- note this line
print('sleeping done in async_main()')
print('async_main() done.')
# all references of tasks we *might* have go out of scope when returning from this coroutine!
return
if __name__ == '__main__':
asyncio.run(async_main())
Output:
in async_main()
awaiting sleep in async_main()
in fire_and_forget_coro()
in fire_and_forget_coro()
in fire_and_forget_coro()
fire_and_forget_coro() done
fire_and_forget_coro() done
fire_and_forget_coro() done
sleeping done in async_main()
async_main() done.
When commenting out the await asyncio.sleep() line, we never see fire_and_forget_coro() finish. This is to be expected: When the event loop started with asyncio.run() closes, tasks will not be excecuted anymore. But it appears that as long as the event loop is still running, all tasks will be taken care of, even when I never explicitly created references to them. This seem logical to me, as the event loop itself must have a reference to all scheduled tasks in order to run them. And we can even get them all using asyncio.all_tasks()!
So, I think I can trust Python to have at least one strong reference to every scheduled tasks as long as the event loop it was submitted to is still running, and thus I do not have to manage references myself. But I would like a second opinion here. Am I right or are there pitfalls I have not yet recognized?
If I am right, why the explicit warning in the documentation? It is a usual Python thing that stuff is garbage-collected if you do not keep a reference to it. Are there situations where one does not have a running event loop but still some task objects to reference? Maybe when creating an event loop manually (never did this)?
There is an open issue at the cpython bug tracker at github about this topic I just found:
https://github.com/python/cpython/issues/88831
Quote:
asyncio will only keep weak references to alive tasks (in _all_tasks). If a user does not keep a reference to a task and the task is not currently executing or sleeping, the user may get "Task was destroyed but it is pending!".
So the answer to my question is, unfortunately, yes. One has to keep around a reference to the scheduled task.
However, the github issue also describes a relatively simple workaround: Keep all running tasks in a set() and add a callback to the task which removes itself from the set() again.
running_tasks = set()
# [...]
task = asyncio.create_task(some_background_function())
running_tasks.add(task)
task.add_done_callback(lambda t: running_tasks.remove(t))
In python3.11, there is a new API asyncio.TaskGroup.create_task.
It do the things that the other answer have mentioned, so you don't need to do it yourself.
Hi I am new to asyncio and concept of event loops (Non-blocking IO)
async def subWorker():
...
async def firstWorker():
await subWorker()
async def secondWorker():
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
asyncio.ensure_future(firstWorker())
asyncio.ensure_future(secondWorker())
loop.run_forever()
here, when code starts, firstWorker() is executed and paused until it encounters await subWorker(). While firstWorker() is waiting, secondWorker() gets started.
Question is, when firstWorker() encounters await subWorker() and gets paused, the computer then will execute subWorker() and secondWorker() at the same time. Since the program has only 1 thread now, and I guess the single thread does secondWorker() work. Then who executes subWorker()? If single thread can only do 1 thing at a time, who else does the other jobs?
The assumption that subWorker and secondWorker execute at the same time is false.
The fact that secondWorker simply sleeps means that the available time will be spent in subWorker.
asyncio by definition is single-threaded; see the documentation:
This module provides infrastructure for writing single-threaded concurrent code
The event loop executes a task at a time, switching for example when one task is blocked while waiting for I/O, or, as here, voluntarily sleeping.
This is a little old now, but I've found the visualization from the gevent docs (about 1 screen down, beneath "Synchronous & Asynchronous Execution") to be helpful while teaching asynchronous flow control to coworkers: http://sdiehl.github.io/gevent-tutorial/
The most important point here is that only one coroutine is running at any one time, even though many may be in process.
I wrote code that seems to do what I want, but I'm not sure if it's a good idea since it mixes threads and event loops to run an infinite loop off the main thread. This is a minimal code snippet that captures the idea of what I'm doing:
import asyncio
import threading
msg = ""
async def infinite_loop():
global msg
while True:
msg += "x"
await asyncio.sleep(0.3)
def worker():
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
asyncio.get_event_loop().run_until_complete(infinite_loop())
t = threading.Thread(target=worker, daemon=True)
t.start()
The main idea is that I have an infinite loop manipulating a global variable each 0.3 s. I want this infinite loop to run off the main thread so I can still access the shared variable in the main thread. This is especially useful in jupyter, because if I call run_until_complete in the main thread I can't interact with jupyter anymore. I want the main thread available to interactively access and modify msg. Using async might seem unnecessary in my example, but I'm using a library that has async code to run a server, so it's necessary. I'm new to async and threading in python, but I remember reading / hearing somewhere that using threading with asyncio is asking for trouble... is this a bad idea? Are there any potential concurrency issues with my approach?
I'm new to async and threading in python, but I remember reading / hearing somewhere that using threading with asyncio is asking for trouble...
Mixing asyncio and threading is discouraged for beginners because it leads to unnecessary complications and often stems from a lack of understanding of how to use asyncio correctly. Programmers new to asyncio often reach for threads by habit, using them for tasks for which coroutines would be more suitable.
But if you have a good reason to spawn a thread that runs the asyncio event loop, by all means do so - there is nothing that requires the asyncio event loop to be run in the main thread. Just be careful to interact with the event loop itself (call methods such as call_soon, create_task, stop, etc.) only from the thread that runs the event loop, i.e. from asyncio coroutines and callbacks. To safely interact with the event loop from the other threads, such as in your case the main thread, use loop.call_soon_threadsafe() or asyncio.run_coroutine_threadsafe().
Note that setting global variables and such doesn't count as "interacting" because asyncio doesn't observe those. Of course, it is up to you to take care of inter-thread synchronization issues, such as protecting access to complex mutable structures with locks.
is this a bad idea?
If unsure whether to mix threads and asyncio, you can ask yourself two questions:
Do I even need threads, given that asyncio provides coroutines that run in parallel and run_in_executor to await blocking code?
If I have threads providing parallelism, do I actually need asyncio?
Your question provides good answers to both - you need threads so that the main thread can interact with jupyter, and you need asyncio because you depend on a library that uses it.
Are there any potential concurrency issues with my approach?
The GIL ensures that setting a global variable in one thread and reading it in another is free of data races, so what you've shown should be fine.
If you add explicit synchronization, such as a multi-threaded queue or condition variable, you should keep in mind that the synchronization code must not block the event loop. In other words, you cannot just wait on, say, a threading.Event in an asyncio coroutine because that would block all coroutines. Instead, you can await an asyncio.Event, and set it using something like loop.call_soon_threadsafe(event.set) from the other thread.
here's some example code
while True: #main-loop
if command_received:
thread = Thread(target = doItNOW)
thread.start()
......
def doItNOW():
some_blocking_operations()
my problem is I need "some_blocking_operations" to start IMMEDIATELY (as soon as command_received is True).
but since they're blocking i can't execute them on my main loop
and i can't change "some_blocking_operations" to be non-blocking either
for "IMMEDIATELY" i mean as soon as possible, not more than 10ms delay.
(i once got a whole second of delay).
if it's not possible, a constant delay would also be acceptable. (but it MUST be constant. with very few milliseconds of error)
i'm currently working on a linux system (Ubuntu, but it may be another one in the future. always linux)
a python solution would be amazing.. but a different one would be better than nothing
any ideas?
thanks in advance
from threading import Thread
class worker(Thread):
def __init__(self, someParameter=True):
Thread.__init__(self)
# This is how you "send" parameters/variables
# into the thread on start-up that the thread can use.
# This is just an example in case you need it.
self.someVariable = someParameter
self.start() # Note: This makes the thread self-starting,
# you could also call .start() in your main loop.
def run():
# And this is how you use the variable/parameter
# that you passed on when you created the thread.
if self.someVariable is True:
some_blocking_operations()
while True: #main-loop
if command_received:
worker()
This is a non-blocking execution of some_blocking_operations() in a threaded manner. I'm not sure if what you're looking for is to actually wait for the thread to finish or not, or if you even care?
If all you want to do is wait for a "command" to be received and then to execute the blocking operations without waiting for it, verifying that it completes, then this should work for you.
Python mechanics of threading
Python will only run in one CPU core, what you're doing here is simply running multiple executions on overlapping clock invervals in the CPU. Meaning every other cycle in the CPU an execution will be made in the main thread, and the other your blocking call will get a chance to run a execution. They won't actually run in parallel.
There are are some "You can, but..." threads.. Like this one:
is python capable of running on multiple cores?