Is it possible to iterate a list calling async function - python

I'm new to this so I apologize for mistakes
I'm trying to figure out a way to iterate inside a for loop range, calling an async function but without waiting for a response
here's my code
import asyncio
from random import randint
import time
import threading
async def print_i(i):
number = 0
if (number % 2) == 0: #check for even number
time.sleep(5)
while number != 5:
number = randint(0,100)
print("id-", i)
for i in range (0,100):
asyncio.run(print_i(i))
# thread = threading.Thread(target=print_i(i))
# thread.start()
Both the asyncio.run and the thread.start() are linearly executing the called function, whereas i was hoping that the for loop would call the functions in all iterations in one go, and only the even numbers of "i" would get the time.sleep(5)
Is this possible?

Here's some basic examples I made about how to achieve concurrency in asyncio, threading, and trio. Consider range() call as list in these cases.
If you wonder why the trio, there's a better alternative to asyncio - called Structured Concurrency - and they use different method when spawning a concurrent task - you might stumble on it one day.
For asyncio:
import asyncio
async def task(num: int):
print(f"task {num} started.")
# async function need something 'awaitable' to be asynchronous
await asyncio.sleep(3)
print(f"task {num} finished.")
async def spawn_task():
task_list = []
for n in range(5):
task_list.append(asyncio.create_task(task(n)))
await asyncio.gather(*task_list)
asyncio.run(spawn_task())
For threading:
import threading
import time
def thread_workload(num: int):
print(f"task {num} started.")
# most of python's IO functions (including time.sleep) release GIL,
# allowing other thread to run.
# GIL prevents more than 1 thread running the python code.
time.sleep(3)
print(f"task {num} finished.")
def spawn_thread():
for n in range(5):
t = threading.Thread(target=thread_workload, args=(n,))
t.start()
spawn_thread()
For Trio:
import trio
async def task(num: int):
print(f"task {num} started.")
# async function need something 'awaitable' to be asynchronous
await trio.sleep(3)
print(f"task {num} finished.")
async def spawn_task():
async with trio.open_nursery() as nursery:
# explicit task spawning area. Nursery for tasks!
for n in range(5):
nursery.start_soon(task, n)
trio.run(spawn_task)
Output:
task 0 started.
task 1 started.
task 2 started.
task 3 started.
task 4 started.
task 0 finished.
task 1 finished.
task 2 finished.
task 3 finished.
task 4 finished.

Related

How to run a periodic task and a while loop asynchronously

I have two functions:
i = 0
def update():
global i
i += 1
def output():
print(i)
I want to run output() every 3 seconds and loop update() without any interval, both of course asynchronously.
I tried using asyncio, threading, multithreading and timeloop but I couldn't get it to work in neither of these libraries. If you figure out how to do it in any of these libraries, or some other library, please help. I'm ok with working with any library.
Using AsyncIO this would resemble:
import asyncio
def update(value):
value["int"] += 1
def output(value):
print(value["int"])
async def update_worker(value):
while True:
update(value)
await asyncio.sleep(0)
async def output_worker(value):
while True:
output(value)
await asyncio.sleep(3)
async def main():
value = {"int": 0}
await asyncio.gather(
update_worker(value),
output_worker(value))
if __name__ == "__main__":
asyncio.run(main())
Notice that I changed the global value to be a shared value since it is best practice to do so. In other programming languages it would be unsafe to share a value both in read and write to multiple concurrent contexts but since most Python objects are thread safe is is ok in this case. Otherwise, you should use a mutex of any other concurrency primitive to synchronise reads and writes.
AsyncIO concurrency is based on a cooperative multitasking model so asynchronous tasks must explicitly yield control to other concurrent tasks when they are waiting for something (noted by all await keywords). Thus, to ensure that output_worker has a chance to run one must add an await asyncio.sleep(0) in the infinite loop of the update_worker so that the AsyncIO event loop can run output_worker.
Here is the same code using multithreading instead of AsyncIO:
from time import sleep
from threading import Thread, Lock
def update(value, lock):
with lock:
value["int"] += 1
def output(value, lock):
with lock:
print(value["int"])
def update_worker(value, lock):
while True:
update(value, lock)
def output_worker(value, lock):
while True:
output(value, lock)
sleep(3)
def main():
value = {"int": 0}
lock = Lock()
t1 = Thread(target=update_worker, args=(value, lock), daemon=True)
t2 = Thread(target=output_worker, args=(value, lock), daemon=True)
t1.start()
t2.start()
t1.join()
t2.join()
if __name__ == "__main__":
main()
Even though it is not necessary in this particular Python program, I used a Lock to synchronize reads and writes as it is the correct way to handle concurrency.

asyncio - Code is executing synchronously

I'm a python beginner and taking from https://www.youtube.com/watch?v=iG6fr81xHKA&t=269s about the power of asyncio, I tried to use this example shown and repurpose it to execute 10 times. Here's a code snippet
def main(x):
print("Hello")
time.sleep(3)
print("World!")
And so I tried to do it in a asyncio fashion however it doesn't execute asynchronously.
Here's so far what I've tried. What am I doing wrong?
import time
import asyncio
async def main(x):
print(f"Starting Task {x}")
await asyncio.sleep(3)
print(f"Finished Task {x}")
async def async_io():
for i in range(10):
await main(i)
if __name__ == "__main__":
start_time = time.perf_counter()
asyncio.run(async_io())
print(f"Took {time.perf_counter() - start_time} secs")
I've also tried to use queue_task in asyncio.
Using await, by definition, waits for the task main to finish. So your code as-is is no different from the synchronous code you posted above. If you want to run them at the same time (asynchronously), while waiting for the results, you should use asyncio.gather or asyncio.wait instead.
async def async_io():
tasks = []
for i in range(10):
tasks += [main(i)]
await asyncio.gather(*tasks)
If you don't care to wait for all of the main() calls to finish, you can also just use asyncio.create_task(main(i)), which creates a Task object and schedule its execution in the background. In this case, def async_io() doesn't need to be async anymore.

How can I run two asyncio loops simultaneously in different processes?

I've been trying to run two asyncio loops in parallel, but I am failing to find meaningful instruction on how to do so. I want to execute two async functions at the same time, while both of them depend on one global variable.
My code looks something like the following:
import asyncio
#a---------------------------------------------------------------
async def foo(n):
print("Executing foo(n)")
return n**2
async def main_a():
print("Executing main_a()")
n = await foo(3)
return n+1
x = 1
async def periodic_a():
global x
i = 0
while True:
i += 2
x = await main_a()
x += i
await asyncio.sleep(1)
#b-----------------------------------------------------------------
async def periodic_b():
global x
while True:
print(f"There are {x} ducks in the pond")
await asyncio.sleep(5)
#execution---------------------------------------------------------
loop = asyncio.get_event_loop()
task = loop.create_task(periodic_a())
try:
loop.run_until_complete(task)
except asyncio.CancelledError:
pass
except KeyboardInterrupt:
task.cancel()
loop.close()
pass
I am trying to get functions periodic_a and periodic_b to run at the same time, and provide the output of print(f"There are {x} ducks in the pond") every five seconds. Thank you in advance for any help!
You should create two tasks for each function you want to run concurrently and then await them with asyncio.gather. Also note you should use asyncio.run instead of using the event loop directly, this will make your code cleaner as it handles creating and shutting down the loop for you. Modify the execute section of your code the the following:
async def main():
periodic_a_task = asyncio.create_task(periodic_a())
periodic_b_task = asyncio.create_task(periodic_b())
await asyncio.gather(periodic_a_task, periodic_b_task)
asyncio.run(main())
Also note you mention multiple processes, but there isn't any need to use multiprocessing in the example you're describing. If you do need multiprocessing, you'll need a different approach for global data with shared memory.

Cancelling asyncio task run in executor

I'm scraping some websites, paralelizing requests library using asyncio:
def run():
asyncio.run(scrape());
def check_link(link):
#.... code code code ...
response = requests.get(link)
#.... code code code ...
write_some_stats_into_db()
async def scrape():
#.... code code code ...
task = asyncio.get_event_loop().run_in_executor(check_link(link));
#.... code code code ...
if done:
for task in all_tasks:
task.cancel();
I only need to find one 'correct' link, after that, I can stop the program. However, because the check_link is run in executor, it's threads are automatically daemonized, thus even after calling taks.cancel(), I have to wait for all of the other still running check_link to complete.
Do you have any ideas how to 'force-kill' the other running checks in the thread executor?
You can do it the following way, actually from my point of view, if you do not have to use asyncio for the task, use only threads without any async loop, since it makes your code more complicated.
import asyncio
from random import randint
import time
from functools import partial
# imagine that this is links array
LINKS = list(range(1000))
# how many thread-worker you want to have simultaneously
WORKERS_NUM = 10
# stops the app
STOP_EVENT = asyncio.Event()
STOP_EVENT.clear()
def check_link(link: str) -> int:
"""checks link in another thread and returns result"""
time.sleep(3)
r = randint(1, 11)
print(f"{link}____{r}\n")
return r
async def check_link_wrapper(q: asyncio.Queue):
"""Async wrapper around sync function"""
loop = asyncio.get_event_loop()
while not STOP_EVENT.is_set():
link = await q.get()
if not link:
break
value = await loop.run_in_executor(None, func=partial(check_link, link))
if value == 10:
STOP_EVENT.set()
print("Hurray! We got TEN !")
async def feeder(q: asyncio.Queue):
"""Send tasks and "poison pill" to all workers"""
# send tasks to workers
for link in LINKS:
await q.put(link)
# ask workers to stop
for _ in range(WORKERS_NUM):
await q.put(None)
async def amain():
"""Main async function of the app"""
# maxsize is one since we want the app
# to stop as fast as possible if stop condition is met
q = asyncio.Queue(maxsize=1)
# we create separate task, since we do not want to await feeder
# we are interested only in workers
asyncio.create_task(feeder(q))
await asyncio.gather(
*[check_link_wrapper(q) for _ in range(WORKERS_NUM)],
)
if __name__ == '__main__':
asyncio.run(amain())

Combining asyncio with a multi-worker ProcessPoolExecutor

Is it possible to take a blocking function such as work and have it run concurrently in a ProcessPoolExecutor that has more than one worker?
import asyncio
from time import sleep, time
from concurrent.futures import ProcessPoolExecutor
num_jobs = 4
queue = asyncio.Queue()
executor = ProcessPoolExecutor(max_workers=num_jobs)
loop = asyncio.get_event_loop()
def work():
sleep(1)
async def producer():
for i in range(num_jobs):
results = await loop.run_in_executor(executor, work)
await queue.put(results)
async def consumer():
completed = 0
while completed < num_jobs:
job = await queue.get()
completed += 1
s = time()
loop.run_until_complete(asyncio.gather(producer(), consumer()))
print("duration", time() - s)
Running the above on a machine with more than 4 cores takes ~4 seconds. How would you write producer such that the above example takes only ~1 second?
await loop.run_in_executor(executor, work) blocks the loop until work completes, as a result you only have one function running at a time.
To run jobs concurrently, you could use asyncio.as_completed:
async def producer():
tasks = [loop.run_in_executor(executor, work) for _ in range(num_jobs)]
for f in asyncio.as_completed(tasks, loop=loop):
results = await f
await queue.put(results)
The problem is in the producer. Instead of allowing the jobs to run in the background, it waits for each job to finish, thus serializing them. If you rewrite producer to look like this (and leave consumer unchanged), you get the expected 1s duration:
async def producer():
for i in range(num_jobs):
fut = loop.run_in_executor(executor, work)
fut.add_done_callback(lambda f: queue.put_nowait(f.result()))

Categories