Given a regular generator, you can get an iterator from it that can only be consumed once and continue where you left off. Like this -
sync_gen = (i in range(10))
def fetch_batch_sync(num_tasks, job_list):
for i, job in enumerate(job_list):
yield job
if i == num_tasks - 1:
break
>>> sync_gen_iter = sync_gen.__iter__()
>>> for i in fetch_batch_sync(2, sync_gen_iter):
... print i
...
0
1
>>> for i in fetch_batch_sync(3, sync_gen_iter):
... print i
...
2
3
4
Is there a way to do the same with an async generator?
async def fetch_batch_async(num_tasks, job_list_iter):
async for i, job in enumerate(job_list_iter):
yield job
if i == num_tasks - 1:
break
The only difference between regular and async generators is that async generators' equivalents of __next__ and __iter__ methods are themselves async. This is why ordinary for and enumerate fail to recognize them as iterables.
As with regular generators, it is possible to extract a subset of values out of an async generator, but you need to use the appropriate tools. fetch_batch_async already uses async for, but it should also use an async version of enemuerate; for example:
async def aenumerate(aiterable, start=0):
i = start
async for obj in aiterable:
yield i, obj
i += 1
fetch_batch_async would use it exactly like enumerate:
async def fetch_batch_async(num_tasks, job_list_iter):
async for i, job in aenumerate(job_list_iter):
yield job
if i == num_tasks - 1:
break
Finally, this code uses fetch_batch_async to extract several items out of an infinite async iterator:
import asyncio, time
async def infinite():
while True:
yield time.time()
await asyncio.sleep(.1)
async def main():
async for received in fetch_batch_async(10, infinite()):
print(received)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Related
Current versions of Python (Dec 2022) still allow using #coroutine decorator and a generation can be as:
import asyncio
asyncify = asyncio.coroutine
data_ready = False # Status of a pipe, just to test
def gen():
global data_ready
while not data_ready:
print("not ready")
data_ready = True # Just to test
yield
return "done"
async def main():
result = await asyncify(gen)()
print(result)
loop = asyncio.new_event_loop()
loop.create_task(main())
loop.run_forever()
However, new Python versions 3.8+ will deprecate #coroutine decorator (the asyncify function alias), how to wait for (await) generator to end as above?
I tried to use async def as expected by the warning but not working:
import asyncio
asyncify = asyncio.coroutine
data_ready = False # Just to test
async def gen():
global data_ready
while not data_ready:
print("not ready")
data_ready = True # Just to test
yield
yield "done"
return
async def main():
# this has error: TypeError: object async_generator can't be used in 'await' expression
result = await gen()
print(result)
loop = asyncio.new_event_loop()
loop.create_task(main())
loop.run_forever()
Asynchronous generators inherit asynchronous iterator and are aimed for asynchronous iterations. You can not directly await them as regular coroutines.
With that in mind, returning to your experimental case and your question "how to wait for (await) generator to end?": to get the final yielded value - perform asynchronous iterations:
import asyncio
data_ready = False # Just to test
async def gen():
global data_ready
while not data_ready:
print("not ready")
data_ready = True # Just to test
yield "processing"
yield "done"
return
async def main():
a_gen = gen()
async for result in a_gen: # assign to result on each async iteration
pass
print('result:', result)
asyncio.run(main())
Prints:
not ready
result: done
Naturally, you can also advance the async generator in steps with anext:
a_gen = gen()
val_1 = await anext(a_gen)
Summing up, follow the guidlines on PEP 525 – Asynchronous Generators and try to not mix old-depreceted things with the actual ones.
This question already has answers here:
asynchronous python itertools chain multiple generators
(2 answers)
Closed 3 years ago.
I would like to listen for events from multiple instances of the same object and then merge this event streams to one stream. For example, if I use async generators:
class PeriodicYielder:
def __init__(self, period: int) -> None:
self.period = period
async def updates(self):
while True:
await asyncio.sleep(self.period)
yield self.period
I can successfully listen for events from one instance:
async def get_updates_from_one():
each_1 = PeriodicYielder(1)
async for n in each_1.updates():
print(n)
# 1
# 1
# 1
# ...
But how can I get events from multiple async generators? In other words: how can I iterate through multiple async generators in the order they are ready to produce next value?
async def get_updates_from_multiple():
each_1 = PeriodicYielder(1)
each_2 = PeriodicYielder(2)
async for n in magic_async_join_function(each_1.updates(), each_2.updates()):
print(n)
# 1
# 1
# 2
# 1
# 1
# 2
# ...
Is there such magic_async_join_function in stdlib or in 3rd party module?
You can use wonderful aiostream library. It'll look like this:
import asyncio
from aiostream import stream
async def test1():
for _ in range(5):
await asyncio.sleep(0.1)
yield 1
async def test2():
for _ in range(5):
await asyncio.sleep(0.2)
yield 2
async def main():
combine = stream.merge(test1(), test2())
async with combine.stream() as streamer:
async for item in streamer:
print(item)
asyncio.run(main())
Result:
1
1
2
1
1
2
1
2
2
2
If you wanted to avoid the dependency on an external library (or as a learning exercise), you could merge the async iterators using a queue:
def merge_async_iters(*aiters):
# merge async iterators, proof of concept
queue = asyncio.Queue(1)
async def drain(aiter):
async for item in aiter:
await queue.put(item)
async def merged():
while not all(task.done() for task in tasks):
yield await queue.get()
tasks = [asyncio.create_task(drain(aiter)) for aiter in aiters]
return merged()
This passes the test from Mikhail's answer, but it's not perfect: it doesn't propagate the exception in case one of the async iterators raises. Also, if the task that exhausts the merged generator returned by merge_async_iters() gets cancelled, or if the same generator is not exhausted to the end, the individual drain tasks are left hanging.
A more complete version could handle the first issue by detecting an exception and transmitting it through the queue. The second issue can be resolved by merged generator cancelling the drain tasks as soon as the iteration is abandoned. With those changes, the resulting code looks like this:
def merge_async_iters(*aiters):
queue = asyncio.Queue(1)
run_count = len(aiters)
cancelling = False
async def drain(aiter):
nonlocal run_count
try:
async for item in aiter:
await queue.put((False, item))
except Exception as e:
if not cancelling:
await queue.put((True, e))
else:
raise
finally:
run_count -= 1
async def merged():
try:
while run_count:
raised, next_item = await queue.get()
if raised:
cancel_tasks()
raise next_item
yield next_item
finally:
cancel_tasks()
def cancel_tasks():
nonlocal cancelling
cancelling = True
for t in tasks:
t.cancel()
tasks = [asyncio.create_task(drain(aiter)) for aiter in aiters]
return merged()
Different approaches to merging async iterators can be found in this answer, and also this one, where the latter allows for adding new streams mid-stride. The complexity and subtlety of these implementations shows that, while it is useful to know how to write one, actually doing so is best left to well-tested external libraries such as aiostream that cover all the edge cases.
I'm using python 3.5 to asynchronously return data from one method to another as follows:
async def A():
# Need to get data here from B continuously
val = await B()
async def B():
# Need to get data here from C continuously as they get generated inside while loop of method C
data = await C()
# Modify and process the data and return to A
return await D(data)
async def C():
i = 0
while i < 5:
await asyncio.sleep(1)
# Return this data to method B one by one, Not sure how to do this ??
return i
async def D(val):
# Do some processing of val and return it
return val
I want to continuously stream data from method C and return it to method B, process each item as they are received and return it to method A.
One way is use an asyncio queue and pass it to method B from A, from where it further gets passed on to C.
Method C would keep writing the content in the queue.
Method B would read from queue, process the data and update the queue.
Method A reads the queue at the end for finally processed data.
Can we achieve it using coroutines or async method itself in any other way ? Wish to avoid calls for reading and writing to queues continuously for every request.
import asyncio
from async_generator import async_generator, yield_, yield_from_
async def fun(n):
print("Finding %d-1" % n)
await asyncio.sleep(n/2)
result = n - 1
print("%d - 1 = %d" % (n, result))
return result
#async_generator
async def main(l):
futures = [ fun(n) for n in l ]
for i, future in enumerate(asyncio.as_completed(futures)):
result = await future
print("inside the main..")
print(result)
await yield_(result)
#async_generator
async def dealer():
l = [2, 4, 6]
gen = main(l)
async for item in gen:
print("inside the dealer....")
await yield_(item)
async def dealer1():
gen = dealer()
async for item in gen:
print("inside dealer 1")
print(item)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
#loop.run_until_complete(cc.main())
loop.run_until_complete(dealer1())
loop.close()
You have support for async generators in python3.6. If you are working with python 3.5 you may use async_generator library(https://pypi.python.org/pypi/async_generator/1.5)
I'm looking to be able to yield from a number of async coroutines. Asyncio's as_completed is kind of close to what I'm looking for (i.e. I want any of the coroutines to be able to yield at any time back to the caller and then continue), but that only seems to allow regular coroutines with a single return.
Here's what I have so far:
import asyncio
async def test(id_):
print(f'{id_} sleeping')
await asyncio.sleep(id_)
return id_
async def test_gen(id_):
count = 0
while True:
print(f'{id_} sleeping')
await asyncio.sleep(id_)
yield id_
count += 1
if count > 5:
return
async def main():
runs = [test(i) for i in range(3)]
for i in asyncio.as_completed(runs):
i = await i
print(f'{i} yielded')
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
Replacing runs = [test(i) for i in range(3)] with runs = [test_gen(i) for i in range(3)] and for for i in asyncio.as_completed(runs) to iterate on each yield is what I'm after.
Is this possible to express in Python and are there any third party maybe that give you more options then the standard library for coroutine process flow?
Thanks
You can use aiostream.stream.merge:
from aiostream import stream
async def main():
runs = [test_gen(i) for i in range(3)]
async for x in stream.merge(*runs):
print(f'{x} yielded')
Run it in a safe context to make sure the generators are cleaned up properly after the iteration:
async def main():
runs = [test_gen(i) for i in range(3)]
merged = stream.merge(*runs)
async with merged.stream() as streamer:
async for x in streamer:
print(f'{x} yielded')
Or make it more compact using pipes:
from aiostream import stream, pipe
async def main():
runs = [test_gen(i) for i in range(3)]
await (stream.merge(*runs) | pipe.print('{} yielded'))
More examples in the documentation.
Adressing #nirvana-msu comment
It is possible to identify the generator that yielded a given value by preparing sources accordingly:
async def main():
runs = [test_gen(i) for i in range(3)]
sources = [stream.map(xs, lambda x: (i, x)) for i, xs in enumerate(runs)]
async for i, x in stream.merge(*sources):
print(f'ID {i}: {x}')
I have a simple aiohttp-server with two handlers.
First one does some computations in the async for loop. Second one just returns text response. not_so_long_operation returns 30-th fibonacci number with the slowest recursive implementation, which takes something about one second.
def not_so_long_operation():
return fib(30)
class arange:
def __init__(self, n):
self.n = n
self.i = 0
async def __aiter__(self):
return self
async def __anext__(self):
i = self.i
self.i += 1
if self.i <= self.n:
return i
else:
raise StopAsyncIteration
# GET /
async def index(request):
print('request!')
l = []
async for i in arange(20):
print(i)
l.append(not_so_long_operation())
return aiohttp.web.Response(text='%d\n' % l[0])
# GET /lol/
async def lol(request):
print('request!')
return aiohttp.web.Response(text='just respond\n')
When I'm trying to fetch / and then /lol/, it gives me response for the second one only when the first one gets finished.
What am I doing wrong and how to make index handler release the ioloop on each iteration?
Your example has no yield points (await statements) for switching between tasks.
Asynchronous iterator allows to use await inside __aiter__/__anext__ but don't insert it automatically into your code.
Say,
class arange:
def __init__(self, n):
self.n = n
self.i = 0
async def __aiter__(self):
return self
async def __anext__(self):
i = self.i
self.i += 1
if self.i <= self.n:
await asyncio.sleep(0) # insert yield point
return i
else:
raise StopAsyncIteration
should work as you expected.
In real application most likely you don't need await asyncio.sleep(0) calls because you will wait on database access and similar activities.
Since, fib(30) is CPU bound and sharing little data, you should probably use a ProcessPoolExecutor (as opposed to a ThreadPoolExecutor):
async def index(request):
loop = request.app.loop
executor = request.app["executor"]
result = await loop.run_in_executor(executor, fib, 30)
return web.Response(text="%d" % result)
Setup executor when you create the app:
app = Application(...)
app["exector"] = ProcessPoolExector()
An asynchronous iterator is not really needed here. Instead you can simply give the control back to the event loop inside your loop. In python 3.4, this is done by using a simple yield:
#asyncio.coroutine
def index(self):
for i in range(20):
not_so_long_operation()
yield
In python 3.5, you can define an Empty object that basically does the same thing:
class Empty:
def __await__(self):
yield
Then use it with the await syntax:
async def index(request):
for i in range(20):
not_so_long_operation()
await Empty()
Or simply use asyncio.sleep(0) that has been recently optimized:
async def index(request):
for i in range(20):
not_so_long_operation()
await asyncio.sleep(0)
You could also run the not_so_long_operation in a thread using the default executor:
async def index(request, loop):
for i in range(20):
await loop.run_in_executor(None, not_so_long_operation)