As mentioned in the title, what is the asyncio equivalent of Twisted's defer.succeed? Alternatively how do I create a coroutine object from a plain Python value?
Note that this is different from How do we call a normal function where a coroutine is expected?. I could always do something like the following but I was wondering if there was a better way.
value_to_convert_to_coroutine_object = 5
async def foo(bar):
return await asyncio.coroutine(lambda: value_to_convert_to_coroutine_object)()
defer.succeed() is nothing fancy and, as far as I know, there's not single asyncio function that replicates that.
def succeed(result):
f = Future()
f.set_result(result)
f.done()
return f
I recall on an asyncio IRC, that the authors didn't want to make functions that the user could simply write out on their own. Which might be why such a function doesn't exist.
As for the second part of your question, it seems like the solution you have is as simple as it gets. I've done something similar for unit tests.
Related
I make creating a discord asynchronous library that is fully typed.
I have a method that create objects from a aiohttp get requests such as the following example:
async def get_bans(self):
"""|coro|
Fetches all the bans in the guild.
"""
data = await self._http.get(f"guilds/{self.id}/bans")
for ban_data in data:
yield Ban.from_dict(construct_client_dict(self._client, ban_data))
I was wondering about the return type of this code snippet and whether it should be a AsyncGenerator[Ban, None] or AsyncIterator[Ban, None]. To be honest i have been searching for a bit, and i could find any information that would gave me a clear idea on the subject.
Official documentation describes AsyncIterator as:
... that provide __aiter__ and __anext__ methods ...
AsyncGenerator leads from documentation to PEP525.
... generator is any function containing one or more yield expressions ...
The result of calling an asynchronous generator function is an asynchronous generator object ...
Asynchronous generators define both of these methods (__aiter__ and __anext__)
So I think, because of yield in your function it's better to use AsyncGenerator.
I have observed that the asyncio.run_coroutine_threadsafe function does not accept general awaitable objects, and I do not understand the reason for this restriction. Observe
import asyncio
async def native_coro():
return
#asyncio.coroutine
def generator_based_coro():
return
class Awaitable:
def __await__(self):
return asyncio.Future()
loop = asyncio.get_event_loop()
asyncio.run_coroutine_threadsafe(native_coro(), loop)
asyncio.run_coroutine_threadsafe(generator_based_coro(), loop)
asyncio.run_coroutine_threadsafe(Awaitable(), loop)
Running this with Python 3.6.6 yields
Traceback (most recent call last):
File "awaitable.py", line 24, in <module>
asyncio.run_coroutine_threadsafe(Awaitable(), loop)
File "~/.local/python3.6/lib/python3.6/asyncio/tasks.py", line 714, in run_coroutine_threadsafe
raise TypeError('A coroutine object is required')
TypeError: A coroutine object is required
where line 24 is asyncio.run_coroutine_threadsafe(Awaitable(), loop).
I know I can wrap my awaitable object in a coroutine defined like
awaitable = Awaitable()
async def wrapper():
return await awaitable
asyncio.run_coroutine_threadsafe(wrapper(), loop)
however my expectation was that the awaitable would be a valid argument directly to run_coroutine_threadsafe.
My questions are:
What is the reason for this restriction?
Is the wrapper function defined above the most conventional way to pass an awaitable to run_coroutine_threadsafe and other APIs that demand an async def or generator-defined coroutine?
What is the reason for this restriction?
Looking at the implementation, the reason is certainly not technical. Since the code already invokes ensure_future (rather than, say, create_task), it would automatically work, and work correctly, on any awaitable object.
The reason for the restriction can be found on the tracker. The function was added in 2015 as a result of a pull request. In the discussion on the related bpo issue the submitter explicitly requests the function be renamed to ensure_future_threadsafe (in parallel to ensure_future) and accept any kind of awaitable, a position seconded by Yury Selivanov. However, Guido was against the idea:
I'm against that idea. I don't really see a great important future for this method either way: It's just a little bit of glue between the threaded and asyncio worlds, and people will learn how to use it by finding an example.
[...]
But honestly I don't want to encourage flipping back and forth between threads and event loops; I see it as a necessary evil. The name we currently have is fine from the POV of someone coding in the threaded world who wants to hand off something to the asyncio world.
Why would someone in the threaded world have an asyncio.future that they need to wait for? That sounds like they're mixing up the two worlds -- or they should be writing asyncio code instead of threaded code.
There are other comments in a similar vein, but the above pretty much sums up the argument.
Is the wrapper function defined above the most conventional way to pass an awaitable to run_coroutine_threadsafe and other APIs that demand an async def or generator-defined coroutine?
If you actually need a coroutine object, something like wrapper is certainly a straightforward and correct way to get one.
If the only reason you're creating the wrapper is to call run_coroutine_threadsafe, but you're not actually interested in the result or the concurrent.futures.Future returned by run_coroutine_threadsafe, you can avoid the wrapping by calling call_soon_threadsafe directly:
loop.call_soon_threadsafe(asyncio.ensure_future, awaitable)
I want to make sure I got down how to create tasklets and asyncrounous methods. What I have is a method that returns a list. I want it to be called from somewhere, and immediatly allow other calls to be made. So I have this:
future_1 = get_updates_for_user(userKey, aDate)
future_2 = get_updates_for_user(anotherUserKey, aDate)
somelist.extend(future_1)
somelist.extend(future_2)
....
#ndb.tasklet
def get_updates_for_user(userKey, lastSyncDate):
noteQuery = ndb.GqlQuery('SELECT * FROM Comments WHERE ANCESTOR IS :1 AND modifiedDate > :2', userKey, lastSyncDate)
note_list = list()
qit = noteQuery.iter()
while (yield qit.has_next_async()):
note = qit.next()
noteDic = note.to_dict()
note_list.append(noteDic)
raise ndb.Return(note_list)
Is this code doing what I'd expect it to do? Namely, will the two calls run asynchronously? Am I using futures correctly?
Edit: Well after testing, the code does produce the desired results. I'm a newbie to Python - what are some ways to test to see if the methods are running async?
It's pretty hard to verify for yourself that the methods are running concurrently -- you'd have to put copious logging in. Also in the dev appserver it'll be even harder as it doesn't really run RPCs in parallel.
Your code looks okay, it uses yield in the right place.
My only recommendation is to name your function get_updates_for_user_async() -- that matches the convention NDB itself uses and is a hint to the reader of your code that the function returns a Future and should be yielded to get the actual result.
An alternative way to do this is to use the map_async() method on the Query object; it would let you write a callback that just contains the to_dict() call:
#ndb.tasklet
def get_updates_for_user_async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
note_list = yield noteQuery.map_async(lambda note: note.to_dict())
raise ndb.Return(note_list)
Advanced tip: you can simplify this even more by dropping the #ndb.tasklet decorator and just returning the Future returned by map_async():
def get_updates_for_user_Async(userKey, lastSyncDate):
noteQuery = ndb.gql('...')
return noteQuery.map_async(lambda note: note.to_dict())
This is a general slight optimization for async functions that contain only one yield and immediately return the value yielded. (If you don't immediately get this you're in good company, and it runs the risk to be broken by a future maintainer who doesn't either. :-)
I'm implementing a utility library which is a sort-of task manager intended to run within the distributed environment of Google App Engine cloud computing service. (It uses a combination of task queues and memcache to execute background processing). I plan to use generators to control the execution of tasks, essentially enforcing a non-preemptive "concurrency" via the use of yield in the user's code.
The trivial example - processing a bunch of database entities - could be something like the following:
class EntityWorker(Worker):
def setup():
self.entity_query = Entity.all()
def run():
for e in self.entity_query:
do_something_with(e)
yield
As we know, yield is two way communication channel, allowing to pass values to code that uses generators. This allows to simulate a "preemptive API" such as the SLEEP call below:
def run():
for e in self.entity_query:
do_something_with(e)
yield Worker.SLEEP, timedelta(seconds=1)
But this is ugly. It would be great to hide the yield within seperate function which could invoked in simple way:
self.sleep(timedelta(seconds=1))
The problem is that putting yield in function sleep turns it into a generator function. The call above would therefore just return another generator. Only after adding .next() and yield back again we would obtain previous result:
yield self.sleep(timedelta(seconds=1)).next()
which is of course even more ugly and unnecessarily verbose that before.
Hence my question: Is there a way to put yield into function without turning it into generator function but making it usable by other generators to yield values computed by it?
You seem to be missing the obvious:
class EntityWorker(Worker):
def setup(self):
self.entity_query = Entity.all()
def run(self):
for e in self.entity_query:
do_something_with(e)
yield self.sleep(timedelta(seconds=1))
def sleep(self, wait):
return Worker.SLEEP, wait
It's the yield that turns functions into generators, it's impossible to leave it out.
To hide the yield you need a higher order function, in your example it's map:
from itertools import imap
def slowmap(f, sleep, *iters):
for row in imap(f, self.entity_query):
yield Worker.SLEEP, wait
def run():
return slowmap(do_something_with,
(Worker.SLEEP, timedelta(seconds=1)),
self.entity_query)
Alas, this won't work. But a "middle-way" could be fine:
def sleepjob(*a, **k):
if a:
return Worker.SLEEP, a[0]
else:
return Worker.SLEEP, timedelta(**k)
So
yield self.sleepjob(timedelta(seconds=1))
yield self.sleepjob(seconds=1)
looks ok for me.
I would suggest you have a look at the ndb. It uses generators as co-routines (as you are proposing here), allowing you to write programs that work with rpcs asynchronously.
The api does this by wrapping the generator with another function that 'primes' the generator (it calls .next() immediately so that the code begins execution). The tasklets are also designed to work with App Engine's rpc infrastructure, making it possible to use any of the existing asynchronous api calls.
With the concurreny model used in ndb, you yield either a future object (similar to what is described in pep-3148) or an App Engine rpc object. When that rpc has completed, the execution in the function that yielded the object is allowed to continue.
If you are using a model derived from ndb.model.Model then the following will allow you to asynchronously iterate over a query:
from ndb import tasklets
#tasklets.tasklet
def run():
it = iter(Entity.query())
# Other tasklets will be allowed to run if the next call has to wait for an rpc.
while (yield it.has_next_async()):
entity = it.next()
do_something_with(entity)
Although ndb is still considered experimental (some of its error handling code still needs some work), I would recommend you have a look at it. I have used it in my last 2 projects and found it to be an excellent library.
Make sure you read through the documentation linked from the main page, and also the companion documentation for the tasklet stuff.
I'm trying to convert a synchronous library to use an internal asynchronous IO framework. I have several methods that look like this:
def foo:
....
sync_call_1() # synchronous blocking call
....
sync_call_2() # synchronous blocking call
....
return bar
For each of the synchronous functions (sync_call_*), I have written a corresponding async function that takes a a callback. E.g.
def async_call_1(callback=none):
# do the I/O
callback()
Now for the python newbie question -- whats the easiest way to translate the existing methods to use these new async methods instead? That is, the method foo() above needs to now be:
def async_foo(callback):
# Do the foo() stuff using async_call_*
callback()
One obvious choice is to pass a callback into each async method which effectively "resumes" the calling "foo" function, and then call the callback global at the very end of the method. However, that makes the code brittle, ugly and I would need to add a new callback for every call to an async_call_* method.
Is there an easy way to do that using a python idiom, such as a generator or coroutine?
UPDATE: take this with a grain of salt, as I'm out of touch with modern python async developments, including gevent and asyncio and don't actually have serious experience with async code.
There are 3 common approaches to thread-less async coding in Python:
Callbacks - ugly but workable, Twisted does this well.
Generators - nice but require all your code to follow the style.
Use Python implementation with real tasklets - Stackless (RIP) and greenlet.
Unfortunately, ideally the whole program should use one style, or things become complicated. If you are OK with your library exposing a fully synchronous interface, you are probably OK, but if you want several calls to your library to work in parallel, especially in parallel with other async code, then you need a common event "reactor" that can work with all the code.
So if you have (or expect the user to have) other async code in the application, adopting the same model is probably smart.
If you don't want to understand the whole mess, consider using bad old threads. They are also ugly, but work with everything else.
If you do want to understand how coroutines might help you - and how they might complicate you, David Beazley's "A Curious Course on Coroutines and Concurrency" is good stuff.
Greenlets might be actualy the cleanest way if you can use the extension. I don't have any experience with them, so can't say much.
There are several way for multiplexing tasks. We can't say what is the best for your case without deeper knowledge on what you are doing. Probably the most easiest/universal way is to use threads. Take a look at this question for some ideas.
You need to make function foo also async. How about this approach?
#make_async
def foo(somearg, callback):
# This function is now async. Expect a callback argument.
...
# change
# x = sync_call1(somearg, some_other_arg)
# to the following:
x = yield async_call1, somearg, some_other_arg
...
# same transformation again
y = yield async_call2, x
...
# change
# return bar
# to a callback call
callback(bar)
And make_async can be defined like this:
def make_async(f):
"""Decorator to convert sync function to async
using the above mentioned transformations"""
def g(*a, **kw):
async_call(f(*a, **kw))
return g
def async_call(it, value=None):
# This function is the core of async transformation.
try:
# send the current value to the iterator and
# expect function to call and args to pass to it
x = it.send(value)
except StopIteration:
return
func = x[0]
args = list(x[1:])
# define callback and append it to args
# (assuming that callback is always the last argument)
callback = lambda new_value: async_call(it, new_value)
args.append(callback)
func(*args)
CAUTION: I haven't tested this