lock an asyncio function when more than 100 are running? - python

This is gonna be a bad explanation but I don't know how else to word this so bear with me please.
I have one function:
async def request():
# this can only be called n times at once
But as it says it can only be called n times at once. Is it possible to have some sort of pool with a limited number of objects so I can do this:
async def request():
await with poolOfOneHundred.acquire():
# do something
and then python would acquire 100 of these, and then once it gets to the 101th it would wait at the await with statement until another request() function was finished, then one lock in the pool would be free.
Is this a thing? If not, how could I implement something like this?
Does this make any sense?

You're looking for asyncio.Semaphore.
Here's an example of how to use it.

Related

Is this a good alternative of asyncio.sleep

I decided not use asyncio.sleep() and tried to create my own coroutine function as shown below. Since, time.sleep is an IO bound function, I thought this will print 7 seconds. But it prints 11 seconds.
import time
import asyncio
async def my_sleep(delay):
time.sleep(delay)
async def main():
start = time.time()
await asyncio.gather(my_sleep(4), my_sleep(7))
print("Took", time.time()-start, "seconds")
asyncio.run(main())
# Expected: Took 7 seconds
# Got: Took 11.011508464813232 seconds
Though if I write a similar code with threads, It does print 7 seconds. Do Task objects created by asyncio.gather not recognize time.sleep as an IO bound operation, the way threads do? Please explain why is it happening.
time.sleep is blocking operation for event loop. It has no sense if you write async in defention of function because it not unlock the event loop (no await command)
This two questions might help you to understand more:
Python 3.7 - asyncio.sleep() and time.sleep()
Run blocking and unblocking tasks together with asyncio
This would not work for you because time.sleep is a synchronous function.
From the 'perspective' of the event loop my_sleep might as well be doing a heavy computation within an async function, never yielding the execution context while working.
The first tell tale sign of this is that you're not using an await statement when calling time.sleep.
Making a synchronous function behave as an async one is not trivial, but the common approach is moving the function call to worker threads and awaiting the results.
I'd recommend looking at the solution of anyio, they implemented a run_sync function which does exactly that.

How to use Trio for fast web api calls?

I'm trying to speed up some code that calls an api_caller(), which is a generator that you can iterate over to get results.
My synchronous code looks something like this:
def process_comment_tree(p):
# time consuming breadth first search that makes another api call...
return
def process_post(p):
process_comment_tree(p)
def process_posts(kw):
for p in api_caller(query=kw): #possibly 1000s of results
process_post(p)
def process_kws(kws):
for kw in kws:
process_posts(kw)
process_kws(kws=['python', 'threads', 'music'])
When I run this code on a long list of kws, it takes around 18 minutes to complete.
When I use threads:
with concurrent.futures.ThreadPoolExecutor(max_workers=len(KWS)) as pool:
for result in pool.map(process_posts, ['python', 'threads', 'music']):
print(f'result: {result}')
the code completes in around 3 minutes.
Now, I'm trying to use Trio for the first time, but I'm having trouble.
async def process_comment_tree(p):
# same as before...
return
async def process_post(p):
await process_comment_tree(p)
async def process_posts(kw):
async with trio.open_nursery() as nursery:
for p in r.api.search_submissions(query=kw)
nursery.start_soon(process_post, p)
async def process_kws(kws):
async with trio.open_nursery() as nursery:
for kw in kws:
nursery.start_soon(process_posts, kw)
trio.run(process_kws, ['python', 'threads', 'music'])
This still takes around 18 minutes to execute. Am I doing something wrong here, or is something like trio/async not appropriate for my problem setup?
Trio, and async libraries in general, work by switching to a different task while waiting for something external, like an API call. In your code example, it looks like you start a bunch of tasks, but wait for something external. I would recommend reading this part of the tutorial; it gives an idea of what that means: https://trio.readthedocs.io/en/stable/tutorial.html#task-switching-illustrated
Basically, your code has to call a function that will pass control back to the run loop so that it can switch to a different task.
If your api_caller generator makes calls to an external API, that's likely to be something you can replace with async calls. You'll need to use an async http library, like HTTPX or hip
On the other hand, if there's nothing in your code that has to wait for something external, then async won't help your code go faster.

Python 3.6 async await without asyncio: how to write own simplest event loop?

Just like to understand async await syntax, so I am looking for some 'hello world' app without using asyncio at all.
So how to create simplest event loop using only Python syntax itself? The simplest code (from this Start async function without importing the asyncio package , further code is much more then hello world, that's why I am asking) looks like that:
async def cr():
while(True):
print(1)
cr().send(None)
It prints 1 infinitely, not so good.
So the 1st question is how to yield from coroutine back to the main flow? yield keyword makes coroutine async generator, not we expected.
I would also appreciate a simple application, like this
i.e. we have a coroutine which prints 1, then yields to event loop, then prints 2 then exits with return 3, and simple event loop, which push coroutine until return and consume result.
How about this?
import types
#types.coroutine
def simple_coroutine():
print(1)
yield
print(2)
return 3
future = simple_coroutine()
while True:
try: future.send(None)
except StopIteration as returned:
print('It has returned', returned.value)
break
I think your biggest problem is that you're mixing concepts. An async function is not the same as a coroutine. It is more appropriate to think of it as a way of combining coroutines. Same as ordinary def functions are a way of combining statements into functions. Yes, Python is highly reflective language, so def is also a statement, and what you get from your async function is also a coroutine---but you need to have something at the bottom, something to start with. (At the bottom, yielding is just yielding. At some intermediate level, it is awaiting---of something else, of course.) That's given to you through the types.coroutine decorator in the standard library.
If you have any more questions, feel free to ask.

Using asyncio with blocking code

Firstly, I looked at this, this and this and whilst the first has some useful information, it's not relevant here because I'm trying to iterate over values.
Here's an example of something I want to be able to do:
class BlockingIter:
def __iter__(self):
while True:
yield input()
async def coroutine():
my_iter = BlockingIter()
#Magic thing here
async for i in my_iter:
await do_stuff_with(i)
How would I go about this?
(Note, BlockingIter is in reality a library I'm using (chatexchange) so there might be a few other complications.)
As #vaultah says and also explained in the docs, awaiting the executor (await loop.run_in_executor(None, next, iter_messages)) is probably what you want.

Is asyncio.wait order guaranteed?

I'm trying to implement fair queuing in my library that is based on asyncio.
In some function, I have a statement like (assume socketX are tasks):
done, pending = asyncio.wait(
[socket1, socket2, socket3],
return_when=asyncio.FIRST_COMPLETED,
)
Now I read the documentation for asyncio.wait many times but it does not contain the information I'm after. Mainly, I'd like to know if:
socket1, socket2 and socket3 happened to be already ready when I issue the call. Is it guaranteed that done will contain them all or could it be that it returns only one (or two) ?
In the second case, does the order of the tasks passed to wait() matter ?
I'm trying to assert if I can just apply fair-queuing in the set of done tasks (by picking one and leaving the other tasks for later resolution) or if I also need to care about the order I pass the tasks in.
The documentation is kinda silent about this. Any idea ?
This is only taken according to the source code of Python 3.5.
If the future is done before calling wait, they will all be placed in the done set:
import asyncio
async def f(n):
return n
async def main():
(done, pending) = await asyncio.wait([f(1), f(2), f(3)], return_when=asyncio.FIRST_COMPLETED)
print(done) # prints set of 3 futures
print(pending) # prints empty set
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()

Categories