I've a trouble understand the reason for the below code, hope someone can shed some lights on this. I'm new in async programming.
This is from websockets documentation
#!/usr/bin/env python
import asyncio
import websockets
async def echo(websocket, path):
async for message in websocket:
await websocket.send(message)
asyncio.get_event_loop().run_until_complete(
websockets.serve(echo, 'localhost', 8765))
asyncio.get_event_loop().run_forever()
I have some questions about asyncio design and how I can take advantage of this.
First of all, looking at the last two lines. If I understand correctly, shouldn't run_until_complete shutdown after finished its work? How can the second loop kept it alive with no jobs submitted into the loop?
Secondly, I was trying to build a back_end that can process some data from the front_end using websockets and return the calculation in realtime. There will be two kinds of tasks, one is bit longer that need computation power a bit but will happen one time for a session, and bunch of streaming data processing that need to be send back immediately (90 frames per seconds).
For the bigger tasks, should I just fire up another websocket server that process the longer work, and use the main websocket to consume on it? Or use another process to do the work in a chained async function? And for the smaller tasks, what would go wrong if I do the same as above?
TLDR:
Async programming is hammering my brain.
Maintain a nonblocking session
Which process hard work and light work simultaneously
The light work should have zero latency
The hard work should be as fast as possible but not effecting the light work.
Thanks!!
Related
I have basically the following code and want to embed it in an async coroutine:
def read_midi():
midi_in = pygame.midi.Input(0)
while True:
if midi_in.poll():
midi_data = midi_in.read(1)[0][0]
# do something with midi_data, e.g. putting it in a Queue..
From my understanding since pygame is not asynchronous I have two options here: put the whole function in an extra thread or turn it into an async coroutine like this:
async def read_midi():
midi_in = pygame.midi.Input(1)
while True:
if not midi_in.poll():
await asyncio.sleep(0.1) # very bad!
continue
midi_data = midi_in.read(1)[0][0]
# do something with midi_data, e.g. putting it in a Queue..
So it looks like I have to either keep the busy loop and put it in a thread and waste lots of cpu time or put it into the (fake) coroutine above and introduce a tradeoff between time lags and wasting CPU time.
Am I wrong?
Is there a way to read MIDI without a busy loop?
Or even a way to await midi.Input.read?
It is true that the pygame library is not asynchronous, so you must either utilize a distinct thread or an asynchronous coroutine to process the MIDI input.
Using a distinct thread will permit the other parts of the program to carry on running concurrently to the MIDI input being read, but it will also necessitate more CPU resources.
Employing an async coroutine with the asyncio.sleep(0.1) call will result in a holdup in the MIDI input, although it will also reduce the CPU utilization. The trade-off here is between responsiveness and resource usage.
Using asyncio.sleep(0.1) will not be optimal as it will cause a considerable lag and it might not be wise to incorporate sleep in the while loop, as this will introduce a lot of holdup and won't be responsive.
Another possible choice is to utilize a library that furnishes an asynchronous interface for MIDI input, such as rtmidi-python or mido. These libraries may offer an approach to wait for MIDI input asynchronously without using a blocking call.
Firstly, an apology. I'm pretty new to Python. I come from a Java/C# coding background. I'm loving Pythons simplicity in many ways, but also finding some standards hard to pin down.
For instance, I've successfully managed to get a Discord Bot running. The async methods are working well. But I would like to schedule a job to run every (say) 30 minutes. However, when I type asyncio.run(job()), Python tells me that "run" is not an attribute of asyncio. I'm really not sure why it would say that. Heck, is asyncio even the "right" way to do this?
Is it possible the discord import has masked it in some way? I think I might need to get a book or something!
Again, thanks. I did try a search on this, but nothing came up!
on_ready is called when the discord bot gets initiated so one way is to attach your job to it:
import discord
import asyncio
client = discord.Client()
#client.event
async def on_ready():
while True:
await asyncio.sleep(30*60) # every 30 minutes
job()
client.run(os.environ.get('DISCORD_BOT_SECRET')) # provide your API token here!!
asyncio.sleep is a non-blocking sleep -- if one were to use time.sleep here, then the bot would wait for time.sleep to complete and would be unresponsive to any other messages coming in. But what await asyncio.sleep does is yield control back to the event loop which can take care of other bot functions. Only after 30 minutes control will return to on_ready.
Note that while your job runs it will block your bot which is a problem for jobs tasking longer than a few seconds. If your job is I/O-based (e.g. fetching websites) you can use async I/O operations (such as aiohttp) to keep it responsive. If your job is CPU-based you might have to use multiple processes, such as subprocess.Popen if your job can be invoked with a terminal command.
The modern way to schedule a job every 30 minutes is using discord.ext.tasks:
import asyncio
import contextlib
import discord
from discord.ext import tasks
client = discord.Client()
#tasks.loop(minutes=30) # every 30 minutes
async def job():
with contextlib.suppress(Exception):
... # your code here
job.start()
client.run(os.environ.get('DISCORD_BOT_SECRET')) # provide your API token here!!
The function job() will be called every 30 minutes by the Discord client's event loop. A couple notes here:
If job() throws an exception, the ENTIRE LOOP will stop. That is right, not just this invocation, but the entire loop will exit. That is why I added a with-statement to catch all exceptions, this is equivalent to try: ... except: pass.
While job() is running, the other parts of your bot are blocked. This is because async is not implemented using multiple threads but using coroutines in a single thread. A way around this could be to make your code async (e.g. using aiohttp for web requests) or to start another process using subprocess.Popen.
In my Python console app, I have a simple for loop that looks like this:
for package in page.packages:
package.load()
# do stuff with package
Every time it's called, package.load() makes a series of HTTP requests. Since page.packages typically contains thousands of packages, the load() call is becoming a substantial bottleneck for my app.
To speed things up, I've thought about using the multiprocessing module to do parallelization, but that would still waste a lot of resources because the threads are network-bound, not CPU-bound: instead of having 1 thread waiting around doing nothing, you'd have 4 of them. Is it possible to use asynchrony instead to somehow just use one/a few threads, but make sure they're never waiting around for the network?
asyncio is an excellent fit for this, but you will need to convert your HTTP loading code Package.load to async using something like aiohttp. For example:
async def load(self):
with self._session.get(self.uri, params) as resp:
data = resp.read() # etc
The sequential loop you've had previously would be expressed as:
async def load_all_serial(page):
for package in page.packages:
await package.load()
But now you also have the option to start the downloads in parallel:
async def load_all_parallel(page):
# just create tasks, do not await them yet
tasks = [package.load() for package in page.packages]
# now let them run, and wait until all have been completed
await asyncio.gather(*tasks)
Calling either of these async functions from synchronous code is as simple as:
loop = asyncio.get_event_loop()
loop.run_until_complete(load_all_parallel(page))
I have a python server that is available through a websocket endpoint.
During serving a connection, it also communicates with some backend services. This communication is asynchronous and may trigger the send() method of the websocket.
When a single client is served, it seems to work ok. However, when multiple clients are served in parallel, some of the routines that handle the connections get stuck occasionally. More precisely, it seem to block in the recv() method.
The actual code is somehow complex and the issue is slightly more complicated than I have described, nevertheless, I provide a minimal skeleton of code that sketch the way in which I use he websockets:
class MinimalConversation(object):
def __init__(self, ws, worker_sck, messages, should_continue_conversation, should_continue_listen):
self.ws = ws
self.messages = messages
self.worker_sck = worker_sck
self.should_continue_conversation = should_continue_conversation
self.should_continue_listen = should_continue_listen
async def run_conversation(self):
serving_future = asyncio.ensure_future(self.serve_connection())
listening_future = asyncio.ensure_future(self.handle_worker())
await asyncio.wait([serving_future, listening_future], return_when=asyncio.ALL_COMPLETED)
async def serve_connection(self):
while self.should_continue_conversation():
await self.ws.recv()
logger.debug("Message received")
self.sleep_randomly(10, 5)
await self.worker_sck.send(b"Dummy")
async def handle_worker(self):
while self.should_continue_listen():
self.sleep_randomly(50, 40)
await self.worker_sck.recv()
await self.ws.send(self.messages.pop())
def sleep_randomly(self, mean, dev):
delta = random.randint(1, dev) / 1000
if random.random() < .5:
delta *= -1
time.sleep(mean / 1000 + delta)
Obviously, in the real code I do not sleep for random intervals and don't use given list of messages but this sketches the way I handle the websockets. In the real setting, some errors may occur that are sent over the websocket too, so parallel sends() may occur in theory but I have never encountered such a situation.
The code is run from a handler function which is passed as a parameter to websockets.serve(), initialize the MinimalConversation object and calls the run_conversation() method.
My questions are:
Is there something fundamentally wrong with such usage of the websockets?
Are concurrent calls of the send() methods dangerous?
Can you suggest some good practices regarding usage of websockets and asyncio?
Thak you.
recv function yields back only when a message is received, and it seems that there are 2 connections awaiting messages from each other, so there might be a situation similar to "deadlock" when they are waiting for each other's messages and can't send anything. Maybe you should try to rethink the overall algorithm to be safer from this.
And, of course, try adding more debug output and see what really happens.
are concurrent calls of the send() methods dangerous?
If by concurrent you mean in the same thread but in independently scheduled coroutines then parallel send is just fine. But be careful with "parallel" recv on the same connection, because order of coroutine scheduling might be far from obvious and it's what decides which call to recv will get a message first.
Can you suggest some good practices regarding usage of websockets and asyncio?
In my experience, the easiest way is to create a dedicated task for incoming connections which will repeatedly call recv on the connection, until connection is closed. You can store the connection somewhere and delete it in finally block, then it can be used from other coroutines to send something.
I've read the docs. I've played with examples. But I'm still unable to grasp what exactly asynchronous means when it is useful and where is the magic lots of people seem so crazy about?
If it is only for avoiding to wait for I/O when why to simple run in threads? Why does Deferred needed?
I think I'm missing some fundamental knowledge about computing so those questions. If so what is it?
like you're five... ok: threads bad, async good!
now, seriously: threads incur overhead - both in locking and switching of the interpreter, and in memory consumption and code complexity. when your program is IO bound, and does a lot of waiting for other services (APIs, databases) to return a response, you're basically waiting on idle, and wasting resources.
the point of async IO is to mitigate the overhead of threads while keeping the concurrency, and keeping your program simple, avoiding deadlocks and reducing complexity.
think for example about a chat server. you have thousands of connections on the server, and you want some people to receive some messages based on which room they are. doing this with threads will be much more complicated than doing this the async way.
re deferred - it's just a way of simplifying your code, instead of giving every function a callback to return to when the operation it's waiting for is ready.
another note - if you want a much simpler and elegant async IO framework, try tornado, which is basically an async web server, with async http client and a replacement for deferred. it's very nicely written and can be used as a general purpose async IO framework.
see http://tornadoweb.org