Python Asyncio Producer Consumer Model Stops working - python

For two projects I rely on a asycio producer-consumer model to work through some tasks.
The producers work of messages that come in from either mqtt or zermq.
This is the code of interest (running python3.7):
async def Producer(client, topic_filter, queue):
async with client.filtered_messages(topic_filter) as messages:
async for message in messages:
message = message.payload.decode()
await queue.put(message)
OutputText( 'Added element to queue.' )
async def Consumer(client, queue: asyncio.Queue):
while True:
item = await queue.get()
await DoTask(client, item)
await asyncio.sleep(timedelay)
queue.task_done()
When I just start up this code it works as expected. But after operating for time I find that the consumer stop working. When this happens I can still send messages to the script. The log file shows the print out that the element was added to the queue. But the consumer isn't triggered and remains idle.
I found that this normally happens when the machine that it is running on has to use some swap memory or goes to 100% CPU usage. Therefore I am guessing that the consumer doesn't have a proper connection with the queue anymore but I could be wrong.
Since I don't get any errors when this happens it is very hard to debug. Any idea's on how to debug this would be great.
Cheers,
Hilbert

Related

How to synchronize access inside async for?

I found this library for asynchronously consuming kafka messages: https://github.com/aio-libs/aiokafka
It gives this code example:
from aiokafka import AIOKafkaConsumer
import asyncio
async def consume():
consumer = AIOKafkaConsumer(
'redacted',
bootstrap_servers='redacted',
auto_offset_reset="earliest"
#group_id="my-group"
)
# Get cluster layout and join group `my-group`
await consumer.start()
try:
# Consume messages
async for msg in consumer:
print("consumed: ", msg.topic, msg.partition, msg.offset,
msg.key, msg.value, msg.timestamp)
finally:
# Will leave consumer group; perform autocommit if enabled.
await consumer.stop()
asyncio.run(consume())
I would like to find out the biggest kafka message using this code. So, Inside async for I need to do max_size = max(max_size, len(msg.value)). But I think it won't be thread-safe, and I need to lock access to it?
try:
max_size = -1
# Consume messages
async for msg in consumer:
max_size = max(max_size, len(msg.value)) # do I need to lock this code?
How do I do it in python? I've checked out this page: https://docs.python.org/3/library/asyncio-sync.html and I'm confused because those synchronization primitives are not thread-safe? So I can't use them in a multithreaded context? I'm really confused. I come from a Java background and need to write this script, so, pardon me that I haven't read all the asyncio books out there.
Is my understanding correct that the body of the async for loop is a continuation that may be scheduled on a separate thread when the asynchronous operation is done?

Whats wrong in my code my code for scheduled messages on discord.py on replit

so I've been trying to make my discord bot send a message every day at 12:30 UCT but i cant seem to get my code to work I'm not sure if its not working because of incorrect code or because its on replit or whatever else the issue could be as i get no errors from this it just send the message once it loads online and that's all.
import datetime, asyncio
bot = commands.Bot(command_prefix="+")
Async def on_Ready():
await schedule_daily_message()
async def schedule_daily_message():
now = datetime.datetime.now()
then = now+datetime.timedelta(days=1)
then.replace(hour=12, minute=30)
wait_time = (then-now).total_seconds()
await asyncio.sleep(wait_time)
channel = bot.get_channel(Channel_id)
await channel.send("Enemies Spawned!")
client.run(os.getenv('TOKEN'))
await asyncio.sleep is non blocking. Your script will execute beyond that statement. You will need to use time.sleep, which that will block all execution of code until the timer has run out.
See here for a more in depth explanation and how to use this in functions:
asyncio.sleep() vs time.sleep()
A way to implement a function that posts a message to a channel after a delay could look like this:
async def send_after_delay(d, c, m):
time.sleep(d)
await c.send(m)
Calling this function asynchronously allows you to continue with code execution beyond it, while still waiting past the calling of the function to send the message.

How to process WebSocket messages in parallel using Django Channels?

We're getting started with Django Channels and are struggling with the following use case:
Our app receives multiple requests from a single client (another server) in a short time. Creating each response takes a long time. The order in which responses are sent to the client doesn't matter.
We want to keep an open WebSocket connection to reduce connection overhead for sending many requests and responses from and to the same client.
Django Channels seems to process messages on the same WebSocket connection strictly in order, and won't start processing the next frame before the previous one has been responded to.
Consider the following example:
Example
Server-side
import asyncio
from channels.generic.websocket import AsyncWebsocketConsumer
class QuestionConsumer(AsyncWebsocketConsumer):
async def websocket_connect(self, event):
await self.accept()
async def complicated_answer(self, question):
await asyncio.sleep(3)
return {
"What is the Answer to Life, The Universe and Everything?": "42",
"Why?": "Because.",
}.get(question, "Don't know")
async def receive(self, text_data=None, bytes_data=None):
# while awaiting below, we should start processing the next WS frame
answer = await self.complicated_answer(text_data)
await self.send(answer)
asgi.py:
from django.urls import re_path
from channels.routing import ProtocolTypeRouter, URLRouter
application = ProtocolTypeRouter(
{"websocket": URLRouter([
re_path(r"^questions", QuestionConsumer.as_asgi(), name="questions",)
]}
)
)
Client-side
import asyncio
import websockets
from time import time
async def main():
async with websockets.connect("ws://0.0.0.0:8000/questions") as ws:
tasks = []
for m in [
"What is the Answer to Life, The Universe and Everything?",
"Why?"
]:
tasks.append(ws.send(m))
# send all requests (without waiting for response)
time_before = time()
await asyncio.gather(*tasks)
# wait for responses
for t in tasks:
print(await ws.recv())
print("{:.1f} seconds since first request".format(time() - time_before))
asyncio.get_event_loop().run_until_complete(main())
Result
Actual
42
3.0 seconds since first request
Because.
6.0 seconds since first request
Desired
42
3.0 seconds since first request
Because.
3.0 seconds since first request
In other words, we would like the event loop to switch between async tasks not only for multiple consumers, but also for all tasks handled by the same consumer. Is this possible or is there a workaround we are overlooking? Have you used Django Channels for similar challenges and how did you solve them?
The consumer's receive function is called sequentially for each incoming WebSocket message, and when the await of the first receive is reached, the receive method wasn't called for the second message and hence switching context to the second co-routine is not yet possible. I couldn't find a source for this, but I'm guessing that this is part of the ASGI protocol itself. For many use-cases, handling WebSocket messages stricty in the order of receiving is probably desired.
The solution to handle messages asynchronously is to not send the response from the receive method, but instead send the response from a coroutine scheduled through loop.create_task.
Scheduling the long-running coroutine which generates response allows receive to complete, and for the next receive to begin. Once the second message's response generation has been scheduled, two coroutines will have been scheduled, and the interpreter can switch contexts to execute them asynchronously.
For the example in the question, this is the solution I found:
class QuestionConsumer(AsyncWebsocketConsumer):
async def complicated_answer(self, question):
await asyncio.sleep(3)
answer = {
"What is the Answer to Life, The Universe and Everything?": "42",
"Why?": "Because.",
}.get(question, "Don't know")
# instead of returning the answer, send it directly to client as a response
await self.send(answer)
async def receive(self, text_data=None, bytes_data=None):
# instead of awaiting, schedule the coroutine
loop = asyncio.get_running_loop()
loop.create_task(
self.complicated_answer(text_data)
)
The output of this altered consumer matches the desired output given by the question. Note that responses may be returned out of order, and clients are responsible for matching requests to responses.
Note that for Python versions <3.7, get_event_loop should be used instead of get_running_loop.

Microsoft Teams - infinite loop in bot main

I have developed a Teams bot that runs an infinite loop at startup in order to send proactive messages to users.
async def job():
i = 60
await asyncio.sleep(i)
await _create_file_of_the_day()
await _send_question_of_the_day()
await job()
if __name__ == "__main__":
try:
loop = asyncio.get_event_loop()
t1 = loop.create_task(job())
t2 = loop.create_task((web.run_app(APP, host="localhost", port=CONFIG.PORT)))
asyncio.gather(t1,t2)
loop.run_forever()
except Exception as error:
raise error
This work on local with python app.py but when I upload the bot to azure and test it online, the infinite loop is not started and so I find it impossible to proactively send messages to users.
The two methods called work. The first creates a file on azure and the second creates two questions using the contents of the file, which should be sent proactively to all members of the channel.
Does anyone know how to help me? I need to send messages to users based on a time delay not in response to their actions. This scheduling is not constant, for example I want to send messages only on working days and not on holidays.
Thanks to all
UPDATE
I have tried this second solution just on the comments, but the result is always the same. Locally the application behaves correctly, but on Azure cloud the routine that should loop seems not to be triggered.
async def job():
i = 60
await asyncio.sleep(i)
await _create_file_of_the_day()
await _send_question_of_the_day()
await job()
async def main():
runner = aiohttp.web.AppRunner(APP)
await runner.setup()
site = web.TCPSite(runner, host='localhost', port=CONFIG.PORT)
await site.start()
asyncio.create_task(job())
while True:
await asyncio.sleep(3600)
if __name__ == "__main__":
try:
asyncio.run(main())
except Exception as error:
raise error
Not being able to use the loop to be able to schedule messages, the problem was solved by using an Azure Function type timer trigger. This function calls an endpoint created inside the bot that each time it is called executes the job() method.
These two links may be useful to understand how to create an endpoint and how to query it. In the code samples, the query is given by a click on the link which can easily be replaced by a GET request within the code.
Proactive message sample with endpoint creation
Article with explanation of proactive messages
The sample code is in python, but by browsing the git folders you can find code in other programming languages.
For the development of Azure Function I found this series of three videos on YouTube very useful.
Useful video on development of timer trigger function in python

Using a synchronous library with asynchronous Discord.py

I am working on a bot that streams post from the Steem Blockchain (using the synchronous beem library) and sends posts that fulfil certain criteria to a Discord channel (using the asynchronous Discord.py library). This is is my (simplified) code:
bot = commands.Bot(command_prefix="!")
async def send_discord(msg):
await bot.wait_until_ready()
await bot.send_message(bot.get_channel("mychannelid"), msg)
async def scan_post(post):
"""Scan queued Comment objects for defined patterns"""
post.refresh()
if post["author"] == "myusername":
await loop.create_task(send_discord("New post found"))
async def start_blockchain():
stream = map(blockchain.stream(opNames=["comment"]))
for post in stream:
await loop.create_task(scan_post(post))
if __name__ == '__main__':
while True:
loop.create_task(start_blockchain())
try:
loop.run_until_complete(bot.start(TOKEN))
except Exception as error:
bot.logout()
logger.warning("Bot restarting "+repr(error))
Before I implemented discord.py I would just call the synchronous function scan_post(post) and it worked just fine, but now with the asynchronous implementation the posts are not processed fast enough and the stream has a rapidly increasing delay. If I make scan_post(post) a synchronous function, the processing time is fine, but the Discord websocket closes (or does not even open) and the bot goes offline. How can I solve this in a simple way (without rewriting the beem library)?
I solved the problem: I run the beem stream in its own thread and the asynchronous functions in a second thread. With the janus library I can then add objects from the beam thread to a queue that is processed by the asynchronous thread.

Categories