Callback function timeout/disruption in google Pub/Sub asynchronous pull subscriber

Callback function timeout/disruption in google Pub/Sub asynchronous pull subscriber - python

I have a subscriber application which pulls from a Google Cloud Pub/Sub asynchronously using the google-cloud-pubsub python library.
I am running into intermittent issues where my callback function doesnt finish running/is interrupted. Unfortunately I dont have any errors, I only know this is the case because it does not finish writing data to an external source.
for example, in the following sample code:
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path(
project, subscription_name)
def callback(message):
print('Received message: {}'.format(message))
message.ack()
# Limit the subscriber to only have ten outstanding messages at a time.
flow_control = pubsub_v1.types.FlowControl(max_messages=10)
subscriber.subscribe(
subscription_path, callback=callback, flow_control=flow_control)
# The subscriber is non-blocking, so we must keep the main thread from
# exiting to allow it to process messages in the background.
print('Listening for messages on {}'.format(subscription_path))
while True:
time.sleep(60)
the code in the callback function may sometimes take a while, and in some cases, I have noticed that it does not finish executing/seems to be disrupted by something.
Could that ever happen? is there a timeout on this function?

Related

Asyncio - run tasks cyclically and politely stop them with ctrl+C

I am writing a pyModbus server with asyncio, based on this example.
Alongside the server I've got a serial device which I'm communicating with and a server updating task.
One task should check the status of the serial device every 500ms.
The server updating task should check if there are any changes in the status of the serial device and update the info on the server. Moreover, if there is a request waiting on the server it should call another task which will send necessary info to the serial device.
My three questions are:
How should I stop the server politely? For now the app is running only in console so it is stopped by ctrl+c - how can I stop the server without causing an avalanche of errors?
How can I implement tasks to be executed cyclically (let's say I want to frefresh the server data every 500ms)? I've found the aiocron module but as far as I can tell its functionalities are a bit limtied as it is intended just for calling functions in intervals.
How can I politely cancel all the tasks before stopping the server (the infinitely, cyclically running ones) when closing the app?
Thanks!
EDIT:
Speaking of running cyclical tasks and cancelling them - is this a proper way to do that? This doesn't rise any errors but does it clean eveything correctly? (I created this sketch compiling a dozen of questions on stackoverflow, I am not sure if this makes sense)
import asyncio
async def periodic():
try:
while True:
print('periodic')
await asyncio.sleep(1)
except asyncio.CancelledError as ex:
print('task1', type(ex))
raise
async def periodic2():
try:
while True:
print('periodic2')
await asyncio.sleep(0.5)
except asyncio.CancelledError as ex:
print('task2', type(ex))
raise
async def main():
tasks = []
task = asyncio.create_task(periodic())
tasks.append(task)
task2 = asyncio.create_task(periodic2())
tasks.append(task2)
for task in tasks:
await task
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
pass

Paho MQTT (Python) - loop_start() not working

I'm writing a MQTT client which simply connects to the broker, publish a message and then disconnect. Here's the code:
def on_connect_v5(client, userdata, flags, rc, properties):
print('connected')
client.publish(topic, payload, 0)
def on_publish(client, userdata, mid):
print(f'mid: {mid}')
client.disconnect()
client = paho.Client(protocol=paho.MQTTv5)
client.on_connect = on_connect_v5
client.on_publish = on_publish
client.connect(host, port, 60)
client.loop_start()
# client.loop_forever()
The question is when I use loop_start(), it seems the client isn't connected successfully, but loop_forever() would work. Have I done something wrong with the loop_start() function, and what's the proper way to use it?
BTW: have tried use the paho.mqtt.publish module and always get a Socket timed out. Appreciated if someone can explain it as well.

The difference is that loop_forever blocks the program. loop_start, only starts a daemon thread, but doesn't block. So your program continues. In the code you show, this means the program exists.
You can read more here: https://github.com/eclipse/paho.mqtt.python#network-loop
Calling loop_start() once, before or after connect*(), runs a thread in the background to call loop() automatically. This frees up the main thread for other work that may be blocking.
loop_forever(). This is a blocking form of the network loop and will not return until the client calls disconnect(). It automatically handles reconnecting.

Your main threads not waiting loop_start(); because its daemon thread. Daemon threads not block the program until finish its job. When your main thread done its your job kill itself. That's the also kill your loop_start() thread. If your main thread has infinite loop or longer loops, your loop_start() works perfectly

How to run a function asynchronously as “fire and forget”?

I have a Python Kafka consumer application where I consume the messages and then call an external webservice synchronously. The webservice takes a minute to process the message and send the response.
Is there a way to consume the message, send the request to the Web service and consume the next message without waiting for the response?
from kafka import KafkaConsumer
from json import loads
consumer = KafkaConsumer(
'spring_test',
bootstrap_servers=['localhost:9092'],
auto_offset_reset='earliest',
enable_auto_commit=True,
group_id='my-group',
value_deserializer=lambda x: loads(x.decode('utf-8')));
This is how I wait for the messages and send an external Web request
def consume_msgs():
for message in consumer:
message = message.value;
send('{}'.format(message))
consume_msgs()
The function send() takes one minute before I get the response. I want to start consuming the next message in the meantime asynchronously but I don't know where to start
def send(pload) :
import requests
r = requests.post('someurl',data = pload)
print(r)

Not sure if this is what you need but could you just spin each call to send out into a thread? Something like this the below. This way the for loop will continue without waiting for send to return. You may have to throttle the number of threads somehow if you are consuming data far quicker than you are processing it.
from threading import Thread
def consume_msgs():
for message in consumer:
message = message.value;
Thread(target=send, args = ('{}'.format(message),)).start()
consume_msgs()

Python thread retry every X seconds up to Y minutes

I'm trying to implement the client-server architecture in Python, where I have:
Server application
List of clients, who can subscribe to updates via API (sending POST requests to /subscribe endpoint).
It works fine. On the server side, I have a list of subscriber's URLs.
The main idea is to send requests from the server to all subscribed clients every X seconds (something similar to monitoring system).
I'm trying to do this part using threads:
class Monitor(threading.Thread):
def __init__(self):
super(Monitor, self).__init__()
self.setDaemon(True)
def send_notifications(self, subscribers):
for subscriber in subscribers:
request.post(subscriber["url"], json=subscriber["data"], timeout=0.5)
def run(self):
subscribers = get_subscribers() # getting list of subscribers via API call.
while True:
self.send_notifications(subscribers)
time.sleep(Y)
More or less it works, but I need to improve it a little.
The expected behaviour is:
The server should send notifications every X seconds to each subscribed client. If sending the notification to some client fails for Y minutes(5 minutes for example) it should unsubscribe this unresponsive client.
Is there some best practices for this?

How to avoid high cpu usage?

I created a zmq_forwarder.py that's run separately and it passes messages from the app to a sockJS connection, and i'm currently working on right now on how a flask app could receive a message from sockJS via zmq. i'm pasting the contents of my zmq_forwarder.py. im new to ZMQ and i dont know why everytime i run it, it uses 100% CPU load.
import zmq
# Prepare our context and sockets
context = zmq.Context()
receiver_from_server = context.socket(zmq.PULL)
receiver_from_server.bind("tcp://*:5561")
forwarder_to_server = context.socket(zmq.PUSH)
forwarder_to_server.bind("tcp://*:5562")
receiver_from_websocket = context.socket(zmq.PULL)
receiver_from_websocket.bind("tcp://*:5563")
forwarder_to_websocket = context.socket(zmq.PUSH)
forwarder_to_websocket.bind("tcp://*:5564")
# Process messages from both sockets
# We prioritize traffic from the server
while True:
# forward messages from the server
while True:
try:
message = receiver_from_server.recv(zmq.DONTWAIT)
except zmq.Again:
break
print "Received from server: ", message
forwarder_to_websocket.send_string(message)
# forward messages from the websocket
while True:
try:
message = receiver_from_websocket.recv(zmq.DONTWAIT)
except zmq.Again:
break
print "Received from websocket: ", message
forwarder_to_server.send_string(message)
as you can see, i've setup 4 sockets. the app connects to port 5561 to push data to zmq, and port 5562 to receive from zmq (although im still figuring out how to actually set it up to listen for messages sent by zmq). on the other hand, sockjs receives data from zmq on port 5564 and sends data to it on port 5563.
i've read the zmq.DONTWAIT makes receiving of message asynchronous and non-blocking so i added it.
is there a way to improve the code so that i dont overload the CPU? the goal is to be able to pass messages between the flask app and the websocket using zmq.

You are polling your two receiver sockets in a tight loop, without any blocking (zmq.DONTWAIT), which will inevitably max out the CPU.
Note that there is some support in ZMQ for polling multiple sockets in a single thread - see this answer. I think you can adjust the timeout in poller.poll(millis) so that your code only uses lots of CPU if there are lots of incoming messages, and idles otherwise.
Your other option is to use the ZMQ event loop to respond to incoming messages asynchronously, using callbacks. See the PyZMQ documentation on this topic, from which the following "echo" example is adapted:
# set up the socket, and a stream wrapped around the socket
s = ctx.socket(zmq.REP)
s.bind('tcp://localhost:12345')
stream = ZMQStream(s)
# Define a callback to handle incoming messages
def echo(msg):
# in this case, just echo the message back again
stream.send_multipart(msg)
# register the callback
stream.on_recv(echo)
# start the ioloop to start waiting for messages
ioloop.IOLoop.instance().start()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Callback function timeout/disruption in google Pub/Sub asynchronous pull subscriber - python

Related

Asyncio - run tasks cyclically and politely stop them with ctrl+C

Paho MQTT (Python) - loop_start() not working

How to run a function asynchronously as “fire and forget”?

Python thread retry every X seconds up to Y minutes

How to avoid high cpu usage?

Categories

Resources