Cause of a multithreading error in a MQTT application (Python)?

Cause of a multithreading error in a MQTT application (Python)? - python

In my code I create threads, which publish.single multiple times on a MQTT connection. However this error is raised and I cannot understand or find its origin. The only time it mentions my code is with line 75, in send_on_sensor.
Exception in thread Thread-639:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 917, in _bootstrap_inner
self.run()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "/Users//PycharmProjects//V3_multiTops/mt_GenPub.py", line 75, in send_on_sensor
publish.single(topic, payload, hostname=hostname)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/paho/mqtt/publish.py", line 223, in single
protocol, transport)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/paho/mqtt/publish.py", line 159, in multiple
client.connect(hostname, port, keepalive)
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/paho/mqtt/client.py", line 839, in connect
return self.reconnect()
File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/paho/mqtt/client.py", line 962, in reconnect
sock = socket.create_connection((self._host, self._port), source_address=(self._bind_address, 0))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/socket.py", line 727, in create_connection
raise err
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/socket.py", line 716, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused
This is the mentioned code part. The thrown line 75 is the one with time.sleep(delay). This method will be called on a new thread whenever a new set of data (as a queue of points) shall be sent.
def send_on_sensor(q, topic, delay):
while q.empty is not True:
payload = json.dumps(q.get())
publish.single(topic, payload, hostname=hostname)
time.sleep(delay)
I get the feeling I am doing something which is not "threadsafe"?! Also this issue occurs especially, when the delay is a short interval (< 1sec). From my output I can see that the next set of data (100 points) will start sending in a new thread before the first one has finished sending. I can fix that and also this error by increasing the time interval in between two sets of data. E.g. if I determine the delay between sets using this relation set_delay = 400 * point_delay I can safely use a delay of 0.1 secs. However the same relation won't work for smaller delays, so this solution really does not satisfy me.
What can I do about this issue? I really want to get my delay below 0.1 secs and be able to adjust it.
EDIT
this is the method which creates the threads:
def send_dataset(data, labels, secs=0):
qs = []
for i in range(8):
qs.append(queue.Queue())
for value in data:
msg = {
"key": value,
}
# c is set accordingly
qs[c].put(msg)
for q in qs:
topic = sensors[qs.index(q)]
t = threading.Thread(target=send_on_sensor, args=(q, topic, secs))
t.start()
time.sleep(secs)
and this is where I start all methods off
output_interval = 0.01
while True:
X, y = give_dataset()
send_dataset(X, y, output_interval)
time.sleep(output_interval * 2000)

Even though you added extra code, it doesnt reveal much. However, I have an experience with similar thing happening to me. I was doing heavy threaded app with MQTT and its quite save. Not totally but it is.
The reason why you get error with lowering the delay is that you have ONE client. By publishing message (I cant be sure because I dont see your code) you connect, send message and disconnect!. Since you are threading this process, you most propably send one message(still in process) and you are about to publish new one in new thread. However the first thread is going to finish and disconnects the client. The new thread is trying to publish, but you cant, because previous thread disconnected you.
Solution:
1) Dont disconnect the client upon publishing
2) Risky and you need more code: For every publish, create new client but be sure to handle this correctly. That means: create client, publish and disconnect, again and again, but make sure you close the connections correctly and delete the clients so your you dont store dead clients
3) solution to 2) - try to make function that will do all - create client, connect and publish and dies after the the end. If you thread such function, I guess you will not have to take care of problems arising in solution 2
Update:
In case your problem is something else, I still think that its not because of threads itself, but because multiple threads are trying to control something that should be controlled only by one thread - like client object
Update: template code
be aware that its my old code and I dont use it anymore because my applications needs particular thread attitude and so on, so I rewrite this one for each application individually. But this one works like charm for not threaded apps and possible for threaded too. It can publish only with qos=0
import paho.mqtt.client as mqtt
import json
# Define Variables
MQTT_BROKER = ""
MQTT_PORT = 1883
MQTT_KEEPALIVE_INTERVAL = 5
MQTT_TOPIC = ""
class pub:
def __init__(self,MQTT_BROKER,MQTT_PORT,MQTT_KEEPALIVE_INTERVAL,MQTT_TOPIC,transport = ''):
self.MQTT_TOPIC = MQTT_TOPIC
self.MQTT_BROKER =MQTT_BROKER
self.MQTT_PORT = MQTT_PORT
self.MQTT_KEEPALIVE_INTERVAL = MQTT_KEEPALIVE_INTERVAL
# Initiate MQTT Client
if transport == 'websockets':
self.mqttc = mqtt.Client(transport='websockets')
else:
self.mqttc = mqtt.Client()
# Register Event Handlers
self.mqttc.on_publish = self.on_publish
self.mqttc.on_connect = self.on_connect
self.connect()
# Define on_connect event Handler
def on_connect(self,mosq, obj, rc):
print("mqtt.thingstud.io")
# Define on_publish event Handler
def on_publish(self,client, userdata, mid):
print("Message Published...")
def publish(self,MQTT_MSG):
MQTT_MSG = json.dumps(MQTT_MSG)
# Publish message to MQTT Topic
self.mqttc.publish(self.MQTT_TOPIC,MQTT_MSG)
# Disconnect from MQTT_Broker
def connect(self):
self.mqttc.connect(self.MQTT_BROKER, self.MQTT_PORT, self.MQTT_KEEPALIVE_INTERVAL)
def disconnect(self):
self.mqttc.disconnect()
p = pub(MQTT_BROKER,MQTT_PORT,MQTT_KEEPALIVE_INTERVAL,MQTT_TOPIC)
p.publish('some messages')
p.publish('more messages')
Note that on object creation I connect automaticly, but I dont disconnect. That is something you have to do manually
I suggest you try to create as many pub objects as you have sensors and publish with them.

Related

BLE device does not make new /dev/input/eventX when it reconnects, using Python Gatt Library

I am new to python gatt module, and i am having a problem with reconnections.
Basically what I am trying to do is establish a connection with a Bluetooth Low Energy (BLE) device with the python gatt module( https://github.com/getsenic/gatt-python ) and then read the input from the /dev/input/eventX path with the evdev module. I also want to automate the reconnection process, so when the device gets out of range and comes back, it will reconnect and continue working normally.
The problem is that when the device disconnects, and eventually reconnects (via simple routine like this: catch disconnect message -> try to reconnect) if the reconnection has taken more than 2-3 minutes, the connection process does not make a new /dev/input/eventX path. This is not happening when the reconnection is successful in between the first 1-2 minutes.
The error I am getting when the 2-3 minutes have passed is:
File "/usr/lib/python3.7/site-packages/dbus/proxies.py", line 145, in
call
File "/usr/lib/python3.7/site-packages/dbus/connection.py", line 651, in call_blocking
dbus.exceptions.DBusException:
org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible
causes include: the remote application did not send a reply, the
message bus security policy blocked the reply, the reply timeout
expired, or the network connection was broken.
The core of the script is the following:
def reconnect(mac_address):
try:
devices[mac_address].connect()
except:
print(f"thread from {mac_address} crashed")
class AnyDevice(gatt.Device):
shut_down_flag = False
def connect_succeeded(self):
super().connect_succeeded()
print(f"{self.mac_address} Connected")
def connect_failed(self, error):
super().connect_failed(error)
print(f"{self.mac_address} Connection failed.")
reconnect_thread = threading.Thread(target=reconnect, name=f'reconnect {self.mac_address}',args=(self.mac_address,))
reconnect_thread.start()
def disconnect_succeeded(self):
super().disconnect_succeeded()
print(f"{self.mac_address} Disconnected")
if not self.shut_down_flag:
reconnect_thread = threading.Thread(target=reconnect, name=f'reconnect {self.mac_address}',args=(self.mac_address,))
reconnect_thread.start()
def gatt_connect_device(mac_address):
global devices
devices.update({f'{mac_address}': AnyDevice(mac_address=f'{mac_address}', manager=manager)})
devices[f'{mac_address}'].connect()
#==== OPEN bd_addresses.txt JSON FILE ====#
if path.exists("bd_addresses.txt"):
with open("bd_addresses.txt", "r") as mac_addresses_json:
mac_addresses = json.load(mac_addresses_json)
else:
print("bd_addresses.txt file NOT FOUND\nPlace it in the same directory as the multiple_scanners.py")
#========================================#
devices={}
manager = gatt.DeviceManager(adapter_name='hci0')
for scanner_number in mac_addresses:
device_instance_thread=threading.Thread(target=gatt_connect_device, name=f'device instance for {mac_addresses[scanner_number]}', args=(mac_addresses[scanner_number],))
device_instance_thread.start()
time.sleep(3)
manager.run()

RabbitMQ pika.exceptions.ConnectionClosed (-1, "error(104, 'Connection reset by peer')")

I have a task queue in RabbitMQ with multiple producers (12) and one consumer for heavy tasks in a webapp. When I run the consumer it starts dequeuing some of the messages before crashing with this error:
Traceback (most recent call last):
File "jobs.py", line 42, in <module> jobs[job](config)
File "/home/ec2-user/project/queue.py", line 100, in init_queue
channel.start_consuming()
File "/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 1822, in start_consuming
self.connection.process_data_events(time_limit=None)
File "/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 749, in process_data_events
self._flush_output(common_terminator)
File "/usr/lib/python2.7/site-packages/pika/adapters/blocking_connection.py", line 477, in _flush_output
result.reason_text)
pika.exceptions.ConnectionClosed: (-1, "error(104, 'Connection reset by peer')")
The producers code is:
message = {'image_url': image_url, 'image_name': image_name, 'notes': notes}
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.queue_declare(queue='tasks_queue')
channel.basic_publish(exchange='', routing_key=queue_name, body=json.dumps(message))
connection.close()
And the only consumer's code (the one is clashing):
def callback(self, ch, method, properties, body):
"""Callback when receive a message."""
message = json.loads(body)
try:
image = _get_image(message['image_url'])
except:
sys.stderr.write('Error getting image in note %s' % note['id'])
# Crop image with PIL. Not so expensive
box_path = _crop(image, message['image_name'], box)
# API call. Long time function
result = long_api_call(box_path)
if result is None:
sys.stderr.write('Error in note %s' % note['id'])
return
# update the db
db.update_record(result)
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.queue_declare(queue='tasks_queue')
channel.basic_qos(prefetch_count=1)
channel.basic_consume(callback_obj.callback, queue='tasks_queue', no_ack=True)
channel.start_consuming()
As you can see, there are 3 expensive functions for message. One crop task, one API call and one database update. Without the API call, que consumer runs smoothly.
Thanks in advance

Your RabbitMQ log shows a message that I thought we might see:
missed heartbeats from client, timeout: 60s
What's happening is that your long_api_call blocks Pika's I/O loop. Pika is a very lightweight library and does not start threads in the background for you so you must code in such a way as to not block Pika's I/O loop longer than the heartbeat interval. RabbitMQ thinks your client has died or is unresponsive and forcibly closes the connection.
Please see my answer here which links to this example code showing how to properly execute a long-running task in a separate thread. You can still use no_ack=True, you will just skip the ack_message call.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

Starting with RabbitMQ 3.5.5, the broker’s default heartbeat timeout
decreased from 580 seconds to 60 seconds.
See pika: Ensuring well-behaved connection with heartbeat and blocked-connection timeouts.
The simplest fix is to increase the heartbeat timeout:
rabbit_url = host + "?heartbeat=360"
conn = pika.BlockingConnection(pika.URLParameters(rabbit_url))
# or
params = pika.ConnectionParameters(host, heartbeat=360)
conn = pika.BlockingConnection(params)

Python multiprocessing Manger OSError "Only one usage of each socket address"

I am creating a communication platform in python (3.4.4) and using the multiprocessing.managers.BaseManager class. I have isolated the problem to the code below.
The intention is to have a ROVManager(role='server') instance running in one process on the main computer and providing read/write capabilities to the system dictionary for multiple ROVManager(role='client') instances running on the same computer and a ROV (remotely operated vehicle) connected to the same network. This way, multiple clients/processes can do different tasks like reading sensor values, moving motors, printing, logging etc, all using the same dictionary. start_reader() below is one of those clients.
Code
from multiprocessing.managers import BaseManager
import multiprocessing as mp
import sys
class ROVManager(BaseManager):
def __init__(self, role, address, port, authkey=b'abc'):
super(ROVManager, self).__init__(address=(address, port),
authkey=authkey)
if role is 'server':
self.system = {'shutdown': False}
self.register('system', callable=lambda: self.system)
server = self.get_server()
server.serve_forever()
elif role is 'client':
self.register('system')
self.connect()
def start_server(server_ip, port_var):
print('starting server')
ROVManager(role='server', address=server_ip, port=port_var)
def start_reader(server_ip, port_var):
print('starting reader')
mgr = ROVManager(role='client', address=server_ip, port=port_var)
i = 0
while not mgr.system().get('shutdown'):
sys.stdout.write('\rTotal while loops: {}'.format(i))
i += 1
if __name__ == '__main__':
server_p = mp.Process(target=start_server, args=('0.0.0.0', 5050))
reader_p = mp.Process(target=start_reader, args=('127.0.0.1', 5050))
server_p.start()
reader_p.start()
while True:
# Check system status, restart processes etc here
pass
Error
This results in the following output and error:
starting server
starting reader
Total while loops: 15151
Process Process - 2:
Traceback(most recent call last):
File "c:\python34\Lib\multiprocessing\process.py", line 254, in _bootstrap
self.run()
File "c:\python34\Lib\multiprocessing\process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "C:\git\eduROV\error_test.py", line 29, in start_reader
while not mgr.system().get('shutdown'):
File "c:\python34\Lib\multiprocessing\managers.py", line 640, in temp
token, exp = self._create(typeid, *args, **kwds)
File "c:\python34\Lib\multiprocessing\managers.py", line 532, in _create
conn = self._Client(self._address, authkey=self._authkey)
File "c:\python34\Lib\multiprocessing\connection.py", line 496, in Client
c = SocketClient(address)
File "c:\python34\Lib\multiprocessing\connection.py", line 629, in SocketClient
s.connect(address)
OSError: [WinError 10048] Only one usage of each socket address(protocol / network address / port) is normally permitted
My research
Total while loops are usually in the range 15000-16000. From my understanding it seems like a socket is created and terminated each time mgr.system().get('shutdown') is called. Windows then runs out of available sockets. I can't seem to find a way to set socket.SO_REUSEADDR.
Is there a way of solving this, or isn't Managers made for this kind of communication? Thanks :)

As error suggests Only one usage of each socket address in general , you can/should bind only a single process to a socket ( unless you design your application accordingly, by passing SO_REUSEADDR option while creating socket)
. These lines
server_p = mp.Process(target=start_server, args=('0.0.0.0', 5050))
reader_p = mp.Process(target=start_reader, args=('127.0.0.1', 5050))
creates two processes on same port 5050 & so the error.
You can refer here to learn how to use SO_REUSEADDR & its implications but i am quoting the main part which should get you going
The second socket calls setsockopt with the optname parameter set to
SO_REUSEADDR and the optval parameter set to a boolean value of TRUE
before calling bind on the same port as the original socket. Once the
second socket has successfully bound, the behavior for all sockets
bound to that port is indeterminate. For example, if all of the
sockets on the same port provide TCP service, any incoming TCP
connection requests over the port cannot be guaranteed to be handled
by the correct socket — the behavior is non-deterministic.

Rabbitmq error: [Errno 10054] An existing connection was forcibly closed by the remote host

I am using Kombu in Python to consume a durable RabbitMQ queue.
There is only one consumer consuming the queue in Windows. This consumer produces the below error:
Traceback (most recent call last):
File ".\consumer_windows.py", line 66, in <module>
message.ack()
File "C:\Users\Administrator\Anaconda2\lib\site-packages\kombu\message.py", line 88, in ack
self.channel.basic_ack(self.delivery_tag)
File "C:\Users\Administrator\Anaconda2\lib\site-packages\amqp\channel.py", line 1584, in basic_ack
self._send_method((60, 80), args)
File "C:\Users\Administrator\Anaconda2\lib\site-packages\amqp\abstract_channel.py", line 56, in _send_method
self.channel_id, method_sig, args, content,
File "C:\Users\Administrator\Anaconda2\lib\site-packages\amqp\method_framing.py", line 221, in write_method
write_frame(1, channel, payload)
File "C:\Users\Administrator\Anaconda2\lib\site-packages\amqp\transport.py", line 182, in write_frame
frame_type, channel, size, payload, 0xce,
File "C:\Users\Administrator\Anaconda2\lib\socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 10054] An existing connection was forcibly closed by the remote host
There are at most 500 messages in the queue at any one time. Each message is small in size however it is a task and takes up to 10 minutes to complete (although it usually takes less then 5 mins per message).
I have tried restarting the consumer, RabbitMQ server and deleting the queue however the error still persists.
I've seen this question however the answer is from 2010 and my rabbitmq.log has different entries:
=ERROR REPORT==== 24-Apr-2016::08:26:20 ===
closing AMQP connection <0.6716.384> (192.168.X.X:59602 -> 192.168.Y.X:5672):
{writer,send_failed,{error,timeout}}
There were no recent events in the rabbitmq-sasl.log.
Why is this error happening and how can I prevent it from occurring?

I'm still looking for an answer. In the meantime I restart the connection to my rabbit server:
while True:
try:

connection = pika.BlockingConnection(params)
channel = connection.channel() # start a channel
channel.queue_declare(queue=amqp_q, durable=True) # Declare a queue
...

except pika.exceptions.ConnectionClosed:
print('connection closed... and restarted')

I had the same issue with MySQL server which was hosted...
I came to understand that it happened if we open the connection for a long time or unmodified for a long time..
If your program opens the DB or anything until the whole program runs make it in a such a way that it opens the DB writes everything and closes and repeats
I don't know what exactly rabbitmq is but I think the error you wrote as title may be for this reason

I had the same error (using pure PIKA library) and trying to connect to a Rabbitmq broker through Amazon MQ.
The problem resolved when setting up correctly the ssl configuration.
Please check full blog post here: https://docs.aws.amazon.com/amazon-mq/latest/developer-guide/amazon-mq-rabbitmq-pika.html
Core snippets that I used:
Define Pika Client:
import ssl
import pika
class BasicPikaClient:
def __init__(self, rabbitmq_broker_id, rabbitmq_user, rabbitmq_password, region):
# SSL Context for TLS configuration of Amazon MQ for RabbitMQ
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
ssl_context.set_ciphers('ECDHE+AESGCM:!ECDSA')
url = f"amqps://{rabbitmq_user}:{rabbitmq_password}#{rabbitmq_broker_id}.mq.{region}.amazonaws.com:5671"
parameters = pika.URLParameters(url)
parameters.ssl_options = pika.SSLOptions(context=ssl_context)
self.connection = pika.BlockingConnection(parameters)
self.channel = self.connection.channel()
Producer:
from basicClient import BasicPikaClient
class BasicMessageSender(BasicPikaClient):
def declare_queue(self, queue_name, durable):
print(f"Trying to declare queue({queue_name})...")
self.channel.queue_declare(queue=queue_name, durable=durable)
def send_message(self, exchange, routing_key, body):
channel = self.connection.channel()
channel.basic_publish(exchange=exchange,
routing_key=routing_key,
body=body)
print(f"Sent message. Exchange: {exchange}, Routing Key: {routing_key}, Body: {body}")
def close(self):
self.channel.close()
self.connection.close()
Calling Producer:
# Initialize Basic Message Sender which creates a connection
# and channel for sending messages.
basic_message_sender = BasicMessageSender(
credentials["broker_id"],
credentials["username"],
credentials['password'],
credentials['region']
)
# Declare a queue
basic_message_sender.declare_queue("q_name", durable=True)
# Send a message to the queue.
basic_message_sender.send_message(exchange="", routing_key="q_name", body=b'Hello World 2!')
# Close connections.
basic_message_sender.close()
Define Consumer:
class BasicMessageReceiver(BasicPikaClient):
def get_message(self, queue):
method_frame, header_frame, body = self.channel.basic_get(queue)
if method_frame:
print(method_frame, header_frame, body)
self.channel.basic_ack(method_frame.delivery_tag)
return method_frame, header_frame, body
else:
print('No message returned')
def close(self):
self.channel.close()
self.connection.close()
Calling Consumer:
# Create Basic Message Receiver which creates a connection
# and channel for consuming messages.
basic_message_receiver = BasicMessageReceiver(
credentials["broker_id"],
credentials["username"],
credentials['password'],
credentials['region']
)
# Consume the message that was sent.
basic_message_receiver.get_message("q_name")
# Close connections.
basic_message_receiver.close()
I hope the above helps.
Thanks

How can I get my Python function in Flask application to stop running?

I set up a function that is meant to be called every minute to send an email. I call it every minute using the following:
import smtplib
def messages_emailed():
fromaddr = FROMADDRESS
toaddrs = TOADDRESS
msg = "this is a test message."
username = USER
password = PASSWORD
server = smtplib.SMTP('smtp.gmail.com:587')
server.starttls()
server.login(username,password)
server.sendmail(fromaddr, toaddrs, msg)
server.quit()
threading.Timer(60, messages_emailed).start() #runs func every min
messages_emailed()
This worked perfectly, although despite me stopping the application in Terminal, using control–C, I am continuing to receive mail every minute, and refreshing the page in which my application is running in my browser, 127.0.0.1:5000, continues to display my application. I can edit my script to add a cancel statement, but hitting save did not make any changes, and trying to reload my application in terminal returned an error
> * Running on ``http://127.0.0.1:5000/ ``Traceback (most recent call
> last): File "bit.py", line 79, in <module>
> app.run() File "/Library/Python/2.7/site-packages/Flask-0.9-py2.7.egg/flask/app.py",
> line 739, in run
> run_simple(host, port, self, **options) File "/Library/Python/2.7/site-packages/Werkzeug-0.8.3-py2.7.egg/werkzeug/serving.py",
> line 613, in run_simple
> test_socket.bind((hostname, port)) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py",
> line 224, in meth socket.error: [Errno 48] Address already in use
For now, I have stopped the influx of emails by deleting the mail account I used to send messages from. However, I am wondering what a long term solution would look like, something that I can ideally stop from the terminal or stops executing when the program does. Research has suggested using sys.exit(0) though I do not know where in my program to place this or when it will quit the function.
Any help would be greatly appreciated.

at first you have to check formatting.
if you want to use thread, you have to write your threading manager which will be encapsulates the start(), stop() methods for your threads.
thread1 = threading.Timer(60, sender()).start()
for stopping just call thread1.stop()

It seems that your script has start a new process to rerun the email-sending function periodically. You may check the active process by running ps aux | grep python.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Cause of a multithreading error in a MQTT application (Python)? - python

Related

BLE device does not make new /dev/input/eventX when it reconnects, using Python Gatt Library

RabbitMQ pika.exceptions.ConnectionClosed (-1, "error(104, 'Connection reset by peer')")

Python multiprocessing Manger OSError "Only one usage of each socket address"

Rabbitmq error: [Errno 10054] An existing connection was forcibly closed by the remote host

How can I get my Python function in Flask application to stop running?

Categories

Resources