RabbitMQ - Socket Closed Exception - Windows Server 2012 - python

So I have a publisher which is using schedule python package to read data from a file and every 5-10 mins and publish each line to a queue.
On the other side I have consumer using something like:
self.connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost'))
self.channel = self.connection.channel()
while True:
method, properties, body = self.channel.basic_get(queue=conf.UNIVERSAL_MESSAGE_QUEUE, no_ack=False)
if body is not None:
self.assign_task(body=body)
self.channel.basic_ack(delivery_tag=method.delivery_tag)
else:
self.logger.info('channel empty')
self.move_to_done()
time.sleep(5)
Assign task function looks like:
def assign_task(body=body):
<do something with the message body>
For some reason after a while it throws the following error:
2017-08-03 15:27:43,756: ERROR: base_connection.py: _handle_error: 335: Socket Error: 10054
2017-08-03 15:27:43,756: WARNING: base_connection.py: _check_state_on_disconnect: 180: Socket closed when connection was open
2017-08-03 15:27:43,756: WARNING: connection.py: _on_disconnect: 1360: Disconnected from RabbitMQ at localhost:5672 (0): Not specified
Essentially both publisher and consumer are 2 different python programs intended to run on a single machine with Windows Server 2012. Can community help understand what might be going wrong here.
The same code runs absolutely fine locally on my windows machine
Following is the output from my log file.
=ERROR REPORT==== 3-Aug-2017::15:06:48 ===
closing AMQP connection <0.617.0> ([::1]:53485 -> [::1]:5672):
missed heartbeats from client, timeout: 60s

Simple answer to this was to create a durable queue and set heartbeat_interval to 0.

Related

Paho MQTT client failing for special payloads (forward slashes) when using OpenVPN (Windows)

I have been struggeling with a strange case of random client MQTT publish failing for certain payloads. It happends randomly when trying to publish some large amount of BASE64 data.
I've finally managed to narrow it down to payloads containing a lot of consequtive forwards slashes (/). I've searched the net to find a good reason why this happends, but havent found anythong. Is it a MQTT feature, a Paho client feature or a broker feature, or just some bug...
Setup:
Python 3.8.8 (Windows 10)
paho-mqtt 1.5.0
mosquitto 1.6.9-1 amd64
On my setup, it fails when I send a payload of 255 '/' to a 1 character topic 'a'. Larger topic length, reduces the possible number of forward slashes.
Code to reproduce error:
import paho.mqtt.client as mqtt_client
import time
address = 'some.server.com'
port = 1883
connected = False
def on_connect(client, userdata, flags, rc):
global connected
connected = True
print("Connected!")
client = mqtt_client.Client()
client.on_connect = on_connect
client.connect(host=address, port=port, keepalive=60)
client.loop_start()
while not connected:
time.sleep(1)
payload = '/'*205
print('Payload: {}'.format(payload))
client.publish(topic='a', payload=payload)
time.sleep(2)
client.loop_stop()
client.disconnect()
print('Done!')
This generates this output:
Connected!
Payload: /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Connected!
Done!
This produces the following error in /var/log/mosquitto/mosquitto.log for the mosquitto broker:
1616605010: New connection from a.b.c.d on port 1883.
1616605010: New client connected from a.b.c.d as auto-CEF15129-E74C-F00A-A6FA-5B5FDA0CEF1D (p2, c1, k60).
1616605011: Socket error on client auto-CEF15129-E74C-F00A-A6FA-5B5FDA0CEF1D, disconnecting.
1616605012: New connection from a.b.c.d on port 1883.
1616605012: New client connected from a.b.c.d as auto-0149B6DB-5997-9E08-366A-304F21FDF2E1 (p2, c1, k60).
1616605013: Client auto-0149B6DB-5997-9E08-366A-304F21FDF2E1 disconnected.
I observe that the client() connects twice, but do not know why, but this is probably caused by a disconnect...
Any Ideas?
Update 1: I've tested this on Linux Ubunit running Python 3.7.3, and same paho-mqtt version, and this does not produce the same error... Seems like some problem in Windows then.
Update 2:
I also tried running mosquitto_pub and experienced the same error, so this has to be Windows-related (or system related) in some way. Possibly firewall? I will close question if I find manage to solve this.
"C:\Program Files\mosquitto\mosquitto_pub.exe" -h some.server.com -t a -m '/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////'
Update 3 The issue is related to OpenVPN. I closed my open VPN connection and MQTT messages were passing! I'm running OpenVPN Client (Windows version 3.2.2 (1455). I have no idea what causes this conlflict...

Allow RabbitMQ and Pika maintain the conection always open

I've a Python script which reads stuff from a stream, and when a new string gets readed, it pushes its content (a string) to a RabbitMQ queue.
The thing is that the stream might not send messages in 1, 2 or 9h or so, so I would like to have the RabbitMQ connection always open.
The problem is that when I create the conection and the channel:
self.connection = pika.BlockingConnection(pika.ConnectionParameters(host=self.host, credentials=self.credentials))
channel = self.connection.channel()
channel.exchange_declare(exchange=self.exchange_name, exchange_type='fanout')
... and if after an hour a message arrives, I get this error:
File "/usr/local/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/var/opt/rabbitmq-agent.py", line 34, in push_to_queue
raise Exception("Error sending the message to the queue: " + format(e))
Exception: Error sending the message to the queue: Send message to publisher error: Channel allocation requires an open connection: <SelectConnection CLOSED socket=None params=<ConnectionParameters host=x port=xvirtual_host=/ ssl=False>>
Which I suppose is that the connection has been closed between the rabbitmq server and client.
How can I avoid this? I would like to have a "please, keep the connection alive always". Maybe setting a super-big heartbeat in the connection parameters of Pika? Something like this:
self.connection = pika.BlockingConnection(pika.ConnectionParameters(host=self.host, credentials=self.credentials, heartbeat=6000))
Any other cooler solutions would be highly appreciated.
Thanks in advance
I would suggest you check connection every time before sending message and if the connection is closed then simply reconnect.
if not self.connection or self.connection.is_closed:
self.connection = pika.BlockingConnection(pika.ConnectionParameters(host=self.host, credentials=self.credentials))
channel = self.connection.channel()
channel.exchange_declare(exchange=self.exchange_name, exchange_type='fanout')
You could try adding heartbeat to your ConnectionParameters. This will create light traffic by sending heartbeats every specified seconds. This will exercise the connections. Some firewalls or proxies tend to scrape idle connections. Even RabbitMQ has a timeout on connections that are idle.
import pika
# Set the connection parameters to connect to rabbit-server1 on port 5672
# on the / virtual host using the username "guest" and password "guest"
credentials = pika.PlainCredentials('guest', 'guest')
parameters = pika.ConnectionParameters('rabbit-server1',
5672,
'/',
heartbeat=60,
credentials)
See here for pika documentation.
Additionally you should have code in place that mitigates network disconnection. This can always happen and will. So appart from the heartbeat have some exception handling ready to re-open closed connections in a graceful way.

thrift timeout for long run call: thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

I've build some a rpc service using thrift. It may run long time (minutes to hours) for each call. I've set the thrift timeout to 2 days.
transport = TSocket.TSocket(self.__host, self.__port)
transport.setTimeout(2 * 24 * 60 * 60 * 1000)
But the thrift always closes connection after about 600s, with the following exception:
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
Is there's any other timeout should i set? (python, thrift server: windows; client: ubuntu)
The Thrift Transport connection is being disconnected. This could be due to network issues or remote service restart or time out issues. Whenever any call is made after a disconnect this results in TTransportException. This problem can be solved by reconnecting to the remote service.
Try using this, invoking it before making a remote service call.
def repoen_transport():
try:
if not transport.isOpen():
transport.open()
except Exception, msg:
print >> sys.stderr.write("Error reopening transport {}".format(msg))

Detect when Websocket is disconnected, with Python Bottle / gevent-websocket

I'm using the gevent-websocket module with Bottle Python framework.
When a client closes the browser, this code
$(window).on('beforeunload', function() { ws.close(); });
helps to close the websocket connection properly.
But if the client's network connection is interrupted, no "close" information can be sent to the server.
Then, often, even 1 minute later, the server still believes the client is connected, and the websocket is still open on the server.
Question: How to detect properly that a websocket is closed because the client is disconnected from network?
Is there a websocket KeepAlive feature available in Python/Bottle/gevent-websocket?
One answer from Web Socket: cannot detect client connection on internet disconnect suggests to use a heartbeat/ping packet every x seconds to tell the server "I'm still alive". The other answer suggests using a setKeepAlive(true). feature. Would this feature be available in gevent-websocket?
Example server code, taken from here:
from bottle import get, template, run
from bottle.ext.websocket import GeventWebSocketServer
from bottle.ext.websocket import websocket
users = set()
#get('/')
def index():
return template('index')
#get('/websocket', apply=[websocket])
def chat(ws):
users.add(ws)
while True:
msg = ws.receive()
if msg is not None:
for u in users:
u.send(msg)
else:
break
users.remove(ws)
run(host='127.0.0.1', port=8080, server=GeventWebSocketServer)
First you need to add a timeout to the receive() method.
with gevent.Timeout(1.0, False):
msg = ws.receive()
Then the loop will not block, if you send even an empty packet and the client doesn't respond, WebsocketError will be thrown and you can close the socket.

connection refused from server unit I reset client machine

Below is the code I am running within a service. For the most part the script runs fine for days/weeks until the script hiccups and crashes. I am not so worried about the crashing part as I can resolve the cause from the error logs an patch appropriately. The issue I am facing is that sometimes when the service restarts and tries to connect to the server again, it gets a (10061, 'Connection refused') error, so that the service is unable to start up again. The bizarre part is that there is no python processes running when connections are being refused. IE no process with image name "pythonw.exe" or "pythonservice.exe." It should be noted that I am unable to connect to the server with any other machine as well until I reset computer which runs the client script. The client machine is running python 2.7 on a windows server 2003 OS. It should also be noted that the server is coded on a piece of hardware of which I do not have access to the code.
try:
EthernetConfig = ConfigParser()
EthernetConfig.read('Ethernet.conf')
HOST = EthernetConfig.get("TCP_SERVER", "HOST").strip()
PORT = EthernetConfig.getint("TCP_SERVER", "PORT")
lp = LineParser()
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
reader = s.makefile("rb")
while(self.run == True):
line = reader.readline()
if line:
line = line.strip()
lp.parse(line)
except:
servicemanager.LogErrorMsg(traceback.format_exc()) # if error print it to event log
s.shutdown(2)
s.close()
os._exit(-1)
Connection refused is an error meaning that the program on the other side of the connection is not accepting your connection attempt. Most probably it hasn't noticed you crashing, and hasn't closed its connection.
What you can do is simply sleep a little while (30-60 seconds) and try again, and do this in a loop and hope the other end notices that the connection in broken so it can accept new connections again.
Turns out that Network Admin had the port closed that I was trying to connect to. It is open for one IP which belongs to the server. Problem is that the server has two network cards with two separate IP's. Issue is now resolved.

Categories