Pika connection closed after 3 heartbeats - python

I'm writing a script which receives HTTP requests (using Tornado), parses them, and sends them to a RabbitMQ broker using pika.
The code looks like this:
def main():
conn_params = pika.ConnectionParameters(
host=BROKER_NAME,
port=BROKER_PORT,
ssl=True,
virtual_host=VIRTUAL_HOST,
credentials=pika.PlainCredentials(BROKER_USER, BROKER_PASS),
heartbeat_interval=HEARTBEAT_INTERVAL
)
conn = pika.BlockingConnection(conn_params)
channel = conn.channel()
# Create the web server which handles application requests.
application = tornado.web.Application([
(URL_BILLING, SomeHandler, dict(channel=channel))
])
# Start the server
application.listen(LISTENING_PORT)
tornado.ioloop.IOLoop.instance().start()
As you can see, I open a single connection and channel, and pass the channel to any instance of the handler which is created, the idea being to save traffic and avoid opening a new connection/channel for every request.
The issue I'm experiencing is that the connection is closed after 3 heartbeats. I used Wireshark in order to figure out what the problem is, but all I can see is that the server sends a PSH (I'm assuming this is the heartbeat) and my scripts replies with an ACK. This happens 3 times with HEARTBEAT_INTERVAL in between them, and then the server just sends a FIN and the connection dies.
Any idea why that happens? Also, should I keep the connection open or is it better to create a new one for every message I need to send?
Thanks for the help.
UPDATE: I looked in the RabbitMQ log, and it says:
Missed heartbeats from client, timeout: 10s
I thought the server was meant to send heartbeats to the client, to make sure it answers, and this agrees with what I observed using Wireshark, but from this log it seems it is the client which is meant to report to the server, not the other way around, and the client, evidently, doesn't report. Am I getting this right?
UPDATE: Figured it out, sort of. A blocking connection (which is what I used) is unable to send heartbeats because it's, well, blocking. As mentioned in this issue, the heartbeat_interval parameters is only used to negotiate the connection with the server, but the client doesn't actually send heartbeats. Since this is the case, what is the best way to keep a long-running connection with pika? Even if I don't specify heartbeat_interval, the server defaults to a heartbeat every 10 minutes, so the connection will die after 30 minutes...

For future visitors:
Pika has an async example which uses heartbeat:
http://pika.readthedocs.org/en/0.10.0/examples/asynchronous_publisher_example.html
For Tornado specific, this example shows how to use Tornado's IOLoop in pika's async model:
http://pika.readthedocs.org/en/0.10.0/examples/tornado_consumer.html

Related

Python websockets - message send with success after 'Broken pipe' error

I have server and client written in python. My server is implemented using asyncio and library called 'websockets'. This is asynchronous architecture. Client from the other hand is implemented in library called 'websocket-client'. They are 2 different code bases and repositories.
In server repository i am calling serve method to start websocket server that is accepting connection from clients and allows them to send messages to sever. It looks like this:
async with serve(
self.messages_loop, host, port, create_protocol=CentralRouterServerProtocol
) as ws_server:
...
Client is using websocket-client library and it is connecting to websocket by calling 'create_connection' method. Later it is calling 'send' method to send message to server. Code:
client = create_connection(f'ws://{central_router.public_ip}', timeout=24*60*60, header=cls.HEADERS)
cls.get_client().send(json.dumps(message_dict)) // Sends message later loop. After user will type something from input.
Main requirement is that client can only send messages. It cant read it.After that server is sending ping every X seconds to confirm that connection is alive. Server waits another Y seconds for client to reply to him. Client cant reply to server, because it is running on synchronous block of code. The server is closing the connection but client doesnt know about it. Client is not reading from websocket (so he cant get information about closed websocket -> is that true?). After that sobody is typing something into input, and client is sending message to server. AND NOW -> the websocket-client send method is not raising any exception (that connection is closed), but messages will never get to the server. If user will type message one more time, it will get finnaly exception
[Errno 32] Broken pipe
but the first message after connection close will never raise and error/exception.
Why is that? What is going on? My first solution was to set ping_timeout to None on server side. It will make server not to wait this Y seconds for response, and it will never close a connection. However this is wrong solution, because it can cause zombie connections on server side.
Do anyone know, why client can sand one more message with success, after the pipe was broken?

Notification for FIN/ACK using python socket

I have a basic implementation of a TCP client using python sockets, all the client does is connect to a server and send heartbeats every X seconds. The problem is that I don't want to send the server a heartbeat if the connection is closed, but I'm not sure how to detect this situation without actually sending a heartbeat and catch an exception. When I turn off the server, in the traffic capture I see FIN/ACK arriving and the client sends an ACK back, this is when I want my code to do something (or at least change some internal state of the connection). Currently, what happens is after the server went down and X seconds passed since last heartbeat the client will try to send another heartbeat, only then I see RST packet in the traffic capture and get an exception of broken pipe (errno 32). Clearly python socket handles the transport layer and the heartbeats are part of application layer, the problem I want to solve is not to send the redundant heartbeat after FIN/ACK arrived from server, any simple way to know the connection state with python socket?

Flask-Sockets keepalive

I recently started using flask-sockets in my flask application with native WebSocket API as client. I would like to know if there is proper way to send ping requests at certain intervals from the server as keepalive?
When going through the geventwebsocket library, I noticed the definition handle_ping(...), but it's never called. Is there a way to determine a ping interval on WS?
I see my sockets dying after a minute and a half inconsistently sometimes.
#socket_blueprint.route('/ws', defaults={'name':''})
def echo_socket(ws):
while not ws.closed:
ws_list.append(
msg = ws.receive()
ws.send(msg)
I could probably spin up a separate thread and send ping opcodes manually every 30 seconds to the clients if I keep them in a list, but I feel like there'd be a better way to handle that..
In service, create a thread in this thread send some data(any data) to client. If client already disconnected,after 15s the server will receive closed.
I haven't find any method about ping in gevent websocket or flask-sockets. So take this method.

Telnet server: is it good practice to keep connections open?

I'm working in a NetHack clone that is supposed to be playing through Telnet, like many NetHack servers. As I've said, this is a clone, so it's being written from scratch, on Python.
I've set up my socket server reusing code from a SMTP server I wrote a while ago and all of suddenly my attention jumped to this particular line of code:
s.listen(15)
My server was designed to be able to connect to 15 simultaneous clients just in case the data exchange with any took too long, but ideally listen(1) or listen(2) would be enough. But this case is different.
As it happens with Alt.org when you telnet their NetHack servers, people connected to my server should be able to play my roguelike remotely, through a single telnet session, so I guess this connection should not be interrupted. Yet, I've read here that
[...] if you are really holding more than 128 queued connect requests you are
a) taking too long to process them or b) need a heavy-weight
distributed server or c) suffering a DDoS attack.
What is the better practice to carry out here? Should I keep every connection open until the connected user disconnects or is there any other way? Should I go for listen(128) (or whatever my system's socket.SOMAXCONN is) or is that a bad practice?
number in listen(number) request limits number of pending connect requests.
Connect request is pending from initial SYN request received by OS until you called accept socket method. So number does not limits open (established) connection number but it limits number of connections in SYN_RECV state.
It is bad idea not to answer on incoming connection because:
Client will retransmit SYN requests until answer SYN is received
Client can not distinguish situation when your server is not available and it just in queue.
Better idea is to answer on connection but send some message to client with rejection reason and then close connection.

Sleep after ZMQ connect?

In a ROUTER-ROUTER setup, after I connect one ROUTER socket to another, if I don't sleep (for say 0.1s or so) after I connect() to the other ROUTER socket, the send() usually doesn't go through (although it sometimes does, by chance).
Is there a way to make sure I am connected before I send?
Why aren't the send()s queued and properly executed until the connection is made?
Also, this is not about whether the server on the other end is alive but rather that I send() too soon after I connect() and somehow it fails. I am not sure why.
Is there a way to make sure I am connected before I send?
Not directly. The recommended approach is to use something like the Freelanch Protocol and keep pinging until you receive a response. If you stop receiving responses to your pings you should consider yourself disconnected.
Why aren't the send()s queued and properly executed until the connection is made?
A router cannot send a message to a peer until both sides have completed an internal ZeroMQ handshake. That's just the way it works, since the ROUTER requires the ID of its peer in order to "route". Apparently sleeping for .1sec is the right amount of time on your dev system. If you need the ability to connect and then send without sleeping or retrying, then you need to use a different pattern.
For example, with DEALER-ROUTER, a DEALER client can connect and immediately send and ZeroMQ will queue the message until it is delivered. The reason the works is that the DEALER does not require the ID of the peer - since it does not "route". When the ROUTER server receives the message, that handshake is already complete so it can respond right away without sleeping.

Categories