In my Tornado app in some situation some clients disconnect from server but my current code doesn't detect that client is disconnect from server. I currently use ping to find out if client is disconnected.
here is my ping pong code:
from threading import Timer
class SocketHandler(websocket.WebSocketHandler):
def __init__(self, application, request, **kwargs):
# some code here
Timer(5.0, self.do_ping).start()
def do_ping(self):
try:
self.ping_counter += 1
self.ping("")
if self.ping_counter > 2:
self.close()
Timer(60, self.do_ping).start()
except WebSocketClosedError:
pass
def on_pong(self, data):
self.ping_counter = 0
now I want to set SO_RCVTIMEO in tornado instead of using ping pong method.
something like this :
sock.setsockopt(socket.SO_RCVTIMEO)
Is it possible to set SO_RCVTIMEO in Tornado for close clients from server after specific time out ?
SO_RCVTIMEO does not do anything in an asynchronous framework like Tornado. You probably want to wrap your reads in tornado.gen.with_timeout. You'll still need to use pings to test the connection and make sure it is still working; if the connection is idle there are few guarantees about how long it will take for the system to notice. (TCP keepalives are a possibility, but these are not configurable on all platforms and generally use very long timeouts).
Related
I'm using python with pika, and have the following two similar use cases:
Connect to RabbitMQ server A and server B (at different IP addrs with different credentials), listen on exchange A1 on server A; when a message arrives, process it and send to an exchange on server B
Open an HTTP listener and connect to RabbitMQ server B; when a specific HTTP request arrives, process it and send to an exchange on server B
Alas, in both these cases using my usual techniques, by the time I get to sending to server B the connection throws ConnectionClosed or ChannelClosed.
I assume this is the cause: while waiting on the incoming messages, the connection to server B (its "driver") is starved of CPU cycles, and it never gets a chance to service is connection socket, thus it can't respond to heartbeats from server B, thus the servers shuts down the connection.
But I can't noodle out the fix. My current work around is lame: I catch the ConnectionClosed, reopen a connection to server B, and retry sending my message.
But what is the "right" way to do this? I've considered these, but don't really feel I have all the parts to solve this:
Don't just sit forever in server A's basic_consume (my usual pattern), but rather, use a timeout, and when I catch the timeout somehow "service" heartbeats on server B's driver, before returning to a "consume with timeout"... but how do I do that? How do I "let service B's connection driver service its heartbeats"?
I know the socket library's select() call can wait for messages on several sockets and once, then service the socket who has packets waiting. So maybe this is what pika's SelectConnection is for? a) I'm not sure, this is just a hunch. b) Even if right, while I can find examples of how to create this connection, I can't find examples of how to use it to solve my multiconnection case.
Set up the the two server connections in different processes... and use Python interprocess queues to get the processed message from one process to the next. The concept is "two different RabbitMQ connections in two different processes should thus then be able to independently service their heartbeats". Except... I think this has a fatal flaw: the process with "server B" is, instead, going to be "stuck" waiting on the interprocess queue, and the same "starvation" is going to happen.
I've checked StackOverflow and Googled this for an hour last night: I can't for the life of me find a blog post or sample code for this.
Any input? Thanks a million!
I managed to work it out, basing my solution on the documentation and an answer in the pika-python Google group.
First of all, your assumption is correct — the client process that's connected to server B, responsible for publishing, cannot reply to heartbeats if it's already blocking on something else, like waiting a message from server A or blocking on an internal communication queue.
The crux of the solution is that the publisher should run as a separate thread and use BlockingConnection.process_data_events to service heartbeats and such. It looks like that method is supposed to be called in a loop that checks if the publisher still needs to run:
def run(self):
while self.is_running:
# Block at most 1 second before returning and re-checking
self.connection.process_data_events(time_limit=1)
Proof of concept
Since proving the full solution requires having two separate RabbitMQ instances running, I have put together a Git repo with an appropriate docker-compose.yml, the application code and comments to test this solution.
https://github.com/karls/rabbitmq-two-connections
Solution outline
Below is a sketch of the solution, minus imports and such. Some notable things:
Publisher runs as a separate thread
The only "work" that the publisher does is servicing heartbeats and such, via Connection.process_data_events
The publisher registers a callback whenever the consumer wants to publish a message, using Connection.add_callback_threadsafe
The consumer takes the publisher as a constructor argument so it can publish the messages it receives, but it can work via any other mechanism as long as you have a reference to an instance of Publisher
The code is taken from the linked Git repo, which is why certain details are hardcoded, e.g the queue name etc. It will work with any RabbitMQ setup needed (direct-to-queue, topic exchange, fanout, etc).
class Publisher(threading.Thread):
def __init__(
self,
connection_params: ConnectionParameters,
*args,
**kwargs,
):
super().__init__(*args, **kwargs)
self.daemon = True
self.is_running = True
self.name = "Publisher"
self.queue = "downstream_queue"
self.connection = BlockingConnection(connection_params)
self.channel = self.connection.channel()
self.channel.queue_declare(queue=self.queue, auto_delete=True)
self.channel.confirm_delivery()
def run(self):
while self.is_running:
self.connection.process_data_events(time_limit=1)
def _publish(self, message):
logger.info("Calling '_publish'")
self.channel.basic_publish("", self.queue, body=message.encode())
def publish(self, message):
logger.info("Calling 'publish'")
self.connection.add_callback_threadsafe(lambda: self._publish(message))
def stop(self):
logger.info("Stopping...")
self.is_running = False
# Call .process_data_events one more time to block
# and allow the while-loop in .run() to break.
# Otherwise the connection might be closed too early.
#
self.connection.process_data_events(time_limit=1)
if self.connection.is_open:
self.connection.close()
logger.info("Connection closed")
logger.info("Stopped")
class Consumer:
def __init__(
self,
connection_params: ConnectionParameters,
publisher: Optional["Publisher"] = None,
):
self.publisher = publisher
self.queue = "upstream_queue"
self.connection = BlockingConnection(connection_params)
self.channel = self.connection.channel()
self.channel.queue_declare(queue=self.queue, auto_delete=True)
self.channel.basic_qos(prefetch_count=1)
def start(self):
self.channel.basic_consume(
queue=self.queue, on_message_callback=self.on_message
)
try:
self.channel.start_consuming()
except KeyboardInterrupt:
logger.info("Warm shutdown requested...")
except Exception:
traceback.print_exception(*sys.exc_info())
finally:
self.stop()
def on_message(self, _channel: Channel, m, _properties, body):
try:
message = body.decode()
logger.info(f"Got: {message!r}")
if self.publisher:
self.publisher.publish(message)
else:
logger.info(f"No publisher provided, printing message: {message!r}")
self.channel.basic_ack(delivery_tag=m.delivery_tag)
except Exception:
traceback.print_exception(*sys.exc_info())
self.channel.basic_nack(delivery_tag=m.delivery_tag, requeue=False)
def stop(self):
logger.info("Stopping consuming...")
if self.connection.is_open:
logger.info("Closing connection...")
self.connection.close()
if self.publisher:
self.publisher.stop()
logger.info("Stopped")
I have an app similar to a chat-room writing in python that intends to do the following things:
A prompt for user to input websocket server address.
Then create a websocket client that connects to server and send/receive messages. Disable the ability to create a websocket client.
After receiving "close" from server (NOT a close frame), client should drop connecting and re-enable the app to create a client. Go back to 1.
If user exits the app, it exit the websocket client if there is one running.
My approach for this is using a main thread to deal with user input. When user hits enter, a thread is created for WebSocketClient using AutoBahn's twisted module and pass a Queue to it. Check if the reactor is running or not and start it if it's not.
Overwrite on message method to put a closing flag into the Queue when getting "close". The main thread will be busy checking the Queue until receiving the flag and go back to start. The code looks like following.
Main thread.
def main_thread():
while True:
text = raw_input("Input server url or exit")
if text == "exit":
if myreactor:
myreactor.stop()
break
msgq = Queue.Queue()
threading.Thread(target=wsthread, args=(text, msgq)).start()
is_close = False
while True:
if msgq.empty() is False:
msg = msgq.get()
if msg == "close":
is_close = True
else:
print msg
if is_close:
break
print 'Websocket client closed!'
Factory and Protocol.
class MyProtocol(WebSocketClientProtocol):
def onMessage(self, payload, isBinary):
msg = payload.decode('utf-8')
self.Factory.q.put(msg)
if msg == 'close':
self.dropConnection(abort=True)
class WebSocketClientFactoryWithQ(WebSocketClientFactory):
def __init__(self, *args, **kwargs):
self.queue = kwargs.pop('queue', None)
WebSocketClientFactory.__init__(self, *args, **kwargs)
Client thread.
def wsthread(url, q):
factory = WebSocketClientFactoryWithQ(url=url, queue=q)
factory.protocol = MyProtocol
connectWS(Factory)
if myreactor is None:
myreactor = reactor
reactor.run()
print 'Done'
Now I got a problem. It seems that my client thread never stops. Even if I receive "close", it seems still running and every time I try to recreate a new client, it creates a new thread. I understand the first thread won't stop since reactor.run() will run forever, but from the 2nd thread and on, it should be non-blocking since I'm not starting it anymore. How can I change that?
EDIT:
I end up solving it with
Adding stopFactory() after disconnect.
Make protocol functions with reactor.callFromThread().
Start the reactor in the first thread and put clients in other threads and use reactor.callInThread() to create them.
Your main_thread creates new threads running wsthread. wsthread uses Twisted APIs. The first wsthread becomes the reactor thread. All subsequent threads are different and it is undefined what happens if you use a Twisted API from them.
You should almost certainly remove the use of threads from your application. For dealing with console input in a Twisted-based application, take a look at twisted.conch.stdio (not the best documented part of Twisted, alas, but just what you want).
I was trying to build a server. Beside accept connection from clients as normal servers do, my server will connect other server as a client either.
I've set the protocol and endpoint like below:
p = FooProtocol()
client = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # without ClientFactory
Then, after call reactor.run(), the server will listen/accept new socket connections. when new socket connections are made(in connectionMade), the server will call connectProtocol(client, p), which acts like the pseudocode below:
while server accept new socket:
connectProtocol(client, p)
# client.client.connect(foo_client_factory) --> connecting in this way won't
# cause memory leak
As the connections to the client are made, the memory is gradually consumed(explicitly calling gc doesn't work).
Do I use the Twisted in a wrong way?
-----UPDATE-----
My test programe: Server waits clients to connect. When connection from client is made, server will create 50 connections to other server
Here is the code:
#! /usr/bin/env python
import sys
import gc
from twisted.internet import protocol, reactor, defer, endpoints
from twisted.internet.endpoints import TCP4ClientEndpoint, connectProtocol
class MyClientProtocol(protocol.Protocol):
def connectionMade(self):
self.transport.loseConnection()
class MyClientFactory(protocol.ClientFactory):
def buildProtocol(self, addr):
p = MyClientProtocol()
return p
class ServerFactory(protocol.Factory):
def buildProtocol(self, addr):
p = ServerProtocol()
return p
client_factory = MyClientFactory() # global
client_endpoint = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # global
times = 0
class ServerProtocol(protocol.Protocol):
def connectionMade(self):
global client_factory
global client_endpoint
global times
for i in range(50):
# 1)
p = MyClientProtocol()
connectProtocol(client_endpoint, p) # cause memleak
# 2)
#client_endpoint.connect(client_factory) # no memleak
times += 1
if times % 10 == 9:
print 'gc'
gc.collect() # doesn't work
self.transport.loseConnection()
if __name__ == '__main__':
server_factory = ServerFactory()
serverEndpoint = endpoints.serverFromString(reactor, "tcp:8888")
serverEndpoint.listen(server_factory)
reactor.run()
This program doesn't do any Twisted log initialization. This means it runs with the "log beginner" for its entire run. The log beginner records all log events it observes in a LimitedHistoryLogObserver (up to a configurable maximum).
The log beginner keeps 2 ** 16 (_DEFAULT_BUFFER_MAXIMUM) events and then begins throwing out old ones, presumably to avoid consuming all available memory if a program never configures another observer.
If you hack the Twisted source to set _DEFAULT_BUFFER_MAXIMUM to a smaller value - eg, 10 - then the program no longer "leaks". Of course, it's really just an object leak and not a memory leak and it's bounded by the 2 ** 16 limit Twisted imposes.
However, connectProtocol creates a new factory each time it is called. When each new factory is created, it logs a message. And the application code generates a new Logger for each log message. And the logging code puts the new Logger into the log message. This means the memory cost of keeping those log messages around is quite noticable (compared to just leaking a short blob of text or even a dict with a few simple objects in it).
I'd say the code in Twisted is behaving just as intended ... but perhaps someone didn't think through the consequences of that behavior complete.
And, of course, if you configure your own log observer then the "log beginner" is taken out of the picture and there is no problem. It does seem reasonable to expect that all serious programs will enable logging rather quickly and avoid this issue. However, lots of short throw-away or example programs often don't ever initialize logging and rely on print instead, making them subject to this behavior.
Note This problem was reported in #8164 and fixed in 4acde626 so Twisted 17 will not have this behavior.
I'm trying to test some code that reconnects to a server after a disconnect. This works perfectly fine outside the tests, but it fails to acknowledge that the socket has disconnected when running the tests.
I'm using a Gevent Stream Server to mock a real listening server:
import gevent.server
from gevent import queue
class TestServer(gevent.server.StreamServer):
def __init__(self, *args, **kwargs):
super(TestServer, self).__init__(*args, **kwargs)
self.sockets = {}
def handle(self, socket, address):
self.sockets[address] = (socket, queue.Queue())
socket.sendall('testing the connection\r\n')
gevent.spawn(self.recv, address)
def recv(self, address):
socket = self.sockets[address][0]
queue = self.sockets[address][1]
print 'Connection accepted %s:%d' % address
try:
for data in socket.recv(1024):
queue.put(data)
except:
pass
def murder(self):
self.stop()
for sock in self.sockets.iteritems():
print sock
sock[1][0].shutdown(socket.SHUT_RDWR)
sock[1][0].close()
self.sockets = {}
def run_server():
test_server = TestServer(('127.0.0.1', 10666))
test_server.start()
return test_server
And my test looks like this:
def test_can_reconnect(self):
test_server = run_server()
client_config = {'host': '127.0.0.1', 'port': 10666}
client = Connection('test client', client_config, get_config())
client.connect()
assert client.socket_connected
test_server.murder()
#time.sleep(4) #tried sleeping. no dice.
assert not client.socket_connected
assert client.server_disconnect
test_server = run_server()
client.reconnect()
assert client.socket_connected
It fails at assert not client.socket_connected.
I detect for "not data" during recv. If it's None, then I set some variables so that other code can decide whether or not to reconnect (don't reconnect if it was a user_disconnect and so on). This behavior works and has always worked for me in the past, I've just never tried to make a test for it until now. Is there something odd with socket connections and local function scopes or something? it's like the connection still exists even after stopping the server.
The code I'm trying to test is open: https://github.com/kyleterry/tenyks.git
If you run the tests, you will see the one I'm trying to fix fail.
Trying to run a unit test with a real socket is a tough row to hoe. It's going to be tricky as only one set of tests can run at a time, as the server port will be used, and it's going to be slow as the sockets get set up and torn down. To top it off if this is really a unit test you don't want to test the socket, just the code that's using the socket.
If you mock the socket calls you can throw exceptions willy nilly from the mocked code and ensure that the code making use of the socket does the right thing. You don't need a real socket to ensure that the class under test does the right thing, you can fake it if you can wrap the socket calls in an object. Pass in a reference to the socket object when constructing your class and you're ready to go.
My suggestion is to wrap the socket calls in a class that supports sendall, recv, and all the methods you call on the socket. Then you can swap out the actual Socket class with a TestReconnectSocket (or whatever) and run your tests.
Take a look at mox, a python mocking framework.
Vague response, but my immediate reaction would be that your recv() call is blocking and keeping the socket alive - have you tried making the socket non-blocking, and catching the error on close instead?
One thing to keep in mind when testing sockets like this, is that operating systems don't like to reopen a socket soon after it has been in use. You can set a socket option to tell it to go ahead and reuse it anyways. Right after you create the socket set the socket's option:
mysocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
Hopefully this will fix your issue. You may have to do it on both the server and client side depending on which one is giving you the problems.
you are calling shutdown(socket.SHUT_RDWR) so this doesn't seem like a problem with recv blocking.
however, you are using gevent.socket.socket.recv, so please check your gevent version, there is an issue with recv() that causes it to block if the underlying file descriptor is closed (version < v0.13.0)
you may still need gevent.sleep() to do cooperative yield and give the client an opportunity to exit the recv() call.
I want to start a simple web server locally, then launch a browser with an url just served. This is something that I'd like to write,
from wsgiref.simple_server import make_server
import webbrowser
srv = make_server(...)
srv.blocking = False
srv.serve_forever()
webbrowser.open_new_tab(...)
try:
srv.blocking = True
except KeyboardInterrupt:
pass
print 'Bye'
The problem is, I couldn't find a way to set a blocking option for the wsgiref simple server. By default, it's blocking, so the browser would be launched only after I stopped it. If I launch the browser first, the request is not handled yet. I'd prefer to use a http server from the standard library, not an external one, like tornado.
You either have to spawn a thread with the server, so you can continue with your control flow, or you have to use 2 python processes.
untested code, you should get the idea
class ServerThread(threading.Thread):
def __init__(self, port):
threading.Thread.__init__(self)
def run(self):
srv = make_server(...)
srv.serve_forever()
if '__main__'==__name__:
ServerThread().start()
webbrowser.open_new_tab(...)