How should I handle reconnections in twisted.application.internet.ClientService? - python

I am trying to make use of the recently introduced twisted.application.internet.ClientService class in a twisted application that does simple modbus-tcp polling using pymodbus. I feel my issues have nothing to do with the modbus Protocol that I am using, as I have created quite a few other working prototypes using the lower level twisted APIs; but this new ClientService looks like it fits my needs exactly, thus should reduce my code footprint and keep it neat if I can get it to work.
My tests show the ClientService handles reconnections just as it is expected to and I have easy access to the first connections Protocol. The problem that I am having is getting hold of subsequent Protocol objects for the reconnections. Here is a simplified version of the code I am having the issue with:
from twisted.application import internet, service
from twisted.internet.protocol import ClientFactory
from twisted.internet import reactor, endpoints
from pymodbus.client.async import ModbusClientProtocol
class ModbusPollingService(internet.ClientService):
def __init__(self, addrstr, numregs=5):
self.numregs=numregs
internet.ClientService.__init__(self,
endpoints.clientFromString(reactor, addrstr),
ClientFactory.forProtocol(ModbusClientProtocol))
def startService(self):
internet.ClientService.startService(self)
self._pollWhenConnected()
def _pollWhenConnected(self):
d = self.whenConnected()
d.addCallback(self._connected)
d.addErrback(self._connfail)
def _connected(self, p):
self._log.debug("connected: {p}", p=p)
self._mbp = p
self._poll()
return True
def _connfail(self, failstat):
self._log.failure('connection failure', failure=failstat)
self._mbp = None
self._pollWhenConnected()
def _poll(self):
self._log.debug("poll: {n}", n=self.numregs)
d = self._mbp.read_holding_registers(0, self.numregs)
d.addCallback(self._regs)
d.addErrback(self._connfail)
def _regs(self, res):
self._log.debug("regs: {r}", r=res.registers)
# Do real work of dealing storing registers here
reactor.callLater(1, self._poll)
return res
application = service.Application("ModBus Polling Test")
mbpollsvc = ModbusPollingService('tcp:127.0.0.1:502')
mbpollsvc.setServiceParent(application)
When the connection fails (for whatever reason) the errback of the deferred returned from read_holding_registers() gets called with the intention that my service can abandon that Protocol and go back into a state of waiting for a new connections Protocol to be returned by the whenConnected() callback... however what seems to be happening is that the ClientService does not yet realise the connection is dead and returns me the same disconnected Protocol, giving me a log full of:
2016-05-05 17:28:25-0400 [-] connected: <pymodbus.client.async.ModbusClientProtocol object at 0x000000000227b558>
2016-05-05 17:28:25-0400 [-] poll: 5
2016-05-05 17:28:25-0400 [-] connection failure
Traceback (most recent call last):
Failure: pymodbus.exceptions.ConnectionException: Modbus Error: [Connection] Client is not connected
2016-05-05 17:28:25-0400 [-] connected: <pymodbus.client.async.ModbusClientProtocol object at 0x000000000227b558>
2016-05-05 17:28:25-0400 [-] poll: 5
2016-05-05 17:28:25-0400 [-] connection failure
Traceback (most recent call last):
Failure: pymodbus.exceptions.ConnectionException: Modbus Error: [Connection] Client is not connected
or very similar, note repeated ModbusClientProtocol object address.
I'm pretty sure that I've probably just made a poor choice of pattern for this API, but I've iterated through a few different possibilities such as creating my own Protocol and Factory based on ModbusClientProtocol and handling the polling mechanism entirely within that class; but it felt a bit messy passing the persistent config and mechanism to store the polled data that way, it seems like handling this at or above the ClientService level is a cleaner approach but I can't work out the best way of keeping track of the currently connected Protocol. I guess what I'm really looking for is a best practice recommendation for usage of the ClientService class in extended polling situations.

This is an old question. But, hopefully, it will help somebody else.
The problem that I am having is getting hold of subsequent Protocol objects for the reconnections.
Supply prepareConnection callable to ClientService constructor. It will supply current connection.
In the example below MyService attaches itself to MyFactory. The main reason for this is so that MyFactory can let MyService know when ClientService disconnected. It's possible because ClientService calls Factory.stopFactory on disconnect.
Next time ClientService reconnects it will call its prepareConnection supplying current protocol instance.
(Reconnecting) ClientService:
# clientservice.py
# twistd -y clientservice.py
from twisted.application import service, internet
from twisted.internet.protocol import Factory
from twisted.internet import endpoints, reactor
from twisted.protocols import basic
from twisted.logger import Logger
class MyProtocol(basic.Int16StringReceiver):
_log = Logger()
def stringReceived(self, data):
self._log.info('Received data from {peer}, data={data}',
peer=self.transport.getPeer(),
data=data)
class MyFactory(Factory):
_log = Logger()
protocol = MyProtocol
def stopFactory(self):
# Let service know that its current connection is stale
self.service.on_connection_lost()
class MyService(internet.ClientService):
def __init__(self, endpoint, factory):
internet.ClientService.__init__(self,
endpoint,
factory,
prepareConnection=self.on_prepare_connection)
factory.service = self # Attach this service to factory
self.connection = None # Future protocol instance
def on_prepare_connection(self, connection):
self.connection = connection # Attach protocol to service
self._log.info('Connected to {peer}',
peer=self.connection.transport.getPeer())
self.send_message('Hello from prepare connection!')
def on_connection_lost(self):
if self.connection is None:
return
self._log.info('Disconnected from {peer}',
peer=self.connection.transport.getPeer())
self.connection = None
def send_message(self, message):
if self.connection is None:
raise Exception('Service is not available')
self.connection.sendString(bytes(message, 'utf-8'))
application = service.Application('MyApplication')
my_endpoint = endpoints.clientFromString(reactor, 'tcp:localhost:22222')
my_factory = MyFactory()
my_service = MyService(my_endpoint, my_factory)
my_service.setServiceParent(application)
Slightly modified echo server from twisted examples:
#!/usr/bin/env python
# echoserv.py
# python echoserv.py
# Copyright (c) Twisted Matrix Laboratories.
# See LICENSE for details.
from twisted.internet.protocol import Protocol, Factory
from twisted.internet import reactor
from twisted.protocols import basic
### Protocol Implementation
# This is just about the simplest possible protocol
class Echo(basic.Int16StringReceiver):
def stringReceived(self, data):
"""
As soon as any data is received, write it back.
"""
print("Received:", data.decode('utf-8'))
self.sendString(data)
def main():
f = Factory()
f.protocol = Echo
reactor.listenTCP(22222, f)
reactor.run()
if __name__ == '__main__':
main()

You're not calling self.transport.loseConnection() anywhere that I can see in response to your polling, so as far as twisted can tell, you aren't actually disconnected. It may later, when you stop doing anything on the old transport, but by then you've lost track of things.

Related

Twisted: Using connectProtocol to connect endpoint cause memory leak?

I was trying to build a server. Beside accept connection from clients as normal servers do, my server will connect other server as a client either.
I've set the protocol and endpoint like below:
p = FooProtocol()
client = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # without ClientFactory
Then, after call reactor.run(), the server will listen/accept new socket connections. when new socket connections are made(in connectionMade), the server will call connectProtocol(client, p), which acts like the pseudocode below:
while server accept new socket:
connectProtocol(client, p)
# client.client.connect(foo_client_factory) --> connecting in this way won't
# cause memory leak
As the connections to the client are made, the memory is gradually consumed(explicitly calling gc doesn't work).
Do I use the Twisted in a wrong way?
-----UPDATE-----
My test programe: Server waits clients to connect. When connection from client is made, server will create 50 connections to other server
Here is the code:
#! /usr/bin/env python
import sys
import gc
from twisted.internet import protocol, reactor, defer, endpoints
from twisted.internet.endpoints import TCP4ClientEndpoint, connectProtocol
class MyClientProtocol(protocol.Protocol):
def connectionMade(self):
self.transport.loseConnection()
class MyClientFactory(protocol.ClientFactory):
def buildProtocol(self, addr):
p = MyClientProtocol()
return p
class ServerFactory(protocol.Factory):
def buildProtocol(self, addr):
p = ServerProtocol()
return p
client_factory = MyClientFactory() # global
client_endpoint = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # global
times = 0
class ServerProtocol(protocol.Protocol):
def connectionMade(self):
global client_factory
global client_endpoint
global times
for i in range(50):
# 1)
p = MyClientProtocol()
connectProtocol(client_endpoint, p) # cause memleak
# 2)
#client_endpoint.connect(client_factory) # no memleak
times += 1
if times % 10 == 9:
print 'gc'
gc.collect() # doesn't work
self.transport.loseConnection()
if __name__ == '__main__':
server_factory = ServerFactory()
serverEndpoint = endpoints.serverFromString(reactor, "tcp:8888")
serverEndpoint.listen(server_factory)
reactor.run()
This program doesn't do any Twisted log initialization. This means it runs with the "log beginner" for its entire run. The log beginner records all log events it observes in a LimitedHistoryLogObserver (up to a configurable maximum).
The log beginner keeps 2 ** 16 (_DEFAULT_BUFFER_MAXIMUM) events and then begins throwing out old ones, presumably to avoid consuming all available memory if a program never configures another observer.
If you hack the Twisted source to set _DEFAULT_BUFFER_MAXIMUM to a smaller value - eg, 10 - then the program no longer "leaks". Of course, it's really just an object leak and not a memory leak and it's bounded by the 2 ** 16 limit Twisted imposes.
However, connectProtocol creates a new factory each time it is called. When each new factory is created, it logs a message. And the application code generates a new Logger for each log message. And the logging code puts the new Logger into the log message. This means the memory cost of keeping those log messages around is quite noticable (compared to just leaking a short blob of text or even a dict with a few simple objects in it).
I'd say the code in Twisted is behaving just as intended ... but perhaps someone didn't think through the consequences of that behavior complete.
And, of course, if you configure your own log observer then the "log beginner" is taken out of the picture and there is no problem. It does seem reasonable to expect that all serious programs will enable logging rather quickly and avoid this issue. However, lots of short throw-away or example programs often don't ever initialize logging and rely on print instead, making them subject to this behavior.
Note This problem was reported in #8164 and fixed in 4acde626 so Twisted 17 will not have this behavior.

Twisted - How can I tell the reactor to dispose a Protocol object after using adoptStreamConnection in a subprocess?

I'm trying to pass a TCP connection to a Twisted subprocess with adoptStreamConnection, but I can't figure out how to get the Process disposed in the main process after doing that.
My desired flow looks like this:
Finish writing any data the Protocol transport has waiting
When we know the write buffer is empty send the AMP message to transfer the socket to the subprocess
Dispose the Protocol instance in the main process
I tried doing nothing, loseConnection, abortConnection, and monkey patching _socketClose out and using loseConnection. See code here:
import weakref
from twisted.internet import reactor
from twisted.internet.endpoints import TCP4ServerEndpoint
from twisted.python.sendmsg import getsockfam
from twisted.internet.protocol import Factory, Protocol
import twisted.internet.abstract
class EchoProtocol(Protocol):
def dataReceived(self, data):
self.transport.write(data)
class EchoFactory(Factory):
protocol = EchoProtocol
class TransferProtocol(Protocol):
def dataReceived(self, data):
self.transport.write('main process still listening!: %s' % (data))
def connectionMade(self):
self.transport.write('this message should make it to the subprocess\n')
# attempt 1: do nothing
# everything works fine in the adopt (including receiving the written message), but old protocol still exists (though isn't doing anything)
# attempt 1: try calling loseConnection
# we lose connection before the adopt opens the socket (presumably TCP disconnect message was sent)
#
# self.transport.loseConnection()
# attempt 2: try calling abortConnection
# result is same as loseConnection
#
# self.transport.abortConnection()
# attempt 3: try monkey patching the socket close out and calling loseConnection
# result: same as doing nothing-- adopt works (including receiving the written message), old protocol still exists
#
# def ignored(*args, **kwargs):
# print 'ignored :D'
#
# self.transport._closeSocket = ignored
# self.transport.loseConnection()
reactor.callLater(0, adopt, self.transport.fileno())
class ServerFactory(Factory):
def buildProtocol(self, addr):
p = TransferProtocol()
self.ref = weakref.ref(p)
return p
f = ServerFactory()
def adopt(fileno):
print "does old protocol still exist?: %r" % (f.ref())
reactor.adoptStreamConnection(fileno, getsockfam(fileno), EchoFactory())
port = 1337
endpoint = TCP4ServerEndpoint(reactor, port)
d = endpoint.listen(f)
reactor.run()
In all cases the Protocol object still exists in the main process after the socket has been transferred. How can I clean this up?
Thanks in advance.
Neither loseConnection nor abortConnection tell the reactor to "forget" about a connection; they close the connection, which is very different; they tell the peer that the connection has gone away.
You want to call self.transport.stopReading() and self.transport.stopWriting() to remove the references to it from the reactor.
Also, it's not valid to use a weakref to test for the remaining existence of an object unless you call gc.collect() first.
As far as making sure that all the data has been sent: the only reliable way to do that is to have an application-level acknowledgement of the data that you've sent. This is why protocols that need a handshake that involves changing protocols - say, for example, STARTTLS - have a specific handshake where the initiator says "I'm going to switch" (and then stops sending), then the peer says "OK, you can switch now". Another way to handle that in this case would be to hand the data you'd like to write to the subprocess via some other channel, instead of passing it to transport.write.

Python twisted: A server that writes to the connection after an asynchronous operation?

I have a server where I have implemented a child of the NetstringReceiver protocol. I want it to perform an asynchronous operation (using txredisapi) based on the client's request and then respond with the results of the operation. A generalization of my code:
class MyProtocol(NetstringReceiver):
def stringReceived(self, request):
d = async_function_that_returns_deferred(request)
d.addCallback(self.respond)
# self.sendString(myString)
def respond(self, result_of_async_function):
self.sendString(result_of_async_function)
In the above code, the client connecting to my server does not get a response. However, it does get myString if I uncomment
# self.sendString(myString)
I also know that result_of_async_function is a non-empty string because I print it to stdout .
What can I do that will allow me to respond to the client with the result of the asynchronous function?
Update: Runnable source code
from twisted.internet import reactor, defer, protocol
from twisted.protocols.basic import NetstringReceiver
from twisted.internet.task import deferLater
def f():
return "RESPONSE"
class MyProtocol(NetstringReceiver):
def stringReceived(self, _):
d = deferLater(reactor, 5, f)
d.addCallback(self.reply)
# self.sendString(str(f())) # Note that this DOES send the string.
def reply(self, response):
self.sendString(str(response)) # Why does this not send the string and how to fix?
class MyFactory(protocol.ServerFactory):
protocol = MyProtocol
def main():
factory = MyFactory()
from twisted.internet import reactor
port = reactor.listenTCP(8888, factory, )
print 'Serving on %s' % port.getHost()
reactor.run()
if __name__ == "__main__":
main()
There's one specific feature about NetstringReceiver:
The connection is lost if an illegal message is received
Are you sure that your messages conform djb's Netstring protocol?
Obviously the client sends illegal string that could not be parsed, and connection is lost by protocol conditions. Everything else looks good in your code.
If you don't need that specific protcol, you'd better inherit LineReceiver instead of NetstringReceiver.
The reason you never get the response is because by the time it's sent the connection is closed. The reason the connection is closed is because the message you send with 'nc' is:
1:a,\n
Because you have to type a newline to get nc to send the message, but nc includes it as part of the message. That violates the NetString protocol...
I worked around it (with your code modified with some additional prints) by sending this message instead:
1:a,40:\n
blahblahblahDon't hit return here, just wait for the reply
8:RESPONSE,

Twisted client for a send only protocol that is tolerant of disconnects

I've decided to dip my toe into the world of asynchronous python with the help of twisted. I've implemented some of the examples from the documentation, but I'm having a difficult time finding an example of the, very simple, client I'm trying to write.
In short I'd like a client which establishes a tcp connection with a server and then sends simple "\n" terminated string messages off of a queue object to the server. The server doesn't ever respond with any messages so my client is fully unidirectional. I /think/ that what I want is some combination of this example and the twisted.internet.protocols.basic.LineReceiver convenience protocol. This feels like it should be just about the simplest thing one could do in twisted, but none of the documentation or examples I've seen online seem to fit quite right.
What I have done is not used a Queue but I am illustrating the code that sends a line, once a connection is made. There are bunch of print stuff that will help you understand on what is going on.
Usual import stuff:
from twisted.web import proxy
from twisted.internet import reactor
from twisted.internet import protocol
from twisted.internet.protocol import ReconnectingClientFactory
from twisted.protocols import basic
from twisted.python import log
import sys
log.startLogging(sys.stdout)
You create a protocol derived from line receiver, set the delimiter.
In this case, I simply write a string "www" once the connection is made.
The key thing is to look at protocol interface at twisted.internet.interface.py and understand the various methods of protocol and what they do and when they are called.
class MyProtocol(basic.LineReceiver):
#def makeConnection(self, transport):
# print transport
def connectionLost(self, reason):
print reason
self.sendData = False
def connectionMade(self):
print "connection made"
self.delimiter = "\n"
self.sendData = True
print self.transport
self.sendFromQueue()
def sendFromQueue(self):
while self.sendData:
msg = dataQueue.get()
self.sendLine(msg)
# you need to handle empty queue
# Have another function to resume
Finally, A protocol factory that will create a protocol instance for every connection.
Look at method : buildProtcol.
class myProtocolFactory():
protocol = MyProtocol
def doStart(self):
pass
def startedConnecting(self, connectorInstance):
print connectorInstance
def buildProtocol(self, address):
print address
return self.protocol()
def clientConnectionLost(self, connection, reason):
print reason
print connection
def clientConnectionFailed(self, connection, reason):
print connection
print reason
def doStop(self):
pass
Now you use a connector to make a connection:
reactor.connectTCP('localhost', 50000, myProtocolFactory())
reactor.run()
I ran this and connected it to an server that simply prints what it receives and hence send no ack back. Here is the output:
1286906080.08 82 INFO 140735087148064 __main__ conn_made: client_address=127.0.0.1:50277
1286906080.08 83 DEBUG 140735087148064 __main__ created handler; waiting for loop
1286906080.08 83 DEBUG 140735087148064 __main__ handle_read
1286906080.08 83 DEBUG 140735087148064 __main__ after recv
'www\n'
Recieved: 4
The above example if not fault tolerant. To reconnect , when a connection is lost, you can derive your protocol factory from an existing twisted class - ReconnectingClientFactory.
Twisted has almost all the tools that you would need :)
class myProtocolFactory(ReconnectingClientFactory):
protocol = MyProtocol
def buildProtocol(self, address):
print address
return self.protocol()
For further reference
I suggest that you read : http://krondo.com/?page_id=1327
[Edited: As per comment below]

How can I ensure closing all connection in loadbalancer when something fails or hangs?

I'm trying to write a simple load-balancer. It works ok till one of servers (BalanceServer) doesn't close connection then...
Client (ReverseProxy) disconnects but the connection in with BalanceServer stays open.
I tried to add callback (#3) to ReverseProxy.connectionLost to close the connection with one of the servers as I do with closing connection when server disconnects (clientLoseConnection), but at that time the ServerWriter is Null and I cannot terminate it at #1 and #2
How can I ensure that all connections are closed when one of sides disconnects? I guess that also some kind of timeout here would be nice when both client and one of servers hang, but how can I add it so it works on both connections?
from twisted.internet.protocol import Protocol, Factory, ClientCreator
from twisted.internet import reactor, defer
from collections import namedtuple
BalanceServer = namedtuple('BalanceServer', 'host port')
SERVER_LIST = [BalanceServer('127.0.0.1', 8000), BalanceServer('127.0.0.1', 8001)]
def getServer(servers):
while True:
for server in servers:
yield server
# this writes to one of balance servers and responds to client with callback 'clientWrite'
class ServerWriter(Protocol):
def sendData(self, data):
self.transport.write(data)
def dataReceived(self, data):
self.clientWrite(data)
def connectionLost( self, reason ):
self.clientLoseConnection()
# callback for reading data from client to send it to server and get response to client again
def transferData(serverWriter, clientWrite, clientLoseConnection, data):
if serverWriter:
serverWriter.clientWrite = clientWrite
serverWriter.clientLoseConnection = clientLoseConnection
serverWriter.sendData(data)
def closeConnection(serverWriter):
if serverWriter: #1 this is null
#2 So connection is not closed and hangs there, till BalanceServer close it
serverWriter.transport.loseConnection()
# accepts clients
class ReverseProxy(Protocol):
def connectionMade(self):
server = self.factory.getServer()
self.serverWriter = ClientCreator(reactor, ServerWriter)
self.client = self.serverWriter.connectTCP( server.host, server.port )
def dataReceived(self, data):
self.client.addCallback(transferData, self.transport.write,
self.transport.loseConnection, data )
def connectionLost(self, reason):
self.client.addCallback(closeConnection) #3 adding close doesn't work
class ReverseProxyFactory(Factory):
protocol = ReverseProxy
def __init__(self, serverGenerator):
self.getServer = serverGenerator
plainFactory = ReverseProxyFactory( getServer(SERVER_LIST).next )
reactor.listenTCP( 7777, plainFactory )
reactor.run()
You may want to look at twisted.internet.protocols.portforward for an example of hooking up two connections and then disconnecting them. Or just use txloadbalancer and don't even write your own code.
However, loseConnection will never forcibly terminate the connection if there is never any traffic going over it. So if you don't have an application-level ping or any data going over your connections, they may still never shut down. This is a long-standing bug in Twisted. Actually, the longest-standing bug. Perhaps you'd like to help work on the fix :).

Categories