Twisted: Using connectProtocol to connect endpoint cause memory leak?

Twisted: Using connectProtocol to connect endpoint cause memory leak? - python

I was trying to build a server. Beside accept connection from clients as normal servers do, my server will connect other server as a client either.
I've set the protocol and endpoint like below:
p = FooProtocol()
client = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # without ClientFactory
Then, after call reactor.run(), the server will listen/accept new socket connections. when new socket connections are made(in connectionMade), the server will call connectProtocol(client, p), which acts like the pseudocode below:
while server accept new socket:
connectProtocol(client, p)
# client.client.connect(foo_client_factory) --> connecting in this way won't
# cause memory leak
As the connections to the client are made, the memory is gradually consumed(explicitly calling gc doesn't work).
Do I use the Twisted in a wrong way?
-----UPDATE-----
My test programe: Server waits clients to connect. When connection from client is made, server will create 50 connections to other server
Here is the code:
#! /usr/bin/env python
import sys
import gc
from twisted.internet import protocol, reactor, defer, endpoints
from twisted.internet.endpoints import TCP4ClientEndpoint, connectProtocol
class MyClientProtocol(protocol.Protocol):
def connectionMade(self):
self.transport.loseConnection()
class MyClientFactory(protocol.ClientFactory):
def buildProtocol(self, addr):
p = MyClientProtocol()
return p
class ServerFactory(protocol.Factory):
def buildProtocol(self, addr):
p = ServerProtocol()
return p
client_factory = MyClientFactory() # global
client_endpoint = TCP4ClientEndpoint(reactor, '127.0.0.1' , 8080) # global
times = 0
class ServerProtocol(protocol.Protocol):
def connectionMade(self):
global client_factory
global client_endpoint
global times
for i in range(50):
# 1)
p = MyClientProtocol()
connectProtocol(client_endpoint, p) # cause memleak
# 2)
#client_endpoint.connect(client_factory) # no memleak
times += 1
if times % 10 == 9:
print 'gc'
gc.collect() # doesn't work
self.transport.loseConnection()
if __name__ == '__main__':
server_factory = ServerFactory()
serverEndpoint = endpoints.serverFromString(reactor, "tcp:8888")
serverEndpoint.listen(server_factory)
reactor.run()

This program doesn't do any Twisted log initialization. This means it runs with the "log beginner" for its entire run. The log beginner records all log events it observes in a LimitedHistoryLogObserver (up to a configurable maximum).
The log beginner keeps 2 ** 16 (_DEFAULT_BUFFER_MAXIMUM) events and then begins throwing out old ones, presumably to avoid consuming all available memory if a program never configures another observer.
If you hack the Twisted source to set _DEFAULT_BUFFER_MAXIMUM to a smaller value - eg, 10 - then the program no longer "leaks". Of course, it's really just an object leak and not a memory leak and it's bounded by the 2 ** 16 limit Twisted imposes.
However, connectProtocol creates a new factory each time it is called. When each new factory is created, it logs a message. And the application code generates a new Logger for each log message. And the logging code puts the new Logger into the log message. This means the memory cost of keeping those log messages around is quite noticable (compared to just leaking a short blob of text or even a dict with a few simple objects in it).
I'd say the code in Twisted is behaving just as intended ... but perhaps someone didn't think through the consequences of that behavior complete.
And, of course, if you configure your own log observer then the "log beginner" is taken out of the picture and there is no problem. It does seem reasonable to expect that all serious programs will enable logging rather quickly and avoid this issue. However, lots of short throw-away or example programs often don't ever initialize logging and rely on print instead, making them subject to this behavior.
Note This problem was reported in #8164 and fixed in 4acde626 so Twisted 17 will not have this behavior.

Related

Grpc python client server streaming not working as expected

a simple grpc server client, client send a int and server streams int's back.
client is reading the messages one by one but server is running the generator function immediately for all responses.
server code:
import test_pb2_grpc as pb_grpc
import test_pb2 as pb2
import time
import grpc
from concurrent import futures
class test_servcie(pb_grpc.TestServicer):
def Produce(self, request, context):
for i in range(request.val):
print("request came")
rs = pb2.Rs()
rs.st = i + 1
yield rs
def serve():
server =
grpc.server(futures.ThreadPoolExecutor(max_workers=10))
pb_grpc.add_TestServicer_to_server(test_servcie(), server)
server.add_insecure_port('[::]:50051')
print("service started")
server.start()
try:
while True:
time.sleep(3600)
except KeyboardInterrupt:
server.stop(0)
if __name__ == '__main__':
serve()
client code:
import grpc
import test_pb2_grpc as pb_grpc
import test_pb2 as pb
def test():
channel = grpc.insecure_channel(
'{host}:{port}'.format(host="localhost", port=50051))
stub = pb_grpc.TestStub(channel=channel)
req = pb.Rq()
req.val = 20
for s in stub.Produce(req):
print(s.st)
import time
time.sleep(10)
test()
proto file:
syntax = "proto3";
service Test {
rpc Produce (Rq) returns (stream Rs);
}
message Rq{
int32 val = 1;
}
message Rs{
int32 st = 1;
}
after starting the server
when i run the client, server side generator started running and completed immediately it looped for the range.
what i expected is it will one by one as client calls but that is not the case.
is this an expected behaviour. my client is still printing the values but the sever is already completed the function.

Yes, this behavior is expected. gRPC features flow control between the two sides of an RPC (so that generating messages too fast on one side won't exhaust memory on the other side) but there's also an allowance for a small amount of buffering (so that a reasonably small amount of data may be sent by one side before the other side explicitly asks for it). In your case the twenty messages sent from server to client all fit within this small allowance. The service-side gRPC Python runtime is calling your service-side Produce method, consuming its entire output of twenty messages, and sending all those messages across the network to your client, where they are locally held by the invocation-side gRPC Python runtime until your invocation-side test function asks for them.
If you want to see the effects of flow control in action, try using huge messages (one megabyte in size or so) or altering the size of the allowance (I think this is done with a channel argument but those are an advanced and relatively-unsupported feature so this is left as an exercise).

How should I handle reconnections in twisted.application.internet.ClientService?

I am trying to make use of the recently introduced twisted.application.internet.ClientService class in a twisted application that does simple modbus-tcp polling using pymodbus. I feel my issues have nothing to do with the modbus Protocol that I am using, as I have created quite a few other working prototypes using the lower level twisted APIs; but this new ClientService looks like it fits my needs exactly, thus should reduce my code footprint and keep it neat if I can get it to work.
My tests show the ClientService handles reconnections just as it is expected to and I have easy access to the first connections Protocol. The problem that I am having is getting hold of subsequent Protocol objects for the reconnections. Here is a simplified version of the code I am having the issue with:
from twisted.application import internet, service
from twisted.internet.protocol import ClientFactory
from twisted.internet import reactor, endpoints
from pymodbus.client.async import ModbusClientProtocol
class ModbusPollingService(internet.ClientService):
def __init__(self, addrstr, numregs=5):
self.numregs=numregs
internet.ClientService.__init__(self,
endpoints.clientFromString(reactor, addrstr),
ClientFactory.forProtocol(ModbusClientProtocol))
def startService(self):
internet.ClientService.startService(self)
self._pollWhenConnected()
def _pollWhenConnected(self):
d = self.whenConnected()
d.addCallback(self._connected)
d.addErrback(self._connfail)
def _connected(self, p):
self._log.debug("connected: {p}", p=p)
self._mbp = p
self._poll()
return True
def _connfail(self, failstat):
self._log.failure('connection failure', failure=failstat)
self._mbp = None
self._pollWhenConnected()
def _poll(self):
self._log.debug("poll: {n}", n=self.numregs)
d = self._mbp.read_holding_registers(0, self.numregs)
d.addCallback(self._regs)
d.addErrback(self._connfail)
def _regs(self, res):
self._log.debug("regs: {r}", r=res.registers)
# Do real work of dealing storing registers here
reactor.callLater(1, self._poll)
return res
application = service.Application("ModBus Polling Test")
mbpollsvc = ModbusPollingService('tcp:127.0.0.1:502')
mbpollsvc.setServiceParent(application)
When the connection fails (for whatever reason) the errback of the deferred returned from read_holding_registers() gets called with the intention that my service can abandon that Protocol and go back into a state of waiting for a new connections Protocol to be returned by the whenConnected() callback... however what seems to be happening is that the ClientService does not yet realise the connection is dead and returns me the same disconnected Protocol, giving me a log full of:
2016-05-05 17:28:25-0400 [-] connected: <pymodbus.client.async.ModbusClientProtocol object at 0x000000000227b558>
2016-05-05 17:28:25-0400 [-] poll: 5
2016-05-05 17:28:25-0400 [-] connection failure
Traceback (most recent call last):
Failure: pymodbus.exceptions.ConnectionException: Modbus Error: [Connection] Client is not connected
2016-05-05 17:28:25-0400 [-] connected: <pymodbus.client.async.ModbusClientProtocol object at 0x000000000227b558>
2016-05-05 17:28:25-0400 [-] poll: 5
2016-05-05 17:28:25-0400 [-] connection failure
Traceback (most recent call last):
Failure: pymodbus.exceptions.ConnectionException: Modbus Error: [Connection] Client is not connected
or very similar, note repeated ModbusClientProtocol object address.
I'm pretty sure that I've probably just made a poor choice of pattern for this API, but I've iterated through a few different possibilities such as creating my own Protocol and Factory based on ModbusClientProtocol and handling the polling mechanism entirely within that class; but it felt a bit messy passing the persistent config and mechanism to store the polled data that way, it seems like handling this at or above the ClientService level is a cleaner approach but I can't work out the best way of keeping track of the currently connected Protocol. I guess what I'm really looking for is a best practice recommendation for usage of the ClientService class in extended polling situations.

This is an old question. But, hopefully, it will help somebody else.
The problem that I am having is getting hold of subsequent Protocol objects for the reconnections.
Supply prepareConnection callable to ClientService constructor. It will supply current connection.
In the example below MyService attaches itself to MyFactory. The main reason for this is so that MyFactory can let MyService know when ClientService disconnected. It's possible because ClientService calls Factory.stopFactory on disconnect.
Next time ClientService reconnects it will call its prepareConnection supplying current protocol instance.
(Reconnecting) ClientService:
# clientservice.py
# twistd -y clientservice.py
from twisted.application import service, internet
from twisted.internet.protocol import Factory
from twisted.internet import endpoints, reactor
from twisted.protocols import basic
from twisted.logger import Logger
class MyProtocol(basic.Int16StringReceiver):
_log = Logger()
def stringReceived(self, data):
self._log.info('Received data from {peer}, data={data}',
peer=self.transport.getPeer(),
data=data)
class MyFactory(Factory):
_log = Logger()
protocol = MyProtocol
def stopFactory(self):
# Let service know that its current connection is stale
self.service.on_connection_lost()
class MyService(internet.ClientService):
def __init__(self, endpoint, factory):
internet.ClientService.__init__(self,
endpoint,
factory,
prepareConnection=self.on_prepare_connection)
factory.service = self # Attach this service to factory
self.connection = None # Future protocol instance
def on_prepare_connection(self, connection):
self.connection = connection # Attach protocol to service
self._log.info('Connected to {peer}',
peer=self.connection.transport.getPeer())
self.send_message('Hello from prepare connection!')
def on_connection_lost(self):
if self.connection is None:
return
self._log.info('Disconnected from {peer}',
peer=self.connection.transport.getPeer())
self.connection = None
def send_message(self, message):
if self.connection is None:
raise Exception('Service is not available')
self.connection.sendString(bytes(message, 'utf-8'))
application = service.Application('MyApplication')
my_endpoint = endpoints.clientFromString(reactor, 'tcp:localhost:22222')
my_factory = MyFactory()
my_service = MyService(my_endpoint, my_factory)
my_service.setServiceParent(application)
Slightly modified echo server from twisted examples:
#!/usr/bin/env python
# echoserv.py
# python echoserv.py
# Copyright (c) Twisted Matrix Laboratories.
# See LICENSE for details.
from twisted.internet.protocol import Protocol, Factory
from twisted.internet import reactor
from twisted.protocols import basic
### Protocol Implementation
# This is just about the simplest possible protocol
class Echo(basic.Int16StringReceiver):
def stringReceived(self, data):
"""
As soon as any data is received, write it back.
"""
print("Received:", data.decode('utf-8'))
self.sendString(data)
def main():
f = Factory()
f.protocol = Echo
reactor.listenTCP(22222, f)
reactor.run()
if __name__ == '__main__':
main()

You're not calling self.transport.loseConnection() anywhere that I can see in response to your polling, so as far as twisted can tell, you aren't actually disconnected. It may later, when you stop doing anything on the old transport, but by then you've lost track of things.

Twisted - How can I tell the reactor to dispose a Protocol object after using adoptStreamConnection in a subprocess?

I'm trying to pass a TCP connection to a Twisted subprocess with adoptStreamConnection, but I can't figure out how to get the Process disposed in the main process after doing that.
My desired flow looks like this:
Finish writing any data the Protocol transport has waiting
When we know the write buffer is empty send the AMP message to transfer the socket to the subprocess
Dispose the Protocol instance in the main process
I tried doing nothing, loseConnection, abortConnection, and monkey patching _socketClose out and using loseConnection. See code here:
import weakref
from twisted.internet import reactor
from twisted.internet.endpoints import TCP4ServerEndpoint
from twisted.python.sendmsg import getsockfam
from twisted.internet.protocol import Factory, Protocol
import twisted.internet.abstract
class EchoProtocol(Protocol):
def dataReceived(self, data):
self.transport.write(data)
class EchoFactory(Factory):
protocol = EchoProtocol
class TransferProtocol(Protocol):
def dataReceived(self, data):
self.transport.write('main process still listening!: %s' % (data))
def connectionMade(self):
self.transport.write('this message should make it to the subprocess\n')
# attempt 1: do nothing
# everything works fine in the adopt (including receiving the written message), but old protocol still exists (though isn't doing anything)
# attempt 1: try calling loseConnection
# we lose connection before the adopt opens the socket (presumably TCP disconnect message was sent)
#
# self.transport.loseConnection()
# attempt 2: try calling abortConnection
# result is same as loseConnection
#
# self.transport.abortConnection()
# attempt 3: try monkey patching the socket close out and calling loseConnection
# result: same as doing nothing-- adopt works (including receiving the written message), old protocol still exists
#
# def ignored(*args, **kwargs):
# print 'ignored :D'
#
# self.transport._closeSocket = ignored
# self.transport.loseConnection()
reactor.callLater(0, adopt, self.transport.fileno())
class ServerFactory(Factory):
def buildProtocol(self, addr):
p = TransferProtocol()
self.ref = weakref.ref(p)
return p
f = ServerFactory()
def adopt(fileno):
print "does old protocol still exist?: %r" % (f.ref())
reactor.adoptStreamConnection(fileno, getsockfam(fileno), EchoFactory())
port = 1337
endpoint = TCP4ServerEndpoint(reactor, port)
d = endpoint.listen(f)
reactor.run()
In all cases the Protocol object still exists in the main process after the socket has been transferred. How can I clean this up?
Thanks in advance.

Neither loseConnection nor abortConnection tell the reactor to "forget" about a connection; they close the connection, which is very different; they tell the peer that the connection has gone away.
You want to call self.transport.stopReading() and self.transport.stopWriting() to remove the references to it from the reactor.
Also, it's not valid to use a weakref to test for the remaining existence of an object unless you call gc.collect() first.
As far as making sure that all the data has been sent: the only reliable way to do that is to have an application-level acknowledgement of the data that you've sent. This is why protocols that need a handshake that involves changing protocols - say, for example, STARTTLS - have a specific handshake where the initiator says "I'm going to switch" (and then stops sending), then the peer says "OK, you can switch now". Another way to handle that in this case would be to hand the data you'd like to write to the subprocess via some other channel, instead of passing it to transport.write.

Python multithreaded ZeroMQ REQ-REP

I am looking to implement a REQ-REP pattern with Python and ZeroMQ using multithreading.
With Python, I can create a new thread when a new client connects to the server. This thread will handle all communications with that particular client, until the socket is closed:
# Thread that will handle client's requests
class ClientThread(threading.Thread):
# Implementation...
def __init__(self, socket):
threading.Thread.__init__(self)
self.socket = socket
def run(self):
while keep_alive:
# Thread can receive from client
data = self.socket.recv(1024)
# Processing...
# And send back a reply
self.socket.send(reply)
while True:
# The server accepts an incoming connection
conn, addr = sock.accept()
# And creates a new thread to handle the client's requests
newthread = ClientThread(conn)
# Starting the thread
newthread.start()
Is it possible to do the same[*] using ZeroMQ? I have seen some examples of multithreading with ZeroMQ and Python, but in all of them a pool of threads is created with a fixed number of threads at the beginning and it seems to be more oriented to load balancing.
[*] Notice what I want is to keep the connection between a client and its thread alive, as the thread is expecting multiple REQ messages from the client and it will store information that must be kept between messages (i.e.: a variable counter that increments its value on a new REQ message; so each thread has its own variable and no other client should ever be able to access that thread). New client = new thread.

Yes, ZeroMQ is a powerful can-do toolbox
However, the major surprise will be, that ZeroMQ <socket>-s are by far more structured than their plain counterparts, you use in the sample.
{ aZmqContext -> aZmqSocket -> aBehavioralPrimitive }
ZeroMQ builds a remarkable, abstraction-rich framework, under a hood of a "singleton" ZMQ-Context, which is (and shall remain) the only thing used as "shared".
Threads shall not "share" any other "derived" objects, the less any their state, as there is a strong distributed-responsibility framework architecture implemented, both in the sake of clean-design and a high performance & low-latency.
For all ZMQ-Socket-s one shall rather imagine a much smarter, layered sub-structure, where one receives off-loaded worries about I/O-activities ( managed inside ZMQ-Context responsibility -- thus keep-alive issues, timing issues and fair-queue buffering / select-polling issues simply cease to be visible for you ... ), with one sort of a formal communication pattern behaviour ( given by a chosen ZMQ-Socket-type archetype ).
Finally
ZeroMQ and similarly nanomsg libraries are rather LEGO-like projects, that empower you, as an architect & designer, more than one typically realises at the very beginning.
One thus can focus on distributed-system behaviour, as opposed to lose time and energy on solving just-another-socket-messaging-[nightmare].
( Definitely worth a look into both books from Pieter Hintjens, co-father of the ZeroMQ. There you find plenty Aha!-moments on this great subject. )
... and as a cherry on a cake -- you get all of this as a Transport-agnostic, universal environment, whether passing some messages on inproc://, other over ipc:// and also in parallel listening / speaking over tcp:// layers.
EDIT#12014-08-19 17:00 [UTC+0000]
Kindly check comments below and further review your -- both elementary and advanced -- design-options for a <trivial-failure-prone>-spin-off processing, for a <load-balanced>-REP-worker queueing, for a <scale-able>-distributed processing and a <fault-resilient_mode>-REP-worker binary-start shaded processing.
No heap of mock-up SLOC(s), no single code-sample will do a One-Size-Fits-All.
This is exponentially valid in designing distributed messaging systems.
"""REQ/REP modified with QUEUE/ROUTER/DEALER add-on ---------------------------
Multithreaded Hello World server
Author: Guillaume Aubert (gaubert) <guillaume(dot)aubert(at)gmail(dot)com>
"""
import time
import threading
import zmq
print "ZeroMQ version sanity-check: ", zmq.__version__
def aWorker_asRoutine( aWorker_URL, aContext = None ):
"""Worker routine"""
#Context to get inherited or create a new one trick------------------------------
aContext = aContext or zmq.Context.instance()
# Socket to talk to dispatcher --------------------------------------------------
socket = aContext.socket( zmq.REP )
socket.connect( aWorker_URL )
while True:
string = socket.recv()
print( "Received request: [ %s ]" % ( string ) )
# do some 'work' -----------------------------------------------------------
time.sleep(1)
#send reply back to client, who asked --------------------------------------
socket.send( b"World" )
def main():
"""Server routine"""
url_worker = "inproc://workers"
url_client = "tcp://*:5555"
# Prepare our context and sockets ------------------------------------------------
aLocalhostCentralContext = zmq.Context.instance()
# Socket to talk to clients ------------------------------------------------------
clients = aLocalhostCentralContext.socket( zmq.ROUTER )
clients.bind( url_client )
# Socket to talk to workers ------------------------------------------------------
workers = aLocalhostCentralContext.socket( zmq.DEALER )
workers.bind( url_worker )
# --------------------------------------------------------------------||||||||||||--
# Launch pool of worker threads --------------< or spin-off by one in OnDemandMODE >
for i in range(5):
thread = threading.Thread( target = aWorker_asRoutine, args = ( url_worker, ) )
thread.start()
zmq.device( zmq.QUEUE, clients, workers )
# ----------------------|||||||||||||||------------------------< a fair practice >--
# We never get here but clean up anyhow
clients.close()
workers.close()
aLocalhostCentralContext.term()
if __name__ == "__main__":
main()

How to kill a socket in unit tests for reconnect test

I'm trying to test some code that reconnects to a server after a disconnect. This works perfectly fine outside the tests, but it fails to acknowledge that the socket has disconnected when running the tests.
I'm using a Gevent Stream Server to mock a real listening server:
import gevent.server
from gevent import queue
class TestServer(gevent.server.StreamServer):
def __init__(self, *args, **kwargs):
super(TestServer, self).__init__(*args, **kwargs)
self.sockets = {}
def handle(self, socket, address):
self.sockets[address] = (socket, queue.Queue())
socket.sendall('testing the connection\r\n')
gevent.spawn(self.recv, address)
def recv(self, address):
socket = self.sockets[address][0]
queue = self.sockets[address][1]
print 'Connection accepted %s:%d' % address
try:
for data in socket.recv(1024):
queue.put(data)
except:
pass
def murder(self):
self.stop()
for sock in self.sockets.iteritems():
print sock
sock[1][0].shutdown(socket.SHUT_RDWR)
sock[1][0].close()
self.sockets = {}
def run_server():
test_server = TestServer(('127.0.0.1', 10666))
test_server.start()
return test_server
And my test looks like this:
def test_can_reconnect(self):
test_server = run_server()
client_config = {'host': '127.0.0.1', 'port': 10666}
client = Connection('test client', client_config, get_config())
client.connect()
assert client.socket_connected
test_server.murder()
#time.sleep(4) #tried sleeping. no dice.
assert not client.socket_connected
assert client.server_disconnect
test_server = run_server()
client.reconnect()
assert client.socket_connected
It fails at assert not client.socket_connected.
I detect for "not data" during recv. If it's None, then I set some variables so that other code can decide whether or not to reconnect (don't reconnect if it was a user_disconnect and so on). This behavior works and has always worked for me in the past, I've just never tried to make a test for it until now. Is there something odd with socket connections and local function scopes or something? it's like the connection still exists even after stopping the server.
The code I'm trying to test is open: https://github.com/kyleterry/tenyks.git
If you run the tests, you will see the one I'm trying to fix fail.

Trying to run a unit test with a real socket is a tough row to hoe. It's going to be tricky as only one set of tests can run at a time, as the server port will be used, and it's going to be slow as the sockets get set up and torn down. To top it off if this is really a unit test you don't want to test the socket, just the code that's using the socket.
If you mock the socket calls you can throw exceptions willy nilly from the mocked code and ensure that the code making use of the socket does the right thing. You don't need a real socket to ensure that the class under test does the right thing, you can fake it if you can wrap the socket calls in an object. Pass in a reference to the socket object when constructing your class and you're ready to go.
My suggestion is to wrap the socket calls in a class that supports sendall, recv, and all the methods you call on the socket. Then you can swap out the actual Socket class with a TestReconnectSocket (or whatever) and run your tests.
Take a look at mox, a python mocking framework.

Vague response, but my immediate reaction would be that your recv() call is blocking and keeping the socket alive - have you tried making the socket non-blocking, and catching the error on close instead?

One thing to keep in mind when testing sockets like this, is that operating systems don't like to reopen a socket soon after it has been in use. You can set a socket option to tell it to go ahead and reuse it anyways. Right after you create the socket set the socket's option:
mysocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
Hopefully this will fix your issue. You may have to do it on both the server and client side depending on which one is giving you the problems.

you are calling shutdown(socket.SHUT_RDWR) so this doesn't seem like a problem with recv blocking.
however, you are using gevent.socket.socket.recv, so please check your gevent version, there is an issue with recv() that causes it to block if the underlying file descriptor is closed (version < v0.13.0)
you may still need gevent.sleep() to do cooperative yield and give the client an opportunity to exit the recv() call.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.