from twisted.web.resource import Resource
from twisted.web.server import Site, Session
from twisted.internet import ssl
from twisted.internet import reactor
class Echo(Resource):
def render_GET(self, request):
return "GET"
class WebSite(Resource):
def start(self):
factory = Site(self, timeout=5)
factory.sessionFactory = Session
self.putChild("echo", Echo())
reactor.listenSSL(443, factory, ssl.DefaultOpenSSLContextFactory('privkey.pem', 'cacert.pem'))
#reactor.listenTCP(8080, factory)
self.sessions = factory.sessions
if __name__ == '__main__':
ws = WebSite()
ws.start()
reactor.run()
On the code above, when i enter the url "https: //localhost/echo" from the web browser, it gets the page. After 5 seconds later i try to reload the page, it does not refresh the web page, stuck on reloading operation. On the second attempt of reload, it gets the page instantly.
When i run the code above with reactor.listenTCP(8080, factory), no such problem occurs. (I can reload page without stucking reload and get the page instantly)
Problem can be repeated with Chrome, Firefox. But when i try it with Ubuntu's Epiphany browser, no such problem occurs.
I could not understand why this occurs.
Any comment about understanding/solving problem will be appriciated.
Extra info:
When i use listenSSL, file descriptor related with the connection does not close after timeout seconds later. While reloading page it stays still, and on the second reload operation, it is closed and new file descriptor is opened. (and i get page instantly)
When i use listenTCP, file descriptor closes after timeout seconds later, and when i reload page it opens new file descriptor and return page instantly.
Also with Telnet connection, it timeout connections as expected in both case.
Twisted client that connects this server also affects timeouts as expected.
The class that timeout connection is TimeoutMixin class.
and it uses transport.loseConneciton() method to timeout connections.
Somehow, DefaultOpenSSLFactory uses the connection(?), therefore loseConnection method waits for finishing the transportation and at that time it doesn't accept any process on the connection.
According to twisted documentation:
In the code above, loseConnection is called immediately after writing to the transport. The loseConnection call will close the connection only when all the data has been written by Twisted out to the operating system, so it is safe to use in this case without worrying about transport writes being lost. If a producer is being used with the transport, loseConnection will only close the connection once the producer is unregistered.
In some cases, waiting until all the data is written out is not what
we want. Due to network failures, or bugs or maliciousness in the
other side of the connection, data written to the transport may not be
deliverable, and so even though loseConnection was called the
connection will not be lost. In these cases, abortConnection can be
used: it closes the connection immediately, regardless of buffered
data that is still unwritten in the transport, or producers that are
still registered. Note that abortConnection is only available in
Twisted 11.1 and newer.
As a result, when i change loseConnection() with abortConnection() on timeoutMixinClass via overriding it, situation is no more occuring.
When i clarify the reason of why loseConnection is not enough to close connection on specific situations, i'll note it here. (any comment about it will be appreciated)
Related
I'm working on time series charts for 300+ clients.
It is beneficial to us to pull each client separately as the combined data is huge and in some cases clients data is resampled or manipulated in a slightly different fashion.
My problem is that the function I loop through to get each client data opens 3 new threads but never closes the threads (I'm assuming the connection stays open) when the request is complete and the function returns the data.
Once I have the results of a client, I'd like to close that connection. I just can't figure out how to do that and haven't been able to find anything in my searches.
def solr_data_pull(submitterId):
zookeeper= pysolr.ZooKeeper('ndhhadr1dnp11,ndhhadr1dnp12,ndhhadr1dnp13:2181/solr')
solr = pysolr.SolrCloud(zookeeper, collection='tran_timings', timeout=60)
query = ('SubmitterId:'+ str(submitterId) +' AND Tier:'+tier+' AND Mode:'+mode+' '
'AND Timestamp:['+ str(start_period)+' TO '+ str(end_period)+ '] ')
results = solr.search(rows=50000, q=[query], fl=[fl_list])
return(pd.DataFrame(list(results)))
PySolr uses the Session object from requests as its underlying library (which in turn uses urllib3s connection pooling), so calling solr.get_session().close() should close all connections and drain the pool:
def close(self):
"""Closes all adapters and as such the session"""
(SolrCloud is an extension of Solr which have the get_session() method.)
For disconnecting from Zookeeper - which you probably shouldn't if its a long running session as it'll have to set up watches etc. again, you can use the .zk object directly on your SolrCloud instance - zk is a KazooClient:
stop()
Gracefully stop this Zookeeper session.
close()
Free any resources held by the client.
This method should be called on a stopped client before
it is discarded. Not doing so may result in filehandles
being leaked.
Suppose I've got a simple Tornado web server, which starts like this:
app = ... # create an Application
srv = tornado.httpserver.HTTPServer(app)
srv.bind(port)
srv.start()
tornado.ioloop.IOLoop.instance().start()
I am writing an "end-to-end" test, which starts the server in a separate process with subprocess.Popen and then calls the server over HTTP. Now I need to make sure the server did not fail to start (e.g. because the port is busy) and then wait till server is ready.
I wrote a function to wait until the server gets ready :
def wait_till_ready(port, n=10, time_out=0.5):
for i in range(n):
try:
requests.get("http://localhost:" + str(port))
return
except requests.exceptions.ConnectionError:
time.sleep(time_out)
raise Exception("failed to connect to the server")
Is there a better way ?
How can the parent process, which forks and execs the server, make sure that the server didn't fail because the server port is busy for example ? (I can change the server code if I need it).
You could approach it in two ways:
Make a pipe / queue before you fork. Then, just before you start the io loop, notify the parent that everything went fine and you're ready for the request.
Open the port and bind to it before forking. You should make sure you close that socket on the parent side. But otherwise, the only thing which needs to run in the child is the io loop. You can handle all the other errors before the fork.
This is a simple client-server example where the server returns whatever the client sends, but reversed.
Server:
import socketserver
class MyTCPHandler(socketserver.BaseRequestHandler):
def handle(self):
self.data = self.request.recv(1024)
print('RECEIVED: ' + str(self.data))
self.request.sendall(str(self.data)[::-1].encode('utf-8'))
server = socketserver.TCPServer(('localhost', 9999), MyTCPHandler)
server.serve_forever()
Client:
import socket
import threading
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(('localhost',9999))
def readData():
while True:
data = s.recv(1024)
if data:
print('Received: ' + data.decode('utf-8'))
t1 = threading.Thread(target=readData)
t1.start()
def sendData():
while True:
intxt = input()
s.send(intxt.encode('utf-8'))
t2 = threading.Thread(target=sendData)
t2.start()
I took the server from an example I found on Google, but the client was written from scratch. The idea was having a client that can keep sending and receiving data from the server indefinitely.
Sending the first message with the client works. But when I try to send a second message, I get this error:
ConnectionAbortedError: [WinError 10053] An established connection was
aborted by the software in your host machine
What am I doing wrong?
For TCPServer, the handle method of the handler gets called once to handle the entire session. This may not be entirely clear from the documentation, but socketserver is, like many libraries in the stdlib, meant to serve as clear sample code as well as to be used directly, which is why the docs link to the source, where you can clearly see that it's only going to call handle once per connection (TCPServer.get_request is defined as just calling accept on the socket).
So, your server receives one buffer, sends back a response, and then quits, closing the connection.
To fix this, you need to use a loop:
def handle(self):
while True:
self.data = self.request.recv(1024)
if not self.data:
print('DISCONNECTED')
break
print('RECEIVED: ' + str(self.data))
self.request.sendall(str(self.data)[::-1].encode('utf-8'))
A few side notes:
First, using BaseRequestHandler on its own only allows you to handle one client connection at a time. As the introduction in the docs says:
These four classes process requests synchronously; each request must be completed before the next request can be started. This isn’t suitable if each request takes a long time to complete, because it requires a lot of computation, or because it returns a lot of data which the client is slow to process. The solution is to create a separate process or thread to handle each request; the ForkingMixIn and ThreadingMixIn mix-in classes can be used to support asynchronous behaviour.
Those mixin classes are described further in the rest of the introduction, and farther down the page, and at the bottom, with a nice example at the end. The docs don't make it clear, but if you need to do any CPU-intensive work in your handler, you want ForkingMixIn; if you need to share data between handlers, you want ThreadingMixIn; otherwise it doesn't matter much which you choose.
Note that if you're trying to handle a large number of simultaneous clients (more than a couple dozen), neither forking nor threading is really appropriate—which means TCPServer isn't really appropriate. For that case, you probably want asyncio, or a third-party library (Twisted, gevent, etc.).
Calling str(self.data) is a bad idea. You're just going to get the source-code-compatible representation of the byte string, like b'spam\n'. What you want is to decode the byte string into the equivalent Unicode string: self.data.decode('utf8').
There's no guarantee that each sendall on one side will match up with a single recv on the other side. TCP is a stream of bytes, not a stream of messages; it's perfectly possible to get half a message in one recv, and two and a half messages in the next one. When testing with a single connection on localhost with the system under light load, it will probably appear to "work", but as soon as you try to deploy any code that assumes that each recv gets exactly one message, your code will break. See Sockets are byte streams, not message streams for more details. Note that if your messages are just lines of text (as they are in your example), using StreamRequestHandler and its rfile attribute, instead of BaseRequestHandler and its request attribute, solves this problem trivially.
You probably want to set server.allow_reuse_address = True. Otherwise, if you quit the server and re-launch it again too quickly, it'll fail with an error like OSError: [Errno 48] Address already in use.
I have an application that uses PyQt4 and python-twisted to maintain a connection to another program. I am using "qt4reactor.py" as found here. This is all packaged up using py2exe. The application works wonderfully for 99% of users, but one user has reported that networking is failing completely on his Windows system. No other users report the issue, and I cannot replicate it on my own Windows VM. The user reports no abnormal configuration.
The debugging logs show that the reactor.connectTCP() call is executing immediately, even though the reactor hasn't been started yet! There's no mistaking run order because this is a single-threaded process with 60 sec of computation and multiple log messages between this line and when the reactor is supposed to start.
There's a lot of code, so I am only putting in pseudo-code, hoping that there is a general solution for this issue. I will link to the actual code below it.
import qt4reactor
qt4reactor.install()
# Start setting up main window
# ...
from twisted.internet import reactor
# Separate listener for detecting/processing multiple instances
self.InstanceListener = ListenerFactory(...)
reactor.listenTCP(LISTEN_PORT, self.InstanceListener)
# The active/main connection
self.NetworkingFactory = ClientFactory(...)
reactor.connectTCP(ACTIVE_IP, ACTIVE_PORT, self.NetworkingFactory)
# Finish setting up main window
# ...
from twisted.internet import reactor
reactor.runReturn()
The code is nested throughout the Armory project files. ArmoryQt.py (containing the above code) and armoryengine.py (containing the ReconnectingClientFactory subclass used for this connection).
So, the reactor.connectTCP() call executes immediately. The client code executes the send command and then immediately connectionLost() gets called. It does not appear to try to reconnect. It also doesn't throw any errors other than connectionLost(). Even more mysteriously, it receives messages from the remote node later on, and this app even processes them! But it believes it's not connected (and handshake never finished, so the remote node shouldn't be sending messages, but might be a bug/oversight in that program).
What on earth is going on!? How could the reactor get started before I tell it to start? I searched the code and found no other code that (I believe) could start the reactor.
The API that you're looking for is twisted.internet.reactor.callWhenRunning.
However, it wouldn't hurt to have less than 60 seconds of computation at startup, either :). Perhaps you should spread that out, or delegate it to a thread, if it's relatively independent?
I am working on a client for a web service using pycurl. The client opens a connection to a stream service and spawns it into a separate thread. Here's a stripped down version of how the connection is set up:
def _setup_connection(self):
self.conn = pycurl.Curl()
self.conn.setopt(pycurl.URL, FILTER_URL)
self.conn.setopt(pycurl.POST, 1)
.
.
.
self.conn.setopt(pycurl.HTTPHEADER, headers_list)
self.conn.setopt(pycurl.WRITEFUNCTION, self.local_callback)
def up(self):
if self.conn is None:
self._setup_connection()
self.perform()
Now, when i want to shut the connection down, if I call
self.conn.close()
I get the following exception:
error: cannot invoke close() - perform() is currently running
Which, in some way makes sense, the connection is constantly open. I've been hunting around and cant seem to find any way to circumvent this problem and close the connection cleanly.
It sounds like you are invoking close() in one thread while another thread is executing perform(). Luckily, the library warns you rather than descending into unknown behavior-ville.
You should only use the curl session from one thread - or have the perform() thread somehow communicate when the call to perform() is complete.
You obviously showed some methods from a curl wrapper class, what you need to do is to let the object handles itself.
def __del__(self):
self.conn.close()
and don't call the closing explicitly. When the object finishes its job and all the references to it are removed, the curl connection will be closed.