Releasing socket from SocketServer.ForkingTCPServer python

Releasing socket from SocketServer.ForkingTCPServer python - python

I am creating a ForkingTCPServer to act as a proxy. It handles perfectly fine, but will not release the socket after executing. I have already tried .shutdown() and .server_close() but they just cause the program to freeze up. How can I release the socket so that it can be used again?
import SocketServer
import SimpleHTTPServer
import urllib
PORT = 8080
class Proxy(SimpleHTTPServer.SimpleHTTPRequestHandler):
def do_GET(self):
self.copyfile(urllib.urlopen("http://"+self.path[1:]), self.wfile)
httpd = SocketServer.ForkingTCPServer(('', PORT), Proxy)
print "serving at port", PORT
httpd.handle_request()

The handle_request method handles only one request, so the parent process answers only one request.
You can put the calls to handle_request in a loop.
Otherwise, if you mean the fact that you get an "in use" error when you fire up a new copy of the server, that's because you have to wait for the TCP 2MSL timer to time out. To work around that, set SO_REUSEADDR on the listening socket. (Set self.allow_reuse_address, although I'm not quite sure where to set it, having not done that.)

Related

TCPServer vs HTTPServer

I am writing a simple python3 webserver. I have tried various tutorials and all work pretty well. Nevertheless there is a difference that I don't understand.
In one tutorial, they use HTTPServer as follows:
server = HTTPServer(('', PORT_NUMBER), myHandler)
server.serve_forever()
In another tutorial, they use socketserver.TCPServer as follows:
with socketserver.TCPServer(('', PORT_NUMBER), myHandler) as httpd:
httpd.serve_forever()
What is the difference between both methods? All I need is a simple webserver that is able to receive JSON files through POSTs and aswer with another JSON. In both cases, I would use the same handler:
class myHandler(BaseHTTPRequestHandler):
def _set_headers(self):
self.send_response(200)
self.send_header('Content-type', 'application/json')
self.end_headers()
def do_GET(self):
self._set_headers()
self.wfile.write("{dummy:'dummy'}")
def do_POST(self):
# Doesn't do anything with posted data for this example
self._set_headers()
self.wfile.write("{dummy:'dummy'}")
Does one of the methods suit my needs better? Is there an even better way of writing this server for my needs?
Thanks for your help and time!

TCP is an OSI Layer 4 transport protocol, whereas HTTP is a higher Layer 7 application protocol, built on top of TCP.
In fact, it looks like HTTPServer inherits directly from socketserver.TCPServer, according to the documentation:
One class, HTTPServer, is a socketserver.TCPServer subclass. It
creates and listens at the HTTP socket, dispatching the requests to a
handler.
For example, TCP will take care of establishing a connection (the three-way handshake with SYN, SYN-ACK and ACK) but it doesn't really prescribe much in terms of the structure of the data interaction (request/response). If you use the TCP protocol, you'll generally need to write all your own data processing code.
All I need is a simple webserver that is able to receive JSON files
through POSTs and aswer with another JSON
This suggests the HTTP server is better suited for your need here, as the concept of a POST, and your description of an answer with another JSON is probably going to include a response (with a response body, response headers, and a response status). All of these will be HTTP concepts. As the documentation for http.server. BaseHTTPRequestHandler states, it will include an instance variable called command that is your request method (GET, for instance).
I can't tell exactly how myhandler is assigned in your example code, but looking at some of the other documentation examples, it looks like a http.server.SimpleHTTPRequestHandler:
import http.server
import socketserver
PORT = 8000
Handler = http.server.SimpleHTTPRequestHandler
with socketserver.TCPServer(("", PORT), Handler) as httpd:
print("serving at port", PORT)
httpd.serve_forever()
A frequently pattern of instantiating work for a server appears to be
To instantiate a lower-level protocol server (ie. TCP) and pass in a higher level handler (like http.server.SimpleHTTPRequestHandler), hence the socketserver.TCPServer(("", PORT), myhandler).
To use it within a context manager (with keyword), most likely because you need to tear down / deallocate resources after the server finishes execution.

How to solve twisted.internet.error.CannotListenError: Couldn't listen on any:8081: [Errno 98] Address already in use [duplicate]

This question already has answers here:
Python: Binding Socket: "Address already in use"
(13 answers)
Closed 8 years ago.
I have a twisted python server which runs on port 8080, and i have written different API's which runs on this server. so i want all these API's to run on a single port no. but when i try to use same port all API's for ex : 8081 and run at the same time using python interpreter. at that time i am getting this error : twisted.internet.error.CannotListenError: Couldn't listen on any:8081: [Errno 98] Address already in use.
As i am new to twisted so don't know much of the things and there is no proper documentation on twisted . please someone guide me to solve this error :
Here is the Code snippets :
from twisted.internet import epollreactor
epollreactor.install()
from zope.interface import implements
from twisted.internet import reactor,interfaces
from functools import partial
from pymongo import Connection
import json
from bson.objectid import ObjectId
import server_1
import replacePlus_Space
global data, data_new, datadB, coll_auth, coll_person
class Login_user(server_1.HTTPEchoFactory):
def __init__(self):
server_1.HTTPEchoFactory.__init__(self,"testsite")
server_1.HTTPEchoFactory.initResources(self)
self.initResources()
def initResources(self):
print "in Login"
self.responses["/login_user"] = partial(self.user_login)
# To connect to DB and Check error cases and insert into mongoDB..!!!
def user_login(self, client):
# some functinality..!!!!
d = Login_user()
reactor.listenTCP(8081,d)
reactor.run()
Second code snippet is :
from twisted.internet import epollreactor
epollreactor.install()
from zope.interface import implements
from twisted.internet import reactor,interfaces
from functools import partial
from pymongo import Connection
import json
from bson.objectid import ObjectId
import server_1
import replacePlus_Space
class AddFriendDB(server_1.HTTPEchoFactory):
def __init__(self):
server_1.HTTPEchoFactory.__init__(self,"testsite")
server_1.HTTPEchoFactory.initResources(self)
self.initResources()
def initResources(self):
print "in add friend"
self.responses["/addFriend_DB"] = partial(self.addFriendDB)
# To connect to DB and Check error cases and insert into mongoDB..!!!
def addFriendDB(self, client):
#some functionality here..!!!
d = AddFriendDB()
reactor.listenTCP(8081,d)
reactor.run()

Translating the question
Many of the specifics in your question don't make sense, so I'm translating your question into:
I have a twisted python standalone program which runs on port 8080, and i have written a different standalone program which runs on this server. I want both programs to run on a single port no. When i try to use same port for all programs ex : using port 8081 for both programs. I am getting this error :twisted.internet.error.CannotListenError: Couldn't listen on any:8081: [Errno 98] Address already in use. As i am new to twisted i don't know much of the things and there is no proper documentation on twisted. please someone guide me to solve this error.
Twisted has great documentation
You commented that:
[...] there is no proper documentation on twisted
That statement is blatantly false.
Twisted certainly has proper documentation, and as compared to most frameworks it has excellent documentation. To name just a few of the many high quality documentation sources:
Dave Peticola's (krondo) Twisted Introduction - an excellent and deep introduction to twisted that begins by explaining the technology Twisted is built upon. If you really want to understand twisted, this (long) introduction is the place to start
The high quality and extensive documentation on on Twisted's primary website twistedmatrix.com
The source itself. Twisted has well commented and surprisingly understand source, if the above documentation doesn't teach you what you need figure it out from the code.
What causes "Address already in use"
As previously answered by Bart Friederichs:
Only one process can listen on a certain port. If you start process 1 to listen on port 8081, and then start another process on that same port, you get this error.
It is an error from the TCP stack, not from python or twisted.
This is a fundament truth of TCP/IP stacks across all operating systems (or at least all process based operating systems that could run Python).
The error is your operating system reminding you that when data comes to a IP port the OS/IP-stack was designed to only forward to one process, a process that is implementing an application level protocol on that port. The error happens when a second program attempts to re-register a port some other program has already registered.
Work-arounds in general
Upon running into a port reuse issue like this you have to ask yourself:
Are the two programs even running the same application-level protocol?
If they are the same protocol, does the protocol have routing such that for any given transaction the correct sub-program/routine be identified?
Are the two programs at a different location in the routing?
If they aren't the same protocol (I.E. one was HTTP and the other is FTP) or they are the same protocol but its a protocol that doesn't have routing (I.E. two instances of NTP) then there is no easy way to mix them because your trying to break the laws of IP (and the way that application protocol implementations are built). The only solution will be to encapsulate one (or both) of the protocol(s) into a common protocol that also has application-level routing (I.E. encapsulating FTP inside of HTTP and using the URI to route)
If they are the same protocol, and the protocol provides for a per-transaction routing (I.E. the URI inside of HTTP transactions) and they aren't at the same routing location, then there are easier solutions, namely:
Merge the two application into one.
If they are the same routable protocol but at a different location (I.E. HTTP protocol with the URI for the first program at /login and second program at /addfriend) it should be trivial to pull out the post-routing logic of the two programs/process and merge them into one program that does both functions
Front-end the programs with a redirector (This solution is only easy with HTTP because of the tools available).
If you have HTTP protocol programs that have routing to separate locations, but for some reason its hard to merge the logic together (I.E. one program is written in Java, the other in Python) then you can look at front-ending both programs with a redirector like nginx
The way you would use a redirector in a case like this is to run your two apps on two different unused ports (I.E. 12001, 12002), and then run the redirector on the port you want the service to be on, running it with a config file that will redirect to your two programs via their unique routing locations (via configs like SF: How to redirect requests to a different domain/url with nginx)
Code example
The following three programs illustrate the process of merging two programs into one, so both sets of logic can be accessed from the same port.
Two separate programs:
If you run the following code a web server will be started up on localhost:8081. If you then point your web browser at http://127.0.0.1:8081/blah the blah page will be displayed.
#!/usr/bin/python
from twisted.internet import defer, protocol, reactor # the reactor
from twisted.web.server import Site # make the webserver go
from twisted.web.resource import Resource
class BlahPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Blah Page!</body></html>"
class ShutdownPage(Resource):
def render_GET(self, request):
reactor.stop()
webroot = Resource()
webroot.putChild("blah", BlahPage())
webroot.putChild("shutdown", ShutdownPage())
def main():
# Register the webserver (TCP server) into twisted
webfactory = Site(webroot)
reactor.listenTCP(8081, webfactory)
print ("Starting server")
reactor.run()
if __name__ == '__main__':
main()
This code will start a web server on localhost:8082. If you then point your web browser at http://127.0.0.1:8082/foo the foo page will be displayed.
#!/usr/bin/python
from twisted.internet import defer, protocol, reactor # the reactor
from twisted.web.server import Site # make the webserver go
from twisted.web.resource import Resource
class FooPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Foo Page!</body></html>"
class ShutdownPage(Resource):
def render_GET(self, request):
reactor.stop()
webroot = Resource()
webroot.putChild("foo", FooPage())
webroot.putChild("shutdown", ShutdownPage())
def main():
# Register the webserver (TCP server) into twisted
webfactory = Site(webroot)
reactor.listenTCP(8082, webfactory)
print ("Starting server")
reactor.run()
if __name__ == '__main__':
main()
Merging the logic
This code is the merger of the two previous programs, as you can see it only required copying a small amount of code to glue both of the above into one that allows for access to http://127.0.0.1:8080/blah and http://127.0.0.1:8080/blah.
#!/usr/bin/python
from twisted.internet import defer, protocol, reactor # the reactor
from twisted.web.server import Site # make the webserver go
from twisted.web.resource import Resource
class BlahPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Blah Page!</body></html>"
class FooPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Foo Page!</body></html>"
class ShutdownPage(Resource):
def render_GET(self, request):
reactor.stop()
webroot = Resource()
webroot.putChild("foo", FooPage())
webroot.putChild("blah", BlahPage())
webroot.putChild("shutdown", ShutdownPage())
def main():
# Register the webserver (TCP server) into twisted
webfactory = Site(webroot)
reactor.listenTCP(8080, webfactory)
print ("Starting server")
reactor.run()
if __name__ == '__main__':
main()

Only one process can listen on a certain port. If you start process 1 to listen on port 8081, and then start another process on that same port, you get this error.
It is an error from the TCP stack, not from python or twisted.
Fix it either by choosing different ports for each process, or create a forking server.

How to stop a Flask server running gevent-socketio

I have a flask application running with gevent-socketio that I create this way:
server = SocketIOServer(('localhost', 2345), app, resource='socket.io')
gevent.spawn(send_queued_messages_loop, server)
server.serve_forever()
I launch send_queued_messages_loop in a gevent thread that keeps on polling on a gevent.Queue where my program stores data to send it to the socket.io connected clients
I tried different approaches to stop the server (such as using sys.exit) either from the socket.io handler (when the client sends a socket.io message) or from a normal route (when the client makes a request to /shutdown) but in any case, sys.exit seems to fail because of the presence of greenlets.
I tried to call gevent.shutdown() first, but this does not seem to change anything
What would be the proper way to shutdown the server?

Instead of using serve_forever() create a gevent.event.Event and wait for it. To actually initiate shutdown, trigger the event using its set() method:
from gevent.event import Event
stopper = Event()
server = SocketIOServer(('localhost', 2345), app, resource='socket.io')
server.start()
gevent.spawn(send_queued_messages_loop)
try:
stopper.wait()
except KeyboardInterrupt:
print
No matter from where you now want to terminate your process - all you need to do is calling stopper.set().
The try..except is not really necessary but I prefer not getting a stacktrace on a clean CTRL-C exit.

Slow Python HTTP server on localhost

I am experiencing some performance problems when creating a very simple Python HTTP server. The key issue is that performance is varying depending on which client I use to access it, where the server and all clients are being run on the local machine. For instance, a GET request issued from a Python script (urllib2.urlopen('http://localhost/').read()) takes just over a second to complete, which seems slow considering that the server is under no load. Running the GET request from Excel using MSXML2.ServerXMLHTTP also feels slow. However, requesting the data Google Chrome or from RCurl, the curl add-in for R, yields an essentially instantaneous response, which is what I would expect.
Adding further to my confusion is that I do not experience any performance problems for any client when I am on my computer at work (the performance problems are on my home computer). Both systems run Python 2.6, although the work computer runs Windows XP instead of 7.
Below is my very simple server example, which simply returns 'Hello world' for any get request.
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
class MyHandler(BaseHTTPRequestHandler):
def do_GET(self):
print("Just received a GET request")
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write('Hello world')
return
def log_request(self, code=None, size=None):
print('Request')
def log_message(self, format, *args):
print('Message')
if __name__ == "__main__":
try:
server = HTTPServer(('localhost', 80), MyHandler)
print('Started http server')
server.serve_forever()
except KeyboardInterrupt:
print('^C received, shutting down server')
server.socket.close()
Note that in MyHandler I override the log_request() and log_message() functions. The reason is that I read that a fully-qualified domain name lookup performed by one of these functions might be a reason for a slow server. Unfortunately setting them to just print a static message did not solve my problem.
Also, notice that I have put in a print() statement as the first line of the do_GET() routine in MyHandler. The slowness occurs prior to this message being printed, meaning that none of the stuff that comes after it is causing a delay.

The request handler issues a inverse name lookup in order to display the client name in the log. My Windows 7 issues a first DNS lookup that fails with no delay, followed by 2 successive NetBIOS name queries to the HTTP client, and each one run into a 2 sec timeout = 4 seconds delay !!
Have a look at https://bugs.python.org/issue6085
Another fix that worked for me is to override BaseHTTPRequestHandler.address_string() in my request handler with a version that does not perform the name lookup
def address_string(self):
host, port = self.client_address[:2]
#return socket.getfqdn(host)
return host
Philippe

This does not sound like a problem with the code. A nifty way of troubleshooting an HTTP server is to connect to it to telnet to it on port 80. Then you can type something like:
GET /index.html HTTP/1.1
host: www.blah.com
<enter> <enter>
and observe the server's response. See if you get a delay using this approach.
You may also want to turn off any firewalls to see if they are responsible for the slowdown.
Try replacing 127.0.0.1 for localhost. If that solves the problem, then that is a clue that the FQDN lookup may indeed be the possible cause.

Replacing localhost with 127.0.0.1 can solve the problem:)

Python: How to shutdown a threaded HTTP server with persistent connections (how to kill readline() from another thread)?

I'm using python2.6 with HTTPServer and the ThreadingMixIn, which will handle each request in a separate thread. I'm also using HTTP1.1 persistent connections ('Connection: keep-alive'), so neither the server or client will close a connection after a request.
Here's roughly what the request handler looks like
request, client_address = sock.accept()
rfile = request.makefile('rb', rbufsize)
wfile = request.makefile('wb', wbufsize)
global server_stopping
while not server_stopping:
request_line = rfile.readline() # 'GET / HTTP/1.1'
# etc - parse the full request, write to wfile with server response, etc
wfile.close()
rfile.close()
request.close()
The problem is that if I stop the server, there will still be a few threads waiting on rfile.readline().
I would put a select([rfile, closefile], [], []) above the readline() and write to closefile when I want to shutdown the server, but I don't think it would work on windows because select only works with sockets.
My other idea is to keep track of all the running requests and rfile.close() but I get Broken pipe errors.
Ideas?

You're almost there—the correct approach is to call rfile.close() and to catch the broken pipe errors and exit your loop when that happens.

If you set daemon_threads to true in your HTTPServer subclass, the activity of the threads will not prevent the server from exiting.
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
daemon_threads = True

You could work around the Windows problem by making closefile a socket, too -- after all, since it's presumably something that's opened by your main thread, it's up to you to decide whether to open it as a socket or a file;-).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.