TCPServer vs HTTPServer - python

I am writing a simple python3 webserver. I have tried various tutorials and all work pretty well. Nevertheless there is a difference that I don't understand.
In one tutorial, they use HTTPServer as follows:
server = HTTPServer(('', PORT_NUMBER), myHandler)
server.serve_forever()
In another tutorial, they use socketserver.TCPServer as follows:
with socketserver.TCPServer(('', PORT_NUMBER), myHandler) as httpd:
httpd.serve_forever()
What is the difference between both methods? All I need is a simple webserver that is able to receive JSON files through POSTs and aswer with another JSON. In both cases, I would use the same handler:
class myHandler(BaseHTTPRequestHandler):
def _set_headers(self):
self.send_response(200)
self.send_header('Content-type', 'application/json')
self.end_headers()
def do_GET(self):
self._set_headers()
self.wfile.write("{dummy:'dummy'}")
def do_POST(self):
# Doesn't do anything with posted data for this example
self._set_headers()
self.wfile.write("{dummy:'dummy'}")
Does one of the methods suit my needs better? Is there an even better way of writing this server for my needs?
Thanks for your help and time!

TCP is an OSI Layer 4 transport protocol, whereas HTTP is a higher Layer 7 application protocol, built on top of TCP.
In fact, it looks like HTTPServer inherits directly from socketserver.TCPServer, according to the documentation:
One class, HTTPServer, is a socketserver.TCPServer subclass. It
creates and listens at the HTTP socket, dispatching the requests to a
handler.
For example, TCP will take care of establishing a connection (the three-way handshake with SYN, SYN-ACK and ACK) but it doesn't really prescribe much in terms of the structure of the data interaction (request/response). If you use the TCP protocol, you'll generally need to write all your own data processing code.
All I need is a simple webserver that is able to receive JSON files
through POSTs and aswer with another JSON
This suggests the HTTP server is better suited for your need here, as the concept of a POST, and your description of an answer with another JSON is probably going to include a response (with a response body, response headers, and a response status). All of these will be HTTP concepts. As the documentation for http.server. BaseHTTPRequestHandler states, it will include an instance variable called command that is your request method (GET, for instance).
I can't tell exactly how myhandler is assigned in your example code, but looking at some of the other documentation examples, it looks like a http.server.SimpleHTTPRequestHandler:
import http.server
import socketserver
PORT = 8000
Handler = http.server.SimpleHTTPRequestHandler
with socketserver.TCPServer(("", PORT), Handler) as httpd:
print("serving at port", PORT)
httpd.serve_forever()
A frequently pattern of instantiating work for a server appears to be
To instantiate a lower-level protocol server (ie. TCP) and pass in a higher level handler (like http.server.SimpleHTTPRequestHandler), hence the socketserver.TCPServer(("", PORT), myhandler).
To use it within a context manager (with keyword), most likely because you need to tear down / deallocate resources after the server finishes execution.

Related

http.server not serving pages as per example from python.org

I'm trying to start a simple dir server on a local network But I am getting this error
Error response
Error code: 501
Message: Unsupported method ('GET').
Error code explanation: HTTPStatus.NOT_IMPLEMENTED - Server does not
support this operation.
This is the example given at https://docs.python.org/3/library/http.server.html If I run it from the command line it works python3 -m http.server. I need to control this server over time so I need to turn it on for a while and turn it off automatically
from http.server import BaseHTTPRequestHandler, HTTPServer
def run(server_class=HTTPServer, handler_class=BaseHTTPRequestHandler):
server_address = ('0.0.0.0', 8000)
httpd = server_class(server_address, handler_class)
httpd.serve_forever()
The answer is in the documentation that you linked to:
The HTTPServer must be given a RequestHandlerClass on
instantiation, of which this module provides three different variants:
class http.server.BaseHTTPRequestHandler(request, client_address, server)
This class is used to handle the HTTP requests that arrive at the
server. By itself, it cannot respond to any actual HTTP requests; it
must be subclassed to handle each request method (e.g. GET or
POST). ...
For your case you should use http.server.SimpleHTTPRequestHandler instead:
class http.server.SimpleHTTPRequestHandler(request, client_address, server)
This class serves files from the current directory and below, directly
mapping the directory structure to HTTP requests.

How to solve twisted.internet.error.CannotListenError: Couldn't listen on any:8081: [Errno 98] Address already in use [duplicate]

This question already has answers here:
Python: Binding Socket: "Address already in use"
(13 answers)
Closed 8 years ago.
I have a twisted python server which runs on port 8080, and i have written different API's which runs on this server. so i want all these API's to run on a single port no. but when i try to use same port all API's for ex : 8081 and run at the same time using python interpreter. at that time i am getting this error : twisted.internet.error.CannotListenError: Couldn't listen on any:8081: [Errno 98] Address already in use.
As i am new to twisted so don't know much of the things and there is no proper documentation on twisted . please someone guide me to solve this error :
Here is the Code snippets :
from twisted.internet import epollreactor
epollreactor.install()
from zope.interface import implements
from twisted.internet import reactor,interfaces
from functools import partial
from pymongo import Connection
import json
from bson.objectid import ObjectId
import server_1
import replacePlus_Space
global data, data_new, datadB, coll_auth, coll_person
class Login_user(server_1.HTTPEchoFactory):
def __init__(self):
server_1.HTTPEchoFactory.__init__(self,"testsite")
server_1.HTTPEchoFactory.initResources(self)
self.initResources()
def initResources(self):
print "in Login"
self.responses["/login_user"] = partial(self.user_login)
# To connect to DB and Check error cases and insert into mongoDB..!!!
def user_login(self, client):
# some functinality..!!!!
d = Login_user()
reactor.listenTCP(8081,d)
reactor.run()
Second code snippet is :
from twisted.internet import epollreactor
epollreactor.install()
from zope.interface import implements
from twisted.internet import reactor,interfaces
from functools import partial
from pymongo import Connection
import json
from bson.objectid import ObjectId
import server_1
import replacePlus_Space
class AddFriendDB(server_1.HTTPEchoFactory):
def __init__(self):
server_1.HTTPEchoFactory.__init__(self,"testsite")
server_1.HTTPEchoFactory.initResources(self)
self.initResources()
def initResources(self):
print "in add friend"
self.responses["/addFriend_DB"] = partial(self.addFriendDB)
# To connect to DB and Check error cases and insert into mongoDB..!!!
def addFriendDB(self, client):
#some functionality here..!!!
d = AddFriendDB()
reactor.listenTCP(8081,d)
reactor.run()
Translating the question
Many of the specifics in your question don't make sense, so I'm translating your question into:
I have a twisted python standalone program which runs on port 8080, and i have written a different standalone program which runs on this server. I want both programs to run on a single port no. When i try to use same port for all programs ex : using port 8081 for both programs. I am getting this error :twisted.internet.error.CannotListenError: Couldn't listen on any:8081: [Errno 98] Address already in use. As i am new to twisted i don't know much of the things and there is no proper documentation on twisted. please someone guide me to solve this error.
Twisted has great documentation
You commented that:
[...] there is no proper documentation on twisted
That statement is blatantly false.
Twisted certainly has proper documentation, and as compared to most frameworks it has excellent documentation. To name just a few of the many high quality documentation sources:
Dave Peticola's (krondo) Twisted Introduction - an excellent and deep introduction to twisted that begins by explaining the technology Twisted is built upon. If you really want to understand twisted, this (long) introduction is the place to start
The high quality and extensive documentation on on Twisted's primary website twistedmatrix.com
The source itself. Twisted has well commented and surprisingly understand source, if the above documentation doesn't teach you what you need figure it out from the code.
What causes "Address already in use"
As previously answered by Bart Friederichs:
Only one process can listen on a certain port. If you start process 1 to listen on port 8081, and then start another process on that same port, you get this error.
It is an error from the TCP stack, not from python or twisted.
This is a fundament truth of TCP/IP stacks across all operating systems (or at least all process based operating systems that could run Python).
The error is your operating system reminding you that when data comes to a IP port the OS/IP-stack was designed to only forward to one process, a process that is implementing an application level protocol on that port. The error happens when a second program attempts to re-register a port some other program has already registered.
Work-arounds in general
Upon running into a port reuse issue like this you have to ask yourself:
Are the two programs even running the same application-level protocol?
If they are the same protocol, does the protocol have routing such that for any given transaction the correct sub-program/routine be identified?
Are the two programs at a different location in the routing?
If they aren't the same protocol (I.E. one was HTTP and the other is FTP) or they are the same protocol but its a protocol that doesn't have routing (I.E. two instances of NTP) then there is no easy way to mix them because your trying to break the laws of IP (and the way that application protocol implementations are built). The only solution will be to encapsulate one (or both) of the protocol(s) into a common protocol that also has application-level routing (I.E. encapsulating FTP inside of HTTP and using the URI to route)
If they are the same protocol, and the protocol provides for a per-transaction routing (I.E. the URI inside of HTTP transactions) and they aren't at the same routing location, then there are easier solutions, namely:
Merge the two application into one.
If they are the same routable protocol but at a different location (I.E. HTTP protocol with the URI for the first program at /login and second program at /addfriend) it should be trivial to pull out the post-routing logic of the two programs/process and merge them into one program that does both functions
Front-end the programs with a redirector (This solution is only easy with HTTP because of the tools available).
If you have HTTP protocol programs that have routing to separate locations, but for some reason its hard to merge the logic together (I.E. one program is written in Java, the other in Python) then you can look at front-ending both programs with a redirector like nginx
The way you would use a redirector in a case like this is to run your two apps on two different unused ports (I.E. 12001, 12002), and then run the redirector on the port you want the service to be on, running it with a config file that will redirect to your two programs via their unique routing locations (via configs like SF: How to redirect requests to a different domain/url with nginx)
Code example
The following three programs illustrate the process of merging two programs into one, so both sets of logic can be accessed from the same port.
Two separate programs:
If you run the following code a web server will be started up on localhost:8081. If you then point your web browser at http://127.0.0.1:8081/blah the blah page will be displayed.
#!/usr/bin/python
from twisted.internet import defer, protocol, reactor # the reactor
from twisted.web.server import Site # make the webserver go
from twisted.web.resource import Resource
class BlahPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Blah Page!</body></html>"
class ShutdownPage(Resource):
def render_GET(self, request):
reactor.stop()
webroot = Resource()
webroot.putChild("blah", BlahPage())
webroot.putChild("shutdown", ShutdownPage())
def main():
# Register the webserver (TCP server) into twisted
webfactory = Site(webroot)
reactor.listenTCP(8081, webfactory)
print ("Starting server")
reactor.run()
if __name__ == '__main__':
main()
This code will start a web server on localhost:8082. If you then point your web browser at http://127.0.0.1:8082/foo the foo page will be displayed.
#!/usr/bin/python
from twisted.internet import defer, protocol, reactor # the reactor
from twisted.web.server import Site # make the webserver go
from twisted.web.resource import Resource
class FooPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Foo Page!</body></html>"
class ShutdownPage(Resource):
def render_GET(self, request):
reactor.stop()
webroot = Resource()
webroot.putChild("foo", FooPage())
webroot.putChild("shutdown", ShutdownPage())
def main():
# Register the webserver (TCP server) into twisted
webfactory = Site(webroot)
reactor.listenTCP(8082, webfactory)
print ("Starting server")
reactor.run()
if __name__ == '__main__':
main()
Merging the logic
This code is the merger of the two previous programs, as you can see it only required copying a small amount of code to glue both of the above into one that allows for access to http://127.0.0.1:8080/blah and http://127.0.0.1:8080/blah.
#!/usr/bin/python
from twisted.internet import defer, protocol, reactor # the reactor
from twisted.web.server import Site # make the webserver go
from twisted.web.resource import Resource
class BlahPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Blah Page!</body></html>"
class FooPage(Resource):
idLeaf = True
def render_GET(self, request):
return "<html><body>Foo Page!</body></html>"
class ShutdownPage(Resource):
def render_GET(self, request):
reactor.stop()
webroot = Resource()
webroot.putChild("foo", FooPage())
webroot.putChild("blah", BlahPage())
webroot.putChild("shutdown", ShutdownPage())
def main():
# Register the webserver (TCP server) into twisted
webfactory = Site(webroot)
reactor.listenTCP(8080, webfactory)
print ("Starting server")
reactor.run()
if __name__ == '__main__':
main()
Only one process can listen on a certain port. If you start process 1 to listen on port 8081, and then start another process on that same port, you get this error.
It is an error from the TCP stack, not from python or twisted.
Fix it either by choosing different ports for each process, or create a forking server.

How do I send and receive HTTP POST requests in Python?

I have these two Python scripts I'm using to attempt to work out how to send and receive POST requests in Python:
The Client:
import httplib
conn = httplib.HTTPConnection("localhost:8000")
conn.request("POST", "/testurl")
conn.send("clientdata")
response = conn.getresponse()
conn.close()
print(response.read())
The Server:
from BaseHTTPServer import BaseHTTPRequestHandler,HTTPServer
ADDR = "localhost"
PORT = 8000
class RequestHandler(BaseHTTPRequestHandler):
def do_POST(self):
print(self.path)
print(self.rfile.read())
self.send_response(200, "OK")
self.end_headers()
self.wfile.write("serverdata")
httpd = HTTPServer((ADDR, PORT), RequestHandler)
httpd.serve_forever()
The problem is that the server hangs on self.rfile.read() until conn.close() has been called on the client but if conn.close() is called on the client the client cannot receive a response from the server. This creates a situation where one can either get a response from the server or read the POST data but never both. I assume there is something I'm missing here that will fix this problem.
Additional information:
conn.getresponse() causes the client to hang until the response is received from the server. The response doesn't appear to be received until the function on the server has finished execution.
There are a couple of issues with your original example. The first is that if you use the request method, you should include the message body you want to send in that call, rather than calling send separately. The documentation notes send() can be used as an alternative to request:
As an alternative to using the request() method described above, you
can also send your request step by step, by using the four functions
below.
You just want conn.request("POST", "/testurl", "clientdata").
The second issue is the way you're trying to read what's sent to the server. self.rfile.read() attempts to read the entire input stream coming from the client, which means it will block until the stream is closed. The stream won't be closed until connection is closed. What you want to do is read exactly how many bytes were sent from the client, and that's it. How do you know how many bytes that is? The headers, of course:
length = int(self.headers['Content-length'])
print(self.rfile.read(length))
I do highly recommend the python-requests library if you're going to do more than very basic tests. I also recommend using a better HTTP framework/server than BaseHTTPServer for more than very basic tests (flask, bottle, tornado, etc.).
Long time answered but came up during a search so I bring another piece of answer. To prevent the server to keep the stream open (resulting in the response never being sent), you should use self.rfile.read1() instead of self.rfile.read()

Releasing socket from SocketServer.ForkingTCPServer python

I am creating a ForkingTCPServer to act as a proxy. It handles perfectly fine, but will not release the socket after executing. I have already tried .shutdown() and .server_close() but they just cause the program to freeze up. How can I release the socket so that it can be used again?
import SocketServer
import SimpleHTTPServer
import urllib
PORT = 8080
class Proxy(SimpleHTTPServer.SimpleHTTPRequestHandler):
def do_GET(self):
self.copyfile(urllib.urlopen("http://"+self.path[1:]), self.wfile)
httpd = SocketServer.ForkingTCPServer(('', PORT), Proxy)
print "serving at port", PORT
httpd.handle_request()
The handle_request method handles only one request, so the parent process answers only one request.
You can put the calls to handle_request in a loop.
Otherwise, if you mean the fact that you get an "in use" error when you fire up a new copy of the server, that's because you have to wait for the TCP 2MSL timer to time out. To work around that, set SO_REUSEADDR on the listening socket. (Set self.allow_reuse_address, although I'm not quite sure where to set it, having not done that.)

Slow Python HTTP server on localhost

I am experiencing some performance problems when creating a very simple Python HTTP server. The key issue is that performance is varying depending on which client I use to access it, where the server and all clients are being run on the local machine. For instance, a GET request issued from a Python script (urllib2.urlopen('http://localhost/').read()) takes just over a second to complete, which seems slow considering that the server is under no load. Running the GET request from Excel using MSXML2.ServerXMLHTTP also feels slow. However, requesting the data Google Chrome or from RCurl, the curl add-in for R, yields an essentially instantaneous response, which is what I would expect.
Adding further to my confusion is that I do not experience any performance problems for any client when I am on my computer at work (the performance problems are on my home computer). Both systems run Python 2.6, although the work computer runs Windows XP instead of 7.
Below is my very simple server example, which simply returns 'Hello world' for any get request.
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
class MyHandler(BaseHTTPRequestHandler):
def do_GET(self):
print("Just received a GET request")
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
self.wfile.write('Hello world')
return
def log_request(self, code=None, size=None):
print('Request')
def log_message(self, format, *args):
print('Message')
if __name__ == "__main__":
try:
server = HTTPServer(('localhost', 80), MyHandler)
print('Started http server')
server.serve_forever()
except KeyboardInterrupt:
print('^C received, shutting down server')
server.socket.close()
Note that in MyHandler I override the log_request() and log_message() functions. The reason is that I read that a fully-qualified domain name lookup performed by one of these functions might be a reason for a slow server. Unfortunately setting them to just print a static message did not solve my problem.
Also, notice that I have put in a print() statement as the first line of the do_GET() routine in MyHandler. The slowness occurs prior to this message being printed, meaning that none of the stuff that comes after it is causing a delay.
The request handler issues a inverse name lookup in order to display the client name in the log. My Windows 7 issues a first DNS lookup that fails with no delay, followed by 2 successive NetBIOS name queries to the HTTP client, and each one run into a 2 sec timeout = 4 seconds delay !!
Have a look at https://bugs.python.org/issue6085
Another fix that worked for me is to override BaseHTTPRequestHandler.address_string() in my request handler with a version that does not perform the name lookup
def address_string(self):
host, port = self.client_address[:2]
#return socket.getfqdn(host)
return host
Philippe
This does not sound like a problem with the code. A nifty way of troubleshooting an HTTP server is to connect to it to telnet to it on port 80. Then you can type something like:
GET /index.html HTTP/1.1
host: www.blah.com
<enter> <enter>
and observe the server's response. See if you get a delay using this approach.
You may also want to turn off any firewalls to see if they are responsible for the slowdown.
Try replacing 127.0.0.1 for localhost. If that solves the problem, then that is a clue that the FQDN lookup may indeed be the possible cause.
Replacing localhost with 127.0.0.1 can solve the problem:)

Categories