I have an own class derived from BaseHTTPRequestHandler, which implements my specific GET method. This works quite fine:
from http.server import BaseHTTPRequestHandler, HTTPServer
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
""" my implementation of the GET method """
myServer = HTTPServer(("127.0.0.1", 8099), MyHTTPRequestHandler)
myServer.handle_request()
But why do I need to pass my class MyHTTPRequestHandler to the HTTPServer? I know that it is required by documentation:
class http.server.BaseHTTPRequestHandler(request, client_address, server)
This class is used to handle the HTTP requests that arrive at the server. By itself, it cannot respond to any actual HTTP requests; it
must be subclassed to handle each request method (e.g. GET or POST).
BaseHTTPRequestHandler provides a number of class and instance
variables, and methods for use by subclasses.
The handler will parse the request and the headers, then call a method specific to the request type. The method name is constructed
from the request. For example, for the request method SPAM, the
do_SPAM() method will be called with no arguments. All of the relevant
information is stored in instance variables of the handler. Subclasses
should not need to override or extend the init() method.
But I do want to pass an instantiated object of my subclass instead. I don't understand why this has been designed like that and it looks like design failure to me. The purpose of object oriented programming with polymorphy is that I can subclass to implement a specific behavior with the same interfaces, so this seems to me as an unnecessary restriction.
That is what I want:
from http.server import BaseHTTPRequestHandler, HTTPServer
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
def __init__(self, myAdditionalArg):
self.myArg = myAdditionalArg
def do_GET(self):
""" my implementation of the GET method """
self.wfile(bytes(self.myArg, "utf-8"))
# ...
myReqHandler = MyHTTPRequestHandler("mySpecificString")
myServer = HTTPServer(("127.0.0.1", 8099), myReqHandler)
myServer.handle_request()
But if I do that, evidently I receive the expected error message:
TypeError: 'MyHTTPRequestHandler' object is not callable
How can I workaround this so that I can still use print a specific string?
There is also a 2nd reason why I need this: I want that MyHTTPRequestHandler provides also more information about the client, which uses the GET method to retrieve data from the server (I want to retrieve the HTTP-Header of the client browser).
I just have one client which starts a single request to the server. If a solution would work in a more general context, I'll be happy, but I won't
need it for my current project.
Somebody any idea to do that?
A server needs to create request handlers as needed to handle all the requests coming in. It would be bad design to only have one request handler. If you passed in an instance, the server could only ever handle one request at a time and if there were any side effects, it would go very very badly. Any sort of change of state is beyond the scope of what a request handler should do.
BaseHTTPRequestHandler has a method to handle message logging, and an attribute self.headers containing all the header information. It defaults to logging messages to sys.stderr, so you could do $ python -m my_server.py 2> log_file.txt to capture the log messages. or, you could write to file in your own handler.
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
log_file = os.path.join(os.path.abspath(''), 'log_file.txt') # saves in directory where server is
myArg = 'some fancy thing'
def do_GET(self):
# do things
self.wfile.write(bytes(self.myArg,'utf-8'))
# do more
msg_format = 'start header:\n%s\nend header\n'
self.log_message(msg_format, self.headers)
def log_message(self, format_str, *args): # This is basically a copy of original method, just changing the destination
with open(self.log_file, 'a') as logger:
logger.write("%s - - [%s] %s\n" %
self.log_date_time_string(),
format%args))
handler = MyHTTPRequestHandler
myServer = HTTPServer(("127.0.0.1", 8099), handler)
It is possible to derive a specific HTTPServer class (MyHttpServer), which has the following attributes:
myArg: the specific "message text" which shall be printed by the HTTP
request handler
headers: a dictionary storing the headers set by a
HTTP request handler
The server class must be packed together with MyHTTPRequestHandler. Furthermore the implementation is only working properly under the following conditions:
only one HTTP request handler requests an answer from the server at the same time (otherwise data kept by the attributes are corrupted)
MyHTTPRequestHandler is only used with MyHttpServer and vice versa (otherwise unknown side effects like exceptions or data corruption can occur)
That's why both classes must be packed and shipped together in a way like this:
from http.server import BaseHTTPRequestHandler, HTTPServer
class MyHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
self.wfile.write(bytes(self.server.myArg, 'utf-8'))
#...
self.server.headers = self.headers
class MyHttpServer(HTTPServer):
def __init__(self, server_address, myArg, handler_class=MyHttpRequestHandler):
super().__init__(server_address, handler_class)
self.myArg = myArg
self.headers = dict()
The usage of these classes could look like this, whereas only one request of a client (i.e. Web-Browser) is answered by the server:
def get_header():
httpd = MyHttpServer(("127.0.0.1", 8099), "MyFancyText", MyHttpRequestHandler)
httpd.handle_request()
return httpd.headers
Related
I have a custom BaseHTTPServer.BaseHTTPRequestHandler that can (partially) be seen below. It has a special method that allows me to assign a reply generator - a class that takes some data in (for example the values of some of the parameters in a GET request) and generates an XML reply that can then be sent back to the client as a reply to the request:
class CustomHandler(BaseHTTPServer.BaseHTTPRequestHandler):
# implementation
# ...
def set_reply_generator(self, generator):
self.generator = generator
def do_GET(self):
# process GET request
# ...
# and generate reply using generator
reply = self.generator.generate(...) # ... are some parameters that were in the GET request
# Encode
reply = reply.encode()
# set headers
self.__set_headers__(data_type='application/xml', data=reply)
# and send away
self.wfile.write(reply)
self.wfile.write('\n'.encode())
I pass this handler to the BaseHTTPServer.HTTPServer:
def run(addr='127.0.0.1', port=8080):
server_addr = (addr, port)
server = BaseHTTPServer.HTTPServer(server_addr)
server_thread = threading.Thread(target=server.server_forever)
server_thread.start()
Since the constructor of BaseHTTPServer.HTTPServer expects a class and not an instance to be passed as the RequestHandlerClass argument I cannot just create an instance of my handler, call set_reply_generator() and then pass it on. Even if that worked I would still want to be able to access the handler later on (for example if through a POST request the reply generator for the GET requests is changed) and for that I need to know how to retrieve the instance that the server is using.
I've looked here but I was unable to find it (perhaps missed it). I have the bad feeling that it is private (aka __...__).
All I was able to find out so far is that the class of the handler that the server uses can be retrieved through the RequestHandlerClass member, which however is not the same as retrieving an instance of the handler that will allow me to call the set_reply_generator(...).
In addition I tried to actually create an instance of the custom handler but then I landed in a chicken-and-egg issue:
the constructor of the handler requires you to pass the instance of the server
the server requires you to pass the handler
This is sort of an indirect proof that the HTTPServer's constructor is the one, that is responsible for instantiating the handler.
Already answered here. No need to directly access the handler but rather create a static class member that can be access without an instance but still be processed when the handler wants to use it.
Suppose I have the following application:
import cherrypy
class HelloWorld(object):
def my_handler(self, a):
assert a == "foo"
def index(self, a):
# Register parameter a/my_handler/...
return "A" * 100000
index.exposed = True
cherrypy.quickstart(HelloWorld())
And the following client:
import httplib
h = httplib.HTTPConnection('localhost', 8080)
h.request("GET", "/?a=foo")
r = h.getresponse(True)
print r.read(10)
I would like to deal explicitly with the case where a client tears down the connection (either politely or not) and thus hasn't received the response. So, in this example, my_handler should be called because the GET response wasn't delivered before the connection closed.
The CherryPy documentation refers to errors that are a bit higher up the TCP stack than I'm aiming for (c.f. cperror, cprequest's error_response). I dug through some of the CherryPy source, and is some mention of socket errors in e.g. read_request_line and simple response. It is unclear to me whether these are exposed anywhere and, more importantly, if there is a canonical way to hook.
Why, you might ask, would I be interested in such a thing? There's a slight abuse of HTTP semantics and it's important to verify that a GET response was delivered; otherwise some accounting needs to be done.
I have a question that could well belong to Twisted or could be directly related to Python.
My problem, as the other is related to the disconnection process in Twisted.
As I read on this site, if I want to I have to perform the following steps:
The server must stop listening.
The client connection must disconnect.
The server connection must disconnect.
According to what I read on the previous page to make the first step would have to run the stopListening method.
In the example mentioned in the web all actions are performed in the same script. Making it easy to access the different variables and methods.
For me I have a server and a client are in different files and different locations.
I have a function that creates a server, and assigns a protocol and want, from the client protocol in another file, make an AMP call to a method for stop the connector.
The call AMP calls the SendMsg command.
class TESTServer(protocol.Protocol):
factory = None
sUsername = ""
credProto = None
bGSuser = None
slot = None
"""
Here was uninteresting code.
"""
# upwards=self.bGSuser, forwarded=True, tx_timestamp=iTimestamp,\
# message=sMsg)
log.msg("self.connector")
log.msg(self.connector)
return {'bResult': True}
SendMsg.responder(vSendMsg)
def _testfunction(self):
logger = logging.getLogger('server')
log.startLogging(sys.stdout)
pf = CredAMPServerFactory()
sslContext = ssl.DefaultOpenSSLContextFactory('key/server.pem',\
'key/public.pem',)
self.connector = reactor.listenSSL(1234, pf, contextFactory = sslContext,)
log.msg('Server running...')
reactor.run()
if __name__ == '__main__':
TESTServer()._testfunction()
The class CredAMPServerFactory assign the corresponding protocol.
class CredAMPServerFactory(ServerFactory):
"""
Server factory useful for creating L{CredReceiver} and L{SATNETServer} instances.
This factory takes care of associating a L{Portal} with the L{CredReceiver}
instances it creates. If the login is succesfully achieved, a L{SATNETServer}
instance is also created.
"""
protocol = CredReceiver
In the "CredReceiver" class I have a call that assigns the protocol to the TestServer class. I do this to make calls using the AMP method "Responder".
self.protocol = SATNETServer
My problem is that when I make the call the program responds with an error indicating that the connector doesn't belong to CredReceiver attribute object.
File "/home/sgongar/Dev/protocol/server_amp.py", line 248, in vSendMsg
log.msg(self.connector)
exceptions.AttributeError: 'CredReceiver' object has no attribute 'connector'
How could I do this? Does anyone know of a similar example of that may take note?
Thank you.
EDIT.
Server side:
server_amp.py
Starts a reactor: reactor.listenSSL(1234, pf, contextFactory =
sslContext,) from within the SATNETServer class.
Assigns protocol, pf, to CredAMPServerFactory class who belongs to module server.py also from within the SATNETServer class.
server.py
Within the class CredAMPServerFactory assigns CredReceiver class to protocol.
Once the connection is established the class SATNETServer is assigned to the protocol.
Client side:
client_amp
Makes a call to the SendMsg method belonging to theSATNETServer class.
So i've looked around at a few things involving writting an HTTP Proxy using python and the Twisted framework.
Essentially, like some other questions, I'd like to be able to modify the data that will be sent back to the browser. That is, the browser requests a resource and the proxy will fetch it. Before the resource is returned to the browser, i'd like to be able to modify ANY (HTTP headers AND content) content.
This ( Need help writing a twisted proxy ) was what I initially found. I tried it out, but it didn't work for me. I also found this ( Python Twisted proxy - how to intercept packets ) which i thought would work, however I can only see the HTTP requests from the browser.
I am looking for any advice. Some thoughts I have are to use the ProxyClient and ProxyRequest classes and override the functions, but I read that the Proxy class itself is a combination of the both.
For those who may ask to see some code, it should be noted that I have worked with only the above two examples. Any help is great.
Thanks.
To create ProxyFactory that can modify server response headers, content you could override ProxyClient.handle*() methods:
from twisted.python import log
from twisted.web import http, proxy
class ProxyClient(proxy.ProxyClient):
"""Mangle returned header, content here.
Use `self.father` methods to modify request directly.
"""
def handleHeader(self, key, value):
# change response header here
log.msg("Header: %s: %s" % (key, value))
proxy.ProxyClient.handleHeader(self, key, value)
def handleResponsePart(self, buffer):
# change response part here
log.msg("Content: %s" % (buffer[:50],))
# make all content upper case
proxy.ProxyClient.handleResponsePart(self, buffer.upper())
class ProxyClientFactory(proxy.ProxyClientFactory):
protocol = ProxyClient
class ProxyRequest(proxy.ProxyRequest):
protocols = dict(http=ProxyClientFactory)
class Proxy(proxy.Proxy):
requestFactory = ProxyRequest
class ProxyFactory(http.HTTPFactory):
protocol = Proxy
I've got this solution by looking at the source of twisted.web.proxy. I don't know how idiomatic it is.
To run it as a script or via twistd, add at the end:
portstr = "tcp:8080:interface=localhost" # serve on localhost:8080
if __name__ == '__main__': # $ python proxy_modify_request.py
import sys
from twisted.internet import endpoints, reactor
def shutdown(reason, reactor, stopping=[]):
"""Stop the reactor."""
if stopping: return
stopping.append(True)
if reason:
log.msg(reason.value)
reactor.callWhenRunning(reactor.stop)
log.startLogging(sys.stdout)
endpoint = endpoints.serverFromString(reactor, portstr)
d = endpoint.listen(ProxyFactory())
d.addErrback(shutdown, reactor)
reactor.run()
else: # $ twistd -ny proxy_modify_request.py
from twisted.application import service, strports
application = service.Application("proxy_modify_request")
strports.service(portstr, ProxyFactory()).setServiceParent(application)
Usage
$ twistd -ny proxy_modify_request.py
In another terminal:
$ curl -x localhost:8080 http://example.com
For two-way proxy using twisted see the article:
http://sujitpal.blogspot.com/2010/03/http-debug-proxy-with-twisted.html
I am learning network programming using twisted 10 in python. In below code is there any way to detect HTTP Request when data recieved? also retrieve Domain name, Sub Domain, Port values from this? Discard it if its not http data?
from twisted.internet import stdio, reactor, protocol
from twisted.protocols import basic
import re
class DataForwardingProtocol(protocol.Protocol):
def _ _init_ _(self):
self.output = None
self.normalizeNewlines = False
def dataReceived(self, data):
if self.normalizeNewlines:
data = re.sub(r"(\r\n|\n)", "\r\n", data)
if self.output:
self.output.write(data)
class StdioProxyProtocol(DataForwardingProtocol):
def connectionMade(self):
inputForwarder = DataForwardingProtocol( )
inputForwarder.output = self.transport
inputForwarder.normalizeNewlines = True
stdioWrapper = stdio.StandardIO(inputForwarder)
self.output = stdioWrapper
print "Connected to server. Press ctrl-C to close connection."
class StdioProxyFactory(protocol.ClientFactory):
protocol = StdioProxyProtocol
def clientConnectionLost(self, transport, reason):
reactor.stop( )
def clientConnectionFailed(self, transport, reason):
print reason.getErrorMessage( )
reactor.stop( )
if __name__ == '_ _main_ _':
import sys
if not len(sys.argv) == 3:
print "Usage: %s host port" % _ _file_ _
sys.exit(1)
reactor.connectTCP(sys.argv[1], int(sys.argv[2]), StdioProxyFactory( ))
reactor.run( )
protocol.dataReceived, which you're overriding, is too low-level to serve for the purpose without smart buffering that you're not doing -- per the docs I just quoted,
Called whenever data is received.
Use this method to translate to a
higher-level message. Usually, some
callback will be made upon the receipt
of each complete protocol message.
Parameters
data
a string of
indeterminate length. Please keep in
mind that you will probably need to
buffer some data, as partial (or
multiple) protocol messages may be
received! I recommend that unit tests
for protocols call through to this
method with differing chunk sizes,
down to one byte at a time.
You appear to be completely ignoring this crucial part of the docs.
You could instead use LineReceiver.lineReceived (inheriting from protocols.basic.LineReceiver, of course) to take advantage of the fact that HTTP requests come in "lines" -- you'll still need to join up headers that are being sent as multiple lines, since as this tutorial says:
Header lines beginning with space or
tab are actually part of the previous
header line, folded into multiple
lines for easy reading.
Once you have a nicely formatted/parsed response (consider studying twisted.web's sources so see one way it could be done),
retrieve Domain name, Sub Domain, Port
values from this?
now the Host header (cfr the RFC section 14.23) is the one containing this info.
Just based on what you seems to be attempting, I think the following would be the path of least resistance:
http://twistedmatrix.com/documents/10.0.0/api/twisted.web.proxy.html
That's the twisted class for building an HTTP Proxy. It will let you intercept the requests, look at the destination and look at the sender. You can also look at all the headers and the content going back and forth. You seem to be trying to re-write the HTTP Protocol and Proxy class that twisted has already provided for you. I hope this helps.