Failed to get the HTTP POST request body using Twisted - python

I was trying to get the HTTP POST request body by using t.p.basic.LineReceiver but failed. My code is listed below:
from twisted.internet import reactor, protocol
from twisted.protocols import basic
class PrintPostBody(basic.LineReceiver):
def __init__(self):
self.line_no = 0
def lineReceived(self, line):
print '{0}: {1}'.format(str(self.line_no).rjust(3), repr(line))
self.line_no += 1
def connectionLost(self, reason):
print "conn lost"
class PPBFactory(protocol.ServerFactory):
protocol = PrintPostBody
def main():
f = PPBFactory()
reactor.listenTCP(80, f)
reactor.run()
if __name__ == '__main__':
main()
But when I was doing HTTP POST request to that machine at port 80, only the HTTP request headers were printed out.
Sample output:
0: 'POST / HTTP/1.0'
1: 'Host: ###.##.##.##'
2: 'Referer: http://#.#####.###/?ssid=0&from=0&bd_page_type=1&uid=wiaui_1292470548_2644&pu=sz%40176_229,sz%40176_208'
3: 'Content-Length: 116'
4: 'Origin: http://#.#####.###'
5: 'Content-Type: application/x-www-form-urlencoded'
6: 'Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'
7: 'User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US) AppleWebKit/534.11 (KHTML, like Gecko) Chrome/9.0.565.0 Safari/534.11'
8: 'Accept-Encoding: gzip,deflate,sdch'
9: 'Accept-Language: en-US,en;q=0.8'
10: 'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3'
11: 'Via: 1.1 #####.###.###.##:8080 (squid/2.6.STABLE21)'
12: 'X-Forwarded-For: ###.##.###.###'
13: 'Cache-Control: max-age=0'
14: 'Connection: keep-alive'
15: ''
So the connection was not closed here but the POST body was not received either.
I have tested the network condition by running sudo nc -l 80 and it did print out the HTTP POST request body.
So, how could I get the HTTP POST request body using Twisted?
Thank you very much.

I suspect you didn't see the request body printed out because it didn't contain any newlines or end with a newline. So it got into the parse buffer of your PrintPostBody instance and sat there forever, waiting for a newline to indicate that a full line had been received. LineReceiver won't call the lineReceived callback until a full line is received.
Instead, you can let Twisted Web do this parsing for you:
from twisted.web.server import Site # Site is a server factory for HTTP
from twisted.web.resource import Resource
from twisted.internet import reactor
class PrintPostBody(Resource): # Resources are what Site knows how to deal with
isLeaf = True # Disable child lookup
def render_POST(self, request): # Define a handler for POST requests
print request.content.read() # Get the request body from this file-like object
return "" # Define the response body as empty
reactor.listenTCP(80, Site(PrintPostBody()))
reactor.run()

Related

How to make a proxy server for python requests

I have seen code like this that shows how to use a proxy for python requests.
import requests
proxies = {
'http': 'http://localhost:7777',
'https': 'http://localhost:7777',
}
requests.get('http://example.org', proxies=proxies)
requests.get('https://example.org', proxies=proxies)
But I am wondering how can we make a very simple proxy server in Python that would be able to respond to the GET request?
You can find many examples how to do it - even in questions on Stackoverflow.
Some of them use standard module socket (but it doesn't look simply).
Other use standard module http but they show code for Python 2 which was using different names.
Version for Python 3
import http.server
import socketserver
import urllib.request
class MyProxy(http.server.SimpleHTTPRequestHandler):
def do_GET(self):
print(self.path)
url = self.path
self.send_response(200)
self.end_headers()
self.copyfile(urllib.request.urlopen(url), self.wfile)
# --- main ---
PORT = 7777
httpd = None
try:
socketserver.TCPServer.allow_reuse_address = True # solution for `OSError: [Errno 98] Address already in use`
httpd = socketserver.TCPServer(('', PORT), MyProxy)
print(f"Proxy at: http://localhost:{PORT}")
httpd.serve_forever()
except KeyboardInterrupt:
print("Pressed Ctrl+C")
finally:
if httpd:
httpd.shutdown()
#httpd.socket.close()
Test using page httpbin.org
import requests
proxies = {
'http': 'http://localhost:7777',
'https': 'http://localhost:7777',
}
response = requests.get('http://httpbin.org/get', proxies=proxies)
print(response.text)
response = requests.get('http://httpbin.org/get?arg1=hello&arg2=world', proxies=proxies)
print(response.text)
But it works only for HTTP.
For HTTPS it may need to use ssl.socket from module ssl.
And it works only with GET.
For POST, PUT, DELETE, etc. it would need do_POST, do_PUT, do_DELETE, etc. with different code.
EDIT:
def do_POST(self):
url = self.path
# - post data -
content_length = int(self.headers.get('Content-Length', 0)) # <--- size of data
if content_length:
content = self.rfile.read(content_length) # <--- data itself
else:
content = None
req = urllib.request.Request(url, method="POST", data=content)
output = urllib.request.urlopen(req)
# ---
self.send_response(200)
self.end_headers()
self.copyfile(output, self.wfile)
But if you need local proxy only to test your code then you could use
Python module/program: mitmproxy (Man-In-The-Middle-Proxy)
not-python, not-free (but work 30 days for free), with nice GUI: Charles Proxy
More complex OWASP ZAP, Burp Suite (community edition)

Python HTTP Simple Server Persistent Connections

Im trying to create a simple HTTP server that will receive POST messages and provide a simple response. Im using the standard HTTPServer with python. The client connects using a session() which should use a persistent connection but after each POST I see the message below in the debug that the connection is dropping.
INFO:urllib3.connectionpool:Resetting dropped connection:
DEBUG:urllib3.connectionpool:"GET / HTTP/1.1" 200 None
The client works properly when I try it with Apache so I believe the issue is in my simple server configuration. How can I configure the simple http server to work with persistent connections?
Simple Server Python Code:
from http.server import HTTPServer, BaseHTTPRequestHandler
from io import BytesIO
import time
import datetime
import logging
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def _set_response(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.send_header("Connection", "keep-alive")
self.send_header("keep-alive", "timeout=5, max=30")
self.end_headers()
def do_GET(self):
self.send_response(200)
self.end_headers()
self.wfile.write(b'Hello, world!')
def do_POST(self):
content_length = int(self.headers['Content-Length'])
body = self.rfile.read(content_length)
curr_time = datetime.datetime.now()
data = ('{"msgid":"0x0002", "timestamp": "'+ str(curr_time) +'", "message":"Test http response from Raspberry Pi HTTP server"}').encode()
self.send_response(200)
self.end_headers()
response = BytesIO()
#response.write(b'This is POST request. ')
#response.write(b'Received: ')
response.write(data)
self.wfile.write(response.getvalue())
print("Simple HTTP Server running...")
logging.basicConfig(level=logging.DEBUG)
httpd = HTTPServer(('', 8000), SimpleHTTPRequestHandler)
httpd.serve_forever()
Client Python code:
#!/usr/bin/env python
# Using same TCP connection for all HTTP requests
import os
import json
import time
import datetime
import logging
import requests
from requests.auth import HTTPBasicAuth
logging.basicConfig(level=logging.DEBUG)
start_time = time.time()
def get_data(limit):
session = requests.Session()
url = "http://localhost:8000"
for i in range(10):
curr_time = datetime.datetime.now()
data = '{"msgid":"0x0001", "timestamp": "'+ str(curr_time) +'", "message":"Test http message from Raspberry Pi"}'
print("Sending Data: " + data)
response = session.post(url.format(limit), data)
#response_dict = json.loads(response.text)
print("Received Data: " + response.text)
if __name__ == "__main__":
limit = 1
get_data(limit)
print("--- %s seconds ---" % (time.time() - start_time))
You aren't actually setting the Connection header in your POST handler. In order for persistent connections to work, you'll also need to set the Content-Length header in the response so that client knows how many bytes of the HTTP body to read before reusing the connection.
Try this POST handler, adapted from your code:
def do_POST(self):
content_length = int(self.headers['Content-Length'])
body = self.rfile.read(content_length)
# Process the request here and generate the entire response
response_data = b'{"stuff": 1234}'
# Send the response
self.send_response(200)
self.send_header("Connection", "keep-alive")
self.send_header("Content-Length", str(len(response_data)))
self.end_headers()
# Write _exactly_ the number of bytes specified by the
# 'Content-Length' header
self.wfile.write(response_data)

SimpleHTTPServer Custom Headers

By default SimpleHTTPServer sends it's own headers.
I've been trying to figure out how to send my own headers and found this solution. I tried adapting it to my (very) simple proxy:
import SocketServer
import SimpleHTTPServer
import urllib
class Proxy(SimpleHTTPServer.SimpleHTTPRequestHandler):
headers = ['Date: Wed, 29 Oct 2014 15:54:43 GMT', 'Server: Apache', 'Accept-Ranges: bytes', 'X-Mod-Pagespeed: 1.6.29.7-3566', 'Vary: Accept-Encoding', 'Cache-Control: max-age=0, no-cache', 'Content-Length: 204', 'Connection: close', 'Content-Type: text/html']
def end_headers(self):
print "Setting custom headers"
self.custom_headers()
SimpleHTTPServer.SimpleHTTPRequestHandler.end_headers(self)
def custom_headers(self):
for i in self.headers:
key, value = i.split(":", 1)
self.send_header(key, value)
def do_GET(self):
self.copyfile(urllib.urlopen(self.path), self.wfile)
httpd = SocketServer.ForkingTCPServer(('', PORT), Proxy)
httpd.serve_forever()
But end_headers() doesn't set the custom headers (confirmed on Wireshark).
Given a list of headers like the one in my little snippet, how I can overwrite SimpleHTTPServer's default headers and server my own?
I think you miss something in do_GET().
SimpleHTTPServer also calls
self.send_response(200)
See the following code or better the module SimpleHTTPServer
self.send_response(200)
self.send_header("Content-type", ctype)
fs = os.fstat(f.fileno())
self.send_header("Content-Length", str(fs[6]))
self.send_header("Last-Modified", self.date_time_string(fs.st_mtime))
self.end_headers()
I think you should override the send_head() method for what you want to do and read the source of SimpleHTTPServer.

Python twisted proxy to send 2 requests

How can I work on this code to be able to send 2 separate requests. The requests would be in this order:
Request1 :
HEAD http://google.com
Host: google.com
... wait for reply from google server ...
Request2 :
GET http://yahoo.com HTTP/1.1
User-Agent: mozilla
Accept: */*
... second request sent from browser while first request is static for all requests ...
The code I’m trying to modify is:
from twisted.web import proxy, http
class SnifferProxy(proxy.Proxy):
def allContentReceived(self):
print "Received data..."
print "method = %s" % self._command
print "action = %s" % self._path
print "ended content manipulation\n\n"
return proxy.Proxy.allContentReceived(self)
class ProxyFactory(http.HTTPFactory):
protocol = SnifferProxy
if __name__ == "__main__":
from twisted.internet import reactor
reactor.listenTCP(8080, ProxyFactory())
reactor.run()
The twisted proxy would be connecting to another external proxy
Any help is appreciated..
I think you can get what you want by adding the call to the Proxy.allContentReceived method as a callback to a HEAD request using Agent.
from twisted.internet import reactor from twisted.web import proxy, http
from twisted.web.client import Agent
from twisted.web.http_headers import Headers
agent = Agent(reactor)
class SnifferProxy(proxy.Proxy):
def allContentReceived(self):
def cbHead(result):
print "got response for HEAD"
def doProxiedRequest(result):
proxy.Proxy.allContentReceived(self)
# I assumed self._path, but it looks OP wants to do the
# HEAD request to the same path always
PATH = "http://foo.bar"
d = agent.request(
'HEAD', PATH, Headers({'User-Agent': ['twisted']}), None)
d.addCallback(cbHead)
d.addCallback(doProxiedRequest)

Socket receiving no data. Why?

I was learning socket programming and tried to design a basic http client of mine. But somehow everything is going good but I am not receiving any data. Can you please tell me what am I missing?
CODE
import socket
def create_socket():
return socket.socket( socket.AF_INET, socket.SOCK_STREAM )
def remove_socket(sock):
sock.close()
del sock
sock = create_socket()
print "Connecting"
sock.connect( ('en.wikipedia.org', 80) )
print "Sending Request"
print sock.sendall ('''GET /wiki/List_of_HTTP_header_fields HTTP/1.1
Host: en.wikipedia.org
Connection: close
User-Agent: Web-sniffer/1.0.37 (+http://web-sniffer.net/)
Accept-Encoding: gzip
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7
Cache-Control: no-cache
Accept-Language: de,en;q=0.7,en-us;q=0.3
Referer: d_r_G_o_s
''')
print "Receving Reponse"
while True:
content = sock.recv(1024)
if content:
print content
else:
break
print "Completed"
OUTPUT
Connecting
Sending Request
298
Receving Reponse
Completed
While I was expecting it show me html content of homepage of wikipedia :'(
Also, it would be great if somebody can share some web resources / books where I can read in detail about python socket programming for HTTP Request Client
Thanks!
For a minimal HTTP client, you definitely shouldn't send Accept-Encoding: gzip -- the server will most likely reply with a gzipped response you won't be able to make much sense of by eye. :)
You aren't sending the final double \r\n (nor are you actually terminating your lines with \r\n as per the spec (unless you happen to develop on Windows with Windows line endings, but that's just luck and not programming per se).
Also, del sock there does not do what you think it does.
Anyway -- this works:
import socket
sock = socket.socket()
sock.connect(('en.wikipedia.org', 80))
for line in (
"GET /wiki/List_of_HTTP_header_fields HTTP/1.1",
"Host: en.wikipedia.org",
"Connection: close",
):
sock.send(line + "\r\n")
sock.send("\r\n")
while True:
content = sock.recv(1024)
if content:
print content
else:
break
EDIT: As for resources/books/reference -- for a reference HTTP client implementation, look at Python's very own httplib.py. :)

Categories