I need to write simple http caching proxy with python socket class (not using SimpleHTTPServer class). The simple scheme is:
-open socket
-get http message
-check if we have cached resource
-either provide cached resource or re-request it from remote server
-close socket
When getting a message I use synchronous recv method:
38 message = ""
39 BUF_SIZE = 4096
40 while 1:
41 chunk = connection.recv(BUF_SIZE)
42 if None == chunk or 0 == len(chunk):
43 break
44 message += chunk
45 print "rec %i" % (len(message))
And when I connect to my proxy with a browser, the browser will hold the socket open until it either gets response or I manually cancel the wait, which is perfectly by according to how HTTP is intended to work. So, my control is stuck at recv.
-I cannot just read say 1000bytes, coz I don't know how large the actual message is? There doesn't seem to be any message end mark in http?
-the program cannot continue while having open socket, using synchronous read;
-I could use either threads or some kind of async api, to solve control blocking, but this doesn't help with the problem to understand when to stop reading request and start handling it.
Related
I use https://gist.github.com/bradmontgomery/2219997 python code to setup a http server.
For my use I just add few lines to the 'do_POST' method:
def _set_headers(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
def do_POST(self):
self._set_headers()
self.wfile.write("POST!")
content_length = int(self.headers['Content-Length'])
print(content_length )
post_data = self.rfile.read(content_length)
print(post_data)
I want to send a file through Curl :
curl -F "file=#file.txt" "http://myServer.noip.me:22222/" --trace-ascii debugdump.txt
Client side : Curl response is:
curl: (52) Empty reply from server
Server side : server prints content_length value and then completely hangs at line "self.rfile.read(content_length)". It does not print 'print(post_data)'.
Firewall has been disabled on both side.
Last lines from debugdump.txt (Client side):
== Info: Empty reply from server
== Info: Connection #0 to host myServer.noip.me left intact
What did I miss?
Empty reply from server means that the server closed the connection without responding anything which is a HTTP protocol violation.
This should never happen with a proper server so it signals something is seriously wrong in the server side.
Sometimes this happens because the server was naively implemented and crashes or misbehaves on an unexpected request coming from a client. The best chance to make this work then, is to try to alter the request in ways to make it more similar to the way the badly written server software might expect it to be.
This is a matter of guessing. Something is wrong in that server of yours.
With Python's http.server package it's your responsibility to explicitly send an HTTP status code as part of your do_POST() override.
In your example code you first reply to the client (i.e. via send_header(), end_headers() and wfile before reading the whole POST request from the client!
See also Python's wfile documentation:
Contains the output stream for writing a response back to the client. Proper adherence to the HTTP protocol must be used when writing to this stream in order to achieve successful interoperation with HTTP clients.
So this looks racy and you probably just need to make sure that you read the complete POST request before you start replying.
IOW, curl just complains that it didn't receive any HTTP status code from your server before the connection was closed. Apparently the connection isn't properly shutdown on both sides such that your server blocks on the read side.
That curl error
curl: (52) Empty reply from server
would also show up if you simply forget to send a status code by omitting
self.send_response(200)
self.end_headers()
at the end of your method - or if your method prematurely exits (e.g. because it raises an exception).
However, the latter issues should show up in stderr ouput of your server.
I have a server, which takes few minutes to process a specific request and then responds to it.
The client has to keep waiting for the response without knowing when it will complete.
Is there a way to let the client know about the processing status? (say 50% completed, 80% completed), without the client having to poll for the status.
Without using any of the newer techniques (websockets, webpush/http2, ...), I've previously used a simplified Pushlet or Long polling solution for HTTP 1.1 and various javascript or own client implementation. If my solution doesn't fit in your use case, you can always google those two names for further possible ways.
Client
sends a request, reads 17 bytes (Inital http response) and then reads 2 bytes at a time getting processing status.
Server
sends a valid HTTP response and during request progress sends 2 bytes of percentage completed, until last 2 bytes are "ok" and closes connection.
UPDATED: Example uwsgi server.py
from time import sleep
def application(env, start_response):
start_response('200 OK', [])
def working():
yield b'00'
sleep(1)
yield b'36'
sleep(1)
yield b'ok'
return working()
UPDATED: Example requests client.py
import requests
response = requests.get('http://localhost:8080/', stream=True)
for r in response.iter_content(chunk_size=2):
print(r)
Example server (only use for testing :)
import socket
from time import sleep
HOST, PORT = '', 8888
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
while True:
client_connection, client_address = listen_socket.accept()
request = client_connection.recv(1024)
client_connection.send('HTTP/1.1 200 OK\n\n')
client_connection.send('00') # 0%
sleep(2) # Your work is done here
client_connection.send('36') # 36%
sleep(2) # Your work is done here
client_connection.sendall('ok') # done
client_connection.close()
If the last 2 bytes aren't "ok", handle error someway else. This isn't beautiful HTTP status code compliance but more of a workaround that did work for me many years ago.
telnet client example
$ telnet localhost 8888
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 200 OK
0036okConnection closed by foreign host.
This answer probably won’t help in your particular case, but it might help in other cases.
The HTTP protocol supports informational (1xx) responses:
indicates an interim
response for communicating connection status or request progress
prior to completing the requested action and sending a final
response
There is even a status code precisely for your use case, 102 (Processing):
interim response used to
inform the client that the server has accepted the complete request,
but has not yet completed it
Status code 102 was removed from further editions of that standard due to lack of implementations, but it is still registered and could be used.
So, it might look like this (HTTP/2 has an equivalent binary form):
HTTP/1.1 102 Processing
Progress: 50%
HTTP/1.1 102 Processing
Progress: 80%
HTTP/1.1 200 OK
Date: Sat, 05 Aug 2017 11:53:14 GMT
Content-Type: text/plain
All done!
Unfortunately, this is not widely supported. In particular, WSGI does not provide a way to send arbitrary 1xx responses. Clients support 1xx responses in the sense that they are required to parse and tolerate them, but they usually don’t give programmatic access to them: in this example, the Progress header would not be available to the client application.
However, 1xx responses may still be useful (if the server can send them) because they have the effect of resetting the client’s socket read timeout, which is one of the main problems with slow responses.
Use Chunked Transfer Encoding, which is a standard technique to transmit streams of unknown length.
See: Wikipedia - Chunked Transfer Encoding
Here a python server implementation available as a gist on GitHub:
https://gist.github.com/josiahcarlson/3250376
It sends content using chunked transfer encoding using standard library modules
In the client if chunked transfer encoding has been notified by the server, you'd only need to:
import requests
response = requests.get('http://server.fqdn:port/', stream=True)
for r in response.iter_content(chunk_size=None):
print(r)
chunk_size=None, because the chunks are dynamic and will be determined by the information in the simple conventions of the chunked transfer semantics.
See: http://docs.python-requests.org/en/master/user/advanced/#chunk-encoded-requests
When you see for example 100 in the content of response r, you know that the next chunk will be the actual content after processing the 100.
I have implemented a simple HTTP server and a client. The latter issues a PUT request using the requests library, sending some arbitrary JSON and then exits.
When I start the server, and then run the client, both the server and the client block. The server however appears to not have gone through the entire handler function yet.
This is what I get on server side:
$ python3 server.py
PUT / HTTP/1.1
That is, after printing the request line, the content JSON string is not printed. At this point both client and server block for some reason.
Interestingly, when I trigger a KeyboardInterrupt to the client, the server proceeds:
$ python3 server.py
PUT / HTTP/1.1
b'{"content": "Hello World"}'
127.0.0.1 - - [25/Feb/2016 11:52:54] "PUT / HTTP/1.1" 200 -
My questions:
Why is it necessary to kill the client to let the server proceed?
Am I using any of these components the wrong way?
How can I make client and server to operate (nearly) instantaneously?
This is the code of the HTTP server. It only handles PUT requests. It prints the request line and the content data and responds using the success code to the client:
import http.server
class PrintPUTRequestHandler(http.server.BaseHTTPRequestHandler):
def do_PUT(self):
print(self.requestline)
print(self.rfile.read())
self.send_response(200)
self.end_headers()
server_address = ('', 8000)
httpd = http.server.HTTPServer(server_address, PrintHTTPRequestHandler)
httpd.serve_forever()
This is the HTTP client. It is intended to connect to the server, write the request and return as soon as possible (but it doesn't):
import requests
server_address = "http://127.1:8000"
data = '{"content": "Hello World"}'
requests.put(server_address, data, headers={"Content-type": "application/json"})
This is how I run it after the server has started (no output observable):
python client.py
The server blocks both itself and the client on this line:
print(self.rfile.read())
That happens because you didn't specify the amount of data to be read so the server reads the input stream until it is closed. And in your case the input stream is closed once you kill the client.
Remember that the server doesn't know a priori when the streaming of data ends because you may want to send data chunk by chunk (for example when you send big files).
The size of request should be passed in Content-Length header so this is what you should do:
length = int(self.headers['Content-Length'])
print(self.rfile.read(length))
That's assuming that the length is small enough to fit in your memory (and in your case it is).
I am writing TLS server in Python. I accept a connection from a client, wrap the socket and then try to read data - without success.
My server inherits from socketserver.TCPServer. My socket is non-blocking - I overwrote server_bind() method. Socket is wrapped, but handshake has to be done separately, because of the exception which is raised otherwise:
def get_request(self):
cli_sock, cli_addr = self.socket.accept()
ssl_sock = ssl.wrap_socket(cli_sock,
server_side=True,
certfile='/path/to/server.crt',
keyfile='/path/to/server.key',
ssl_version=ssl.PROTOCOL_TLSv1,
do_handshake_on_connect=False)
try:
ssl_sock.do_handshake()
except ssl.SSLError as e:
if e.args[1].find("SSLV3_ALERT_CERTIFICATE_UNKNOWN") == -1:
raise
return ssl_sock, cli_addr
To handle received data, I created a class which inherits from socketserver.StreamRequestHandler (I tried also with BaseRequestHandler, but with no luck, ended with the same problem - no data received).
When I print self.connection in handle() method, I can see that it is of type SSLSocket, fd is set (to some positive value), both local and remote IP and port have values as expected, so I assume that a client is successfully connected to my server and the connection is opened. However when I try to read data
self.connection.read(1)
There should be more bytes received, I tried with 1, 10, 1024, but it does not make any difference, the read() method always returns nothing. I tried to check len or print it, but there is nothing to be printed.
I was monitoring packages using Wireshark. And I can see that the data I am expecting to read, comes to my server (I checked that IP and port are the same for self.connection and in Wireshark), which sends ACK and then receives FIN+ACK from the client. So it looks like the data comes and are handled properly on a low level, but somehow read() method is not able to access it.
If I remove wrap_socket() call, then I am able to read data, but that is some data which client is sending for authentication.
I am using Python 3.4 on Mac machine.
How is that possible that I can in Wireshark that packets are coming, but I am not able to read the data in my code?
I'm learning python at the moment and choose a WebSocket server as a learning project, this might be not a wise decision after reading the WebSocket rfc...
The handshake and receiving single framed packages is working, but sending data back to client isn't.
I'm using the Firefox and Chromium as clients for testing.
Both browsers are cancelling the connection when receiving data from the server, this is the Chromiums error message:
WebSocket connection to 'ws://localhost:1337/' failed: Unrecognized frame opcode: 13
The createFrame function should frame the message text, send to the client.
def createFrame (text):
length = len(text)
if length <= 125:
ret = bytearray([129, length])
for byte in text.encode("utf-8"):
ret.append(byte)
print(ret)
return ret
#TODO 16 & 64Bit payload length
This is the createFrame debug output, which looks fine if I understood the rfc, the fin and utf8 bit are set, the length is 5:
bytearray(b'\x81\x05Hello')
This is the primitive sending and receiving loop:
while 1:
data = conn.recv(1024) #TODO Multiple frames
if len(data) > 0:
print(readFrame(data))
conn.send(createFrame("Hello"))
The whole code can be found in this Gist: https://gist.github.com/Cacodaimon/33ff6c3c4b312b074c3e
You have an error on line 99 in your code. The error that 13 is not an opcode is coming from the fact that you generate a http response that looks like this:
HTTP/1.1 101 Switching Protocols\r\n
(...)\r\n
Sec-WebSocket-Accept: (...)==\n\r\n\r\n
Note the extra erroneous \n, which is added by base64.encodestring. Apparently chrome interprets \n\r\n as two correct newlines and the next token is \r, which is 13: an incorrect opcode. When you replace base64.encodestring with base64.b64encode, the \n is not added and your code works as expected.