For the HTTP 1.1 protocol, the connections are persistent (keep-alive).
The client should send Connection:close header attribute to close the connection.
In a Python program, this is the case for a GET request. However, a connection for a HEAD request is closed without the Connection:close header.
What is the issue?
I have also tested a Java version of a HEAD request, and the connection is persistent there.
Python program for a HEAD request:
#!/usr/bin/env python
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect(("webcode.me" , 80))
s.sendall(b"HEAD / HTTP/1.1\r\nHost: webcode.me\r\nAccept: text/html\r\n\r\n")
print(str(s.recv(1024), 'utf-8'))
Python program for a GET request:
#!/usr/bin/env python
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect(("webcode.me" , 80))
s.sendall(b"GET / HTTP/1.1\r\nHost: webcode.me\r\nAccept: text/html\r\nConnection: close\r\n\r\n")
# s.sendall(b"GET / HTTP/1.0\r\nHost: webcode.me\r\nAccept: text/html\r\n\r\n")
while True:
data = s.recv(512)
if not data:
break
print(data.decode())
For the HTTP 1.1 protocol, the connections are persistent (keep-alive)
No, the connections can be persistent if the server also wants them to be persistent. The server might decide to close the connection immediately, 5 seconds after ... or even never by its own if the client signals support for persistence.
However, a connection for a HEAD request is closed without the Connection:close header.
It is your client which is closing the connection, not the server. Your client does a single recv and then it is done with the socket and the program. If one would modify the code to continue with recv until no more data can be read then (similar to your second program) then the client would hang since the server is waiting for the new request from the client.
Related
I have an extremely simple tcp server in python the code for which is below:
#!/usr/bin/env python
import socket
sock = socket.socket()
sock.bind(('',3912))
sock.listen(100)
num_cons = 10
cons = []
for i in range(num_cons):
con, addr = sock.accept()
cons.append(con)
while True:
for con in cons:
msg = "a"* 1000
num_sent = con.send(msg.encode())
print("sent: {} bytes of msg:{}".format(str(num_sent), msg))
The corresponding client code is
#!/usr/bin/env python
import socket
sock = socket.socket()
sock.connect(('',3912)) # in reality here I use the IP of the host where
# I run the server since I launch the clients on a different host
while True:
data = sock.recv(1000)
print("received data: {} ".format(str(data)))
Now, if I start the server with
./server.py
and 10 clients in parallel from a different host:
for i in `seq 1 10`; do ./client.py 2>/dev/null 1>/dev/null & done
And I send kill -SIGSTOP %1 to the first client, I expect the server to successfully keep trying to send data because it cannot know that the client has been stopped. Instead, the server blocks when it tries to send the data to client 1. I can understand the behaviour if the clients were on the same host as the server: we tried to write data, but the kernel buffers are full, so we block in the server, but the client never reads, so the buffer is never freed. However, if the clients are on a different machine, the kernel buffers of the server host should only be full temporarily and then the kernel should send the data over the network card and free them. So why is my server blocking on the send call? I have not verified if the same behaviour is seen when using a different language (C for example)
It is weird because 1000 characters is a small size for TCP. I have no available Linux machine but on a FreeBSD box, I could successfully send 130000 bytes on a TCP connection where the peer was stopped before the sender blocks. And more that 1000000 on Windows.
But as TCP is a connected protocol, a send call will block if it cannot queue its data because the internal TCP stack queue is full.
The gist of your problem seems to be that you're creating a SOCK_STREAM socket (i.e. TCP), and then abruptly terminating the client. As discussed in the Python Socket Programming HOWTO, a hang is expected in this situation.
TCP is a reliable protocol, meaning that every transmitted packet has to be acked. If the receiving side is dead, the sender will block waiting for that acknowledgement. Try setting a timeout and see if your send raises a socket.timeout after the expected time.
I'm creating a HTTP proxy in python but I'm having trouble in the fact that my proxy will only accept the webservers response and will completely ignore the browsers next request and the transfer of data just stops. Here's the code:
import socket
s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
bhost = '192.168.1.115'
port = 8080
s.bind((bhost, port))
s.listen(5)
def server(sock, data, host):
p = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
p.connect((host, 80))
p.send(data)
rdata = p.recv(1024)
print(rdata)
sock.send(rdata)
while True:
sock, addr = s.accept()
data = sock.recv(1024)
host = data.splitlines()[1][6:]
server(sock, data, host)`
Sorry about the code this is just a trial version and help will be much appreciated as I am only 14 and have much to learn :-)
Unfortunately I don't really see how your code should work, so I'm putting here my thoughts of how should a simple HTTP proxy look like.
So what should a basic proxy server do:
Accept connection from a client and receive an HTTP request.
Parse the request and extract its destination.
Forward requests and responses.
(optionally) Support Connection: keep-alive.
Let's go step by step and write some very simplified code.
How does proxy accepts a client. A socket should be created and moved to passive mode:
import socket, select
sock = socket.socket()
sock.bind((your_ip, port))
sock.listen()
while True:
client_sock = sock.accept()
do_stuff(client_sock)
Once the TCP connection is established, it's time receive a request. Let's assume we're going to get something like this:
GET /?a=1&b=2 HTTP/1.1
Host: localhost
User-Agent: my browser details
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
In TCP, message borders aren't preserved, so we should wait until we get at least first two lines (for GET request) in order to know what to do later:
def do_stuff(sock):
data = receive_two_lines(sock)
remote_host = parse_request(data)
After we have got the remote hostname, it's time to forward the requests and responses:
def do_stuff(client_sock):
data = receive_two_lines(client_sock)
remote_host = parse_request(data)
remote_ip = socket.getaddrinfo(remote_host) # see the docs for exact use
webserver = socket.socket()
webserver.connect((remote_ip, 80))
webserver.sendall(data)
while it_makes_sense():
client_ready = select.select([client_sock], [], [])[0]
web_ready = select.select([webserver], [], [])[0]
if client_ready:
webserver.sendall(client_sock.recv(1024))
if web_ready:
client_sock.sendall(webserver.recv(1024))
Please note select - this is how we know if a remote peer has sent us data. I haven't run and tested this code and there are thing left to do:
Chances are, you will get several GET requests in a single client_sock.recv(1024) call, because again, message borders aren't preserved in TCP. Probably, look additional get requests each time you receive data.
Request may differ for POST, HEAD, PUT, DELETE and other types of requests. Parse them accordingly.
Browsers and servers usually utilise one TCP connection by setting Connection: keep-alive option in the headers, but they also may decide to drop it. Be ready to detect disconnects and sockets closed by a remote peer (for simplicity sake, this is called while it_makes_sense() in the code).
bind, listen, accept, recv, send, sendall, getaddrinfo, select - all these functions can throw exceptions. It's better to catch them and act accordingly.
The code currently server one client at a time.
I am using python to write a simple web server, and sending requests to it. And I use libevent as my http client. But every time I send a keep-alive request, the http connection have the close callback before the success callback. I think it might be the keep-alive problem. And this is my python(server) code:
import socket
HOST, PORT = '', 8999
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 60)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, 4)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, 15)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
client_connection, client_address = listen_socket.accept()
request = client_connection.recv(1024)
print request
http_response = """\
HTTP/1.1 200 OK
Hello, World!
"""
client_connection.sendall(http_response)
client_connection.close()
But every time I send a keep-alive request, ...
I think you are mixing up the application layer HTTP keep-alive and the transport layer TCP keep-alive.
HTTP keep-alive is used by the client to suggest to the server that the underlying TCP connection should be kept open for further requests from the client. But the server might decline and your server explicitly closes the connection after it handled the clients request,i.e. finished sending the response. Apart from that the way the server sends the response in a way which makes HTTP keep-alive impossible because the length of the response is unknown and thus ends only with the end of the underlying TCP connection. To fix this you would need to specify a Content-length or use chunked transfer encoding.
TCP keep alive instead is used to detect break of connectivity, i.e. one side crashed, router dead or similar. It is not related to HTTP keep-alive at all except for the similar name. It is set with setsockopt and that's what you are doing. But there is no such thing as a keep-alive request which you can explicitly send in case of TCP keep-alive.
I'm trying to create a TCP socket server in Python that after receiving a string of bytes from a client passes the received data(without knowing what it's actually inside, assuming it's a valid HTTP request) to a HTTP or HTTPS proxy and waits for results, my code looks like this:
import socket
def test(host, port):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((socket.gethostbyname(host), int(port))
msg = """GET / HTTP/1.1
Host: www.bing.com
User-Agent: Firefox
"""
sent_count = sock.send(msg)
recv_value = sock.recv(2048)
print('recvieved:',)
print str(recv_value)
pass
if __name__ == '__main__':
test(host='x.x.x.xx', port='80') # a https proxy server
test(host='yy.yy.yy.yy', port='80') # a http proxy server
But when i connect to the HTTP proxy server it returns something like:
HTTP/1.1 404 Not Found
And when i connect to the HTTPS proxy server it shows something like:
HTTP/1.0 400 Bad Request
So wanted to ask if anybody know how could i send HTTP requests to HTTP/HTTPS servers via sockets in Python? or how can i redirect arbitrary strings of data toward HTTP/HTTPS proxy servers in general in Python using sockets?, any suggestions are very much appreciated, thanks in advance.
I want to connect Blender (v2.55) to a webpage through sockets.
For the web part, I can use Node.js & socket.io. I've already used a little node.js/socket.io, it's not a problem I think.
Now for Blender, it runs on Python 3.1, so I've already sockets and I can add libraries if needed. I'm new to Python sockets, can I connect a client to node.js/socket.io directly ?
I tried with the basic code from the Python doc:
import socket
import sys
HOST, PORT = "127.0.0.1", 8080
data = "Hello from Blender"
# Create a socket (SOCK_STREAM means a TCP socket)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect to server and send data
sock.connect((HOST, PORT))
sock.send(bytes(data + "\n","utf8"))
# Receive data from the server and shut down
received = sock.recv(1024)
sock.close()
print("Sent: %s" % data)
print("Received: %s" % received)
It results by:
Sent: Hello from Blender
Received: b''
It seems that Blender is connected, but doesn't receive data. Also Node shows no new client connected…
Do I need something else ? If somebody can help me out…
You are missing a protocol/handshake. What you have there is a bare TCP socket connection. node.js/socket.io lives on top of a TCP socket. Basically when you open a connection to a socket.io server, it's expecting you to use some protocol for communication (websockets, longpolling, htmlfile, whatever). The initial handshake defines what that protocol will be. Websockets is one of the supported protocols. This blog post should help you. It doesn't look all that hard to get websockets implemented.
you can try the form of loop to receive valid data.
import socket
host="127.0.0.1"
port=8088
web=socket.socket()
web.bind((host,port))
web.listen(5)
print("recycle")
while True:
conn,addr=web.accept()
data=conn.recv(8)
print(data)
conn.sendall(b'HTTP/1.1 200 OK\r\n\r\nHello world')
conn.close()
and use your browser to visit the host and port for a check
I understand this thread is extremely old. But I faced the same problem recently and couldn't find an answer or any similar questions. So here is my answer.
Answer: Use socket.io for python python-socketio
The reason why built-in sockets or any other websocket library in python won't work is explained in the socket.io website socket.io
Socketio is simply just not a websoket connection. Although they say, it uses websockets for transport internally, the connection is established with HTTP protocol http:// as opposed to the WEBSOCKET protocol ws://. This results in the failure of handshake and the connection fails to be established.