I'm trying to get 101 Switching Protocol response by upgrading the connnection to HTTP/2 via Socket. I've tried big web that support HTTP/2 such as google.com and twitter.com, None of them that give a proper response, here's the approach:
import socket
import ssl
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
context = ssl.create_default_context()
context.set_ciphers(cipher)
sock.connect(('twitter.com', 443))
sock = context.wrap_socket(sock, server_hostname='twitter.com')
sock.sendall(bytes('GET h2://twitter.com/ HTTP/1.1\r\nHost: twitter.com\r\nConnection: Upgrade, HTTP2-Settings\r\nUpgrade: h2\r\nHTTP2-Settings: \r\nAlt-Svc: h2=":443"\r\n\r\n',encoding='utf-8'))
sock.settimeout(5)
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
line = str(sock.recv(10000))
print(line)
Upgrading the HTTP/1 connection to HTTP/2; giving 400 bad request, not what I'm expecting 101.
sock.connect(('www.google.com', 443))
sock = context.wrap_socket(sock, server_hostname='www.google.com')
sock.sendall(bytes('GET / HTTP/1.1\r\nHost: www.google.com\r\nConnection: Upgrade, HTTP2-Settings\r\nUpgrade: h2\r\nHTTP2-Settings: \r\n\r\n',encoding='utf-8'))
I've also tried to set ALPN protocol manually but google.com and twitter.com give different response:
context.set_alpn_protocols(['h2'])
for google.com there's frame response:
b'\x00\x00\x12\x04\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00d\x00\x04\x00\x10\x00\x00\x00\x06\x00\x01\x00\x00\x00\x00\x04\x08\x00\x00\x00\x00\x00\x00\x0f\x00\x01'
but twitter.com giving empty response frame:
b''
The reason for this is, i had to perform checks based on wordlist that contains list of website. How can i properly check if website supports for HTTP/2 in Python?
Attachment:
Twitter
yes, you do need to set the context.set_alpn_protocols(['h2'])
and you can check if you successfully negotiated with sock.selected_alpn_protocol().
Unfortunately, if you want to actually test an http call, you need to encode the header (get, host, etc.). rfc7540 section 4 and section 6.2 detail the requirement. a python library like h2 can help with the encoding as well as properly encode and decode the other sadly necessary http2 chatter.
The whole connection upgrade thing is not used by web browsers, so it is not unexpected that websites don't support it. Once you have an alpn h2 negotiated tls stream, a properly encoded
GET / HTTP/2
Host: twitter.com
should be all you need. That is the reason twitter just closed the socket. I suspect the google frame sent back (if you used hpack to decode) would say something like "I don't understand".
Related
how to get html file into python code using socket. I was able to implement using the requests library. However, it needs to be rewritten to sockets. I don’t understand how. The implementation code through requests will be below. I will also leave pathetic attempts to implement via a socket using Google. However, the decision is not at all correct. ! (Help implement using sockets.
import requests
reg_get = requests.get("https://stackoverflow.blog/")
text = reg_get.text
print(text)
import socket
request = b"GET / HTTP/1.1\nHost: https://stackoverflow.blog/\n\n"
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("https://stackoverflow.blog/", 80))
s.send(request)
result = s.recv(10000)
while (len(result) > 0):
print(result)
result = s.recv(10000)
After seeing the comments and listening to you. I have rewritten the following code. However, I never got the html. And I received information about the site. How do I get html structure in python
import socket
import ssl
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
request = "GET /r/AccidentalRenaissance/comments/8ciibe/mr_fluffies_betrayal/ HTTP/1.1\r\nHost: www.reddit.com\r\n\r\n"
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
s = context.wrap_socket(sock, server_hostname = "www.reddit.com")
s.connect(("www.reddit.com", 443))
s.sendall(request.encode())
contest = s.recv(1024).decode()
s.close()
print(contest)
result
HTTP/1.1 200 OK
Connection: keep-alive
Cache-control: private, s-maxage=0, max-age=0, must-revalidate, no-store
Content-Type: text/html; charset=utf-8
X-Frame-Options: SAMEORIGIN
Accept-Ranges: bytes
Date: Sun, 03 Oct 2021 03:34:25 GMT
Via: 1.1 varnish
Vary: Accept-Encoding, Accept-Encoding
A URL is composed of a protocol, a hostname, an optional port, and an optional path. In the URL http://stackoverflow.blog/ , https is the protocol, stackoverflow.blog is the hostname, and no port or path is provided. For http, the port defaults to 80 and the path defaults to /. When using sockets, first establish a connection to the host at the port using connect then send an HTTP command to retrieve the page on the path. The HTTP command to retrieve the page is "GET /" and receive the response from the server.
Note that I used http instead of https because https adds security set up and negotiation to the above that occurs once the connect is done but before the "GET /" is done. It is quite complicated and a good reason to use Requests instead of trying to implement it yourself. If you don't want to use Requests but don't want to go down to the level of sockets, take a look at urllib3
I have a server, which takes few minutes to process a specific request and then responds to it.
The client has to keep waiting for the response without knowing when it will complete.
Is there a way to let the client know about the processing status? (say 50% completed, 80% completed), without the client having to poll for the status.
Without using any of the newer techniques (websockets, webpush/http2, ...), I've previously used a simplified Pushlet or Long polling solution for HTTP 1.1 and various javascript or own client implementation. If my solution doesn't fit in your use case, you can always google those two names for further possible ways.
Client
sends a request, reads 17 bytes (Inital http response) and then reads 2 bytes at a time getting processing status.
Server
sends a valid HTTP response and during request progress sends 2 bytes of percentage completed, until last 2 bytes are "ok" and closes connection.
UPDATED: Example uwsgi server.py
from time import sleep
def application(env, start_response):
start_response('200 OK', [])
def working():
yield b'00'
sleep(1)
yield b'36'
sleep(1)
yield b'ok'
return working()
UPDATED: Example requests client.py
import requests
response = requests.get('http://localhost:8080/', stream=True)
for r in response.iter_content(chunk_size=2):
print(r)
Example server (only use for testing :)
import socket
from time import sleep
HOST, PORT = '', 8888
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
while True:
client_connection, client_address = listen_socket.accept()
request = client_connection.recv(1024)
client_connection.send('HTTP/1.1 200 OK\n\n')
client_connection.send('00') # 0%
sleep(2) # Your work is done here
client_connection.send('36') # 36%
sleep(2) # Your work is done here
client_connection.sendall('ok') # done
client_connection.close()
If the last 2 bytes aren't "ok", handle error someway else. This isn't beautiful HTTP status code compliance but more of a workaround that did work for me many years ago.
telnet client example
$ telnet localhost 8888
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 200 OK
0036okConnection closed by foreign host.
This answer probably won’t help in your particular case, but it might help in other cases.
The HTTP protocol supports informational (1xx) responses:
indicates an interim
response for communicating connection status or request progress
prior to completing the requested action and sending a final
response
There is even a status code precisely for your use case, 102 (Processing):
interim response used to
inform the client that the server has accepted the complete request,
but has not yet completed it
Status code 102 was removed from further editions of that standard due to lack of implementations, but it is still registered and could be used.
So, it might look like this (HTTP/2 has an equivalent binary form):
HTTP/1.1 102 Processing
Progress: 50%
HTTP/1.1 102 Processing
Progress: 80%
HTTP/1.1 200 OK
Date: Sat, 05 Aug 2017 11:53:14 GMT
Content-Type: text/plain
All done!
Unfortunately, this is not widely supported. In particular, WSGI does not provide a way to send arbitrary 1xx responses. Clients support 1xx responses in the sense that they are required to parse and tolerate them, but they usually don’t give programmatic access to them: in this example, the Progress header would not be available to the client application.
However, 1xx responses may still be useful (if the server can send them) because they have the effect of resetting the client’s socket read timeout, which is one of the main problems with slow responses.
Use Chunked Transfer Encoding, which is a standard technique to transmit streams of unknown length.
See: Wikipedia - Chunked Transfer Encoding
Here a python server implementation available as a gist on GitHub:
https://gist.github.com/josiahcarlson/3250376
It sends content using chunked transfer encoding using standard library modules
In the client if chunked transfer encoding has been notified by the server, you'd only need to:
import requests
response = requests.get('http://server.fqdn:port/', stream=True)
for r in response.iter_content(chunk_size=None):
print(r)
chunk_size=None, because the chunks are dynamic and will be determined by the information in the simple conventions of the chunked transfer semantics.
See: http://docs.python-requests.org/en/master/user/advanced/#chunk-encoded-requests
When you see for example 100 in the content of response r, you know that the next chunk will be the actual content after processing the 100.
I'm creating a HTTP proxy in python but I'm having trouble in the fact that my proxy will only accept the webservers response and will completely ignore the browsers next request and the transfer of data just stops. Here's the code:
import socket
s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
bhost = '192.168.1.115'
port = 8080
s.bind((bhost, port))
s.listen(5)
def server(sock, data, host):
p = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
p.connect((host, 80))
p.send(data)
rdata = p.recv(1024)
print(rdata)
sock.send(rdata)
while True:
sock, addr = s.accept()
data = sock.recv(1024)
host = data.splitlines()[1][6:]
server(sock, data, host)`
Sorry about the code this is just a trial version and help will be much appreciated as I am only 14 and have much to learn :-)
Unfortunately I don't really see how your code should work, so I'm putting here my thoughts of how should a simple HTTP proxy look like.
So what should a basic proxy server do:
Accept connection from a client and receive an HTTP request.
Parse the request and extract its destination.
Forward requests and responses.
(optionally) Support Connection: keep-alive.
Let's go step by step and write some very simplified code.
How does proxy accepts a client. A socket should be created and moved to passive mode:
import socket, select
sock = socket.socket()
sock.bind((your_ip, port))
sock.listen()
while True:
client_sock = sock.accept()
do_stuff(client_sock)
Once the TCP connection is established, it's time receive a request. Let's assume we're going to get something like this:
GET /?a=1&b=2 HTTP/1.1
Host: localhost
User-Agent: my browser details
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
In TCP, message borders aren't preserved, so we should wait until we get at least first two lines (for GET request) in order to know what to do later:
def do_stuff(sock):
data = receive_two_lines(sock)
remote_host = parse_request(data)
After we have got the remote hostname, it's time to forward the requests and responses:
def do_stuff(client_sock):
data = receive_two_lines(client_sock)
remote_host = parse_request(data)
remote_ip = socket.getaddrinfo(remote_host) # see the docs for exact use
webserver = socket.socket()
webserver.connect((remote_ip, 80))
webserver.sendall(data)
while it_makes_sense():
client_ready = select.select([client_sock], [], [])[0]
web_ready = select.select([webserver], [], [])[0]
if client_ready:
webserver.sendall(client_sock.recv(1024))
if web_ready:
client_sock.sendall(webserver.recv(1024))
Please note select - this is how we know if a remote peer has sent us data. I haven't run and tested this code and there are thing left to do:
Chances are, you will get several GET requests in a single client_sock.recv(1024) call, because again, message borders aren't preserved in TCP. Probably, look additional get requests each time you receive data.
Request may differ for POST, HEAD, PUT, DELETE and other types of requests. Parse them accordingly.
Browsers and servers usually utilise one TCP connection by setting Connection: keep-alive option in the headers, but they also may decide to drop it. Be ready to detect disconnects and sockets closed by a remote peer (for simplicity sake, this is called while it_makes_sense() in the code).
bind, listen, accept, recv, send, sendall, getaddrinfo, select - all these functions can throw exceptions. It's better to catch them and act accordingly.
The code currently server one client at a time.
I am using python to write a simple web server, and sending requests to it. And I use libevent as my http client. But every time I send a keep-alive request, the http connection have the close callback before the success callback. I think it might be the keep-alive problem. And this is my python(server) code:
import socket
HOST, PORT = '', 8999
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 60)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, 4)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, 15)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
client_connection, client_address = listen_socket.accept()
request = client_connection.recv(1024)
print request
http_response = """\
HTTP/1.1 200 OK
Hello, World!
"""
client_connection.sendall(http_response)
client_connection.close()
But every time I send a keep-alive request, ...
I think you are mixing up the application layer HTTP keep-alive and the transport layer TCP keep-alive.
HTTP keep-alive is used by the client to suggest to the server that the underlying TCP connection should be kept open for further requests from the client. But the server might decline and your server explicitly closes the connection after it handled the clients request,i.e. finished sending the response. Apart from that the way the server sends the response in a way which makes HTTP keep-alive impossible because the length of the response is unknown and thus ends only with the end of the underlying TCP connection. To fix this you would need to specify a Content-length or use chunked transfer encoding.
TCP keep alive instead is used to detect break of connectivity, i.e. one side crashed, router dead or similar. It is not related to HTTP keep-alive at all except for the similar name. It is set with setsockopt and that's what you are doing. But there is no such thing as a keep-alive request which you can explicitly send in case of TCP keep-alive.
I'm trying to create a TCP socket server in Python that after receiving a string of bytes from a client passes the received data(without knowing what it's actually inside, assuming it's a valid HTTP request) to a HTTP or HTTPS proxy and waits for results, my code looks like this:
import socket
def test(host, port):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((socket.gethostbyname(host), int(port))
msg = """GET / HTTP/1.1
Host: www.bing.com
User-Agent: Firefox
"""
sent_count = sock.send(msg)
recv_value = sock.recv(2048)
print('recvieved:',)
print str(recv_value)
pass
if __name__ == '__main__':
test(host='x.x.x.xx', port='80') # a https proxy server
test(host='yy.yy.yy.yy', port='80') # a http proxy server
But when i connect to the HTTP proxy server it returns something like:
HTTP/1.1 404 Not Found
And when i connect to the HTTPS proxy server it shows something like:
HTTP/1.0 400 Bad Request
So wanted to ask if anybody know how could i send HTTP requests to HTTP/HTTPS servers via sockets in Python? or how can i redirect arbitrary strings of data toward HTTP/HTTPS proxy servers in general in Python using sockets?, any suggestions are very much appreciated, thanks in advance.