How to get port number from an HTTP response, using Python?

How to get port number from an HTTP response, using Python? - python

I am trying to simulate Network Address Translation for some test code. I am mapping virtual users to high port numbers, then, when I want to test a request from user 1, I send it from port 6000 (user 2, from 6001, etc).
However, I can't see the port number in the response.
connection = httplib.HTTPConnection("the.company.lan", port=80, strict=False,
timeout=10, source_address=("10.129.38.51", 6000))
connection.request("GET", "/portal/index.html")
httpResponse = connection.getresponse()
connection.close()
httpResponse.status is 200, but I don't see the port number anywhere in the response headers.
Maybe I should be using some lower level socket functionality? If so, which is simplest and supports both HTTP and FTP? Btw, I only want to use built-in modules, nothing which I have to install.
[Update] I should have made it clearer; I really do need to get the actual port number received in the response, not just remember it.

To complete #TimSpence answer, you can use a socket object as an interface for your connection and then treat with some API your data as an HTTP object.
host = 'xxx.xxx.xxx.xxx'
port = 80
address = (host, port)
## socket object interface for a TCP connection
listener_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM,
socket.IPPROTO_TCP)
listener_socket.bind(address)
listener_socker.listen(MAX_CONNECTIONS)
## new_connection is the connection object you use to handle the data echange
## source_address (source_host, source_port) is the address object
## source_port is what you're looking for
new_connection, source_address = listener_socket.accept()
data = new_connection.recv(65536)
## handle data as an HTTP object(for example as an HTTP request)
new_connection.close()

HTTP messages do not contain anything about ports so the httpResponse will not have that information.
However, you will need a different connection object (which will map to a different underlying socket) for each request anyway so you can get that information from the HTTPconnection object.
_, port = connection.source_address
Does that help?

Considring your comments, I had to provide a new answer.
I though you can also put a non standard header host in your HTTPRespose, 'Host: domain/IP:port', so that your client can read it when it receives a response.
Server Response:
HTTP/1.1 200 OK
Date: Day, DD Month YYYY HH:MM:SS GMT
Content-Type: text/html; charset=UTF-8
Content-Encoding: UTF-8
Content-Length: LENGTH
Last-Modified: Day, DD Month YYYY HH:MM:SS GMT
Server: Name/Version (Platform)
Accept-Ranges: bytes
Connection: close
Host: domain/IP:port #exapmple: the.company.lan:80
<html>
<head>
<title>Example Response</title>
</head>
<body>
Hello World!
</body>
</html>
Client:
connection = httplib.HTTPConnection("the.company.lan", port=80,
strict=False, timeout=10,
source_address=("10.129.38.51", 6000))
connection.request("GET", "/portal/index.html")
httpResponse = connection.getresponse()
## store a dict with the response headers
## extract your custom header 'host'
res_headers = dict(httpResponse.getheaders());
server_address = tuple(headers['host'].split(':'))
## read the response content
HTMLData = httpResponse.read(CONTENT_LENGTH_HEADER)
connection.close()
This way you got server_address as a tuple (domain, port).

Related

how to get html file into code in python?

how to get html file into python code using socket. I was able to implement using the requests library. However, it needs to be rewritten to sockets. I don’t understand how. The implementation code through requests will be below. I will also leave pathetic attempts to implement via a socket using Google. However, the decision is not at all correct. ! (Help implement using sockets.
import requests
reg_get = requests.get("https://stackoverflow.blog/")
text = reg_get.text
print(text)
import socket
request = b"GET / HTTP/1.1\nHost: https://stackoverflow.blog/\n\n"
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("https://stackoverflow.blog/", 80))
s.send(request)
result = s.recv(10000)
while (len(result) > 0):
print(result)
result = s.recv(10000)
After seeing the comments and listening to you. I have rewritten the following code. However, I never got the html. And I received information about the site. How do I get html structure in python
import socket
import ssl
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
request = "GET /r/AccidentalRenaissance/comments/8ciibe/mr_fluffies_betrayal/ HTTP/1.1\r\nHost: www.reddit.com\r\n\r\n"
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
s = context.wrap_socket(sock, server_hostname = "www.reddit.com")
s.connect(("www.reddit.com", 443))
s.sendall(request.encode())
contest = s.recv(1024).decode()
s.close()
print(contest)
result
HTTP/1.1 200 OK
Connection: keep-alive
Cache-control: private, s-maxage=0, max-age=0, must-revalidate, no-store
Content-Type: text/html; charset=utf-8
X-Frame-Options: SAMEORIGIN
Accept-Ranges: bytes
Date: Sun, 03 Oct 2021 03:34:25 GMT
Via: 1.1 varnish
Vary: Accept-Encoding, Accept-Encoding

A URL is composed of a protocol, a hostname, an optional port, and an optional path. In the URL http://stackoverflow.blog/ , https is the protocol, stackoverflow.blog is the hostname, and no port or path is provided. For http, the port defaults to 80 and the path defaults to /. When using sockets, first establish a connection to the host at the port using connect then send an HTTP command to retrieve the page on the path. The HTTP command to retrieve the page is "GET /" and receive the response from the server.
Note that I used http instead of https because https adds security set up and negotiation to the above that occurs once the connect is done but before the "GET /" is done. It is quite complicated and a good reason to use Requests instead of trying to implement it yourself. If you don't want to use Requests but don't want to go down to the level of sockets, take a look at urllib3

Reading text from website using sockets Python

I am trying to go to http://www.py4inf.com/code/romeo.txt, read the contents of romeo.txt and print them back out, am using python 3.6.1.
import socket
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('www.py4inf.com', 80))
mysock.send('GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n'.encode("utf8"))
while True:
data = mysock.recv(512)
if ( len(data) < 1 ) :
break
print (data.decode("utf8"))
mysock.close()
instead of the contents of the page it prints out
TTP/1.1 404 Not Found
Server: nginx
Date: Wed, 21 Jun 2017 03:00:15 GMT
Content-Type: text/html
Content-Length: 162
Connection: close
<html>
<head><title>404 Not Found</title></head>
<body bgcolor="white">
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html
Why is this? Thanks in advance

In theory, the Host header is only mandatory from HTTP 1.1 onwards, but it appears that particular server requires the Host header to be present, even for HTTP 1.0. I'm not sure if that's the default behaviour of Nginx, or whether the server admin's explicitly configured it that way.
In any case, try changing your request to the following:
mysock.send('GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\nHost: www.py4inf.com\n\n'.encode("utf8"))
I can understand your confusion - IMHO, it should be returning 400 not 404 if it is insisting on the Host header being provided (since it's a client request issue, not a matter of the resource not existing).

proxy server not sending all data python

I'm creating a HTTP proxy in python but I'm having trouble in the fact that my proxy will only accept the webservers response and will completely ignore the browsers next request and the transfer of data just stops. Here's the code:
import socket
s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
bhost = '192.168.1.115'
port = 8080
s.bind((bhost, port))
s.listen(5)
def server(sock, data, host):
p = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
p.connect((host, 80))
p.send(data)
rdata = p.recv(1024)
print(rdata)
sock.send(rdata)
while True:
sock, addr = s.accept()
data = sock.recv(1024)
host = data.splitlines()[1][6:]
server(sock, data, host)`
Sorry about the code this is just a trial version and help will be much appreciated as I am only 14 and have much to learn :-)

Unfortunately I don't really see how your code should work, so I'm putting here my thoughts of how should a simple HTTP proxy look like.
So what should a basic proxy server do:
Accept connection from a client and receive an HTTP request.
Parse the request and extract its destination.
Forward requests and responses.
(optionally) Support Connection: keep-alive.
Let's go step by step and write some very simplified code.
How does proxy accepts a client. A socket should be created and moved to passive mode:
import socket, select
sock = socket.socket()
sock.bind((your_ip, port))
sock.listen()
while True:
client_sock = sock.accept()
do_stuff(client_sock)
Once the TCP connection is established, it's time receive a request. Let's assume we're going to get something like this:
GET /?a=1&b=2 HTTP/1.1
Host: localhost
User-Agent: my browser details
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
In TCP, message borders aren't preserved, so we should wait until we get at least first two lines (for GET request) in order to know what to do later:
def do_stuff(sock):
data = receive_two_lines(sock)
remote_host = parse_request(data)
After we have got the remote hostname, it's time to forward the requests and responses:
def do_stuff(client_sock):
data = receive_two_lines(client_sock)
remote_host = parse_request(data)
remote_ip = socket.getaddrinfo(remote_host) # see the docs for exact use
webserver = socket.socket()
webserver.connect((remote_ip, 80))
webserver.sendall(data)
while it_makes_sense():
client_ready = select.select([client_sock], [], [])[0]
web_ready = select.select([webserver], [], [])[0]
if client_ready:
webserver.sendall(client_sock.recv(1024))
if web_ready:
client_sock.sendall(webserver.recv(1024))
Please note select - this is how we know if a remote peer has sent us data. I haven't run and tested this code and there are thing left to do:
Chances are, you will get several GET requests in a single client_sock.recv(1024) call, because again, message borders aren't preserved in TCP. Probably, look additional get requests each time you receive data.
Request may differ for POST, HEAD, PUT, DELETE and other types of requests. Parse them accordingly.
Browsers and servers usually utilise one TCP connection by setting Connection: keep-alive option in the headers, but they also may decide to drop it. Be ready to detect disconnects and sockets closed by a remote peer (for simplicity sake, this is called while it_makes_sense() in the code).
bind, listen, accept, recv, send, sendall, getaddrinfo, select - all these functions can throw exceptions. It's better to catch them and act accordingly.
The code currently server one client at a time.

python socket handle keepalive request

I am using python to write a simple web server, and sending requests to it. And I use libevent as my http client. But every time I send a keep-alive request, the http connection have the close callback before the success callback. I think it might be the keep-alive problem. And this is my python(server) code:
import socket
HOST, PORT = '', 8999
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 60)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, 4)
listen_socket.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, 15)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
client_connection, client_address = listen_socket.accept()
request = client_connection.recv(1024)
print request
http_response = """\
HTTP/1.1 200 OK
Hello, World!
"""
client_connection.sendall(http_response)
client_connection.close()

But every time I send a keep-alive request, ...
I think you are mixing up the application layer HTTP keep-alive and the transport layer TCP keep-alive.
HTTP keep-alive is used by the client to suggest to the server that the underlying TCP connection should be kept open for further requests from the client. But the server might decline and your server explicitly closes the connection after it handled the clients request,i.e. finished sending the response. Apart from that the way the server sends the response in a way which makes HTTP keep-alive impossible because the length of the response is unknown and thus ends only with the end of the underlying TCP connection. To fix this you would need to specify a Content-length or use chunked transfer encoding.
TCP keep alive instead is used to detect break of connectivity, i.e. one side crashed, router dead or similar. It is not related to HTTP keep-alive at all except for the similar name. It is set with setsockopt and that's what you are doing. But there is no such thing as a keep-alive request which you can explicitly send in case of TCP keep-alive.

Python socket client Post parameters

Fir let me clear I don't want to to use higher level APIs, I only want to use socket programming
I have wrote following program to connect to server using POST request.
import socket
import binascii
host = "localhost"
port = 9000
message = "POST /auth HTTP/1.1\r\n"
parameters = "userName=Ganesh&password=pass\r\n"
contentLength = "Content-Length: " + str(len(parameters))
contentType = "Content-Type: application/x-www-form-urlencoded\r\n"
finalMessage = message + contentLength + contentType + "\r\n"
finalMessage = finalMessage + parameters
finalMessage = binascii.a2b_qp(finalMessage)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host, port))
s.sendall(finalMessage)
print(s.recv(1024))
I checked online how POST request is created.
Somehow Paramters are not getting passed to the server. Do I have to add or remove "\r\n" in between the request?
Thanks in advance,
Regards,
Ganesh.

This line finalMessage = binascii.a2b_qp(finalMessage) is certainly wrong, so you should remove the line completely, another problem is that there is no new-line missing after Content-Length. In this case the request sent to the socket is (I am showing the CR and LF characters here as \r\n, but also splitting lines for clarity):
POST /auth HTTP/1.1\r\n
Content-Length: 31Content-Type: application/x-www-form-urlencoded\r\n
\r\n
userName=Ganesh&password=pass\r\n
So obviously this does not make much sense to the web server.
But even after adding a newline and removing a2b_qp, there is still the problem is that you are not talking HTTP/1.1 there; the request must have a Host header for HTTP/1.1 (RFC 2616 14.23):
A client MUST include a Host header field in all HTTP/1.1 request
messages . If the requested URI does not include an Internet host name
for the service being requested, then the Host header field MUST be
given with an empty value. An HTTP/1.1 proxy MUST ensure that any
request message it forwards does contain an appropriate Host header
field that identifies the service being requested by the proxy. All
Internet-based HTTP/1.1 servers MUST respond with a 400 (Bad Request)
status code to any HTTP/1.1 request message which lacks a Host header
field.
Also you do not support chunked requests and persistent connections, keepalives or anything, so you must do Connection: close (RFC 2616 14.10):
HTTP/1.1 applications that do not support persistent connections MUST
include the "close" connection option in every message.
Thus, any HTTP/1.1 server that would still respond normally to your messages without Host: header is also broken.
This the data that you should send to the socket with that request:
POST /auth HTTP/1.1\r\n
Content-Type: application/x-www-form-urlencoded\r\n
Content-Length: 29\r\n
Host: localhost:9000\r\n
Connection: close\r\n
\r\n
userName=Ganesh&password=pass
Note that you'd not add the \r\n in the body anymore (thus the length of body 29). Also you should read the response to find out whatever the error is that you're getting.
On Python 3 the working code would say:
host = "localhost"
port = 9000
headers = """\
POST /auth HTTP/1.1\r
Content-Type: {content_type}\r
Content-Length: {content_length}\r
Host: {host}\r
Connection: close\r
\r\n"""
body = 'userName=Ganesh&password=pass'
body_bytes = body.encode('ascii')
header_bytes = headers.format(
content_type="application/x-www-form-urlencoded",
content_length=len(body_bytes),
host=str(host) + ":" + str(port)
).encode('iso-8859-1')
payload = header_bytes + body_bytes
# ...
socket.sendall(payload)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get port number from an HTTP response, using Python? - python

Related

how to get html file into code in python?

Reading text from website using sockets Python

proxy server not sending all data python

python socket handle keepalive request

Python socket client Post parameters

Categories

Resources