I know that python urllib3 by default reuse http connection for requests sent to a same host. I wanted to have it work for requests sending to ip address. I did a little test:
import logging
import requests
logging.basicConfig(level=logging.INFO)
s = requests.Session()
print(s.get('https://<ip address here>/xxx/yyy',verify=False))
print(s.get('https://<same ip address here/xxx/yyy>',verify=False))
output:
INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1):...
INFO:requests.packages.urllib3.connectionpool:Resetting dropped connection: ...
second code:
import logging
import requests
logging.basicConfig(level=logging.INFO)
s = requests.Session()
print(s.get('http://httpbin.org/cookies/set/sessioncookie/123456789'))
print(s.get('http://httpbin.org/cookies/set/anothercookie/123456789'))
output:
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): httpbin.org
<Response [200]>
<Response [200]>
Obviously http connect was not reused (I mean w/o close nor drop) for ip address host, how can I make it work? or it's just impossible in the first place?
Related
I'm trying to connect to a websocket server that protected with CloudFlare through upgrade: websocket header. Expected result is 101 Switching Protocol. Using a raw Socket, I was able to connect into the server but with several issues such as SSLv3 Handshake Failure or the server doesn't give any response; sometimes occur.
import ssl
import socket
socketch = ssl._create_unverified_context().wrap_socket(socket.socket(), server_hostname='unpkg.com')
socketch.connect(('unpkg.com', 443))
socketch.sendall(b'''GET / HTTP/1.1\r
Host: identity.o2.co.uk.zainvps.tk\r
User-Agent: cpprestsdk/2.9.0\r
Upgrade: websocket\r
Connection: Upgrade\r
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==\r
Sec-WebSocket-Version: 13\r\n\r
''')
print(socketch.recv(10000))
print('')
Using a raw socket is unstable, so I think it's better to use requests module.
import requests
heading = {'Host':'identity.o2.co.uk.zainvps.tk','Connection':'upgrade','Upgrade':'websocket','Sec-Websocket-Version':'13','Sec-Websocket-Key':'dGhlIHNhbXBsZSBub25jZQ=='}
r = requests.get('https://unpkg.com', headers=heading)
print(r.status_code)
Using requests; the server responded with 403 status codes which means it's rejected by the CloudFlare protection but when using Socket, it gives the correct 101 status code. I'm assuming that it is because of wrapped socket gives an expected SSL Hostname through server_hostname.
Is this idea can also be implemented inside requests.Session()?
UPDATE 1:
Someone mentioning about the use of CloudScraper module to bypass the CloudFlare protection. Using CloudScraper still returns in 403 status code with Custom Headers.
import cloudscraper
scraper = cloudscraper.create_scraper()
url = 'https://unpkg.com'
sc = scraper.get(url, headers={"Host": "usaws1.sshstores.vip", "Connection": "upgrade", "Upgrade": "websocket","Sec-WebSocket-Key": "dGhlIHNhbXBsZSBub25jZQ==", "Sec-WebSocket-Version": "13"})
print(sc.status_code)
I'm trying to send an HTTPS request through an HTTPS tunnel. That is, my proxy expects HTTPS for the CONNECT. It also expects a client certificate.
I'm using Requests' proxy features.
import requests
url = "https://some.external.com/endpoint"
with requests.Session() as session:
response = session.get(
url,
proxies={"https": "https://proxy.host:4443"},
# client certificates expected by proxy
cert=(cert_path, key_path),
verify="/home/savior/proxy-ca-bundle.pem",
)
with response:
...
This works, but with some limitations:
I can only set client certificates for the TLS connection with the proxy, not for the external endpoint.
The proxy-ca-bundle.pem only verifies the server certificates in the TLS connection with the proxy. The server certificates from the external endpoint are seemingly ignored.
Is there any way to use requests to address these two issues? I'd like to set a different set of CAs for the external endpoint.
I also tried using http.client and HTTPSConnection.set_tunnel but, as far as I can tell, its tunnel is done through HTTP and I need HTTPS.
Looking at the source code, it doesn't seem like requests currently supports this "TLS in TLS", ie. providing two sets of clients/CA bundles for a proxied requests.
We can use PycURL which simply wraps libcurl
from io import BytesIO
import pycurl
url = "https://some.external.com/endpoint"
buffer = BytesIO()
curl = pycurl.Curl()
curl.setopt(curl.URL, url)
curl.setopt(curl.WRITEDATA, buffer)
# proxy settings
curl.setopt(curl.HTTPPROXYTUNNEL, 1)
curl.setopt(curl.PROXY, "https://proxy.host")
curl.setopt(curl.PROXYPORT, 4443)
curl.setopt(curl.PROXY_SSLCERT, cert_path)
curl.setopt(curl.PROXY_SSLKEY, key_path)
curl.setopt(curl.PROXY_CAINFO, "/home/savior/proxy-ca-bundle.pem")
# endpoint verification
curl.setopt(curl.CAINFO, "/home/savior/external-ca-bundle.pem")
try:
curl.perform()
except pycurl.error:
pass # log or re-raise
else:
status_code = curl.getinfo(curl.RESPONSE_CODE)
PycURL will use the PROXY_ settings to establish a TLS connection to the proxy, send it an HTTP CONNECT request. Then it'll establish a new TLS session through the proxy connection to the external endpoint and use the CAINFO bundle to verify those server certificates.
I'm trying to access the following domain nzxj65x32vh2fkhk.onion using requests.
I have tor running and I configured the session's object proxies correctly.
import requests
session = requests.session()
session.proxies = {'http': 'socks5://localhost:9050',
'https': 'socks5://localhost:9050'}
print(session.get('http://httpbin.org/ip').text) # prints {"origin": "67.205.146.164" }
print(requests.get('http://httpbin.org/ip').text) # prints {"origin": "5.102.254.76" }
However when I try to access the URL with the .onion domain I get the following error:
session.get('http://nzxj65x32vh2fkhk.onion/all')
ConnectionError: SOCKSHTTPConnectionPool(host='nzxj65x32vh2fkhk.onion', port=80): Max retries exceeded with url: /all (Caused by NewConnectionError('<requests.packages.urllib3.contrib.socks.SOCKSConnection object at 0x7f5e8c2dbbd0>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I also tried to replace localhost with 127.0.0.1 as suggested in one of the answers. The result is the same unfortunately.
Performing the same request using urllib2 works just fine.
import socks, socket, urllib2
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '127.0.0.1', 9050)
socket.socket = socks.socksocket
socket.create_connection = create_connection
print(urllib2.urlopen('http://nzxj65x32vh2fkhk.onion/all').read()) # Prints the URL's contents
cURL also retrieves the contents of the page correctly.
I'm using Python 2.7.13, requests 2.13.0 & PySocks 1.6.7. Tor is running through a docker container with the following command:
sudo docker run -it -p 8118:8118 -p 9050:9050 -d dperson/torproxy
What am I doing wrong here? What do I need to do to make requests recognize the .onion URLs?
The solution is to use the socks5h protocol in order to enable remote DNS resolving in case the local DNS resolving process fails. See https://github.com/kennethreitz/requests/blob/e3f89bf23c53b98593e4248054661472aacac820/requests/packages/urllib3/contrib/socks.py#L158
The following code works as expected:
import requests
session = requests.session()
session.proxies = {'http': 'socks5h://localhost:9050',
'https': 'socks5h://localhost:9050'}
print(session.get('http://httpbin.org/ip').text) # prints {"origin": "67.205.146.164" }
print(requests.get('http://httpbin.org/ip').text) # prints {"origin": "5.102.254.76" }
print(session.get('http://nzxj65x32vh2fkhk.onion/all').text) # Prints the contents of the page
To begin with, I understand there are other modules such as Requests that would be better suited and simpler to use, but I want to use the socket module to better understand HTTP.
I have a simple script that does the following:
Client ---> HTTP Proxy ---> External Resource (GET Google.com)
I am able to connect to the HTTP proxy alright, but when I send the GET request headers for google.com to the proxy, it doesn't serve me any response at all.
#!/usr/bin/python
import socket
import sys
headers = """GET / HTTP/1.1\r\n
Host: google.com\r\n\r\n"""
socket = socket
host = "165.139.179.225" #proxy server IP
port = 8080 #proxy server port
try:
s = socket.socket()
s.connect((host,port))
s.send(("CONNECT {0}:{1} HTTP/1.1\r\n" + "Host: {2}: {3}\r\n\r\n").format(socket.gethostbyname(socket.gethostname()),1000,port,host))
print s.recv(1096)
s.send(headers)
response = s.recv(1096)
print response
s.close()
except socket.error,m:
print str(m)
s.close()
sys.exit(1)
To make a HTTP request to a proxy open a connection to the proxy server and then send a HTTP-proxy request. This request is mostly the same as the normal HTTP request, but contains the absolute URL instead of the relative URL, e.g.
> GET http://www.google.com HTTP/1.1
> Host: www.google.com
> ...
< HTTP response
To make a HTTPS request open a tunnel using the CONNECT method and then proceed inside this tunnel normally, that is do the SSL handshake and then a normal non-proxy request inside the tunnel, e.g.
> CONNECT www.google.com:443 HTTP/1.1
>
< .. read response to CONNECT request, must be 200 ...
.. establish the TLS connection inside the tunnel
> GET / HTTP/1.1
> Host: www.google.com
Python 3 requires the request to be encoded. Thus, expanding on David's original code, combined with Steffens answer, here is the solution written for Python 3:
def connectThroughProxy():
headers = """GET http://www.example.org HTTP/1.1
Host: www.example.org\r\n\r\n"""
host = "192.97.215.348" #proxy server IP
port = 8080 #proxy server port
try:
s = socket.socket()
s.connect((host,port))
s.send(headers.encode('utf-8'))
response = s.recv(3000)
print (response)
s.close()
except socket.error as m:
print (str(m))
s.close()
sys.exit(1)
This allows me to connect to the example.org host through my corporate proxy (at least for non SSL/TLS connections).
I'm trying to create a TCP socket server in Python that after receiving a string of bytes from a client passes the received data(without knowing what it's actually inside, assuming it's a valid HTTP request) to a HTTP or HTTPS proxy and waits for results, my code looks like this:
import socket
def test(host, port):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((socket.gethostbyname(host), int(port))
msg = """GET / HTTP/1.1
Host: www.bing.com
User-Agent: Firefox
"""
sent_count = sock.send(msg)
recv_value = sock.recv(2048)
print('recvieved:',)
print str(recv_value)
pass
if __name__ == '__main__':
test(host='x.x.x.xx', port='80') # a https proxy server
test(host='yy.yy.yy.yy', port='80') # a http proxy server
But when i connect to the HTTP proxy server it returns something like:
HTTP/1.1 404 Not Found
And when i connect to the HTTPS proxy server it shows something like:
HTTP/1.0 400 Bad Request
So wanted to ask if anybody know how could i send HTTP requests to HTTP/HTTPS servers via sockets in Python? or how can i redirect arbitrary strings of data toward HTTP/HTTPS proxy servers in general in Python using sockets?, any suggestions are very much appreciated, thanks in advance.