Easiest way to use HTTPS through a proxy [duplicate] - python

I try to use https proxy in python like this:
proxiesDict ={
'http': 'http://' + proxy_line,
'https': 'https://' + proxy_line
}
response = requests.get('https://api.ipify.org/?format=json', proxies=proxiesDict, allow_redirects=False)
proxy_line is a proxy read from file in the format of ip:port. I checked this https proxy in browser and it works. But in python this code hangs for a few seconds and then i get exception:
HTTPSConnectionPool(host='api.ipify.org', port=443): Max retries exceeded with url: /?format=json (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x0425E450>: Failed to establish a new connection: [WinError 10060]
I tried to use socks5 proxy, and it works on socks5 proxies with a PySocks installed. But for https i get this exception, can someone help me

When specifying a proxy list for requests, the key is the protocol, and the value is the domain/ip. You don't need to specify http:// or https:// again, for the actual value.
So, your proxiesDict will be:
proxiesDict = {
'http': proxy_line,
'https': proxy_line
}

You can also configure proxies by setting the enviroment variables:
$ export HTTP_PROXY="http://proxyIP:PORT"
$ export HTTPS_PROXY="http://proxyIP:PORT"
Then, you only need to execute your python script without proxy request.
Also, you can configure your proxy with http://user:password#host
For more information see this documentation: http://docs.python-requests.org/en/master/user/advanced/

Try using pycurl this function may help:
import pycurl
def pycurl_downloader(url, proxy_url, proxy_usr):
"""
Download files with pycurl
the proxy configuration:
proxy_url = 'http://10.0.0.0:3128'
proxy_usr = 'user:password'
"""
c = pycurl.Curl()
c.setopt(pycurl.FOLLOWLOCATION, 1)
c.setopt(pycurl.MAXREDIRS, 5)
c.setopt(pycurl.CONNECTTIMEOUT, 30)
c.setopt(pycurl.AUTOREFERER, 1)
if proxy_url: c.setopt(pycurl.PROXY, proxy_url)
if proxy_usr: c.setopt(pycurl.PROXYUSERPWD, proxy_usr)
content = StringIO()
c.setopt(pycurl.URL, url)
c.setopt(c.WRITEFUNCTION, content.write)
try:
c.perform()
c.close()
except pycurl.error, error:
errno, errstr = error
print 'An error occurred: ', errstr
return content.getvalue()

Related

proxy works fine with http but not https

I wanted to use proxies in python requests but when I run the code with like this
req = requests.get("https://httpbin.org/ip", proxies={'https': 'user:pass#host:port',
'http': 'user:pass#host:port'})
print(req.content)
I get this error
HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ProxyError('Cannot connect to proxy.', TimeoutError(10060, 'A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond', None, 10060, None
but If I use "http://httpbin.org/ip" instead of "https://httpbin.org/ip"
it works really fine
and in other stuff like if I run this code
proxies = { 'http' : 'user:pass#host:port' }
req =requests.get("https://lumtest.com/myip.json",proxies =proxies )
print(req.content)
I get my ip address which means that the proxies are not working. But if I use this
which is the same url just without the s in https and I run it over HTTP
proxies = { 'http' : 'user:pass#host:port' }
req =requests.get("http://lumtest.com/myip.json",proxies =proxies )
print(req.content)
I get the ip of the proxy which means that its working fine
It doesn't bother me changing the s in HTTP or HTTPS but in some website when I use proxies over HTTP
I get a different response I get this
b''
instead of getting the response that I wanted that works fine without proxies even If I run it on HTTPS or HTTP
but If I run it it only works over http with the proxies and it doesn't give me a valid response
I hope someone can help me bcuz I have been trying to solve this forever

Python: How can I use urllib or requests modules from a corporate domain (firewall, proxy, cntlm etc)

I am trying to do the following:
from urllib.request import urlopen
data = urlopen("https://www.duolingo.com/users/SaifullahS6").read()
I get the following error:
URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
Similarly, when I try this:
import requests
session = requests.Session()
data = {"login": "SaifullahS6", "password": "mypassword"}
req = requests.Request('POST', "https://www.duolingo.com/login", data=data,
cookies=session.cookies)
prepped=req.prepare()
returned = session.send(prepped)
I get:
ConnectionError: HTTPSConnectionPool(host='www.duolingo.com', port=443): Max retries exceeded with url: /login (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x000000000E6948D0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',))
I am not sure how to give details of my internet connection.
I'm at work and I know we have a corporate proxy.
We have Windows Firewall turned on, but i have checked that python and pythonw are ticked in the "Domain" column of the control panel for allowing a program through the firewall.
When I ping google.co.uk from a command shell, all four requests time out, but I can access it from a browser.
In the Internet Options control panel, I click on the Connections tab and then LAN settings, and I have "Automatically detect settings" turned on, and also "Use a proxy server for your LAN", "Address" is "localhost" and "Port" is 3128. This is cntlm. I set it up once to do download python packages, and it appears to still be active because I have just managed to update one of my packages.
I don't even need a direct answer to my question; at this point I'll just settle for some clarity on what is actually going on behind the scenes. Any help much appreciated!
For the first case above (urllib module), I solved it by inserting the following lines before the data = urlopen(...).read() line:
proxies = { "http": "localhost:3128",
"https": "localhost:3128"}
proxy = urllib.request.ProxyHandler(proxies)
opener = urllib.request.build_opener(proxy)
urllib.request.install_opener(opener)
For the second case (requests module), everything was the same except the last line:
proxies = { "http": "localhost:3128",
"https": "localhost:3128"}
returned = session.send(prepped, proxies=proxies)
Hope this note helps others who come across this page.

Fetching a .onion domain with requests

I'm trying to access the following domain nzxj65x32vh2fkhk.onion using requests.
I have tor running and I configured the session's object proxies correctly.
import requests
session = requests.session()
session.proxies = {'http': 'socks5://localhost:9050',
'https': 'socks5://localhost:9050'}
print(session.get('http://httpbin.org/ip').text) # prints {"origin": "67.205.146.164" }
print(requests.get('http://httpbin.org/ip').text) # prints {"origin": "5.102.254.76" }
However when I try to access the URL with the .onion domain I get the following error:
session.get('http://nzxj65x32vh2fkhk.onion/all')
ConnectionError: SOCKSHTTPConnectionPool(host='nzxj65x32vh2fkhk.onion', port=80): Max retries exceeded with url: /all (Caused by NewConnectionError('<requests.packages.urllib3.contrib.socks.SOCKSConnection object at 0x7f5e8c2dbbd0>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I also tried to replace localhost with 127.0.0.1 as suggested in one of the answers. The result is the same unfortunately.
Performing the same request using urllib2 works just fine.
import socks, socket, urllib2
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '127.0.0.1', 9050)
socket.socket = socks.socksocket
socket.create_connection = create_connection
print(urllib2.urlopen('http://nzxj65x32vh2fkhk.onion/all').read()) # Prints the URL's contents
cURL also retrieves the contents of the page correctly.
I'm using Python 2.7.13, requests 2.13.0 & PySocks 1.6.7. Tor is running through a docker container with the following command:
sudo docker run -it -p 8118:8118 -p 9050:9050 -d dperson/torproxy
What am I doing wrong here? What do I need to do to make requests recognize the .onion URLs?
The solution is to use the socks5h protocol in order to enable remote DNS resolving in case the local DNS resolving process fails. See https://github.com/kennethreitz/requests/blob/e3f89bf23c53b98593e4248054661472aacac820/requests/packages/urllib3/contrib/socks.py#L158
The following code works as expected:
import requests
session = requests.session()
session.proxies = {'http': 'socks5h://localhost:9050',
'https': 'socks5h://localhost:9050'}
print(session.get('http://httpbin.org/ip').text) # prints {"origin": "67.205.146.164" }
print(requests.get('http://httpbin.org/ip').text) # prints {"origin": "5.102.254.76" }
print(session.get('http://nzxj65x32vh2fkhk.onion/all').text) # Prints the contents of the page

Request Max Retries TOR

I am trying to connect to TOR's localhost loopback and send data through it.
The address I am using is:
127.0.0.1:9050
I am using the following script to do this:
import requesocks, requests
session = requesocks.session()
session.proxies = {'http': 'socks5://127.0.0.1:9050',
'https': 'socks5://127.0.0.1:9050'}
print session.get("https://api.ipify.org?format=json").json()
It is supposed to retrieve my IP and print it. However, it gives the following error:
Max retries exceeded with url: https://api.ipify.org/?format=json
I can verify that TOR is up and running. What could be the problem raising this exception?
I got it working. I had to install the "expert" installer and add the exe to my PATH. Thank you

How to make python Requests work via SOCKS proxy

I'm using the great Requests library in my Python script:
import requests
r = requests.get("http://example.com")
print(r.text)
I would like to use a SOCKS proxy, how can I do that? Requests seems to only support HTTP proxies.
The modern way:
pip install -U 'requests[socks]'
then
import requests
resp = requests.get('http://go.to',
proxies=dict(http='socks5://user:pass#host:port',
https='socks5://user:pass#host:port'))
In case someone has tried all of these older answers, and is still running into problems like:
requests.exceptions.ConnectionError:
SOCKSHTTPConnectionPool(host='myhost', port=80):
Max retries exceeded with url: /my/path
(Caused by NewConnectionError('<requests.packages.urllib3.contrib.socks.SOCKSConnection object at 0x106812bd0>:
Failed to establish a new connection:
[Errno 8] nodename nor servname provided, or not known',))
It may be because, by default, requests is configured to resolve DNS queries on the local side of the connection.
Try changing your proxy URL from socks5://proxyhost:1234 to socks5h://proxyhost:1234. Note the extra h (it stands for hostname resolution).
The PySocks package module default is to do remote resolution, and I'm not sure why requests made their integration this obscurely divergent, but here we are.
As of requests version 2.10.0, released on 2016-04-29, requests supports SOCKS.
It requires PySocks, which can be installed with pip install pysocks.
Example usage:
import requests
proxies = {'http': "socks5://myproxy:9191"}
requests.get('http://example.org', proxies=proxies)
You need install pysocks , my version is 1.0 and the code works for me:
import socket
import socks
import requests
ip='localhost' # change your proxy's ip
port = 0000 # change your proxy's port
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, ip, port)
socket.socket = socks.socksocket
url = u'http://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=inurl%E8%A2%8B'
print(requests.get(url).text)
As soon as python requests will be merged with SOCKS5 pull request it will do as simple as using proxies dictionary:
Update: PR was already merged.
#proxy
# SOCKS5 proxy for HTTP/HTTPS
proxies = {
'http' : "socks5://myproxy:9191",
'https' : "socks5://myproxy:9191"
}
#headers
headers = {
}
url='http://example.com/'
res = requests.get(url, headers=headers, proxies=proxies)
See SOCKS Proxy Support
Another options, in case that you cannot wait request to be ready, when you cannot use requesocks - like on GoogleAppEngine due to the lack of pwd built-in module, is to use PySocks that was mentioned above:
Grab the socks.py file from the repo and put a copy in your root folder;
Add import socks and import socket
At this point configure and bind the socket before using with urllib2 - in the following example:
import urllib2
import socket
import socks
socks.set_default_proxy(socks.SOCKS5, "myprivateproxy.example",port=9050)
socket.socket = socks.socksocket
res=urllib2.urlopen(url).read()
You can just run your script with https_proxy environment variable.
Install socks support if it necessary.
pip install PySocks
pip install pysocks5
Setup environment variable
export https_proxy=socks5://<hostname or ip>:<port>
Run your script. This example makes request using proxy and shows IP-address:
echo Your real IP
python -c 'import requests;print(requests.get("http://ipinfo.io/ip").text)'
echo IP with socks-proxy
python -c 'import requests;print(requests.get("https://ipinfo.io/ip").text)'
# SOCKS5 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks5://1.2.3.4:1080",
'https' : "socks5://1.2.3.4:1080"
}
# SOCKS4 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks4://1.2.3.4:1080",
'https' : "socks4://1.2.3.4:1080"
}
# HTTP proxy for HTTP/HTTPS
proxiesDict = {
'http' : "1.2.3.4:1080",
'https' : "1.2.3.4:1080"
}
I installed pysocks and monkey patched create_connection in urllib3, like this:
import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS4, "127.0.0.1", 1080)
def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
source_address=None, socket_options=None):
"""Connect to *address* and return the socket object.
Convenience function. Connect to *address* (a 2-tuple ``(host,
port)``) and return the socket object. Passing the optional
*timeout* parameter will set the timeout on the socket instance
before attempting to connect. If no *timeout* is supplied, the
global default timeout setting returned by :func:`getdefaulttimeout`
is used. If *source_address* is set it must be a tuple of (host, port)
for the socket to bind as a source address before making the connection.
An host of '' or port 0 tells the OS to use the default.
"""
host, port = address
if host.startswith('['):
host = host.strip('[]')
err = None
for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM):
af, socktype, proto, canonname, sa = res
sock = None
try:
sock = socks.socksocket(af, socktype, proto)
# If provided, set socket level options before connecting.
# This is the only addition urllib3 makes to this function.
urllib3.util.connection._set_socket_options(sock, socket_options)
if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
sock.settimeout(timeout)
if source_address:
sock.bind(source_address)
sock.connect(sa)
return sock
except socket.error as e:
err = e
if sock is not None:
sock.close()
sock = None
if err is not None:
raise err
raise socket.error("getaddrinfo returns an empty list")
# monkeypatch
urllib3.util.connection.create_connection = create_connection
I could do this on Linux.
$ pip3 install --user 'requests[socks]'
$ https_proxy=socks5://<hostname or ip>:<port> python3 -c \
> 'import requests;print(requests.get("https://httpbin.org/ip").text)'

Categories