Fetching a .onion domain with requests - python

I'm trying to access the following domain nzxj65x32vh2fkhk.onion using requests.
I have tor running and I configured the session's object proxies correctly.
import requests
session = requests.session()
session.proxies = {'http': 'socks5://localhost:9050',
'https': 'socks5://localhost:9050'}
print(session.get('http://httpbin.org/ip').text) # prints {"origin": "67.205.146.164" }
print(requests.get('http://httpbin.org/ip').text) # prints {"origin": "5.102.254.76" }
However when I try to access the URL with the .onion domain I get the following error:
session.get('http://nzxj65x32vh2fkhk.onion/all')
ConnectionError: SOCKSHTTPConnectionPool(host='nzxj65x32vh2fkhk.onion', port=80): Max retries exceeded with url: /all (Caused by NewConnectionError('<requests.packages.urllib3.contrib.socks.SOCKSConnection object at 0x7f5e8c2dbbd0>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I also tried to replace localhost with 127.0.0.1 as suggested in one of the answers. The result is the same unfortunately.
Performing the same request using urllib2 works just fine.
import socks, socket, urllib2
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '127.0.0.1', 9050)
socket.socket = socks.socksocket
socket.create_connection = create_connection
print(urllib2.urlopen('http://nzxj65x32vh2fkhk.onion/all').read()) # Prints the URL's contents
cURL also retrieves the contents of the page correctly.
I'm using Python 2.7.13, requests 2.13.0 & PySocks 1.6.7. Tor is running through a docker container with the following command:
sudo docker run -it -p 8118:8118 -p 9050:9050 -d dperson/torproxy
What am I doing wrong here? What do I need to do to make requests recognize the .onion URLs?

The solution is to use the socks5h protocol in order to enable remote DNS resolving in case the local DNS resolving process fails. See https://github.com/kennethreitz/requests/blob/e3f89bf23c53b98593e4248054661472aacac820/requests/packages/urllib3/contrib/socks.py#L158
The following code works as expected:
import requests
session = requests.session()
session.proxies = {'http': 'socks5h://localhost:9050',
'https': 'socks5h://localhost:9050'}
print(session.get('http://httpbin.org/ip').text) # prints {"origin": "67.205.146.164" }
print(requests.get('http://httpbin.org/ip').text) # prints {"origin": "5.102.254.76" }
print(session.get('http://nzxj65x32vh2fkhk.onion/all').text) # Prints the contents of the page

Related

proxy works fine with http but not https

I wanted to use proxies in python requests but when I run the code with like this
req = requests.get("https://httpbin.org/ip", proxies={'https': 'user:pass#host:port',
'http': 'user:pass#host:port'})
print(req.content)
I get this error
HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ProxyError('Cannot connect to proxy.', TimeoutError(10060, 'A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond', None, 10060, None
but If I use "http://httpbin.org/ip" instead of "https://httpbin.org/ip"
it works really fine
and in other stuff like if I run this code
proxies = { 'http' : 'user:pass#host:port' }
req =requests.get("https://lumtest.com/myip.json",proxies =proxies )
print(req.content)
I get my ip address which means that the proxies are not working. But if I use this
which is the same url just without the s in https and I run it over HTTP
proxies = { 'http' : 'user:pass#host:port' }
req =requests.get("http://lumtest.com/myip.json",proxies =proxies )
print(req.content)
I get the ip of the proxy which means that its working fine
It doesn't bother me changing the s in HTTP or HTTPS but in some website when I use proxies over HTTP
I get a different response I get this
b''
instead of getting the response that I wanted that works fine without proxies even If I run it on HTTPS or HTTP
but If I run it it only works over http with the proxies and it doesn't give me a valid response
I hope someone can help me bcuz I have been trying to solve this forever

Easiest way to use HTTPS through a proxy [duplicate]

I try to use https proxy in python like this:
proxiesDict ={
'http': 'http://' + proxy_line,
'https': 'https://' + proxy_line
}
response = requests.get('https://api.ipify.org/?format=json', proxies=proxiesDict, allow_redirects=False)
proxy_line is a proxy read from file in the format of ip:port. I checked this https proxy in browser and it works. But in python this code hangs for a few seconds and then i get exception:
HTTPSConnectionPool(host='api.ipify.org', port=443): Max retries exceeded with url: /?format=json (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x0425E450>: Failed to establish a new connection: [WinError 10060]
I tried to use socks5 proxy, and it works on socks5 proxies with a PySocks installed. But for https i get this exception, can someone help me
When specifying a proxy list for requests, the key is the protocol, and the value is the domain/ip. You don't need to specify http:// or https:// again, for the actual value.
So, your proxiesDict will be:
proxiesDict = {
'http': proxy_line,
'https': proxy_line
}
You can also configure proxies by setting the enviroment variables:
$ export HTTP_PROXY="http://proxyIP:PORT"
$ export HTTPS_PROXY="http://proxyIP:PORT"
Then, you only need to execute your python script without proxy request.
Also, you can configure your proxy with http://user:password#host
For more information see this documentation: http://docs.python-requests.org/en/master/user/advanced/
Try using pycurl this function may help:
import pycurl
def pycurl_downloader(url, proxy_url, proxy_usr):
"""
Download files with pycurl
the proxy configuration:
proxy_url = 'http://10.0.0.0:3128'
proxy_usr = 'user:password'
"""
c = pycurl.Curl()
c.setopt(pycurl.FOLLOWLOCATION, 1)
c.setopt(pycurl.MAXREDIRS, 5)
c.setopt(pycurl.CONNECTTIMEOUT, 30)
c.setopt(pycurl.AUTOREFERER, 1)
if proxy_url: c.setopt(pycurl.PROXY, proxy_url)
if proxy_usr: c.setopt(pycurl.PROXYUSERPWD, proxy_usr)
content = StringIO()
c.setopt(pycurl.URL, url)
c.setopt(c.WRITEFUNCTION, content.write)
try:
c.perform()
c.close()
except pycurl.error, error:
errno, errstr = error
print 'An error occurred: ', errstr
return content.getvalue()

Request Max Retries TOR

I am trying to connect to TOR's localhost loopback and send data through it.
The address I am using is:
127.0.0.1:9050
I am using the following script to do this:
import requesocks, requests
session = requesocks.session()
session.proxies = {'http': 'socks5://127.0.0.1:9050',
'https': 'socks5://127.0.0.1:9050'}
print session.get("https://api.ipify.org?format=json").json()
It is supposed to retrieve my IP and print it. However, it gives the following error:
Max retries exceeded with url: https://api.ipify.org/?format=json
I can verify that TOR is up and running. What could be the problem raising this exception?
I got it working. I had to install the "expert" installer and add the exe to my PATH. Thank you

python simple_salesforce proxy usage

I'm using python simple_salesforce module from this example: https://pypi.python.org/pypi/simple-salesforce. Specifically:
proxies = {
"http": "http://10.10.1.10:3128"
}
from simple_salesforce import Salesforce
sf = Salesforce(username='myemail#example.com.sandbox', password='password', security_token='token', sandbox=True, proxies=proxies)
Its failing with the below error.
requests.exceptions.ConnectionError: ('Connection aborted.', error(111, 'Connection refused'))
If I dont use proxy, it works fine. My requirement is to enable proxy.
Any suggestions?
Adding the following to the beginning of the program will solve this problem.
I was using urllib2 in python and that takes care of forwarding the request through proxy.
For the answer to my question:
If your hostname and port for proxy are xyz1-pqr01.abc.company.com and 3128 then
import os
os.environ['http_proxy'] = 'http://xyz1-pqr01.abc.company.com:3128'
os.environ['https_proxy'] = 'http://xyz1-pqr01.abc.company.com:3128'

How to make python Requests work via SOCKS proxy

I'm using the great Requests library in my Python script:
import requests
r = requests.get("http://example.com")
print(r.text)
I would like to use a SOCKS proxy, how can I do that? Requests seems to only support HTTP proxies.
The modern way:
pip install -U 'requests[socks]'
then
import requests
resp = requests.get('http://go.to',
proxies=dict(http='socks5://user:pass#host:port',
https='socks5://user:pass#host:port'))
In case someone has tried all of these older answers, and is still running into problems like:
requests.exceptions.ConnectionError:
SOCKSHTTPConnectionPool(host='myhost', port=80):
Max retries exceeded with url: /my/path
(Caused by NewConnectionError('<requests.packages.urllib3.contrib.socks.SOCKSConnection object at 0x106812bd0>:
Failed to establish a new connection:
[Errno 8] nodename nor servname provided, or not known',))
It may be because, by default, requests is configured to resolve DNS queries on the local side of the connection.
Try changing your proxy URL from socks5://proxyhost:1234 to socks5h://proxyhost:1234. Note the extra h (it stands for hostname resolution).
The PySocks package module default is to do remote resolution, and I'm not sure why requests made their integration this obscurely divergent, but here we are.
As of requests version 2.10.0, released on 2016-04-29, requests supports SOCKS.
It requires PySocks, which can be installed with pip install pysocks.
Example usage:
import requests
proxies = {'http': "socks5://myproxy:9191"}
requests.get('http://example.org', proxies=proxies)
You need install pysocks , my version is 1.0 and the code works for me:
import socket
import socks
import requests
ip='localhost' # change your proxy's ip
port = 0000 # change your proxy's port
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, ip, port)
socket.socket = socks.socksocket
url = u'http://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=inurl%E8%A2%8B'
print(requests.get(url).text)
As soon as python requests will be merged with SOCKS5 pull request it will do as simple as using proxies dictionary:
Update: PR was already merged.
#proxy
# SOCKS5 proxy for HTTP/HTTPS
proxies = {
'http' : "socks5://myproxy:9191",
'https' : "socks5://myproxy:9191"
}
#headers
headers = {
}
url='http://example.com/'
res = requests.get(url, headers=headers, proxies=proxies)
See SOCKS Proxy Support
Another options, in case that you cannot wait request to be ready, when you cannot use requesocks - like on GoogleAppEngine due to the lack of pwd built-in module, is to use PySocks that was mentioned above:
Grab the socks.py file from the repo and put a copy in your root folder;
Add import socks and import socket
At this point configure and bind the socket before using with urllib2 - in the following example:
import urllib2
import socket
import socks
socks.set_default_proxy(socks.SOCKS5, "myprivateproxy.example",port=9050)
socket.socket = socks.socksocket
res=urllib2.urlopen(url).read()
You can just run your script with https_proxy environment variable.
Install socks support if it necessary.
pip install PySocks
pip install pysocks5
Setup environment variable
export https_proxy=socks5://<hostname or ip>:<port>
Run your script. This example makes request using proxy and shows IP-address:
echo Your real IP
python -c 'import requests;print(requests.get("http://ipinfo.io/ip").text)'
echo IP with socks-proxy
python -c 'import requests;print(requests.get("https://ipinfo.io/ip").text)'
# SOCKS5 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks5://1.2.3.4:1080",
'https' : "socks5://1.2.3.4:1080"
}
# SOCKS4 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks4://1.2.3.4:1080",
'https' : "socks4://1.2.3.4:1080"
}
# HTTP proxy for HTTP/HTTPS
proxiesDict = {
'http' : "1.2.3.4:1080",
'https' : "1.2.3.4:1080"
}
I installed pysocks and monkey patched create_connection in urllib3, like this:
import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS4, "127.0.0.1", 1080)
def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
source_address=None, socket_options=None):
"""Connect to *address* and return the socket object.
Convenience function. Connect to *address* (a 2-tuple ``(host,
port)``) and return the socket object. Passing the optional
*timeout* parameter will set the timeout on the socket instance
before attempting to connect. If no *timeout* is supplied, the
global default timeout setting returned by :func:`getdefaulttimeout`
is used. If *source_address* is set it must be a tuple of (host, port)
for the socket to bind as a source address before making the connection.
An host of '' or port 0 tells the OS to use the default.
"""
host, port = address
if host.startswith('['):
host = host.strip('[]')
err = None
for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM):
af, socktype, proto, canonname, sa = res
sock = None
try:
sock = socks.socksocket(af, socktype, proto)
# If provided, set socket level options before connecting.
# This is the only addition urllib3 makes to this function.
urllib3.util.connection._set_socket_options(sock, socket_options)
if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
sock.settimeout(timeout)
if source_address:
sock.bind(source_address)
sock.connect(sa)
return sock
except socket.error as e:
err = e
if sock is not None:
sock.close()
sock = None
if err is not None:
raise err
raise socket.error("getaddrinfo returns an empty list")
# monkeypatch
urllib3.util.connection.create_connection = create_connection
I could do this on Linux.
$ pip3 install --user 'requests[socks]'
$ https_proxy=socks5://<hostname or ip>:<port> python3 -c \
> 'import requests;print(requests.get("https://httpbin.org/ip").text)'

Categories