Get HTML content from onion links using the request library (Python 3) - python

I am trying to get the html code of onion websites using the requests library (or urllib.request). I tried diverse methods but none of them seemed to work properly.
At first, I simply tried to connect to a proxy using the requests library and get the HTML code of the facebook deep web:
import requests
session = requests.session()
session.proxie = {}
session.proxies['http'] = 'socks5h://localhost:9050'
session.proxies['https'] = 'socks5h://localhost:9050'
r = requests.get('https://facebookcorewwwi.onion/')
print(r.text)
However, when I do this, the connection to the proxy doesn't work (my IP stays the same with or without the proxy).
I get the following error:
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='facebookcorewwwi.onion', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x109e8b198>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
After doing some research, I saw someone who tried to do a similar thing and the solution was to connect to the proxy before importing the requests/urllib.request library.
So I tried connecting using the libraries socks and socket:
import socks
import socket
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050)
# patch the socket module
socket.socket = socks.socksocket
socket.create_connection = create_connection
import urllib.request
with urllib.request.urlopen('https://facebookcorewwwi.onion/') as response:
html = response.read()
print(html)
When I do this, my connection the the proxy gets refused:
urllib.error.URLError: <urlopen error Error connecting to SOCKS5 proxy 127.0.0.1:9050: [Errno 61] Connection refused>
I tried to use the requests library instead like follow (simply replace it from the line that says import urllib.request)
import requests
r = requests.get('https://facebookcorewwwi.onion/')
print(r.text)
But here I get this error:
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='facebookcorewwwi.onion', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x10d93ee80>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
It seem that no matter what I do my connection to a proxy gets refused. Does anyone have an alternative solution or a way to fix this?

Related

Requests giving errors while using HTTP proxies

So, I was sending a request using the requests library in Python 3.9.1. The problem is, when I tried to use an HTTP proxy it gave me this error:
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='google.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002B08D6BC9A0>: Failed to establish a new connection:
[WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond')))
This my code, I would appreciate any help.:
import requests
for proxy in open('proxies.txt','r').readlines():
proxies = {
'http': f'http://{proxy}',
'https': f'http://{proxy}'
}
e = requests.get('https://google.com/robots.txt',proxies=proxies)
open('uwu.txt','a').write(e.text)
print(e.text)
I am pretty sure it is not problem with my proxies as they are really good private proxies with 100 gigs of bandwidth. (from zenum.io).

Making requests through tor, requests.exceptions.ConnectionError Errno 61: Connection Refused

I'm trying to make a simple request to a whatsmyip site while connected to tor but no matter what I try I continue to get this error:
requests.exceptions.ConnectionError: SOCKSHTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /get (Caused by NewConnectionError('<urllib3.contrib.socks.SOCKSHTTPSConnection object at 0x1018a7438>: Failed to establish a new connection: [Errno 61] Connection refused'))
I've looked at a lot of posts on here with similar issues but I can't seem to find a fix that works.
This is the current code but I've tried multiple ways and its the same error every time:
import requests
def main():
proxies = {
'http': 'socks5h://127.0.0.1:9050',
'https': 'socks5h://127.0.0.1:9050'
}
r = requests.get('https://httpbin.org/get', proxies=proxies)
print(r.text)
if __name__ == '__main__':
main()
Well the error says Max retries exceeded with url:, so possibly could be too many requests has been made from the tor exit nodes ip. Attempt to do it with a new Tor identity and see if that works.
If you wanted to you could catch the exception and put it in a loop to attempt every number of seconds, but this may lead to that ip address being refused by the server for longer.

Failed to establish a new connection error using Python requests Errno -2 Name or service unknown

I am trying to make a request to an API with Python. I am able to make the request with curl without issue but I have something wrong with my Python request.
Why does this code,
import requests
from requests.auth import HTTPBasicAuth
emailadd = 'user123#example.com'
domain = 'example.com'
spantoken = 'omitted'
def checkUserLicensed(useremail):
url = ('https://api.spannigbackup.com/v1/users/' + useremail)
print(url)
response = requests.get(url, auth=(domain,spantoken))
print(response)
checkUserLicensed(emailadd)
Return this error
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.spannigbackup.com', port=443): Max retries exceeded with url: /v1/users/user123#example.com
(Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f73ca323748>: Failed to establish a new connection: [Errno -2] Name or service not known'))

Django can't make external connections with requests or urllib2 on development server

Everytime I make an external request (including to google.com) I get this response:
HTTPConnectionPool(host='EXTERNALHOSTSITE', port=8080): Max retries exceeded with url: EXTERNALHOSTPARAMS (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x105d8d6d0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
It does seem that your server cannot resolve the hostname into IP, this is probably not Django nor Python problem but your server network setup issue.
Try to reach the same URL with ping tool / wget/curl or troubleshoot DNS with nslookup.

Why HTTPSConnectionPool doesn't work when PoolManager does?

I have tested a 'POST' request with both PoolManager and HTTPSConnectionPool. The first one works, the other throw me a :
urllib3.exceptions.MaxRetryError:
HTTPSConnectionPool(host='https://some.url.com', port=443):
Max retries exceeded with url: /some-api (Caused by <class 'socket.gaierror'>:
[Errno -2] Name or service not known)
Here's my code for PoolManager:
import urllib3
HOST = 'https://some.url.com'
PORT = 443
PATH = '/some-api'
xml_request = '<some xml tree/>'
manager = urllib3.PoolManager()
res = manager.request('POST', HOST+PATH, {'req':xml_request})
and for HTTPSConnectonPool:
manager = urllib3.HTTPSConnectionPool(HOST, port=PORT)
res = manager.request('POST', PATH, {'req':xml_request})
https://some.url.com is not a hostname or IP address, it's a URL. So you're feeding the wrong information to HTTPSConnectionPool.
Furthermore, PoolManager and HTTPSConnectionPool are not at the same abstraction level. PoolManager manages ConnectionPool instances for you.

Categories