I have deployed an AWS ec2 instance to use a proxy. I have edited the security policies and have allowed my machine to have access to the server. I am using port 22 for SSH, and port 4444 for the proxy. For some reason I still can not start a session using the proxy.
The code:
import requests
session = requests.Session()
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'
headers = {'user-agent' : user_agent}
proxies = {
'http' : 'socks5h://ec2-ip-address-here.us-east-2.compute.amazonaws.com:4444',
'https' : 'socks5h://ec2-ip-address-here.us-east-2.compute.amazonaws.com:4444',
}
print(session.get('https://www.ipchicken.com/', headers=headers, proxies=proxies).content)
The error:
requests.exceptions.ConnectionError: SOCKSHTTPSConnectionPool(host='www.ipchicken.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.contrib.socks.SOCKSHTTPSConnection object at 0x107a09048>: Failed to establish a new connection: [Errno 61] Connection refused'))
I'm not sure what I am doing wrong. I followed this video https://www.youtube.com/watch?v=HOL2eg0g0Ng for setting up the server. Thanks to all of those who reply in advance.
You need to be using socks5h:// for your http and https proxies.
I get this error on macOS when using socks5://.
Related
I am try to access a site that is restricted to region using a VPN. The proxy works fine in browser but when proxy is enabled and I try to send request using
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36 Edg/100.0.1185.50',
}
data = {
'courtType': '',
'countyCode': '',
'cortCode': 'SW',
'caseNumber': '22SL-PR00157',
}
res = requests.post('https://www.courts.mo.gov/cnet/caseNoSearch.do', data=data, headers=headers)
I get following error
requests.exceptions.ProxyError: HTTPSConnectionPool(host='www.courts.mo.gov', port=443): Max retries exceeded with url: /cnet/caseNoSearch.do (Caused by ProxyError('Your proxy appears to only use HTTP and not HTTPS, try changing your proxy URL to be HTTP. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#https-proxy-error-http-proxy', SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1124)'))))
Is there any way I can send requests enabling VPN?
I have three versions that i tested: 3.10.2, 3.9.9 and 3.8.10 on different machines and even one online compiler. In all of them i did the following:
import requests
requests.get(url, proxies=proxies, headers=headers)
Testing in each parameter:
url:
"https://www.icanhazip.com"
"http://www.icanhazip.com"
proxies:
{"https": "223.241.0.250:3000", "http": "223.241.0.250:3000"}
{"https": "223.241.0.250:3000", "http": "223.241.0.250:3000"}
headers:
{'User-Agent': 'Chrome'}
{"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36"}
In all of them but the 3.10.2 I got this error:
ProxyError: HTTPSConnectionPool(host='www.icanhazip.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', NewConn
ectionError('<urllib3.connection.HTTPSConnection object at 0x05694F40>: Failed to establish a new connection: [WinError 10060] A connection attempt failed beca
use the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond')))
On the 3.10.2 i got:
InvalidURL: Proxy URL had no scheme, should start with http:// or https://
But when i tried to put the proxies like this:
{"https": "223.241.0.250:3000", "http": "223.241.0.250:3000"}
It didnt work and it just showed my normal ip.
What am I missing? A normal requests works just fine but when I add the proxies it just doesnt work. This code was working fine a while back and now outputs this error I cant figure out why.
Try adding a scheme in the proxy address
{"https": "http://223.241.0.250:3000", "http": "http://223.241.0.250:3000"}
{"https": "http://223.241.0.250:3000", "http": "http://223.241.0.250:3000"}
I am trying currently to scrape a website using a proxy. I chose to use Luminati as proxy provider.
I created a zone with a Data Center using Shared IPs.
Luminati Dashboard
Then I installed Luminati Proxy Manager on my local machine, set up a proxy port using default config.
import requests
ip_json = requests.get('http://lumtest.com/myip.json', proxies={"http":"http://localhost:24000/",
"https":"http://localhost:24000/"}).json()
proxy = "https://" + ip_json['ip'] + ':' + str(ip_json['asn']['asnum'])
proxies ={"https": proxy , "http": proxy }
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get('http://google.com/', headers=headers, proxies=proxies)
However, each time I get
ProxyError: HTTPSConnectionPool(host='x.x.x.x', port=x): Max retries exceeded with url: http://google.com/ (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x11d3b5358>: Failed to establish a new connection: [Errno 61] Connection refused',)))
I tried using urls with http and https however nothing changed. I tried putting the proxy address as https, but still nothing.
Does anybody encountered this error and resolved it? I would really appreciate your help.
Thank you.
First, Luminati blocks google by default, and only allow specific use cases with the SERP zone.
Second, is google the only domain you target? Try lumtest.io/myip.json
The proxy manager should show you error codes in the logs, are there any clues there?
Try contacting Luminati live chat support from the control panel, you may have limitations on your account.
I'm having this issue while running the script:
(I'm using Spyder to build my script, but I'm trying at Jupyter Notebook and I'm getting the same error)
#STEP 3.8 - Get the URL request
LIMIT = 100
radius = 50
url = 'https://api.foursquare-com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
CLIENT_ID, CLIENT_SECRET, VERSION, neighbor_lat, neighbor_long, radius, LIMIT)
#STEP 3.9 - Get request and examinate the result
results = requests.get(url).json()
print(results)
ConnectionError: HTTPSConnectionPool(host='api.foursquare-com', port=443): Max retries exceeded with url: /v2/venues/explore?&client_id=xxx&client_secret=xxx&v=20180605&ll=43.806686299999996,-79.19435340000001&radius=500&limit=100 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
Try to add headers parameters in your request.get.
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'}
page = requests.get(url, headers=headers)
Try using the exceptions
results=" "
while results==" ":
try:
results = requests.get(url).json()
except:
time.sleep(50)
continue
This is common error, after doing SSL to your VPS server, it only finds the url with https://domanName.com
in this case check you connection in you code File or .env file and
change connections from
http://domainName.com -- to -- https://domainName.com
I hope this will solve your problem
Thanks.
I'm trying to search using beautifulsoup with anaconda for python 3.6.
I am trying to scrape accuweather.com to find the weather in Tel Aviv.
This is my code:
from bs4 import BeautifulSoup
import requests
data=requests.get("https://www.accuweather.com/he/il/tel-
aviv/215854/weather-forecast/215854")
soup=BeautifulSoup(data.text,"html parser")
soup.find('div',('class','info'))
I get this error:
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', OSError("(10060,
'WSAETIMEDOUT')",))
What can I do and what does this error mean?
What does this error mean
Googling for "errno 10600" yields quite a few results. Basically, it's a low-level network error (it's not http specific, you can have the same issue for any kind of network connection), whose canonical description is
A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
IOW, your system failed to connect to the host. This might come from a lot of reasons, either temporary (like your internet connection is down) or not (like a proxy - if you are behind a proxy - blocking access to this host, etc), or quite simply (as is the case here) the host blocking your requests.
The first thing to do when you have such an error is to check your internet connection, then try to get the url in your browser. If you can get it in your browser then it's most often the host blocking you, most often based on your client's "user-agent" header (the client here is requests), and specifying a "standard" user-agent header as explained in newbie's answer should solve the problem (and it does in this case, or at least it did for me).
NB : to set the user agent:
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
}
data = requests.get("https://www.accuweather.com/he/il/tel-aviv/215854/weather-forecast/215854", headers=headers)
The problem does not come from the code, but from the website.
If you add User-Agent field in the header of the request it will look like it comes from a browser.
Example:
from bs4 import BeautifulSoup
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
}
data=requests.get("https://www.accuweather.com/he/il/tel-aviv/215854/weather-forecast/215854", headers=headers)