I am try to access a site that is restricted to region using a VPN. The proxy works fine in browser but when proxy is enabled and I try to send request using
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36 Edg/100.0.1185.50',
}
data = {
'courtType': '',
'countyCode': '',
'cortCode': 'SW',
'caseNumber': '22SL-PR00157',
}
res = requests.post('https://www.courts.mo.gov/cnet/caseNoSearch.do', data=data, headers=headers)
I get following error
requests.exceptions.ProxyError: HTTPSConnectionPool(host='www.courts.mo.gov', port=443): Max retries exceeded with url: /cnet/caseNoSearch.do (Caused by ProxyError('Your proxy appears to only use HTTP and not HTTPS, try changing your proxy URL to be HTTP. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#https-proxy-error-http-proxy', SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1124)'))))
Is there any way I can send requests enabling VPN?
Related
I have three versions that i tested: 3.10.2, 3.9.9 and 3.8.10 on different machines and even one online compiler. In all of them i did the following:
import requests
requests.get(url, proxies=proxies, headers=headers)
Testing in each parameter:
url:
"https://www.icanhazip.com"
"http://www.icanhazip.com"
proxies:
{"https": "223.241.0.250:3000", "http": "223.241.0.250:3000"}
{"https": "223.241.0.250:3000", "http": "223.241.0.250:3000"}
headers:
{'User-Agent': 'Chrome'}
{"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36"}
In all of them but the 3.10.2 I got this error:
ProxyError: HTTPSConnectionPool(host='www.icanhazip.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Cannot connect to proxy.', NewConn
ectionError('<urllib3.connection.HTTPSConnection object at 0x05694F40>: Failed to establish a new connection: [WinError 10060] A connection attempt failed beca
use the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond')))
On the 3.10.2 i got:
InvalidURL: Proxy URL had no scheme, should start with http:// or https://
But when i tried to put the proxies like this:
{"https": "223.241.0.250:3000", "http": "223.241.0.250:3000"}
It didnt work and it just showed my normal ip.
What am I missing? A normal requests works just fine but when I add the proxies it just doesnt work. This code was working fine a while back and now outputs this error I cant figure out why.
Try adding a scheme in the proxy address
{"https": "http://223.241.0.250:3000", "http": "http://223.241.0.250:3000"}
{"https": "http://223.241.0.250:3000", "http": "http://223.241.0.250:3000"}
I am a python noob and was trying to scrape random websites (without abusing). This site caught my attention and my code ran:-
import requests
url = 'https://resultsarchives.nic.in/cbseresults/cbseresults2018/class12zpq/class12th18.asp'
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Language": "en-US,en;q=0.9,bn;q=0.8",
"X-Requested-With": "XMLHttpRequest",
"Content-Type": "application/x-www-form-urlencoded",
":authority": "resultsarchives.nic.in",
"Origin": "http://resultsarchives.nic.in",
"Referer": "https://resultsarchives.nic.in/cbseresults/cbseresults2018/class12zpq/class12th18.htm",
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "same-origin",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1"
}
x = range(8397, 8398)
i=0
for i in x:
payload = {
'regno': '6529437',
'sch': '12345',
'cno': str(i),
'B2': 'Submit'
}
response = requests.post(url, headers=headers, data=payload)
open('scrape.html', 'a', encoding="utf-8").write(response.text)
When executed, an SSL Certificate Expiry Error is thrown. However, browsers (chrome/firefox) work fine, and note the certificate to expire in Dec. 2022. The error ran:
requests.exceptions.SSLError: HTTPSConnectionPool(host='resultsarchives.nic.in', port=443): Max retries exceeded with url: /cbseresults/cbseresults2018/class12zpq/class12th18.asp (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)')))
If I process the same code using the Http version(s) of the site(s), it works fine!
I have this function to access a website.
def access_website(link):
try:
cert = requests.certs.where()
page = requests.get(link,
verify=cert,
headers={"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"})
return page
except Exception as x:
print(x)
return ''
access_website('https://www.davita.com/-/media/davita/project/kidneycare/pdf/corporate-governance/dva-pay-equity-disclosure-32119-final.ashx?la=en-us&hash=E5E2F4F69620F3C0BB52FFE818ABCE6CD36BFA12')
But when I do this, I get the following error:
HTTPSConnectionPool(host='www.davita.com', port=443): Max retries exceeded with url: /-/media/davita/project/kidneycare/pdf/corporate-governance/dva-pay-equity-disclosure-32119-final.ashx?la=en-us&hash=E5E2F4F69620F3C0BB52FFE818ABCE6CD36BFA12 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)')))
But I can easily access the site from the browser. All other links that I have do not raise this error.
How can I resolve this? I already tried by providing the certificate, but it still doesn't work
You can set the verify=False argument:
import requests
def access_website(link):
HEADERS = {
"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"
}
try:
page = requests.get(link, verify=False, headers=HEADERS)
return page
except Exception as x:
print(x)
return ""
access_website(
"https://www.davita.com/-/media/davita/project/kidneycare/pdf/corporate-governance/dva-pay-equity-disclosure-32119-final.ashx?la=en-us&hash=E5E2F4F69620F3C0BB52FFE818ABCE6CD36BFA12"
)
I'm having this issue while running the script:
(I'm using Spyder to build my script, but I'm trying at Jupyter Notebook and I'm getting the same error)
#STEP 3.8 - Get the URL request
LIMIT = 100
radius = 50
url = 'https://api.foursquare-com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
CLIENT_ID, CLIENT_SECRET, VERSION, neighbor_lat, neighbor_long, radius, LIMIT)
#STEP 3.9 - Get request and examinate the result
results = requests.get(url).json()
print(results)
ConnectionError: HTTPSConnectionPool(host='api.foursquare-com', port=443): Max retries exceeded with url: /v2/venues/explore?&client_id=xxx&client_secret=xxx&v=20180605&ll=43.806686299999996,-79.19435340000001&radius=500&limit=100 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
Try to add headers parameters in your request.get.
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'}
page = requests.get(url, headers=headers)
Try using the exceptions
results=" "
while results==" ":
try:
results = requests.get(url).json()
except:
time.sleep(50)
continue
This is common error, after doing SSL to your VPS server, it only finds the url with https://domanName.com
in this case check you connection in you code File or .env file and
change connections from
http://domainName.com -- to -- https://domainName.com
I hope this will solve your problem
Thanks.
I have deployed an AWS ec2 instance to use a proxy. I have edited the security policies and have allowed my machine to have access to the server. I am using port 22 for SSH, and port 4444 for the proxy. For some reason I still can not start a session using the proxy.
The code:
import requests
session = requests.Session()
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'
headers = {'user-agent' : user_agent}
proxies = {
'http' : 'socks5h://ec2-ip-address-here.us-east-2.compute.amazonaws.com:4444',
'https' : 'socks5h://ec2-ip-address-here.us-east-2.compute.amazonaws.com:4444',
}
print(session.get('https://www.ipchicken.com/', headers=headers, proxies=proxies).content)
The error:
requests.exceptions.ConnectionError: SOCKSHTTPSConnectionPool(host='www.ipchicken.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.contrib.socks.SOCKSHTTPSConnection object at 0x107a09048>: Failed to establish a new connection: [Errno 61] Connection refused'))
I'm not sure what I am doing wrong. I followed this video https://www.youtube.com/watch?v=HOL2eg0g0Ng for setting up the server. Thanks to all of those who reply in advance.
You need to be using socks5h:// for your http and https proxies.
I get this error on macOS when using socks5://.