Getting error while web scraping using python requests module - python

Trying to grab the content of website using python 3.6.2.Getting below error.
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.amazon.in', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: TLSV1_ALERT_ACCESS_DENIED] tlsv1 alert access denied
(_ssl.c:748)'),))
Code:
import requests
from bs4 import BeautifulSoup
r=requests.get("https://www.amazon.in/")
r.content
Help me in fixing this!

try http instead of https. It worked for me

You can check your connection and ability to communicate with TLS protocol by typing following command:
openssl s_client -connect www.amazon.in:443
Anyway, your python code is correct and works for me.

Related

SSL error only in python command window with apify request

I am trying to use endpoint from apify.com. When I run my request in web browser with token everything is fine but if I run my request via requests library from python console I am getting following error:
SSLError: HTTPSConnectionPool(host='', port=443): Max retries exceeded with url: /endpoint?token=token (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)')))
Moreover if I set verify = False in my request than request is working. Does anyone have an idea what can be wrong? Thanks in advance
I had this issue come up a few weeks ago.
>>> pip install certifi
>>> python -m certifi
I'm not certain that one needs to actually call the module to get it's functionality, but I did and it solved the error. More info on Certifi here. It is also a recommended package extension to requests from their website. I added those lasts bits because I was wary of installing a package that ostensibly was never called after installation.
Solution was to install internal company SSL package for managing SSL connection from python. There was a recent change.

What exactly causes the "unable to get local issuer certificate" error when accessing an otherwise accessible (via browser) website URL?

I'm on macOS Monterey 12.3 running Python 3.9.7 installed via brew. Given this minimal replication of my production code:
import requests
try:
response = requests.get(website)
except requests.exceptions.SSLError as e:
print("Error: " + str(e))
... it spits out this error:
Error: HTTPSConnectionPool(host='<SNIP>', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)')))
Unfortunately, the website URL is something that I cannot share. But it's definitely accessible via HTTPS on Chrome. I'm aware of workarounds and have successfully fixed it by following this one, but I have deployed the same identical code on a Linux server and it errors out all the same (so I'm assuming this isn't a MacOS specific issue). Is this a misconfiguration of the SSL cert on the server? And if so, how is it fixed?

Python SSL Problem with PIP, Requests and Other

I've got a problem with a SSLError that appeared since last week.
I've used Python on my machine for a few years without any problem, but now whenever i try to use a library that connects to the web, a SSLError is thrown.
I've tried other solutions to make PIP and Requests work while avoiding the certificate check, but now i need to make it work to use an Azure library.
I know it's not a problem of the Wifi connection i'm using because it works fine on other machines. Could it be something i've installed on the machine? Maybe a VPN? Is there a way to check what is "blocking" the connection?
This is an example of the error when using the Azure Iot Hub library:
ClientRequestError: Error occurred in request., SSLError: HTTPSConnectionPool(host='iothubstreamdemo.azure-devices.net', port=443): Max retries exceeded with url: /devices?api-version=2020-03-01 (Caused by SSLError(SSLError(0, 'unknown error (_ssl.c:3622)'),))
And this is while using requests:
SSLError: HTTPSConnectionPool(host='example.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(0, 'unknown error (_ssl.c:3622)'),))
Thanks in advance.

Unable to connectto https://www.anaconda.com to install a package

C:\Anaconda\Scripts>conda install matplotlib
Collecting package metadata (current_repodata.json): failed
CondaHTTPError: HTTP 000 CONNECTION FAILED for url https://repo.anaconda.com/pkgs/main/win-64/current_repodata.json
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
If your current network has https://www.anaconda.com blocked, please file
a support request with your network engineering team.
SSLError(MaxRetryError('HTTPSConnectionPool(host=\'repo.anaconda.com\', port=443): Max retries exceeded with url: /pkgs/main/win-64/current_repodata.json (Caused by SSLError("Can\'t connect to HTTPS URL because the SSL module is not available."))'))
C:\Anaconda\Scripts>

Why do HTTPS requests produce SSL CERTIFICATE_VERIFY_FAILED error?

Here is my Python code:
import requests
requests.get('https://google.com')
This is the error:
requests.exceptions.SSLError: HTTPSConnectionPool(host='google.com', port=443):
Max retries exceeded with url: / (Caused by SSLError(SSLError(1,
'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)'),))
Using Insomnia gives me an error related with certificates:
My OS is Windows 7 Professional.
requests.get('https://google.com', verify='/path/to/certfile')
or you can skip verifications by doing this:
requests.get('https://google.com', verify=False)
You should specify your CA.
This fixed it: Python referencing old SSL version
The openssl versions used to differ for python and the one offered by homebrew
if
brew install python --with-brewd-openssl
doesn't work
try
brew install openssl
brew install python
after uninstalling python
If you are running under business network, contact the network administrator to apply required configurations at the network level.
You might add header and verify argument to by-pass ssl certificate security.
r = requests.get(URL, headers = {'User-agent': 'your bot 0.1'}, verify=False)
You should specify path your certificate if you have.
In the requests.get you can set the verify flag to False. This way the handshake between the program and the server is going to be ignored.
-- This isn't a guaranteed method because some servers have strict policy to deliver responses to requests.
If you using proxy server,add proxy to your requests.
just like:
proxies = {'http':'http://localhost:port','https':'http://localhost:port'}
requests.get('your_request_website', headers=headers, proxies=proxies)
hope this helps.
I resolved my problem by installing openssl
you can go here and download the Light version or any version suited to your needs:
https://slproweb.com/products/Win32OpenSSL.html

Categories