Not able to download Excel Spreadhsheet from a given url - python

I have already written a script that will take an .xlsm file and will update this file based on some other file and then will plot a graph according to the updated data. The Script is working fine. But now as this script is contributing to automation of a process,it needs to get the Excel(.xlsm) file from a url and then update the file and may be save it back to the same url.
I have tried downloading the file into a local copy using below code-
import requests
url = 'https://sharepoint.amr.ith.intel.com/sites/SKX/patchboard/Shared%20Documents/Forms/AllItems.aspx?RootFolder=%2Fsites%2FSKX%2Fpatchboard%2FShared%20Documents%2FReleaseInfo&FolderCTID=0x0120004C1C8CCA66D8D94FB4D7A0D2F56A8DB7&View={859827EF-6A11-4AD6-BD42-23F385D43AD6}/Copy of Patch_Release_Utilization'
r = requests.get(url)
open('Excel.xlsm', 'wb').write(r.content)
By doing this I am getting the error-
Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)'),)
What I understood till now is that the Server is not sending the complete Chain Certificates to the browser for authentication.
I also tried-
r=requests.get(url,verify=False)
By doing this error is gone but The file created is empty.When I checked status code for the connection using code-
r=requests.get(url,verify=False).status_code
I got the code as "401" which means authorization error.I have tried providing authentication as-
resp = requests.get(url,auth=HTTPBasicAuth('username', 'password'),verify=False)
and
resp = requests.get(url,auth=HTTPBasicAuth('username', 'password'))
both the above lines I have tried but still status code remained same.
Then I came along an article-Python requests SSL error - certificate verify failed
,where the author is asking to add missing certificates in a .pem file then use that pem file.How to know what are the missing certificates??So din't get any help from there also.
Can somebody please help me with this if somebody already catered this problem.It will be a great help.I am using Python3.6.3 and requests version is 2.18.4
NOTE- When i am using the link manually on Internet Explorer I am able to download the file

Related

Python Requests SSL error: hostname doesn't mactch either of

I'm trying to connect to one of my internal services at: https://myservice.my-alternative-domain.com through Python Requests. I'm using Python 3.6
I'm using a custom CA bundle to verify the request, and I'm getting the next error:
SSLError: hostname 'myservice.my-domain.com' doesn't match either of 'my-domain.com', 'my-alternative-domain.com'
The SSL certificate that the internal service uses has as CN: my-domain.com, and as SAN (Subject Alternative Names): 'my-domain.com', 'my-alternative-domain.com'
So, I'm trying to access the service through one of the alternative names (this has to be like this and it's not under my control)
I think the error is correct, and that the certificate should have also as SAN:
'*.my-alternative-domain.com'
in order for the request to work.
The only thing that puzzles me is that I can access the service through the browser.
Can somebody confirm the behavior of Python Requests is correct?
This is how I call the service:
response = requests.get('https://myservice.my-alternative-domain.com', params=params, headers=headers, verify=ca_bundle)
Thanks
pass verify as false might work
x=requests.get(-----,verify=false)

Accessing data from the internet

I want to access the file automatically using Python 3. the website is https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls
when you manually enter the url into explorer it asks you to download the file but i want to do this in python automatically and load the data as a df.
i get the below error
URLError:
from urllib.request import urlretrieve
import pandas as pd
# Assign url of file: url
url = 'https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls'
# Save file locally
urlretrieve(url, 'my-sheet.xls')
# Read file into a DataFrame and print its head
df=pd.read_excel('my-sheet.xls')
print(df.head())
URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
$ curl https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>307 Temporary Redirect</title>
</head><body>
<h1>Temporary Redirect</h1>
<p>The document has moved here.</p>
</body></html>
You are just getting redirected. There are ways to implement this in code, but I would just change url to "https://www.dax-indices.com/document/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls"
I ran your code in a jupyter environment, and it worked. No error was prompted, but the dataframe has only NaN values. I checked the xls file you are trying to read, and it seems to not contain any data...
There are other ways to retrieve xls data, such as: downloading an excel file from the web in python
import requests
url = 'https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls'
resp = requests.get(url)
output = open('my-sheet.xls', 'wb')
output.write(resp.content)
output.close()
df=pd.read_excel('my-sheet.xls')
print(df.head())
You can do it directly with pandas and .read_excel method
df = pd.read_excel("https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls", sheet_name='Data', skiprows=5)
df.head(1)
Output
Sorry mate. It works on my PC (not a very helpful comment tbh). Here's a list of things you can do ->
Obtain a reference and check the status code of the reference (200 or 300 means that everything is good, anything else has different meanings)
Check if that link has bot access blocked (Certain sites do that)
In case of blocked access to bot, use selenium for python

SSLError( '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)'),)) in Python while accessing an url [duplicate]

This question already has an answer here:
SSL: CERTIFICATE_VERIFY_FAILED certificate verify failed
(1 answer)
Closed 4 years ago.
I am writing a script that will will update a macro enabled Excel file that is present in a given url.I am using Python3.6 for this work. I decided it to first download in a local copy then update the local copy and after updating push it back to the same url. But when I am writing code to download the file I am getting the error as-
(Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)'),))
the code that I am using is-
import requests
url = 'https://sharepoint.amr.ith.intel.com/sites/SKX/patchboard/Shared%20Documents/Forms/AllItems.aspx?RootFolder=%2Fsites%2FSKX%2Fpatchboard%2FShared%20Documents%2FReleaseInfo&FolderCTID=0x0120004C1C8CCA66D8D94FB4D7A0D2F56A8DB7&View={859827EF-6A11-4AD6-BD42-23F385D43AD6}/Copy of Patch_Release_Utilization'
r = requests.get(url)
open('Excel.xlsm', 'wb').write(r.content)
I have tried solution given in-Python requests SSL error - certificate verify failed
,but this is not working for me. How to resolve this problem?? Please help me with the solution if somebody has already tackled it.
EDIT:
I have tried using-
r=request.get(url,verify=False)
After doing this I am getting the warning as - "InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings" and also when I am trying to open the "Excel.xlsm" file so created I am getting the error message as- "Excel cannot open the file "Excel.xlsm" because the file format or file extensionis not valid.Verify that the file has not been corrupted and that the file extension matches the format of trhe file"
NOTE- I am trying to access the macro enabled Excel file(.xlsm) file
You can use the verify=False parameter to ignore verifying the SSL certificate, per the documentation:
r = requests.get(url, verify=False)

SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed while generating SMS from python script

I have my python script similar to below, Scripts works fine in my personal laptop.
import plivo
import sys
auth_id = "XXXXXX"
auth_token = "YYYYYYYYYYYY"
test = plivo.RestClient(auth_id, auth_token)
message_created = test.messages.create(
src='ZZZZZZ',
dst='+NNNNN',
text='Testing!!'
)
However while running the script in our organization PC's its throwing error
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='api.plivo.com', port=443): **Max retries exceeded with url**: /v1/Account/SXXXXXYW/Message/ (Cau
sed by SSLError(SSLError(1, u'[SSL: **CERTIFICATE_VERIFY_FAILED**] certificate verify failed (_ssl.c:590)'),))
I tried to add ssl._create_default_https_context = ssl._create_unverified_context and PYTHONHTTPSVERIFY=0 but unfortunately nothing works for me. Can anyone one help me how to resolve the error?
Try the solution from https://github.com/locustio/locust/issues/417
How to get rid from “SSL: CERTIFICATE_VERIFY_FAILED” Error
On Windows, Python does not look at the system certificate, it uses its own located at ?\lib\site-packages\certifi\cacert.pem.
The solution to your problem:
download the domain validation certificate as *.crt or *pem file
open the file in editor and copy it's content to clipboard
find your cacert.pem location: from requests.utils import
DEFAULT_CA_BUNDLE_PATH;
print(DEFAULT_CA_BUNDLE_PATH)
edit the cacert.pem file and paste your domain validation
certificate at the end of the file.
Save the file and enjoy requests!

Tweepy SSLError regarding ssl certificate

I am running a REST API (Search API) with Tweepy in Python. I worked the program at home and it's totally fine. But now I am working on this in different networks and I got the error message.
SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_get_server_certificate', 'certificate verify failed')],)",)
My code is like this.
auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)
api = tweepy.API(auth,wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
I found this post
Python Requests throwing up SSLError
and set the following code (verify = false) may be a quick solution. Does anyone know how to do it or other ways in tweepy? Thank you.
In streaming.py, adding verify = False in line# 105 did the trick for me as shown below. Though it is not advisable to use this approach as it makes the connection unsafe. Haven't been able to come up with a workaround for this yet.
stream = Stream(auth, listener, verify = False)
I ran into the same problem and unfortunately the only thing that worked was setting verify=False in auth.py in Tweepy (for me Tweepy is located in /anaconda3/lib/python3.6/site-packages/tweepy on my Mac):
resp = requests.post(self._get_oauth_url('token'),
auth=(self.consumer_key,
self.consumer_secret),
data={'grant_type': 'client_credentials'},
verify=False)
Edit:
Behind a corporate firewall, there is a certificate issue. In chrome go to settings-->advanced-->certificates and download your corporate CA certificate. Then, in Tweepy binder.py, right under session = requests.session() add
session.verify = 'path_to_corporate_certificate.cer'
First, verify if you can access twitter just using a proxy configuration. If so, you can modify this line on your code to include a proxy URL:
self.api = tweepy.API(self.auth)
Adding verify=False will ignore the validation that has to be made and all the data will be transferred in plain text without any encryption.
pip install certifi
The above installation fixes the bad handshake and ssl error.
For anybody that might stumble on this like I did, I had a similar problem because my company was using a proxy, and the SSL check failed while trying to verify the proxy's certificate.
The solution was to export the proxy's root certificate as a .pem file. Then you can add this certificate to certifi's trust store by doing:
import certifi
cafile = certifi.where()
with open(r<path to pem file>, 'rb') as infile:
customca = infile.read()
with open(cafile, 'ab') as outfile:
outfile.write(customca)
You'll have to replace <path to pem file> with the path to the exported file. This should allow requests (and tweepy) to successfully validate the certificates.

Categories