How to disable SSL verification for urlretrieve?

How to disable SSL verification for urlretrieve? - python

A similar question had been asked a couple times around SO, but the solutions are for urlopen. That function takes an optional context parameter which can accept a pre-configured SSL context. urlretrieve does not have this parameter. How can I bypass SSL verification errors in the following call?
urllib.request.urlretrieve(
"http://sourceforge.net/projects/libjpeg-turbo/files/1.3.1/libjpeg-turbo-1.3.1.tar.gz/download",
destFolder+"/libjpeg-turbo.tar.gz")

This solution worked as well for me: before making the call to the library, define the default SSL context:
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
# urllib.request.urlretrieve(...)
Source: http://thomas-cokelaer.info/blog/2016/01/python-certificate-verified-failed/

This does not appear to be possible with urlretrieve (in Python >=2.7.9, or Python >=3.0).
The requests package is recommended as a replacement.
Edited to add: the context parameter has been added to the code, even though it isn't mentioned in the documentation! Hat-tip #Sushisource

Related

make any internet-accessing python code work (proxy + custom .crt)

The situation
If the following is not done, all outgoing HTTP or HTTPS requests made with python ends in a WinError 10054 Connection Reset, or a SSL bad handshake error.
set the HTTP_PROXY, HTTPS_PROXY environment variable, or their counterparts
What needs to be verified must be verified with a custom .crt file.
For example, assuming the .crt file is in place, both gets me a 200 OK:
import os
os.environ['HTTP_PROXY'] = #some_appropriate_address
os.environ['HTTPS_PROXY'] = #some appropriate_address
requests.get('http://www.google.com',verify="C:\the_file.crt") # 200 OK
requests.get('http://httpbin.org',verify=False) # 200 OK, but unsafe
requests.get('http://httpbin.org') # SSL bad handshake error
The Problem
There is this massive jumble of pre-written code (heavily utilizing urllib3 and requests and possibly other pieces of internet-accessing code) I have, and I have to make it work under the conditions outlined above.
Sure, I can write verify='C:\the_file.crt' for every requests.get(), but that can very quickly get hairy, right? And the code may also be using some other library (that is not requests). So I am looking for a global setting (environment variable etc.) I should alter, so that everything works well (return a 200 OK upon a GET request to a server, whether or not the code is written in requests-py).
Also, if there is no such way, I would like an explanation as to why.
What I tried (am trying)
Maybe editing the .condarc file (via conda --config) is a solution. I tried, to no avail: python gives me a "SSL verification failed" error. On the contrary, note that the code snippet above gave me a 200 OK. To my knowledge, this does not fit nicely with many situations that were previously discussed in Stack Overflow.
By the way, setting ssl_verify to false does not solve the problem either; I still get a bad handshake error for some reason.
I am using Win 10, Python 3.7.4 (Anaconda).
Update
I have edited the question to prevent future misunderstandings about the content of this question. A few answers below are a reiteration of what was written here from the start.
The current answers are not entirely satisfactory either, as they only seem to address the case where I am using requests or urllib3.

You should be able to get any python code that uses the requests module(which is inside urllib3) to work behind a proxy without modifying the python code itself by setting the following environment variables in Windows.
http_proxy http://[<user>:<pwd>#]<http_host>:<http_port>
https_proxy http://[<user>:<pwd>#]<https_host>:<https_port>
requests_ca_bundle <path_to_ca_bundle.crt>
curl_ca_bundle <path_to_ca_bundle.crt>
You can set environment variables by doing the following:
Press Windows-Key + R, enter sysdm.cpl ,3 (mind the space before the comma) and press Enter
Click the Environment variables button
In either of the fields (User variables or System variables), add the four variables

According to Doc in Requests:
https://requests.readthedocs.io/en/master/user/advanced/#proxies
you can use proxy in this way:
proxies = { 'http': 'http://10.10.1.10:3128', 'https': 'http://10.10.1.10:1080',}
requests.get('http://example.org', proxies=proxies)
Then depending on if you want to add .crt or .pem:
requests.get('https://kennethreitz.com', cert=('/path/server.crt', '/path/key'))
requests.get('https://kennethreitz.org', cert='/path/client.pem')
https://2.python-requests.org//en/v1.0.4/user/advanced/

You are trying to make https requests to an outer url and you need to provide the proper certificate files for verification. You are trying to make these configurations inside each component. But I would suggest that you make those configurations globally and system-wide so neither of the components need to provide certificates and deal with ssl-verification stuff.
I am awful at windows related networking configurations, but I would suggest you go check Proxifier and I am pretty sure you can configure a ssl proxy with proper certificates.

Python http.client - What is the difference between request and putrequest?

The documentation I've found explaining http.client for Python seems a bit sparse. I want to use it over requests because requests has not worked for our project.
So, knowing that I'm using Python's http.client, I'm seeing again and again request and putrequest. Both methods are defined here under HTTPConnection.
HTTPConnection.request: This will send a request to the server using
the HTTP request method method and the selector url.
HTTPConnection.putrequest: This should be the first call after the
connection to the server has been made. It sends a line to the server
consisting of the method string, the url string, and the HTTP version
(HTTP/1.1). To disable automatic sending of Host: or Accept-Encoding:
headers (for example to accept additional content encodings), specify
skip_host or skip_accept_encoding with non-False values.
Also, the source code for both is defined in this file.
From my guess and reading things, it seems like request is a more high level API compared to putrequest. Is that correct?

The Answer: request() is an abstracted version of multiple functions, putrequest() being one of them.
Although this is defined in the documentation, it's easy to skip over the line that answers this question.
This is pointed out in this line of the http.client documentation:
As an alternative to using the request() method described above, you can also send your request step by step, by using the four functions below.

SSLError("bad handshake: Error([('SSL routines', 'tls_process_ske_dhe', 'dh key too small' in Python

I have seen a few links for this issue and most people want the server to be updated for security reasons. I am looking to make an internal only tool and connect to a server that is not able to be modified. My code is below and I am hopeful I can get clarity on how I can accept the small key and process the request.
Thank you all in advance
import requests
from requests.auth import HTTPBasicAuth
import warnings
import urllib3
warnings.filterwarnings("ignore")
requests.packages.urllib3.disable_warnings()
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS += 'HIGH:!DH:!aNULL'
#requests.packages.urllib3.contrib.pyopenssl.DEFAULT_SSL_CIPHER_LIST += 'HIGH:!DH:!aNULL'
url = "https://x.x.x.x/place/stuff"
userName = 'stuff'
passW = 'otherstuff'
dataR = requests.get(url,auth=HTTPBasicAuth(userName, passW),verify=False)
print(dataR.text)

The problem with too small DH keys is discussed in length at https://weakdh.org` with various remediations.
Now in your case it depends on OpenSSL which Python uses under the hood. It hardcodes thing to reject too small values.
Have a look at: How to reject weak DH parameters in an OpenSSL client?
Currently OpenSSL in client mode stops handshake only if the keylength of server selected DH parameters is less than 768 bit (hardcoded in source).
Based on the answer there, you could use SSL_CTX_set_tmp_dh_callback and SSL_set_tmp_dh_callback to control things more to your liking... except that at that time it did not seem to work at the client side only the server side.
Based on http://openssl.6102.n7.nabble.com/How-to-enforce-DH-field-size-in-the-client-td60442.html it seems that some work was added in the 1.1.0 branch for that problem. It seems to hint at a commit 2001129f096d10bbd815936d23af3e97daf7882d in 1.0.2 so first maybe try a newer version of OpenSSL (you did not specify which versions you are using).
However even if you manage to have everything working with OpenSSL you still need your Python to use it (so probably to compile python yourself) and then have the specific API inside Python to work on that... to be honest I think you will loose far less time fixing the service (even if you say you can not modify it) instead of trying to basically cripple the client, as rejecting small keys is a good thing (for reasons explained in the first link).

Using certifi module with urllib2?

I'm having trouble downloading https pages with the urllib2 module, which seems to result from urllib2's inability to access the system's certificate store.
To get around this issue, one possible solution is to download https web pages with pycurl, by using the certifi module. The following is an example of doing so:
def download_web_page_with_curl(url_website):
from pycurl import Curl, CAINFO, URL
from certifi import where
from cStringIO import StringIO
response = StringIO()
curl = Curl()
curl.setopt(CAINFO, where())
curl.setopt(URL, url_website)
curl.setopt(curl.WRITEFUNCTION, response.write)
curl.perform()
curl.close()
return response.getvalue()
Is there a way to use certifi with urllib2 (in a fashion comparable to the pycurl example above), which will permit me to download https sites? Alternatively, is there another feasible urllib2-based workaround which will remedy the permissions issue, without compromising security?

Would recommend using requests per my other answer. However, to answer the original question of how to do this with urllib2:
import urllib2
import certifi
def download_web_page_with_urllib2(url_website):
t = urllib2.urlopen(url_website, cafile=certifi.where())
return t.read()
text = download_web_page_with_urllib2('https://www.google.com/')
The same recommendations about error checking apply.

Expanding on the comment to use requests (which is built on urllib3):
def download_web_page_with_requests(url_website):
import requests
r = requests.get(url_website)
return r.text
This is so much easier than anything else and properly handles SSL verification independent of the platform's own cert lists. If certifi is found, requests will automatically use it. If not, it silently falls back to a more limited, possibly older set of built-in root certs. If ensuring that certifi is used matters to you, you can do this:
r = requests.get(url_website, verify=certifi.where())
Note that the above code does not do the error checking that you should probably do. So I'll point out that requests.get() can throw a number of exceptions for invalid ULRs, unreachable sites, communication errors, and failed certification validation, so you should be prepared to catch and deal with those. If it does successfully talk to a server, but the server returns a non-OK status code (such as for a non-existent page), then an exception won't be thrown, so you'd also want to check that r.status_code==200.

Suds ignoring proxy setting

I'm trying to use the salesforce-python-toolkit to make web services calls to the Salesforce API, however I'm having trouble getting the client to go through a proxy. Since the toolkit is based on top of suds, I tried going down to use just suds itself to see if I could get it to respect the proxy setting there, but it didn't work either.
This is tested on suds 0.3.9 on both OS X 10.7 (python 2.7) and ubuntu 12.04.
an example request I've made that did not end up going through the proxy (just burp or charles proxy running locally):
import suds
ws = suds.client.Client('file://sandbox.xml',proxy={'http':'http://localhost:8888'})
ws.service.login('user','pass')
I've tried various things with the proxy - dropping http://, using an IP, using a FQDN. I've stepped through the code in pdb and see it setting the proxy option. I've also tried instantiating the client without the proxy and then setting it with:
ws.set_options(proxy={'http':'http://localhost:8888'})
Is proxy not used by suds any longer? I don't see it listed directly here http://jortel.fedorapeople.org/suds/doc/suds.options.Options-class.html, but I do see it under transport. Do I need to set it differently through a transport? When I stepped through in pdb it did look like it was using a transport, but I'm not sure how.
Thank you!

I went into #suds on freenode and Xelnor/rbarrois provided a great answer! Apparently the custom mapping in suds overrides urllib2's behavior for using the system configuration environment variables. This solution now relies on having the http_proxy/https_proxy/no_proxy environment variables set accordingly.
I hope this helps anyone else running into issues with proxies and suds (or other libraries that use suds). https://gist.github.com/3721801
from suds.transport.http import HttpTransport as SudsHttpTransport
class WellBehavedHttpTransport(SudsHttpTransport):
"""HttpTransport which properly obeys the ``*_proxy`` environment variables."""
def u2handlers(self):
"""Return a list of specific handlers to add.
The urllib2 logic regarding ``build_opener(*handlers)`` is:
- It has a list of default handlers to use
- If a subclass or an instance of one of those default handlers is given
in ``*handlers``, it overrides the default one.
Suds uses a custom {'protocol': 'proxy'} mapping in self.proxy, and adds
a ProxyHandler(self.proxy) to that list of handlers.
This overrides the default behaviour of urllib2, which would otherwise
use the system configuration (environment variables on Linux, System
Configuration on Mac OS, ...) to determine which proxies to use for
the current protocol, and when not to use a proxy (no_proxy).
Thus, passing an empty list will use the default ProxyHandler which
behaves correctly.
"""
return []
client = suds.client.Client(my_wsdl, transport=WellBehavedHttpTransport())

I think you can do by using a urllib2 opener like below.
import suds
t = suds.transport.http.HttpTransport()
proxy = urllib2.ProxyHandler({'http': 'http://localhost:8888'})
opener = urllib2.build_opener(proxy)
t.urlopener = opener
ws = suds.client.Client('file://sandbox.xml', transport=t)

I was actually able to get it working by doing two things:
making sure there were keys in the proxy dict for http as well as https.
setting the proxy using set_options AFTER creation of the client.
So, my relevant code looks like this:
self.suds_client = suds.client.Client(wsdl)
self.suds_client.set_options(proxy={'http': 'http://localhost:8888', 'https': 'http://localhost:8888'})

I had multiple issues using Suds, even though my proxy was configured properly I could not connect to the endpoint wsdl. After spending significant time attempting to formulate a workaround, I decided to give soap2py a shot - https://code.google.com/p/pysimplesoap/wiki/SoapClient
Worked straight off the bat.

For anyone who's attempting cji's solution over HTTPS, you actually need to keep one of the handlers for the basic authentication. I also am using python3.7 so urllib2 has been replaced with urllib.request.
from suds.transport.https import HttpAuthenticated as SudsHttpsTransport
from urllib.request import HTTPBasicAuthHandler
class WellBehavedHttpsTransport(SudsHttpsTransport):
""" HttpsTransport which properly obeys the ``*_proxy`` environment variables."""
def u2handlers(self):
""" Return a list of specific handlers to add.
The urllib2 logic regarding ``build_opener(*handlers)`` is:
- It has a list of default handlers to use
- If a subclass or an instance of one of those default handlers is given
in ``*handlers``, it overrides the default one.
Suds uses a custom {'protocol': 'proxy'} mapping in self.proxy, and adds
a ProxyHandler(self.proxy) to that list of handlers.
This overrides the default behaviour of urllib2, which would otherwise
use the system configuration (environment variables on Linux, System
Configuration on Mac OS, ...) to determine which proxies to use for
the current protocol, and when not to use a proxy (no_proxy).
Thus, passing an empty list (asides from the BasicAuthHandler)
will use the default ProxyHandler which behaves correctly.
"""
return [HTTPBasicAuthHandler(self.pm)]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.