I have Python code to call a REST service that is something like this:
import urllib
import urllib2
username = 'foo'
password = 'bar'
passwordManager = urllib2.HTTPPasswordMgrWithDefaultRealm()
passwordManager .add_password(None, MY_APP_PATH, username, password)
authHandler = urllib2.HTTPBasicAuthHandler(passwordManager)
opener = urllib2.build_opener(authHandler)
urllib2.install_opener(opener)
params= { "param1" : param1,
"param2" : param2,
"param3" : param3 }
xmlResults = urllib2.urlopen(MY_APP_PATH, urllib.urlencode(params)).read()
results = MyResponseParser.parse(xmlResults)
MY_APP_PATH is currently an HTTP url. I would like to change it to use SSL ("HTTPS"). How would I go about changing this code to use https in the simplest way possible?
Unfortunately, urllib2 and httplib, at least up to Python 2.7 don't do any certificate verification for when using HTTPS. The result is that you're exchanging information with a server you haven't necessarily identified (it's a bit like exchanging a secret with someone whose identity you haven't verified): this defeats the security purpose of HTTPS.
See this quote from httplib (in Python 2.7):
Note: This does not do any certificate
verification.
(This is independent of httplib.HTTPSConnection being able to send a client-certificate: that's what its key and cert parameters are for.)
There are ways around this, for example:
http://thejosephturner.com/blog/post/https-certificate-verification-in-python-with-urllib2/
http://code.google.com/p/python-httpclient/ (not using urllib2, so possibly not the shortest way for you)
Just using HTTPS:// instead of HTTP:// in the URL you are calling should work, at least if you are trying to reach a known/verified server. If necessary, you can use your client-side SSL certificate to secure the API transaction:
mykey = '/path/to/ssl_key_file'
mycert = '/path/to/ssl_cert_file'
opener = urllib2.build_opener(HTTPSClientAuthHandler(mykey, mycert))
opener.add_handler(urllib2.HTTPBasicAuthHandler()) # add HTTP Basic Authentication information...
opener.add_password(user=settings.USER_ID, passwd=settings.PASSWD)
Related
I am trying to scrape using proxies (this proxy server is a free one from the internet); in particular I would like to use their IP, not my private one. To test my script I am trying to Access "http://whatismyipaddress.com/" to see which IP this site sees. As it turns out it will see my private IP. Can somebody tell me what's wrong here?
import requests
from fake_useragent import UserAgent
def getMyIP(proxyServer,myPrivateIP):
scrape_website = "http://whatismyipaddress.com/"
ua = UserAgent()
headers = {'User-Agent': ua.random}
try:
response = requests.get(scrape_website,headers=headers,proxies={"https":proxyServer})
except:
faultString = proxyServer + " did not work; " + "\n"
print(faultString)
return
if myPrivateIP in str(response.content):
print("They found my private IP.")
proxyServer = "http://103.250.158.23:61219"
myPrivateIP = "xxx.xxx.xxx.xxx"
getMyIP(proxyServer,myPrivateIP)
Two things:
You set an {'https': ...} proxy configuration. This means for any HTTPS requests, it will use that proxy. You're requesting an HTTP URL however, so that proxy isn't getting used. Configure an 'http' proxy instead or in addition.
If the proxy forwards your IP in an HTTP header, and the target server heeds that header, that's tough luck and nothing you can do anything about, besides using a different proxy which doesn't forward your IP. I think point 1 is more likely the issue though.
I need to make an API call (of sorts) in Django as a part of the custom authentication system we require. A username and password is sent to a specific URL over SSL (using GET for those parameters) and the response should be an HTTP 200 "OK" response with the body containing XML with the user's info.
On an unsuccessful auth, it will return an HTTP 401 "Unauthorized" response.
For security reasons, I need to check:
The request was sent over an HTTPS connection
The server certificate's public key matches an expected value (I use 'certificate pinning' to defend against broken CAs)
Is this possible in python/django using pycurl/urllib2 or any other method?
Using M2Crypto:
from M2Crypto import SSL
ctx = SSL.Context('sslv3')
ctx.set_verify(SSL.verify_peer | SSL.verify_fail_if_no_peer_cert, depth=9)
if ctx.load_verify_locations('ca.pem') != 1:
raise Exception('No CA certs')
c = SSL.Connection(ctx)
c.connect(('www.google.com', 443)) # automatically checks cert matches host
c.send('GET / \n')
c.close()
Using urllib2_ssl (it goes without saying but to be explicit: use it at your own risk):
import urllib2, urllib2_ssl
opener = urllib2.build_opener(urllib2_ssl.HTTPSHandler(ca_certs='ca.pem'))
xml = opener.open('https://example.com/').read()
Related: Making HTTPS Requests secure in Python.
Using pycurl:
c = pycurl.Curl()
c.setopt(pycurl.URL, "https://example.com?param1=val1¶m2=val2")
c.setopt(pycurl.HTTPGET, 1)
c.setopt(pycurl.CAINFO, 'ca.pem')
c.setopt(pycurl.SSL_VERIFYPEER, 1)
c.setopt(pycurl.SSL_VERIFYHOST, 2)
c.setopt(pycurl.SSLVERSION, 3)
c.setopt(pycurl.NOBODY, 1)
c.setopt(pycurl.NOSIGNAL, 1)
c.perform()
c.close()
To implement 'certificate pinning' provide different 'ca.pem' for different domains.
httplib2 can do https requests with certificate validation:
import httplib2
http = httplib2.Http(ca_certs='/path/to/cert.pem')
try:
http.request('https://...')
except httplib2.SSLHandshakeError, e:
# do something
Just make sure that your httplib2 is up to date. The one which is shipped with my distribution (ubuntu 10.04) does not have ca_certs parameter.
Also in similar question to yours there is an example of certificate validation with pycurl.
how and with which python library is it possible to make an httprequest (https) with a user:password or a token?
basically the equivalent to curl -u user:pwd https://www.mysite.com/
thank you
use python requests : Http for Humans
import requests
requests.get("https://www.mysite.com/", auth=('username','pwd'))
you can also use digest auth...
If you need to make thread-safe requests, use pycurl (the python interface to curl):
import pycurl
from StringIO import StringIO
response_buffer = StringIO()
curl = pycurl.Curl()
curl.setopt(curl.URL, "https://www.yoursite.com/")
# Setup the base HTTP Authentication.
curl.setopt(curl.USERPWD, '%s:%s' % ('youruser', 'yourpassword'))
curl.setopt(curl.WRITEFUNCTION, response_buffer.write)
curl.perform()
curl.close()
response_value = response_buffer.getvalue()
Otherwise, use urllib2 (see other responses for more info) as it's builtin and the interface is much cleaner.
class urllib2.HTTPSHandler
A class to handle opening of HTTPS URLs.
21.6.7. HTTPPasswordMgr Objects
These methods are available on HTTPPasswordMgr and HTTPPasswordMgrWithDefaultRealm objects.
HTTPPasswordMgr.add_password(realm, uri, user, passwd)
uri can be either a single URI, or a sequence of URIs. realm, user and passwd must be strings. This causes (user, passwd) to be used as authentication tokens when authentication for realm and a super-URI of any of the given URIs is given.
HTTPPasswordMgr.find_user_password(realm, authuri)
Get user/password for given realm and URI, if any. This method will return (None, None) if there is no matching user/password.
For HTTPPasswordMgrWithDefaultRealm objects, the realm None will be searched if the given realm has no matching user/password.
Check our urllib2. The examples at the bottom will probably be of interest.
http://docs.python.org/library/urllib2.html
I'm having trouble getting my bot to login to a MediaWiki install on the intranet. I believe it is due to the http authentication protecting the wiki.
Facts:
The wiki root is: https://local.example.com/mywiki/
When visiting the wiki with a web browser, a popup comes up asking for enterprise credentials (I assume this is basic access authentication)
This is what I have in my user-config.py:
mylang = 'en'
family = 'mywiki'
usernames['mywiki']['en'] = u'Bot'
authenticate['local.example.com'] = ('user', 'pass')
This is what I have in mywiki_family.py:
# -*- coding: utf-8 -*-
import family, config
# The Wikimedia family that is known as mywiki
class Family(family.Family):
def __init__(self):
family.Family.__init__(self)
self.name = 'mywiki'
self.langs = { 'en' : 'local.example.com'}
def scriptpath(self, code):
return '/mywiki'
def version(self, code):
return '1.13.5'
def isPublic(self):
return False
def hostname(self, code):
return 'local.example.com'
def protocol(self, code):
return 'https'
def path(self, code):
return '/mywiki/index.php'
When I execute login.py -v -v, I get this:
urllib2.urlopen(urllib2.Request('https://local.example.com/w/index.php?title=Special:Userlogin&useskin=monobook&action=submit', wpSkipCookieCheck=1&wpPassword=XXXX&wpDomain=&wpRemember=1&wpLoginattempt=Aanmelden%20%26%20Inschrijven&wpName=Bot, {'Content-type': 'application/x-www-form-urlencoded', 'User-agent': 'PythonWikipediaBot/1.0'})):
(Redundant traceback info here)
urllib2.HTTPError: HTTP Error 401: Unauthorized
(I'm not sure why it has 'local.example.com/w' instead of '/mywiki'.)
I thought it might be trying to authenticate to example.com instead of example.com/wiki, so I changed the authenticate line to:
authenticate['local.example.com/mywiki'] = ('user', 'pass')
But then I get an HTTP 401.2 error back from IIS:
You do not have permission to view this directory or page using the credentials that you supplied because your Web browser is sending a WWW-Authenticate header field that the Web server is not configured to accept.
Any help on how to get this working would be appreciated.
Update After fixing my family file, it now says:
Getting information for site mywiki:en
('http error', 401, 'Unauthorized', )
WARNING: Could not open 'https://local.example.com/mywiki/index.php?title=Non-existing_page&action=edit&useskin=monobook'. Maybe the server or your connection is down. Retrying in 1 minutes...
I looked at the HTTP headers on a plan urllib2.ulropen call and it's using WWW-Authenticate: Negotiate WWW-Authenticate: NTLM. I'm guessing urllib2 and thus pywikipedia don't support this?
Update Added a tasty bounty for help in getting this to work. I can authenticate using python-ntlm. How do I integrate this into pywikipedia?
Well the fact that login.py tries accessing '\w' instead of your path shows that there is a family configuration issue.
Your code is indented strangely: is scriptpath a member of the new Family class? as in:
class Family(family.Family):
def __init__(self):
family.Family.__init__(self)
self.name = 'mywiki'
self.langs = { 'en' : 'local.example.com'}
def scriptpath(self, code):
return '/mywiki'
def version(self, code):
return '1.13.5'
def isPublic(self):
return False
def hostname(self, code):
return 'local.example.com'
def protocol(self, code):
return 'https'
?
I believe that something is wrong with your family file. A good way to check is to do in a python console:
import wikipedia
site = wikipedia.getSite('en', 'mywiki')
print site.login_address()
as long as the relative address is wrong, showing '/w' instead of '/mywiki', it means that the family file is still not configured correctly, and that the bot won't work :)
Update: how to integrate ntlm in pywikipedia?
I just had a look at the basic example here. I would integrate the code before that line in login.py:
response = urllib2.urlopen(urllib2.Request(self.site.protocol() + '://' + self.site.hostname() + address, data, headers))
You want to write something of the like:
from ntlm import HTTPNtlmAuthHandler
user = 'DOMAIN\User'
password = "Password"
url = self.site.protocol() + '://' + self.site.hostname()
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
# create and install the opener
opener = urllib2.build_opener(auth_NTLM)
urllib2.install_opener(opener)
response = urllib2.urlopen(urllib2.Request(self.site.protocol() + '://' + self.site.hostname() + address, data, headers))
I would test this and integrate it directly into pywikipedia codebase if only I had an available ntlm setup...
Whatever happens, please do not vanish with your solution: we're interested, at pywikipedia, by your solution :)
I am guessing the problem you have is that the server expects basic authentication and you are not handling that in your client. Michael Foord wrote a good article about handling basic authentication in Python.
You did not provide enough information for me to be sure about this, so if that does not work, please provide some additional information, like network dump of you connection attempt.
I need to create a secure channel between my server and a remote web service. I'll be using HTTPS with a client certificate. I'll also need to validate the certificate presented by the remote service.
How can I use my own client certificate with urllib2?
What will I need to do in my code to ensure that the remote certificate is correct?
Because alex's answer is a link, and the code on that page is poorly formatted, I'm just going to put this here for posterity:
import urllib2, httplib
class HTTPSClientAuthHandler(urllib2.HTTPSHandler):
def __init__(self, key, cert):
urllib2.HTTPSHandler.__init__(self)
self.key = key
self.cert = cert
def https_open(self, req):
# Rather than pass in a reference to a connection class, we pass in
# a reference to a function which, for all intents and purposes,
# will behave as a constructor
return self.do_open(self.getConnection, req)
def getConnection(self, host, timeout=300):
return httplib.HTTPSConnection(host, key_file=self.key, cert_file=self.cert)
opener = urllib2.build_opener(HTTPSClientAuthHandler('/path/to/file.pem', '/path/to/file.pem.') )
response = opener.open("https://example.org")
print response.read()
Here's a bug in the official Python bugtracker that looks relevant, and has a proposed patch.
Per Antoine Pitrou's response to the issue linked in Hank Gay's answer, this can be simplified somewhat (as of 2011) by using the included ssl library:
import ssl
import urllib.request
context = ssl.create_default_context()
context.load_cert_chain('/path/to/file.pem', '/path/to/file.key')
opener = urllib.request.build_opener(urllib.request.HTTPSHandler(context=context))
response = opener.open('https://example.org')
print(response.read())
(Python 3 code, but the ssl library is also available in Python 2).
The load_cert_chain function also accepts an optional password parameter, allowing the private key to be encrypted.
check http://www.osmonov.com/2009/04/client-certificates-with-urllib2.html