I can't seem to get SUDS to download a WSDL that requires Basic auth credentials. My code is simple:
wsdl_url = 'https://example.com/ChangeRequest.do?WSDL'
self.client = Client(wsdl_url, username=username, password=password)
I've also tried:
from suds.transport.https import HttpAuthenticated
wsdl_url = 'https://example.com/ChangeRequest.do?WSDL'
credentials = dict(username=username, password=password)
t = HttpAuthenticated(**credentials)
self.client = Client(url=wsdl_url, transport=t)
In both cases, the service returns a 403 Forbidden error. I can go down into the SUDS code in http.py and add this line to the call:
u2request.add_header('Authorization','Basic xxxxxxxxxxxxxxxxxxxx')
This works. What am I doing wrong to get SUDS to pass my credentials when downloading the WSDL?
Note: I try to connect to the WSDL directly using both Chrome's Postman plugin and SoapUI, and the service works as well. So I know the credentials are correct.
I encountered a similar issue (suds v0.4, wsdl, 403), and found out that it was because the server I'm trying to access blocks any requests with the header User-Agent set like Python-urllib* (suds is using urllib2, hence the default header). Explicitly change the header solves the issue.
Particular to my solution: I overrode the open method of a transport class, and set client options, like the following code snippet. Note that we need to explicitly set for open and subsequent requests separately. Please advice better ways to circumvent this if you know any. And hope this post could help save someone's time in the future.
import urllib2
import suds
from suds.transport.https import HttpAuthenticated
from suds.transport import TransportError
URL = 'https://example.com/ChangeRequest.do?WSDL'
class HttpHeaderModify(HttpAuthenticated):
def open(self, request):
try:
url = request.url
u2request = urllib2.Request(url, headers={'User-Agent': 'Mozilla'})
self.proxy = self.options.proxy
return self.u2open(u2request)
except urllib2.HTTPError, e:
raise TransportError(str(e), e.code, e.fp)
transport = HttpHeaderModify()
client = Client(URL, transport=transport, timeout=10)
# Subsequent requests' header needs to be set again here. The overridden transport
# class only handles opening of the client.
client.set_options(headers={'User-Agent': 'Mozilla'})
P.S. Though my problem may not be the same, searching for "403 suds" pops up this SO question, so I decide just post my solution here.
reference post that gave me the right direction: https://bitbucket.org/jurko/suds/issues/27/client-request-for-wsdl-does-not-use-given
I used to have this issue before and compare with the soap UI header.
Found that suds missing to include the header (Host).
client.set_options(headers={'Host': 'value'})
And issue fixed.
Related
I am trying to web-scrape data from https://www.mcmaster.com. They have provided me with a .pfx file and a passphrase. When making a GET request on Postman using their .json file, I input my website login/password and upload the .pfx certificate with its passphrase and everything works fine. Now I am trying to do this same thing but in Python, but am a bit unsure.
Here is my current Python code, I am unsure where I would put the website email/password login and how to successfully do a GET request.
import requests_pkcs12
from requests_pkcs12 import get
r = get('https://api.mcmaster.com/v1/login', pkcs12_filename='Schallert.pfx', pkcs12_password='mcmasterAPI#1901')
response = requests_pkcs12.get(r)
print(response.text)
Here is how I have it setup in Postman (Website email/pw login)
.PFX Certificate page
Postman has a built in feature where it will convert requests into code. You can do it like so:
On the far right click the Code Snippet Button (</>)
Once you are on that page, there is two available python options
Then all you need to do is copy the code into a Python file and add all your customizations (Should be already optimized)
One thing I’ll warn you about though is the URL. Postman doesn’t add
http:// or https:// to the URL, which means Python will throw a No
Scheme Supplied error.
Available Packages for Auto Conversion:
Requests
http.client
Meaning you will have to use a different package instead of requests_pkcs12
After a quick web search, it looks like you need to create a temporary certificate as a .pem file, which is then passed to the request.
from contextlib import contextmanager
from pathlib import Path
from tempfile import NamedTemporaryFile
import requests
from cryptography.hazmat.primitives.serialization import Encoding, PrivateFormat, NoEncryption
from cryptography.hazmat.primitives.serialization.pkcs12 import load_key_and_certificates
#contextmanager
def pfx_to_pem(pfx_path, pfx_password):
pfx = Path(pfx_path).read_bytes()
private_key, main_cert, add_certs = load_key_and_certificates(pfx, pfx_password.encode('utf-8'), None)
with NamedTemporaryFile(suffix='.pem') as t_pem:
with open(t_pem.name, 'wb') as pem_file:
pem_file.write(private_key.private_bytes(Encoding.PEM, PrivateFormat.PKCS8, NoEncryption()))
pem_file.write(main_cert.public_bytes(Encoding.PEM))
for ca in add_certs:
pem_file.write(ca.public_bytes(Encoding.PEM))
yield t_pem.name
with pfx_to_pem('your pfx file path', 'your passphrase') as cert:
requests.get(url, cert=cert, data=payload)
The package requests_pkcs12 is a wrapper written above the requests package. So, all the parameters that accept requests will accept by the requests_pkcs12.
Here is the source code proof for that. https://github.com/m-click/requests_pkcs12/blob/master/requests_pkcs12.py#L156
Also, from your screenshots, I understood that you are using POST not GET.
import json
from requests_pkcs12 import post
url = "https://api.mcmaster.com/v1/login"
payload = {'Username': 'yourusername',
'Password': 'yourpassword'}
resp = post(url, pkcs12_filename='Schallert.pfx', pkcs12_password='mcmasterAPI#1901', data=json.dumps(payload))
print(resp)
Footnote: I hope the password mcmasterAPI#1901 it's a fake one. if not please don't share any credentials in the platform.
I am writing a script that downloads Sentinel 2 products (satellite imagery) using sentinelsat Python API.
A product's description is structured as JSON and contains the parameter quicklook_url.
Example:
https://apihub.copernicus.eu/apihub/odata/v1/Products('862619d6-9b82-4fe0-b2bf-4e1c78296990')/Products('Quicklook')/$value
Any Sentinel API calls require credentials. So does retrieving a product and also opening the link stored inside quicklook_url. When I call the example in my browser I get asked to enter username and password in order to get
with the name S2A_MSIL2A_20210625T065621_N0300_R063_T39NTJ_20210625T093748-ql.jpg.
Needless to say I am just starting with the API so I am probably missing something but
requests.post(product_description['quicklook_url'], verify=False, auth=HTTPBasicAuth(username, password)).content
yields 0KB damaged file and
requests.get(product_description['quicklook_url']).content
yields 1KB damaged file.
I have looked into requests.Session
session = requests.Session()
session.auth = (username, password)
auth = session.post('URL_FOR_LOGING_IN')
img = session.get(product_description['quicklook_url']).content
The problem is I am unable to find the URL I need to post my session authentification. I am somewhat sure that the sentinelsat API does that but my looks have not yielded any successful result.
I am currently looking into the SentinelAPI class. It has the download_quicklook() function, which I am using right now but I am still curious how to do this without the function.
I guess you don't need to sent a post request. Basic authentication works by sending a header along with each request. The following should work
session = requests.Session()
session.auth = (username, password)
img = session.get(product_description['quicklook_url']).content
Your first attempt is failed because of using POST I think.
requests.gett(product_description['quicklook_url'], verify=False, auth=HTTPBasicAuth(username, password)).content
should also work.
I have to work with an API that has multiple services. All of which require the JSESSION cookie from the authentication one below. When I call the next service however, it doesn't keep the cookie and so rejects them.
from suds.client import Client
url = 'http://example/ws/Authenticate?wsdl'
client = Client(url)
result = client.service.connect(username='admin', password='admin')
print client.options.transport.cookiejar
>>> <cookielib.CookieJar[<Cookie JSESSIONID=XXXXXXXXXX for localhost.local/Service/>]>
I believe that the way to get it to keep this cookie is to extract it, then provide it as a custom header in the format: -
url = 'http://example/ws/dostuffnowloggedin?wsdl'
client2 = Client(url, headers= { 'Cookie': 'JSESSIONID=value'})
But I can't figure out how to do it. I've reviewed the SUDS Docs, URL2LIB and Cookiejar python docs, looked over stack & asked on Reddit. This is the first question I've asked on Stack, I've tried to make it meaningful and specific, but if I've commited a faux par, tell me and I'll do my best to correct it.
Try this.
from suds.client import Client
url = 'http://example/ws/Authenticate?wsdl'
client = Client(url)
result = client.service.connect(username='admin', password='admin')
url2='url of second service'
client2=Client(url2)
client2.options.transport.cookiejar=client.options.transport.cookiejar
I have made a code to get urls from bing search. It gives the error mentioned above.
import urllib
import urllib2
accountKey = 'mykey'
username =accountKey
queryBingFor = "'JohnDalton'"
quoted_query = urllib.quote(queryBingFor)
rootURL = "https://api.datamarket.azure.com/Bing/Search/"
searchURL = rootURL + "Image?$format=json&Query=" + quoted_query
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None, searchURL,username,accountKey)
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(handler)
urllib2.install_opener(opener)
readURL = urllib2.urlopen(searchURL).read()
I have made the username = authKey as someone told me it has to be same for both. Anyways, i didn't get a username when i made the bing webmaster account. Or is it just my email. Excuse me if i have made novice mistakes. I've just started Python.
In the absence of any other information, it seems unlikely that what is effectively your username and password would be the same thing if this site actually needs this form of authorisation.
Are you able to make it work by doing a request in your browser like the following?
https://mykey:mykey#api.datamarket.azure.com/Bing/Search/Image?$format=json&Query=blah
If so then at lerast it sounds like the credentials are right and that its the way you are using them in python that's wrong, but more likely the above will fail with the same error, suggesting the credentials themselves are not valid.
Also see this question, which suggests there may be a problem is the site doesn't do 'standard' auth: urllib2 HTTPPasswordMgr not working - Credentials not sent error
It also suggests that you might need to pass the top level URL of the site tot he password manager rather than the specific search URL.
Finally, it might be worth adapting this code:
http://www.voidspace.org.uk/python/articles/authentication.shtml
for your site to check the auth realm and scheme the site is sending you to check they're supported.
I'm having trouble getting my bot to login to a MediaWiki install on the intranet. I believe it is due to the http authentication protecting the wiki.
Facts:
The wiki root is: https://local.example.com/mywiki/
When visiting the wiki with a web browser, a popup comes up asking for enterprise credentials (I assume this is basic access authentication)
This is what I have in my user-config.py:
mylang = 'en'
family = 'mywiki'
usernames['mywiki']['en'] = u'Bot'
authenticate['local.example.com'] = ('user', 'pass')
This is what I have in mywiki_family.py:
# -*- coding: utf-8 -*-
import family, config
# The Wikimedia family that is known as mywiki
class Family(family.Family):
def __init__(self):
family.Family.__init__(self)
self.name = 'mywiki'
self.langs = { 'en' : 'local.example.com'}
def scriptpath(self, code):
return '/mywiki'
def version(self, code):
return '1.13.5'
def isPublic(self):
return False
def hostname(self, code):
return 'local.example.com'
def protocol(self, code):
return 'https'
def path(self, code):
return '/mywiki/index.php'
When I execute login.py -v -v, I get this:
urllib2.urlopen(urllib2.Request('https://local.example.com/w/index.php?title=Special:Userlogin&useskin=monobook&action=submit', wpSkipCookieCheck=1&wpPassword=XXXX&wpDomain=&wpRemember=1&wpLoginattempt=Aanmelden%20%26%20Inschrijven&wpName=Bot, {'Content-type': 'application/x-www-form-urlencoded', 'User-agent': 'PythonWikipediaBot/1.0'})):
(Redundant traceback info here)
urllib2.HTTPError: HTTP Error 401: Unauthorized
(I'm not sure why it has 'local.example.com/w' instead of '/mywiki'.)
I thought it might be trying to authenticate to example.com instead of example.com/wiki, so I changed the authenticate line to:
authenticate['local.example.com/mywiki'] = ('user', 'pass')
But then I get an HTTP 401.2 error back from IIS:
You do not have permission to view this directory or page using the credentials that you supplied because your Web browser is sending a WWW-Authenticate header field that the Web server is not configured to accept.
Any help on how to get this working would be appreciated.
Update After fixing my family file, it now says:
Getting information for site mywiki:en
('http error', 401, 'Unauthorized', )
WARNING: Could not open 'https://local.example.com/mywiki/index.php?title=Non-existing_page&action=edit&useskin=monobook'. Maybe the server or your connection is down. Retrying in 1 minutes...
I looked at the HTTP headers on a plan urllib2.ulropen call and it's using WWW-Authenticate: Negotiate WWW-Authenticate: NTLM. I'm guessing urllib2 and thus pywikipedia don't support this?
Update Added a tasty bounty for help in getting this to work. I can authenticate using python-ntlm. How do I integrate this into pywikipedia?
Well the fact that login.py tries accessing '\w' instead of your path shows that there is a family configuration issue.
Your code is indented strangely: is scriptpath a member of the new Family class? as in:
class Family(family.Family):
def __init__(self):
family.Family.__init__(self)
self.name = 'mywiki'
self.langs = { 'en' : 'local.example.com'}
def scriptpath(self, code):
return '/mywiki'
def version(self, code):
return '1.13.5'
def isPublic(self):
return False
def hostname(self, code):
return 'local.example.com'
def protocol(self, code):
return 'https'
?
I believe that something is wrong with your family file. A good way to check is to do in a python console:
import wikipedia
site = wikipedia.getSite('en', 'mywiki')
print site.login_address()
as long as the relative address is wrong, showing '/w' instead of '/mywiki', it means that the family file is still not configured correctly, and that the bot won't work :)
Update: how to integrate ntlm in pywikipedia?
I just had a look at the basic example here. I would integrate the code before that line in login.py:
response = urllib2.urlopen(urllib2.Request(self.site.protocol() + '://' + self.site.hostname() + address, data, headers))
You want to write something of the like:
from ntlm import HTTPNtlmAuthHandler
user = 'DOMAIN\User'
password = "Password"
url = self.site.protocol() + '://' + self.site.hostname()
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
# create and install the opener
opener = urllib2.build_opener(auth_NTLM)
urllib2.install_opener(opener)
response = urllib2.urlopen(urllib2.Request(self.site.protocol() + '://' + self.site.hostname() + address, data, headers))
I would test this and integrate it directly into pywikipedia codebase if only I had an available ntlm setup...
Whatever happens, please do not vanish with your solution: we're interested, at pywikipedia, by your solution :)
I am guessing the problem you have is that the server expects basic authentication and you are not handling that in your client. Michael Foord wrote a good article about handling basic authentication in Python.
You did not provide enough information for me to be sure about this, so if that does not work, please provide some additional information, like network dump of you connection attempt.