I am using Suds to access Sharepoint lists through soap, but I am having some trouble with malformed soap.
I am using the following code:
from suds.client import Client
from suds.sax.element import Element
from suds.sax.attribute import Attribute
from suds.transport.https import WindowsHttpAuthenticated
import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger('suds.client').setLevel(logging.DEBUG)
ntlm = WindowsHttpAuthenticated(username='somedomain\\username', password='password')
url = "http://somedomain/sites/somesite/someothersite/somethirdsite/_vti_bin/Lists.asmx?WSDL"
client = Client(url, transport=ntlm)
result = client.service.GetListCollection()
print repr(result)
Every time I run this, I get the result Error 400 Bad request. As I have debugging enabled I can see the resulting envelope:
<SOAP-ENV:Envelope xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns0:Body>
<ns1:GetListCollection/>
</ns0:Body>
</SOAP-ENV:Envelope>
...with this error message:
DEBUG:suds.client:http failed:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>
Running the same WSDL (and raw envelope data as well) through SoapUI the request returns with values as expected. Can anyone see any obvious reason why I get the different results with Suds as SoapUI and how I can correct this?
UPDATE: after testing the exact same code on a different Sharepoint site (i.e. not a subsubsubsite with whitespace in its name) and with Java (JAX-WS, which also had issues with the same site, though, different issues) it appears as if it works as expected. As a result I wonder if one of two details may be the reason for these problems:
SOAP implementations have some issues with subsubsubsites in Sharepoint?
SOAP implementations have some issues with whitespace in its name, even if using %20 as a replacement?
I still have the need to use the original URL with those issues, so any input would be highly appreciated. I assume that since SoapUI worked with the original url, it should be possible to correct whatever is wrong.
I think I narrowed down the issue, and it is specific to suds (possibly other SOAP implementations as well). Your bullet point:
SOAP implementations have some issues with whitespace in its name, even if using %20 as a replacement?
That's spot on. Turning up debug logging for suds allowed me to grab the endpoint, envelope, and headers. Mimicking the exact same call using cURL returns a valid response, but suds it throws the bad request.
The issue is that suds takes your WSDL (url parameter) and parses it, but it doesn't include the URL encoded string. This leads to debug messages like this:
DEBUG:suds.transport.http:opening (https://sub.site.com/sites/Site Collection with Spaces/_vti_bin/UserGroup.asmx?WSDL)
<snip>
TransportError: HTTP Error 400: Bad Request
Piping this request through a fiddler proxy showed that it was running the request against the URL https://sub.site.com/sites/Site due to the way it parses the WSDL. The issue is that you aren't passing the location parameter to suds.client.Client. The following code gives me valid responses every time:
from ntlm3 import ntlm
from suds.client import Client
from suds.transport.https import WindowsHttpAuthenticated
# URL without ?WSDL
url = 'https://sub.site.com/sites/Site%20Collection%20with%20Spaces/_vti_bin/Lists.asmx'
# Create NTLM transport handler
transport = WindowsHttpAuthenticated(username='foo',
password='bar')
# We use FBA, so this forces it to challenge us with
# a 401 so WindowsHttpAuthenticated can take over.
msg = ("%s\\%s" % ('DOM', 'foo'))
auth = 'NTLM %s' % ntlm.create_NTLM_NEGOTIATE_MESSAGE(msg).decode('ascii')
# Create the client and append ?WSDL to the URL.
client = Client(url=(url + "?WSDL"),
location=url,
transport=transport)
# Add the NTLM header to force negotiation.
header = {'Authorization': auth}
client.set_options(headers=header)
One caveat: Using quote from urllib works, but you cannot encode the entire URL or it fails to recognize the URL. You are better off just doing a replace on spaces with %20.
url = url.replace(' ','%20')
Hope this keeps someone else from banging their head against the wall.
Related
I use NASA GSFC server to retrieve data from their archives.
I send request and receives response as a simple text.
I discovered that they amended their page so that login is required.
However, even after logging I'm receiving an error.
I read information provided in thread how do python capture 302 redirect url
as well as tried to use urllib2 and request libraries, but still receiving an error.
Currently part of my code responsible for downloading data looks as follows:
def getSampleData():
import urllib
# I approved application according to:
# http://disc.sci.gsfc.nasa.gov/registration/authorizing-gesdisc-data-access-in-earthdata_login
# Query: http://hydro1.sci.gsfc.nasa.gov/dods/_expr_{GLDAS_NOAH025SUBP_3H}{ave(rainf,time=00Z23Oct2016,time=00Z24Oct2016)}{17.00:25.25,48.75:54.50,1:1,00Z23Oct2016:00Z23Oct2016}.ascii?result
sample_query = 'http://hydro1.sci.gsfc.nasa.gov/dods/_expr_%7BGLDAS_NOAH025SUBP_3H%7D%7Bave(rainf,time=00Z23Oct2016,time=00Z24Oct2016)%7D%7B17.00:25.25,48.75:54.50,1:1,00Z23Oct2016:00Z23Oct2016%7D.ascii?result'
# I've tried also:
# sock=urllib.urlopen(sample_query, urllib.urlencode({'username':'MyUserName','password':'MyPassword'}))
# but I was still asked to provide credentials, so I simplified mentioned line to just:
sock=urllib.urlopen(sample_query)
print('\n\nCurrent url:\n')
print(sock.geturl())
print('\nIs it the same as sample query?')
print(sock.geturl() == sample_query)
returnedData=sock.read()
# returnedData always stores simple page with 302. Why? StackOverflow suggests that urllib and urllib2 handle redirection automatically
sock.close()
with open("Output.html", "w") as text_file:
text_file.write(returnedData)
Output.html content is as follows:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved here.</p>
</body></html>
If I copy-paste sample_query (sample_query from function defined above) to browser, I have no problem with receiving data.
Thus, if there's no hope for solution, I'm thinking about rewriting my code to use Selenium.
It seems that i figured out how to download data:
How to authenticate on NASA gsfc server
However, I don't know how to process dataset.
I would like to display (or write to textfile) output as a raw data (in exactly the same way as I see them in browser)
from suds.client import Client
url = r'http://*********?singleWsdl'
c = Client(url)
The requests work fine till here, but when I execute the below statement, I get the error message shown at the end. Please help.
c.service.Method_Name('parameter1', 'parameter2')
The Error message is :
Exception: (415, u'Cannot process the message because the content type
\'text/xml; charset=utf-8\' was not the expected type
\'multipart/related; type="application/xop+xml"\'.')
A Content-Type header of multipart/related; type="application/xop+xml" is the type used by MTOM, a message format used to efficiently send attachments to/from web services.
I'm not sure why the error claims to be expecting it, because the solution I found for my situation was the override the Content-Type header to 'application/soap+xml;charset=UTF-8'.
Example:
soap_client.set_options(headers = {'Content-Type': 'application/soap+xml;charset=UTF-8'})
If you are able, you could also trying checking for MTOM encoding in the web service's configuration and changing it.
Scope:
I am currently trying to write a Web scraper for this specific page. I have a pretty strong "Web Crawling" background using C#, but this httplib is beating me off.
Problem:
When trying to make a Http Get request for the page specified above I get a "Moved Permanently", that points to the very same URL. I can make a request using the requests lib, but I want to make it work using httplib so I can understand what I am doing wrong.
Code Sample:
I am completely new to Python, so any wrong language guideline or syntax is C#'s fault.
import httplib
# Wrapper for a "HTTP GET" Request
class HttpClient(object):
def HttpGet(self, url, host):
connection = httplib.HTTPConnection(host)
connection.request('GET', url)
return connection.getresponse().read()
# Using "HttpClient" class
httpclient = httpClient()
# This is the full URL I need to make a get request for : https://420101.com/strain-database
httpResponseText = httpclient.HttpGet('www.420101.com','/strain-database')
print httpResponseText
I really want to make it work using the httplib library, instead of requests or any other fancy one because I feel like I am missing something really small here.
The problem i've had too little or too much caffeine in my system.
To get a https, I needed the HTTPSConnection class.
Also, there is no 'www' in the address I wanted to GET. So, it shouldn't be included in the host.
Both of the wrong addresses redirect me to the correct one, with the 301 error code. If I were using requests or a more full featured module, it would have automatically followed the redirect.
My Validation:
c = httplib.HTTPSConnection('420101.com')
c.request("GET", "/strain-database")
r = c.getresponse()
print r.status, r.reason
200 OK
Is it possible to use Python's requests library to send a SOAP request?
It is indeed possible.
Here is an example calling the Weather SOAP Service using plain requests lib:
import requests
url="http://wsf.cdyne.com/WeatherWS/Weather.asmx?WSDL"
#headers = {'content-type': 'application/soap+xml'}
headers = {'content-type': 'text/xml'}
body = """<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:ns0="http://ws.cdyne.com/WeatherWS/" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns1:Body><ns0:GetWeatherInformation/></ns1:Body>
</SOAP-ENV:Envelope>"""
response = requests.post(url,data=body,headers=headers)
print response.content
Some notes:
The headers are important. Most SOAP requests will not work without the correct headers. application/soap+xml is probably the more correct header to use (but the weatherservice prefers text/xml
This will return the response as a string of xml - you would then need to parse that xml.
For simplicity I have included the request as plain text. But best practise would be to store this as a template, then you can load it using jinja2 (for example) - and also pass in variables.
For example:
from jinja2 import Environment, PackageLoader
env = Environment(loader=PackageLoader('myapp', 'templates'))
template = env.get_template('soaprequests/WeatherSericeRequest.xml')
body = template.render()
Some people have mentioned the suds library. Suds is probably the more correct way to be interacting with SOAP, but I often find that it panics a little when you have WDSLs that are badly formed (which, TBH, is more likely than not when you're dealing with an institution that still uses SOAP ;) ).
You can do the above with suds like so:
from suds.client import Client
url="http://wsf.cdyne.com/WeatherWS/Weather.asmx?WSDL"
client = Client(url)
print client ## shows the details of this service
result = client.service.GetWeatherInformation()
print result
Note: when using suds, you will almost always end up needing to use the doctor!
Finally, a little bonus for debugging SOAP; TCPdump is your friend. On Mac, you can run TCPdump like so:
sudo tcpdump -As 0
This can be helpful for inspecting the requests that actually go over the wire.
The above two code snippets are also available as gists:
SOAP Request with requests
SOAP Request with suds
Adding up to the last answer, make sure you add to the headers the following attributes:
headers={"Authorization": f"bearer {token}", "SOAPAction": "http://..."}
The authorization is meant when you need some token to access the SOAP API,
Otherwise, the SOAPAction is the action you are going to perform with the data you are sending in,
So if you don't need Authorization, then you could pop that out of the headers,
That worked pretty fine for me,
I'm currently working on a automated way to interface with a database website that has RESTful webservices installed. I am having issues with figure out the proper formatting of how to properly send the requests listed in the following site using python.
https://neesws.neeshub.org:9443/nees.html
Particular example is this:
POST https://neesws.neeshub.org:9443/REST/Project/731/Experiment/1706/Organization
<Organization id="167"/>
The biggest problem is that I do not know where to put the XML formatted part of the above. I want to send the above as a python HTTPS request and so far I've been trying something of the following structure.
>>>import httplib
>>>conn = httplib.HTTPSConnection("neesws.neeshub.org:9443")
>>>conn.request("POST", "/REST/Project/731/Experiment/1706/Organization")
>>>conn.send('<Organization id="167"/>')
But this appears to be completely wrong. I've never actually done python when it comes to webservices interfaces so my primary question is how exactly am I supposed to use httplib to send the POST Request, particularly the XML formatted part of it? Any help is appreciated.
You need to set some request headers before sending data. For example, content-type to 'text/xml'. Checkout the few examples,
Post-XML-Python-1
Which has this code as example:
import sys, httplib
HOST = www.example.com
API_URL = /your/api/url
def do_request(xml_location):
"""HTTP XML Post requeste"""
request = open(xml_location,"r").read()
webservice = httplib.HTTP(HOST)
webservice.putrequest("POST", API_URL)
webservice.putheader("Host", HOST)
webservice.putheader("User-Agent","Python post")
webservice.putheader("Content-type", "text/xml; charset=\"UTF-8\"")
webservice.putheader("Content-length", "%d" % len(request))
webservice.endheaders()
webservice.send(request)
statuscode, statusmessage, header = webservice.getreply()
result = webservice.getfile().read()
print statuscode, statusmessage, header
print result
do_request("myfile.xml")
Post-XML-Python-2
You may get some idea.