Adding Cookie to SOAPpy Request - python

I'm trying to send a SOAP request using SOAPpy as the client. I've found some documentation stating how to add a cookie by extending SOAPpy.HTTPTransport, but I can't seem to get it to work.
I tried to use the example here,
but the server I'm trying to send the request to started throwing 415 errors, so I'm trying to accomplish this without using ClientCookie, or by figuring out why the server is throwing 415's when I do use it. I suspect it might be because ClientCookie uses urllib2 & http/1.1, whereas SOAPpy uses urllib & http/1.0
Does someone know how to make ClientCookie use http/1.0, if that is even the problem, or a way to add a cookie to the SOAPpy headers without using ClientCookie? If tried this code using other services, it only seems to throw errors when sending requests to Microsoft servers.
I'm still finding my footing with python, so it could just be me doing something dumb.
import sys, os, string
from SOAPpy import WSDL,HTTPTransport,Config,SOAPAddress,Types
import ClientCookie
Config.cookieJar = ClientCookie.MozillaCookieJar()
class CookieTransport(HTTPTransport):
def call(self, addr, data, namespace, soapaction = None, encoding = None,
http_proxy = None, config = Config):
if not isinstance(addr, SOAPAddress):
addr = SOAPAddress(addr, config)
cookie_cutter = ClientCookie.HTTPCookieProcessor(config.cookieJar)
hh = ClientCookie.HTTPHandler()
hh.set_http_debuglevel(1)
# TODO proxy support
opener = ClientCookie.build_opener(cookie_cutter, hh)
t = 'text/xml';
if encoding != None:
t += '; charset="%s"' % encoding
opener.addheaders = [("Content-Type", t),
("Cookie", "Username=foobar"), # ClientCookie should handle
("SOAPAction" , "%s" % (soapaction))]
response = opener.open(addr.proto + "://" + addr.host + addr.path, data)
data = response.read()
# get the new namespace
if namespace is None:
new_ns = None
else:
new_ns = self.getNS(namespace, data)
print '\n' * 4 , '-'*50
# return response payload
return data, new_ns
url = 'http://www.authorstream.com/Services/Test.asmx?WSDL'
proxy = WSDL.Proxy(url, transport=CookieTransport)
print proxy.GetList()

Error 415 is because of incorrect content-type header.
Install httpfox for firefox or whatever tool (wireshark, Charles or Fiddler) to track what headers are you sending. Try Content-Type: application/xml.
...
t = 'application/xml';
if encoding != None:
t += '; charset="%s"' % encoding
...
If you trying to send file to the web server use Content-Type:application/x-www-form-urlencoded

A nice hack to use cookies with SOAPpy calls
Using Cookies with SOAPpy calls

Related

Derive protocol from url

I do have a list of urls such as ["www.bol.com ","www.dopper.com"]format.
In order to be inputted on scrappy as start URLs I need to know the correct HTTP protocol.
For example:
["https://www.bol.com/nl/nl/", "https://dopper.com/nl"]
As you see the protocol might differ from https to http or even with or without www.
Not sure if there are any other variations.
is there any python tool that can determine the right protocol?
If not and I have to build the logic by myself what are the cases that I should take into account?
For option 2, this is what I have so far:
def identify_protocol(url):
try:
r = requests.get("https://" + url + "/", timeout=10)
return r.url, r.status_code
except requests.HTTPError:
r = requests.get("http//" + url + "/", timeout=10)
return r.url, r.status_code
except requests.HTTPError:
r = requests.get("https//" + url.replace("www.","") + "/", timeout=10)
return r.url, r.status_code
except:
return None, None
is there any other possibility I should take into account?
There is no way to determine the protocol/full domain from the fragment directly, the information simply isn't there. In order to find it you would either need:
a database of the correct protocol/domains, which you can lookup your domain fragment in
to make the request and see what the server tells you
If you do (2) you can of course gradually build your own database to avoid needing the request in future.
On many https servers, if you attempt a http connection you will be redirected to https. If you are not, then you can reliably use the http. If the http fails, then you could try again with https and see if it works.
The same applies to the domain: if the site usually redirects, you can perform the request using the original domain and see where you are redirected.
An example using requests:
>>> import requests
>>> r = requests.get('http://bol.com')
>>> r
<Response [200]>
>>> r.url
'https://www.bol.com/nl/nl/'
As you can see the request object url parameter has the final destination URL, plus protocol.
As I understood question, you need to retrieve final url after all possible redirections. It could be done with built-in urllib.request. If provided url has no scheme you can use http as default. To parse input url I used combination of urlsplit() and urlunsplit().
Code:
import urllib.request as request
import urllib.parse as parse
def find_redirect_location(url, proxy=None):
parsed_url = parse.urlsplit(url.strip())
url = parse.urlunsplit((
parsed_url.scheme or "http",
parsed_url.netloc or parsed_url.path,
parsed_url.path.rstrip("/") + "/" if parsed_url.netloc else "/",
parsed_url.query,
parsed_url.fragment
))
if proxy:
handler = request.ProxyHandler(dict.fromkeys(("http", "https"), proxy))
opener = request.build_opener(handler, request.ProxyBasicAuthHandler())
else:
opener = request.build_opener()
with opener.open(url) as response:
return response.url
Then you can just call this function on every url in list:
urls = ["bol.com ","www.dopper.com", "https://google.com"]
final_urls = list(map(find_redirect_location, urls))
You can also use proxies:
from itertools import cycle
urls = ["bol.com ","www.dopper.com", "https://google.com"]
proxies = ["http://localhost:8888"]
final_urls = list(map(find_redirect_location, urls, cycle(proxies)))
To make it a bit faster you can make checks in parallel threads using ThreadPoolExecutor:
from concurrent.futures import ThreadPoolExecutor
urls = ["bol.com ","www.dopper.com", "https://google.com"]
final_urls = list(ThreadPoolExecutor().map(find_redirect_location, urls))

Curl Equivalent in Python

I have a python program that takes pictures and I am wondering how I would write a program that sends those pictures to a particular URL.
If it matters, I am running this on a Raspberry Pi.
(Please excuse my simplicity, I am very new to all this)
Many folks turn to the requests library for this sort of thing.
For something lower level, you might use urllib2
I've been using the requests package as well. Here's an example POST from the requests documentation.
If you are feeling that you want to use CURL, try PyCurl.
Install it using:
sudo pip install pycurl
Here is an example of how to send data using it:
import pycurl
import json
import urllib
import cStringIO
url = 'your_url'
first_param = '12'
dArrayData = [{'data' : 'first'}, {'data':'second'}]
json_to_send = json.dumps(dArrayData, separators=(',',':'), sort_keys=False)
curlClient = pycurl.Curl()
curlClient.setopt(curlClient.USERAGENT, 'curl-user-agent')
# Sets the url of the service
curlClient.setopt(curlClient.URL, url)
# Sets the request to be of the type POST
curlClient.setopt(curlClient.POST, True)
# Sets the params of the post request
send_params = 'first_param=' + first_param + '&data=' + urllib.quote(json_to_send)
curlClient.setopt(curlClient.POSTFIELDS, send_params)
# Setting the buffer for the response to be written to
bufResponse = cStringIO.StringIO()
curlClient.setopt(curlClient.WRITEFUNCTION, bufResponse.write)
# Setting to fail on error
curlClient.setopt(curlClient.FAILONERROR, True)
# Sets the time out for the connections
curlClient.setopt(pycurl.CONNECTTIMEOUT, 25)
curlClient.setopt(pycurl.TIMEOUT, 25)
response = ''
try:
# Performs the operation
curlClient.perform()
except pycurl.error as err:
errno, errString = err
print '========'
print 'ERROR sending the data:'
print '========'
print 'CURL error code:', errno
print 'CURL error Message:', errString
else:
response = bufResponse.getvalue()
# Do what ever you want with the response.. Json it or what ever..
finally:
curlClient.close()
bufResponse.close()
The requests library is most supported and advanced way to do this.

VCloud Director Org user authentication for RestAPI in python

I have VMware setup for testing. I create one user abc/abc123 to access the Org url "http://localhost/cloud/org/MyOrg". I want to access the RestAPI of the VCloud. I tried with RestClient plugin in firefox. Its working fine.
Now I tried with python code.
url = 'https://localhost/api/sessions/'
req = urllib2.Request(url)
base64string = base64.encodestring('%s:%s' % ('abc#MyOrg', 'abc123'))[:-1]
authheader = "Basic %s" % base64string
req.add_header("Authorization", authheader)
req.add_header("Accept", 'application/*+xml;version=1.5')
f = urllib2.urlopen(req)
data = f.read()
print(data)
This is the code i get from stackoverflow. But for my example its give "urllib2.HTTPError: HTTP Error 403: Forbidden" Error.
I also tried HTTP authentication for the same.
After doing some googling I found the solution from the post https://stackoverflow.com/a/6348729/243031. I change the code for my usability. I am posting the answer because if some one has same error then he will get the answer directly.
My change code is:
import urllib2
import base64
# make a string with the request type in it:
method = "POST"
# create a handler. you can specify different handlers here (file uploads etc)
# but we go for the default
handler = urllib2.HTTPSHandler()
# create an openerdirector instance
opener = urllib2.build_opener(handler)
# build a request
url = 'https://localhost/api/sessions'
request = urllib2.Request(url)
# add any other information you want
base64string = base64.encodestring('%s:%s' % ('abc#MyOrg', 'abc123'))[:-1]
authheader = "Basic %s" % base64string
request.add_header("Authorization", authheader)
request.add_header("Accept",'application/*+xml;version=1.5')
# overload the get method function with a small anonymous function...
request.get_method = lambda: method
# try it; don't forget to catch the result
try:
connection = opener.open(request)
except urllib2.HTTPError,e:
connection = e
# check. Substitute with appropriate HTTP code.
if connection.code == 200:
data = connection.read()
print "Data :", data
else:
print "ERRROR", connection.code
Hope this will help some one who want to send POST request without the data.

Making a POST call instead of GET using urllib2

There's a lot of stuff out there on urllib2 and POST calls, but I'm stuck on a problem.
I'm trying to do a simple POST call to a service:
url = 'http://myserver/post_service'
data = urllib.urlencode({'name' : 'joe',
'age' : '10'})
content = urllib2.urlopen(url=url, data=data).read()
print content
I can see the server logs and it says that I'm doing GET calls, when I'm sending the data
argument to urlopen.
The library is raising an 404 error (not found), which is correct for a GET call, POST calls are processed well (I'm also trying with a POST within a HTML form).
Do it in stages, and modify the object, like this:
# make a string with the request type in it:
method = "POST"
# create a handler. you can specify different handlers here (file uploads etc)
# but we go for the default
handler = urllib2.HTTPHandler()
# create an openerdirector instance
opener = urllib2.build_opener(handler)
# build a request
data = urllib.urlencode(dictionary_of_POST_fields_or_None)
request = urllib2.Request(url, data=data)
# add any other information you want
request.add_header("Content-Type",'application/json')
# overload the get method function with a small anonymous function...
request.get_method = lambda: method
# try it; don't forget to catch the result
try:
connection = opener.open(request)
except urllib2.HTTPError,e:
connection = e
# check. Substitute with appropriate HTTP code.
if connection.code == 200:
data = connection.read()
else:
# handle the error case. connection.read() will still contain data
# if any was returned, but it probably won't be of any use
This way allows you to extend to making PUT, DELETE, HEAD and OPTIONS requests too, simply by substituting the value of method or even wrapping it up in a function. Depending on what you're trying to do, you may also need a different HTTP handler, e.g. for multi file upload.
This may have been answered before: Python URLLib / URLLib2 POST.
Your server is likely performing a 302 redirect from http://myserver/post_service to http://myserver/post_service/. When the 302 redirect is performed, the request changes from POST to GET (see Issue 1401). Try changing url to http://myserver/post_service/.
Have a read of the urllib Missing Manual. Pulled from there is the following simple example of a POST request.
url = 'http://myserver/post_service'
data = urllib.urlencode({'name' : 'joe', 'age' : '10'})
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
print response.read()
As suggested by #Michael Kent do consider requests, it's great.
EDIT: This said, I do not know why passing data to urlopen() does not result in a POST request; It should. I suspect your server is redirecting, or misbehaving.
The requests module may ease your pain.
url = 'http://myserver/post_service'
data = dict(name='joe', age='10')
r = requests.post(url, data=data, allow_redirects=True)
print r.content
it should be sending a POST if you provide a data parameter (like you are doing):
from the docs:
"the HTTP request will be a POST instead of a GET when the data parameter is provided"
so.. add some debug output to see what's up from the client side.
you can modify your code to this and try again:
import urllib
import urllib2
url = 'http://myserver/post_service'
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))
data = urllib.urlencode({'name' : 'joe',
'age' : '10'})
content = opener.open(url, data=data).read()
Try this instead:
url = 'http://myserver/post_service'
data = urllib.urlencode({'name' : 'joe',
'age' : '10'})
req = urllib2.Request(url=url,data=data)
content = urllib2.urlopen(req).read()
print content
url="https://myserver/post_service"
data["name"] = "joe"
data["age"] = "20"
data_encoded = urllib2.urlencode(data)
print urllib2.urlopen(url + "?" + data_encoded).read()
May be this can help

In-python MediaWiki authentication from cookies

What would be easiest way to use MediaWiki cookies in some Python CGI scripts (on the same domain, ofc) for authentication (including MW's OpenID, especially)?
Access from python to MediaWiki database is possible, too.
A very easy way to use Cookies with mediawiki is as follows:
from cookielib import CookieJar
import urllib2
import urllib
import json
cj = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
Now, requests can be made using opener. For example:
login_data = {
'action': 'login',
'lgname': 'Example',
'lgpassword': 'Foobar',
'format': 'json'
}
data = urllib.urlencode(login_data)
request = opener.open('http://en.wikipedia.org/w/api.php',data)
content = json.load(request)
login_data['token'] = content['login']['token']
data_2 = urllib.urlencode(login_data)
request_2 = opener.open('http://en.wikipedia.org/w/api.php',data_2)
content_2 = json.load(request_2)
print content_2['login']['result']
In the example above, if the Cookiejar was not created, the login wouldn't fully work, asking for another token. Though, it'd be recommended to use a mediawiki wrapper already created such as pywikipedia, mwhair, pytybot, simplemediawiki or wikitools, with hundreds of other mediawiki wrappers in python.
You could connect to and modify the SQL-database without HTTP and cookies using the MySQLdb module, but this is often the wrong solution to do MediaWiki maintenance. Though read-only-access should not be a problem.
The best way to access MediaWiki with a script is to use the api.php.
The best known Python based MediaWiki-API-bot is the Pywikibot (former Pywikipediabot).
The easiest way to save cookies in Python might be to use the http.cookiejar module.
Its documentation contains some simple examples.
I extracted functional example code out of my own MediaWiki-bot:
#!/usr/bin/python3
import http.cookiejar
import urllib.request
import urllib.parse
import json
s_login_name = 'example'
s_login_password = 'secret'
s_api_url = 'http://en.wikipedia.org/w/api.php'
s_user_agent = 'StackOverflowExample/0.0.1.2012.09.26.1'
def api_request(d_post_params):
d_post_params['format'] = 'json'
r_post_params = urllib.parse.urlencode(d_post_params).encode('utf-8')
o_url_request = urllib.request.Request(s_api_url, r_post_params)
o_url_request.add_header('User-Agent', s_user_agent)
o_http_response = o_url_opener.open(o_url_request)
s_reply = o_http_response.read().decode('utf-8')
d_reply = json.loads(s_reply)
return (o_http_response.code, d_reply)
o_cookie_jar = http.cookiejar.CookieJar()
o_http_cookie_processor = urllib.request.HTTPCookieProcessor(o_cookie_jar)
o_url_opener = urllib.request.build_opener(o_http_cookie_processor)
d_post_params = {'action': 'login', 'lgname': s_login_name}
i_code, d_reply = api_request(d_post_params)
print('http code: %d' % (i_code))
print('api reply: %s' % (d_reply))
s_login_token = d_reply['login']['token']
d_post_params = {
'action': 'login',
'lgname': s_login_name,
'lgpassword': s_login_password,
'lgtoken':s_login_token
}
i_code, d_reply = api_request(d_post_params)
print('http code: %d' % (i_code))
print('api reply: %s' % (d_reply))
Classes, error handling and sub-functions have been removed to increase readability.
The cookies saved in o_url_opener can also be used for requests to index.php.
You could also login via index.php (fake a browser request) but this would include parsing of HTML-output.
Variable name legend:
# Unicode string
s_* = 'a'
# Bytes (raw string)
r_* = b'a'
# Dictionary
d_* = {'a':1}
# Integer number
i_* = 4711
# Other objects
o_* = SomeClass()

Categories