The default Python xmlrpc.client.Transport (can be used with xmlrpc.client.ServerProxy) does not retain cookies, which are sometimes needed for cookie based logins.
For example, the following proxy, when used with the TapaTalk API (for which the login method uses cookies for authentication), will give a permission error when trying to modify posts.
proxy = xmlrpc.client.ServerProxy(URL, xmlrpc.client.Transport())
There are some solutions for Python 2 on the net, but they aren't compatible with Python 3.
How can I use a Transport that retains cookies?
Existing answer from GermainZ works only for HTTP. After a lot of time fighting with it, there is HTTPS adaptation. Note the context option which is crucial.
class CookiesTransport(xmlrpc.client.SafeTransport):
"""A SafeTransport (HTTPS) subclass that retains cookies over its lifetime."""
# Note context option - it's required for success
def __init__(self, context=None):
super().__init__(context=context)
self._cookies = []
def send_headers(self, connection, headers):
if self._cookies:
connection.putheader("Cookie", "; ".join(self._cookies))
super().send_headers(connection, headers)
def parse_response(self, response):
# This check is required if in some responses we receive no cookies at all
if response.msg.get_all("Set-Cookie"):
for header in response.msg.get_all("Set-Cookie"):
cookie = header.split(";", 1)[0]
self._cookies.append(cookie)
return super().parse_response(response)
The reason for it is that ServerProxy doesn't respect context option related to transport, if transport is specified, so we need to use it directly in Transport constructor.
Usage:
import xmlrpc.client
import ssl
transport = CookiesTransport(context=ssl._create_unverified_context())
# Note the closing slash in address as well, very important
server = xmlrpc.client.ServerProxy("https://<api_link>/", transport=transport)
# do stuff with server
server.myApiFunc({'param1': 'x', 'param2': 'y'})
This is a simple Transport subclass that will retain all cookies:
class CookiesTransport(xmlrpc.client.Transport):
"""A Transport subclass that retains cookies over its lifetime."""
def __init__(self):
super().__init__()
self._cookies = []
def send_headers(self, connection, headers):
if self._cookies:
connection.putheader("Cookie", "; ".join(self._cookies))
super().send_headers(connection, headers)
def parse_response(self, response):
for header in response.msg.get_all("Set-Cookie"):
cookie = header.split(";", 1)[0]
self._cookies.append(cookie)
return super().parse_response(response)
Usage:
proxy = xmlrpc.client.ServerProxy(URL, CookiesTransport())
Since xmlrpc.client in Python 3 has better suited hooks for this, it's much simpler than an equivalent Python 2 version.
Related
I'm setting up a small Python service to act as an REST API reverse proxy, but hoping there's some libraries available to help speed this process up.
Need to be able to run a function to calculate a variable to inject as a request header when the request is proxied through to the backend.
As it stands I have a simpler script to do the function to get the variable and inject it into a Nginx config file and then force a Nginx hot reload via signals, but trying to remove this dependency for what should be a fairly simple task.
Would a good approach be to use falcon as the listener and combine it with another approach to inject and forward requests?
Thanks for reading.
Edit: Been reading https://aiohttp.readthedocs.io/en/stable/ as it seems to be the right direction.
Thanks to someone over at falcon, this is now the accepted answer!
import io
import falcon
import requests
class Proxy(object):
UPSTREAM = 'https://httpbin.org'
def __init__(self):
self.session = requests.Session()
def handle(self, req, resp):
headers = dict(req.headers, Via='Falcon')
for name in ('HOST', 'CONNECTION', 'REFERER'):
headers.pop(name, None)
request = requests.Request(req.method, self.UPSTREAM + req.path,
data=req.bounded_stream.read(),
headers=headers)
prepared = request.prepare()
from_upstream = self.session.send(prepared, stream=True)
resp.content_type = from_upstream.headers.get('Content-Type',
falcon.MEDIA_HTML)
resp.status = falcon.get_http_status(from_upstream.status_code)
resp.stream = from_upstream.iter_content(io.DEFAULT_BUFFER_SIZE)
api = falcon.API()
api.add_sink(Proxy().handle)
I'm trying to refactor some code in which many HTTP requests are made using the requests module. Many of these requests have (partially) the same headers, so I would like to 'pre-fill' these using Session objects.
However, I'm having difficulty making multiple inheritance work in this context. Here is what I've tried:
import requests, time
requestbin_URL = 'http://requestb.in/1nsaz9y1' # For testing only; remains usable for 48 hours
auth_token = 'asdlfjkwoieur182932385' # Fake authorization token
class AuthorizedSession(requests.Session):
def __init__(self, auth_token):
super(AuthorizedSession, self).__init__()
self.auth_token = auth_token
self.headers.update({'Authorization': 'token=' + self.auth_token})
class JSONSession(requests.Session):
def __init__(self):
super(JSONSession, self).__init__()
self.headers.update({'content-type': 'application/json'})
class AuthorizedJSONSession(AuthorizedSession, JSONSession):
def __init__(self, auth_token):
AuthorizedSession.__init__(self, auth_token=auth_token)
JSONSession.__init__(self)
""" These two commented-out requests work as expected """
# with JSONSession() as s:
# response = s.post(requestbin_URL, data={"ts" : time.time()})
# with AuthorizedSession(auth_token=auth_token) as s:
# response = s.post(requestbin_URL, data={"key1" : "value1"})
""" This one doesn't """
with AuthorizedJSONSession(auth_token=auth_token) as s:
response = s.post(requestbin_URL, data={"tag" : "some_tag_name"})
If I inspect the result of the last request at http://requestb.in/1nsaz9y1?inspect, I see the following:
It seems like the Content-Type field is correctly set to application/json; however, I don't see an Authorization header with the fake authentication token. How can I combine the AuthorizedSession and JSONSession classes to see both?
I've found that the request works if I define AuthorizedJSONSession more simply as follows:
class AuthorizedJSONSession(AuthorizedSession, JSONSession):
def __init__(self, auth_token):
super(AuthorizedJSONSession, self).__init__(auth_token=auth_token)
The resulting request now has updated both the Authorization and Content-Type headers:
I've understood that when a class inherits from multiple classes which in turn inherit from the same base class, then Python is 'smart enough' to simply use super to initialize.
I've got a piece of code that I can't figure out how to unit test! The module pulls content from external XML feeds (twitter, flickr, youtube, etc.) with urllib2. Here's some pseudo-code for it:
params = (url, urlencode(data),) if data else (url,)
req = Request(*params)
response = urlopen(req)
#check headers, content-length, etc...
#parse the response XML with lxml...
My first thought was to pickle the response and load it for testing, but apparently urllib's response object is unserializable (it raises an exception).
Just saving the XML from the response body isn't ideal, because my code uses the header information too. It's designed to act on a response object.
And of course, relying on an external source for data in a unit test is a horrible idea.
So how do I write a unit test for this?
urllib2 has a functions called build_opener() and install_opener() which you should use to mock the behaviour of urlopen()
import urllib2
from StringIO import StringIO
def mock_response(req):
if req.get_full_url() == "http://example.com":
resp = urllib2.addinfourl(StringIO("mock file"), "mock message", req.get_full_url())
resp.code = 200
resp.msg = "OK"
return resp
class MyHTTPHandler(urllib2.HTTPHandler):
def http_open(self, req):
print "mock opener"
return mock_response(req)
my_opener = urllib2.build_opener(MyHTTPHandler)
urllib2.install_opener(my_opener)
response=urllib2.urlopen("http://example.com")
print response.read()
print response.code
print response.msg
It would be best if you could write a mock urlopen (and possibly Request) which provides the minimum required interface to behave like urllib2's version. You'd then need to have your function/method which uses it able to accept this mock urlopen somehow, and use urllib2.urlopen otherwise.
This is a fair amount of work, but worthwhile. Remember that python is very friendly to ducktyping, so you just need to provide some semblance of the response object's properties to mock it.
For example:
class MockResponse(object):
def __init__(self, resp_data, code=200, msg='OK'):
self.resp_data = resp_data
self.code = code
self.msg = msg
self.headers = {'content-type': 'text/xml; charset=utf-8'}
def read(self):
return self.resp_data
def getcode(self):
return self.code
# Define other members and properties you want
def mock_urlopen(request):
return MockResponse(r'<xml document>')
Granted, some of these are difficult to mock, because for example I believe the normal "headers" is an HTTPMessage which implements fun stuff like case-insensitive header names. But, you might be able to simply construct an HTTPMessage with your response data.
Build a separate class or module responsible for communicating with your external feeds.
Make this class able to be a test double. You're using python, so you're pretty golden there; if you were using C#, I'd suggest either in interface or virtual methods.
In your unit test, insert a test double of the external feed class. Test that your code uses the class correctly, assuming that the class does the work of communicating with your external resources correctly. Have your test double return fake data rather than live data; test various combinations of the data and of course the possible exceptions urllib2 could throw.
Aand... that's it.
You can't effectively automate unit tests that rely on external sources, so you're best off not doing it. Run an occasional integration test on your communication module, but don't include those tests as part of your automated tests.
Edit:
Just a note on the difference between my answer and #Crast's answer. Both are essentially correct, but they involve different approaches. In Crast's approach, you use a test double on the library itself. In my approach, you abstract the use of the library away into a separate module and test double that module.
Which approach you use is entirely subjective; there's no "correct" answer there. I prefer my approach because it allows me to build more modular, flexible code, something I value. But it comes at a cost in terms of additional code to write, something that may not be valued in many agile situations.
You can use pymox to mock the behavior of anything and everything in the urllib2 (or any other) package. It's 2010, you shouldn't be writing your own mock classes.
I think the easiest thing to do is to actually create a simple web server in your unit test. When you start the test, create a new thread that listens on some arbitrary port and when a client connects just returns a known set of headers and XML, then terminates.
I can elaborate if you need more info.
Here's some code:
import threading, SocketServer, time
# a request handler
class SimpleRequestHandler(SocketServer.BaseRequestHandler):
def handle(self):
data = self.request.recv(102400) # token receive
senddata = file(self.server.datafile).read() # read data from unit test file
self.request.send(senddata)
time.sleep(0.1) # make sure it finishes receiving request before closing
self.request.close()
def serve_data(datafile):
server = SocketServer.TCPServer(('127.0.0.1', 12345), SimpleRequestHandler)
server.datafile = datafile
http_server_thread = threading.Thread(target=server.handle_request())
To run your unit test, call serve_data() then call your code that requests a URL that looks like http://localhost:12345/anythingyouwant.
Why not just mock a website that returns the response you expect? then start the server in a thread in setup and kill it in the teardown. I ended up doing this for testing code that would send email by mocking an smtp server and it works great. Surely something more trivial could be done for http...
from smtpd import SMTPServer
from time import sleep
import asyncore
SMTP_PORT = 6544
class MockSMTPServer(SMTPServer):
def __init__(self, localaddr, remoteaddr, cb = None):
self.cb = cb
SMTPServer.__init__(self, localaddr, remoteaddr)
def process_message(self, peer, mailfrom, rcpttos, data):
print (peer, mailfrom, rcpttos, data)
if self.cb:
self.cb(peer, mailfrom, rcpttos, data)
self.close()
def start_smtp(cb, port=SMTP_PORT):
def smtp_thread():
_smtp = MockSMTPServer(("127.0.0.1", port), (None, 0), cb)
asyncore.loop()
return Thread(None, smtp_thread)
def test_stuff():
#.......snip noise
email_result = None
def email_back(*args):
email_result = args
t = start_smtp(email_back)
t.start()
sleep(1)
res.form["email"]= self.admin_email
res = res.form.submit()
assert res.status_int == 302,"should've redirected"
sleep(1)
assert email_result is not None, "didn't get an email"
Trying to improve a bit on #john-la-rooy answer, I've made a small class allowing simple mocking for unit tests
Should work with python 2 and 3
try:
import urllib.request as urllib
except ImportError:
import urllib2 as urllib
from io import BytesIO
class MockHTTPHandler(urllib.HTTPHandler):
def mock_response(self, req):
url = req.get_full_url()
print("incomming request:", url)
if url.endswith('.json'):
resdata = b'[{"hello": "world"}]'
headers = {'Content-Type': 'application/json'}
resp = urllib.addinfourl(BytesIO(resdata), header, url, 200)
resp.msg = "OK"
return resp
raise RuntimeError('Unhandled URL', url)
http_open = mock_response
#classmethod
def install(cls):
previous = urllib._opener
urllib.install_opener(urllib.build_opener(cls))
return previous
#classmethod
def remove(cls, previous=None):
urllib.install_opener(previous)
Used like this:
class TestOther(unittest.TestCase):
def setUp(self):
previous = MockHTTPHandler.install()
self.addCleanup(MockHTTPHandler.remove, previous)
I have the following code for a simple BaseHTTPServer based server.
class myHandler(BaseHTTPRequestHandler):
#Handler for the GET requests
def do_GET(self):
# Parse the query_str
query_str = self.path.strip().lower()
if query_str.startswith("/download?"):
query_str = query_str[10:]
opts = urlparse.parse_qs(query_str)
# Send the html message and download file
self.protocol_version = 'HTTP/1.1'
self.send_response(200)
self.send_header("Content-type", 'text/html')
self.send_header("Content-length", 1)
self.end_headers()
self.wfile.write("0")
# Some code to do some processing
# ...
# -----------
self.wfile.write("1")
I was expecting the HTML page to show "1", but it shows "0". How can I update the response through keep alive?
I believe you are setting self.protocol_version to 'HTTP/1.1' too late. You are doing it in your do_GET() method, at which point your request handler has already been instantiated, and the server has already inspected that instance's protocol_version property.
Better to set it on the class:
class myHandler(BaseHTTPRequestHandler):
protocol_version = 'HTTP/1.1'
Not sure what you are trying to accomplish, but if you want the 1 to be sent, you need to set your content-length to 2 or remove it entirely. The 1 is not going to overwrite the 0, so you will see 01.
https://docs.python.org/2/library/basehttpserver.html
protocol_version
This specifies the HTTP protocol version used in responses. If set to 'HTTP/1.1', the server will permit HTTP persistent connections; however, your server must then include an accurate Content-Length header (using send_header()) in all of its responses to clients. For backwards compatibility, the setting defaults to 'HTTP/1.0'.
I faced same question. I tried set protocol_version in my do_METHOD() function which doesn't work.
My code look like this.
def _handle(self, method):
self.protocol_version = "HTTP/1.1"
# some code here
def do_GET(self):
self._handle("GET")
I used ss and tcpdump to detect network and finally find server will reset connection after send response although it use http/1.1.
So I try set protocol_version just under my class which inherited from standard library class and it works. Because of cost of time, I don't dive into source code. Hope it works for others.
I need to create a secure channel between my server and a remote web service. I'll be using HTTPS with a client certificate. I'll also need to validate the certificate presented by the remote service.
How can I use my own client certificate with urllib2?
What will I need to do in my code to ensure that the remote certificate is correct?
Because alex's answer is a link, and the code on that page is poorly formatted, I'm just going to put this here for posterity:
import urllib2, httplib
class HTTPSClientAuthHandler(urllib2.HTTPSHandler):
def __init__(self, key, cert):
urllib2.HTTPSHandler.__init__(self)
self.key = key
self.cert = cert
def https_open(self, req):
# Rather than pass in a reference to a connection class, we pass in
# a reference to a function which, for all intents and purposes,
# will behave as a constructor
return self.do_open(self.getConnection, req)
def getConnection(self, host, timeout=300):
return httplib.HTTPSConnection(host, key_file=self.key, cert_file=self.cert)
opener = urllib2.build_opener(HTTPSClientAuthHandler('/path/to/file.pem', '/path/to/file.pem.') )
response = opener.open("https://example.org")
print response.read()
Here's a bug in the official Python bugtracker that looks relevant, and has a proposed patch.
Per Antoine Pitrou's response to the issue linked in Hank Gay's answer, this can be simplified somewhat (as of 2011) by using the included ssl library:
import ssl
import urllib.request
context = ssl.create_default_context()
context.load_cert_chain('/path/to/file.pem', '/path/to/file.key')
opener = urllib.request.build_opener(urllib.request.HTTPSHandler(context=context))
response = opener.open('https://example.org')
print(response.read())
(Python 3 code, but the ssl library is also available in Python 2).
The load_cert_chain function also accepts an optional password parameter, allowing the private key to be encrypted.
check http://www.osmonov.com/2009/04/client-certificates-with-urllib2.html