Rendering requested type in Tornado

Rendering requested type in Tornado - python

In Tornado, how do you tell apart the different request types? Also, what is the proper way to split out the requests? In the end if I go to /item/1.xml, I want xml, /item/1.html to be a proper html view etc.
Something like:
def getXML():
return self.render('somexmlresult.xml')
def getHTML():
return self.rendeR('htmlresult.html')
or
def get():
if request == 'xml':
return self.render('somexmlresult.xml')
elif request == 'html':
return self.render('htmlresult.html')
~ edit ~ I was shooting for something along the lines of rails' implementation seen here

I would prefer a self describing url like a RESTful application. An url part need not be required to represent the format of the resource. http://www.enterprise.com/customer/abc/order/123 must represent the resource irrespective of whether it is xml/html/json. A way to send the requested format is to send it as one of the request parameters.
http://www.enterprise.com/customer/abc/order/123?mimetype=application/xml
http://www.enterprise.com/customer/abc/order/123?mimetype=application/json
http://www.enterprise.com/customer/abc/order/123?mimetype=text/html
Use the request parameter to serialize to the appropriate format.

mimetype is the correct way to do this, however I can see where an end user would want a more simplistic way of accessing the data in the format they wish.
In order to maintain compatibility with standards compliant libraries etc you should ultimately determine the response type based on the mimetype requested and respond with the appropriate mimetype in the headers.
A way to achieve this while not breaking anything would be to add a parser that checks the URI that was requested for a suffix that matches a tuple of defined suffixes that the route can respond to, if it does and the mimetype is not already specified change the mimetype passed in to be the correct type for the suffix.
Make sure that the final decision is based on the supplied mimetype not the suffix.
This way others can interact with your RESTful service in the way their used to and you can still maintain ease of use for humans etc.
~ edit ~
Heres an example regexp that checks to see if it ends in .js | .html | .xml | .json. This is assuming your given the full URI.
(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*\.(?:js|html|xml|json))(?:\?([^#]*))?(?:#(.*))?
Here's an example that's easier to interpret but less robust
^https?://(?:[a-z\-]+\.)+[a-z]{2,6}(?:/[^/#?]+)+\.(?:js|html|xml|json)$
These regex's are taken from rfc2396

First, set up the handlers to count on a restful style URI. We use 2 chunks of regex looking for an ID and a potential request format (ie html, xml, json etc)
class TaskServer(tornado.web.Application):
def __init__(self, newHandlers = [], debug = None):
request_format = "(\.[a-zA-Z]+$)?"
baseHandlers = [
(r"/jobs" + request_format, JobsHandler),
(r"/jobs/", JobsHandler),
(r"/jobs/new" + request_format, NewJobsHandler),
(r"/jobs/([0-9]+)/edit" + request_format, EditJobsHandler)
]
for handler in newHandlers:
baseHandlers.append(handler)
tornado.web.Application.__init__(self, baseHandlers, debug = debug)
Now, in the handler define a reusable function parseRestArgs (I put mine in a BaseHandler but pasted it here for ease of understanding/to save space) that splits out ID's and request formats. Since you should be expecting id's in a particular order, I stick them in a list.
The get function can be abstracted more but it shows the basic idea of splitting out your logic into different request formats...
class JobsHandler(BaseHandler):
def parseRestArgs(self, args):
idList = []
extension = None
if len(args) and not args[0] is None:
for arg in range(len(args)):
match = re.match("[0-9]+", args[arg])
if match:
slave_id = int(match.groups()[0])
match = re.match("(\.[a-zA-Z]+$)", args[-1])
if match:
extension = match.groups()[0][1:]
return idList, extension
def get(self, *args):
### Read
job_id, extension = self.parseRestArgs(args)
if len(job_id):
if extension == None or "html":
#self.render(html) # Show with some ID voodoo
pass
elif extension == 'json':
#self.render(json) # Show with some ID voodoo
pass
else:
raise tornado.web.HTTPError(404) #We don't do that sort of thing here...
else:
if extension == None or "html":
pass
# self.render(html) # Index- No ID given, show an index
elif extension == "json":
pass
# self.render(json) # Index- No ID given, show an index
else:
raise tornado.web.HTTPError(404) #We don't do that sort of thing here...

Related

How to convert suds object to xml string

This is a duplicate to this question:
How to convert suds object to xml
But the question has not been answered: "totxt" is not an attribute on the Client class.
Unfortunately I lack of reputation to add comments. So I ask again:
Is there a way to convert a suds object to its xml?
I ask this because I already have a system that consumes wsdl files and sends data to a webservice. But now the customers want to alternatively store the XML as files (to import them later manually). So all I need are 2 methods for writing data: One writes to a webservice (implemented and tested), the other (not implemented yet) writes to files.
If only I could make something like this:
xml_as_string = My_suds_object.to_xml()
The following code is just an example and does not run. And it's not elegant. Doesn't matter. I hope you get the idea what I want to achieve:
I have the function "write_customer_obj_webservice" that works. Now I want to write the function "write_customer_obj_xml_file".
import suds
def get_customer_obj():
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
customer = c.factory.create("ns0:CustomerType")
return customer
def write_customer_obj_webservice(customer):
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
response = c.service.save(someparameters, None, None, customer)
return response
def write_customer_obj_xml_file(customer):
output_filename = r'C\temp\testxml'
# The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
xml = customer.to_xml()
fo = open(output_filename, 'a')
try:
fo.write(xml)
except:
raise
else:
response = 'All ok'
finally:
fo.close()
return response
# Get the customer object always from the wsdl.
customer = get_customer_obj()
# Since customer is an object, setting it's attributes is very easy. There are very complex objects in this system.
customer.name = "Doe J."
customer.age = 42
# Write the new customer to a webservice or store it in a file for later proccessing
if later_processing:
response = write_customer_obj_xml_file(customer)
else:
response = write_customer_obj_webservice(customer)

I found a way that works for me. The trick is to create the Client with the option "nosend=True".
In the documentation it says:
nosend - Create the soap envelope but don't send. When specified, method invocation returns a RequestContext instead of sending it.
The RequestContext object has the attribute envelope. This is the XML as string.
Some pseudo code to illustrate:
c = suds.client.Client(url, nosend=True)
customer = c.factory.create("ns0:CustomerType")
customer.name = "Doe J."
customer.age = 42
response = c.service.save(someparameters, None, None, customer)
print response.envelope # This prints the XML string that would have been sent.

You have some issues in write_customer_obj_xml_file function:
Fix bad path:
output_filename = r'C:\temp\test.xml'
The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
What's the type of customer? type(customer)?
xml = customer.to_xml() # to be continued...
Why mode='a'? ('a' => append, 'w' => create + write)
Use a with statement (file context manager).
with open(output_filename, 'w') as fo:
fo.write(xml)
Don't need to return a response string: use an exception manager. The exception to catch can be EnvironmentError.
Analyse
The following call:
customer = c.factory.create("ns0:CustomerType")
Construct a CustomerType on the fly, and return a CustomerType instance customer.
I think you can introspect your customer object, try the following:
vars(customer) # display the object attributes
help(customer) # display an extensive help about your instance
Another way is to try the WSDL URLs by hands, and see the XML results.
You may obtain the full description of your CustomerType object.
And then?
Then, with the attributes list, you can create your own XML. Use an XML template and fill it with the object attributes.
You may also found the magic function (to_xml) which do the job for you. But, not sure the XML format matches your need.

client = Client(url)
client.factory.create('somename')
# The last XML request by client
client.last_sent()
# The last XML response from Web Service
client.last_received()

Pyramid route matching and query parameters

I have a Pyramid web service, and code samples are as follows:
View declaration:
#view_config(route_name="services/Prices/GetByTicker/")
def GET(request):
ticker = request.GET('ticker')
startDate = request.GET('startDate')
endDate = request.GET('endDate')
period = request.GET('period')
Routing:
config.add_route('services/Prices/GetByTicker/', 'services/Prices/GetByTicker/{ticker}/{startDate}/{endDate}/{period}')
Now I know this is all screwed up but I don't know what the convention is for Pyramid. At the moment this works inasmuch as the request gets routed to the view successfully, but then I get a "Dictionary object not callable" exception.
The URL looks horrible:
#root/services/Prices/GetByTicker/ticker=APPL/startDate=19981212/endDate=20121231/period=d
Ideally I would like to be able to use a URL something like:
#root/services/Prices/GetByTicker/?ticker=APPL&startDate=19981212&endDate=20121231&period=d
Any Pyramid bods out there willing to take five minutes to explain what I'm doing wrong?

from you sample code, i think you use the URL Dispatch
so it should be like this
config.add_route('services/Prices/GetByTicker/', 'services/Prices/GetByTicker/')
then the URL like:
#root/services/Prices/GetByTicker/?ticker=APPL&startDate=19981212&endDate=20121231&period=d
will match it
--edit--
you don't have to use a name like "services/Prices/GetByTicker" for route_name,and you can get the GET params use request.params['key']
View declaration:
#view_config(route_name="services_Prices_GetByTicker")
def services_Prices_GetByTicker(request):
ticker = request.params['ticker']
startDate = request.params['startDate']
endDate = request.params['endDate']
period = request.params['period']
Routing:
config.add_route('services_Prices_GetByTicker', 'services/Prices/GetByTicker/')

The query string is turned into the request.GET dictionary. You are using parenthesis to call the dictionary instead of accessing items via the brackets. For a url such as
#root/services/Prices/GetByTicker/?ticker=APPL&startDate=19981212&endDate=20121231&period=d
request.GET['ticker'] # -> 'APPL' or an exception if not available
request.GET.get('ticker') # -> 'APPL' or None if not available
request.GET.get('ticker', 'foo') # -> 'APPL' or 'foo' if not available
request.GET.getall('ticker') # -> ['APPL'] or [] if not available
The last option is useful if you expect ticker to be supplied multiple times.
request.params is a combination of request.GET and request.POST where the latter is a dictionary representing the request's body in a form upload.
Anyway, the answer is that request.GET('ticker') syntactically is not one of the options I mentioned, stop doing it. :-)

How to override stuff in a package at runtime?

[EDIT: I'm running Python 2.7.3]
I'm a network engineer by trade, and I've been hacking on ncclient (the version on the website is old, and this was the version I've been working off of) to make it work with Brocade's implementation of NETCONF. There are some tweaks that I had to make in order to get it to work with our Brocade equipment, but I had to fork off the package and make tweaks to the source itself. This didn't feel "clean" to me so I decided I wanted to try to do it "the right way" and override a couple of things that exist in the package*; three things specifically:
A "static method" called build() which belongs to the HelloHandler class, which itself is a subclass of SessionListener
The "._id" attribute of the RPC class (the original implementation used uuid, and Brocade boxes didn't like this very much, so in my original tweaks I just changed this to a static value that never changed).
A small tweak to a util function that builds XML filter attributes
So far I have this code in a file brcd_ncclient.py:
#!/usr/bin/env python
# hack on XML element creation and create a subclass to override HelloHandler's
# build() method to format the XML in a way that the brocades actually like
from ncclient.xml_ import *
from ncclient.transport.session import HelloHandler
from ncclient.operations.rpc import RPC, RaiseMode
from ncclient.operations import util
# register brocade namespace and create functions to create proper xml for
# hello/capabilities exchange
BROCADE_1_0 = "http://brocade.com/ns/netconf/config/netiron-config/"
register_namespace('brcd', BROCADE_1_0)
brocade_new_ele = lambda tag, ns, attrs={}, **extra: ET.Element(qualify(tag, ns), attrs, **extra)
brocade_sub_ele = lambda parent, tag, ns, attrs={}, **extra: ET.SubElement(parent, qualify(tag, ns), attrs, **extra)
# subclass RPC to override self._id to change uuid-generated message-id's;
# Brocades seem to not be able to handle the really long id's
class BrcdRPC(RPC):
def __init__(self, session, async=False, timeout=30, raise_mode=RaiseMode.NONE):
self._id = "1"
return super(BrcdRPC, self).self._id
class BrcdHelloHandler(HelloHandler):
def __init__(self):
return super(BrcdHelloHandler, self).__init__()
#staticmethod
def build(capabilities):
hello = brocade_new_ele("hello", None, {'xmlns':"urn:ietf:params:xml:ns:netconf:base:1.0"})
caps = brocade_sub_ele(hello, "capabilities", None)
def fun(uri): brocade_sub_ele(caps, "capability", None).text = uri
map(fun, capabilities)
return to_xml(hello)
#return super(BrcdHelloHandler, self).build() ???
# since there's no classes I'm assuming I can just override the function itself
# in ncclient.operations.util?
def build_filter(spec, capcheck=None):
type = None
if isinstance(spec, tuple):
type, criteria = spec
# brocades want the netconf prefix on subtree filter attribute
rep = new_ele("filter", {'nc:type':type})
if type == "xpath":
rep.attrib["select"] = criteria
elif type == "subtree":
rep.append(to_ele(criteria))
else:
raise OperationError("Invalid filter type")
else:
rep = validated_element(spec, ("filter", qualify("filter")),
attrs=("type",))
# TODO set type var here, check if select attr present in case of xpath..
if type == "xpath" and capcheck is not None:
capcheck(":xpath")
return rep
And then in my file netconftest.py I have:
#!/usr/bin/env python
from ncclient import manager
from brcd_ncclient import *
manager.logging.basicConfig(filename='ncclient.log', level=manager.logging.DEBUG)
# brocade server capabilities advertising as 1.1 compliant when they're really not
# this will stop ncclient from attempting 1.1 chunked netconf message transactions
manager.CAPABILITIES = ['urn:ietf:params:netconf:capability:writeable-running:1.0', 'urn:ietf:params:netconf:base:1.0']
# BROCADE_1_0 is the namespace defined for netiron configs in brcd_ncclient
# this maps to the 'brcd' prefix used in xml elements, ie subtree filter criteria
with manager.connect(host='hostname_or_ip', username='username', password='password') as m:
# 'get' request with no filter - for brocades just shows 'show version' data
c = m.get()
print c
# 'get-config' request with 'mpls-config' filter - if no filter is
# supplied with 'get-config', brocade returns nothing
netironcfg = brocade_new_ele('netiron-config', BROCADE_1_0)
mplsconfig = brocade_sub_ele(netironcfg, 'mpls-config', BROCADE_1_0)
filterstr = to_xml(netironcfg)
c2 = m.get_config(source='running', filter=('subtree', filterstr))
print c2
# so far it only looks like the supported filters for 'get-config'
# operations are: 'interface-config', 'vlan-config' and 'mpls-config'
Whenever I run my netconftest.py file, I get timeout errors because in the log file ncclient.log I can see that my subclass definitions (namely the one that changes the XML for hello exchange - the staticmethod build) are being ignored and the Brocade box doesn't know how to interpret the XML that the original ncclient HelloHandler.build() method is generating**. I can also see in the generated logfile that the other things I'm trying to override are also being ignored, like the message-id (static value of 1) as well as the XML filters.
So, I'm kind of at a loss here. I did find this blog post/module from my research, and it would appear to do exactly what I want, but I'd really like to be able to understand what I'm doing wrong via doing it by hand, rather than using a module that someone has already written as an excuse to not have to figure this out on my own.
*Can someone explain to me if this is "monkey patching" and is actually bad? I've seen in my research that monkey patching is not desirable, but this answer and this answer are confusing me quite a bit. To me, my desire to override these bits would prevent me from having to maintain an entire fork of my own ncclient.
**To give a little more context, this XML, which ncclient.transport.session.HelloHandler.build() generates by default, the Brocade box doesn't seem to like:
<?xml version='1.0' encoding='UTF-8'?>
<nc:hello xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
<nc:capabilities>
<nc:capability>urn:ietf:params:netconf:base:1.0</nc:capability>
<nc:capability>urn:ietf:params:netconf:capability:writeable-running:1.0</nc:capability>
</nc:capabilities>
</nc:hello>
The purpose of my overridden build() method is to turn the above XML into this (which the Brocade does like:
<?xml version="1.0" encoding="UTF-8"?>
<hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<capabilities>
<capability>urn:ietf:params:netconf:base:1.0</capability>
<capability>urn:ietf:params:netconf:capability:writeable-running:1.0</capability>
</capabilities>
</hello>

So it turns out that the "meta info" should not have been so hastily removed, because again, it's difficult to find answers to what I'm after when I don't fully understand what I want to ask. What I really wanted to do was override stuff in a package at runtime.
Here's what I've changed brcd_ncclient.py to (comments removed for brevity):
#!/usr/bin/env python
from ncclient import manager
from ncclient.xml_ import *
brcd_new_ele = lambda tag, ns, attrs={}, **extra: ET.Element(qualify(tag, ns), attrs, **extra)
brcd_sub_ele = lambda parent, tag, ns, attrs={}, **extra: ET.SubElement(parent, qualify(tag, ns), attrs, **extra)
BROCADE_1_0 = "http://brocade.com/ns/netconf/config/netiron-config/"
register_namespace('brcd', BROCADE_1_0)
#staticmethod
def brcd_build(capabilities):
hello = brcd_new_ele("hello", None, {'xmlns':"urn:ietf:params:xml:ns:netconf:base:1.0"})
caps = brcd_sub_ele(hello, "capabilities", None)
def fun(uri): brcd_sub_ele(caps, "capability", None).text = uri
map(fun, capabilities)
return to_xml(hello)
def brcd_build_filter(spec, capcheck=None):
type = None
if isinstance(spec, tuple):
type, criteria = spec
# brocades want the netconf prefix on subtree filter attribute
rep = new_ele("filter", {'nc:type':type})
if type == "xpath":
rep.attrib["select"] = criteria
elif type == "subtree":
rep.append(to_ele(criteria))
else:
raise OperationError("Invalid filter type")
else:
rep = validated_element(spec, ("filter", qualify("filter")),
attrs=("type",))
if type == "xpath" and capcheck is not None:
capcheck(":xpath")
return rep
manager.transport.session.HelloHandler.build = brcd_build
manager.operations.util.build_filter = brcd_build_filter
And then in netconftest.py:
#!/usr/bin/env python
from brcd_ncclient import *
manager.logging.basicConfig(filename='ncclient.log', level=manager.logging.DEBUG)
manager.CAPABILITIES = ['urn:ietf:params:netconf:capability:writeable-running:1.0', 'urn:ietf:params:netconf:base:1.0']
with manager.connect(host='host', username='user', password='password') as m:
netironcfg = brcd_new_ele('netiron-config', BROCADE_1_0)
mplsconfig = brcd_sub_ele(netironcfg, 'mpls-config', BROCADE_1_0)
filterstr = to_xml(netironcfg)
c2 = m.get_config(source='running', filter=('subtree', filterstr))
print c2
This gets me almost to where I want to be. I still have to edit the original source code to change the message-id's from being generated with uuid1().urn because I haven't figured out or don't understand how to change an object's attributes before __init__ happens at runtime (chicken/egg problem?); here's the offending code in ncclient/operations/rpc.py:
class RPC(object):
DEPENDS = []
REPLY_CLS = RPCReply
def __init__(self, session, async=False, timeout=30, raise_mode=RaiseMode.NONE):
self._session = session
try:
for cap in self.DEPENDS:
self._assert(cap)
except AttributeError:
pass
self._async = async
self._timeout = timeout
self._raise_mode = raise_mode
self._id = uuid1().urn # Keeps things simple instead of having a class attr with running ID that has to be locked
Credit goes to this recipe on ActiveState for finally cluing me in on what I really wanted to do. The code I had originally posted I don't think was technically incorrect - if what I wanted to do was fork off my own ncclient and make changes to it and/or maintain it, which wasn't what I wanted to do at all, at least not right now.
I'll edit my question title to better reflect what I had originally wanted - if other folks have better or cleaner ideas, I'm totally open.

Best way to make subapps with Traversal

Ok so I have my apps, that takes requests from root / Almost everything is using traversal.
But i'd like to make on top of that site a rest api.
So I'm off with two choices. I either separate the that in two different apps and put that rest application to : rest.site.com, Or I can move it to site.com/rest/*traversal
If I'm doing "/rest/*traversal", I guess I'll have to add a route called rest_traversal where the traversal path will be *traversal with the route /rest/*traversal. I did that once for an admin page.
I was wondering if there was a cleanest way to do that. I tried to use virtual_root, but as I understand virtual_root is actually getting added to the path for traversal.
like having virtual_root = /cms and requesting /fun will create the following path /cms/fun
I on the other hand wish to have /cms/fun turned into /fun

I know this has been answered already, but in case someone arrives here looking for another possible way to make "subapps" and using them in pyramid, I wanted to point out that some interesting things can be done with pyramid.wsgi
"""
example of wsgiapp decorator usage
http://docs.pylonsproject.org/projects/pyramid/en/1.3-branch/api/wsgi.html
"""
from pyramid.wsgi import wsgiapp2, wsgiapp
from pyramid.config import Configurator
from webob import Request, Response
import pprint
# define some apps
def wsgi_echo(environ, start_response):
"""pretty print out the environ"""
response = Response(body=pprint.pformat({k: v for k, v in environ.items()
if k not in ["wsgi.errors",
"wsgi.input",
"SCRIPT_NAME"]}))
return response(environ, start_response)
print Request.blank("/someurl").send(wsgi_echo).body
# convert wsgi app to a pyramid view callable
pyramid_echo = wsgiapp(wsgi_echo)
pyramid_echo_2 = wsgiapp2(wsgi_echo)
# wire up a pyramid application
config = Configurator()
config.add_view(pyramid_echo, name="foo") # /foo
config.add_view(pyramid_echo, name="bar") # /bar
config.add_view(pyramid_echo_2, name="foo_2") # /foo
config.add_view(pyramid_echo_2, name="bar_2") # /bar
pyramid_app = config.make_wsgi_app()
#call some urls
foo_body = Request.blank("/foo").send(pyramid_app).body
bar_body = Request.blank("/bar").send(pyramid_app).body
foo_body_2 = Request.blank("/foo_2").send(pyramid_app).body
bar_body_2 = Request.blank("/bar_2").send(pyramid_app).body
# both should be different because we arrived at 2 different urls
assert foo_body != bar_body, "bodies should not be equal"
# should be equal because wsgiapp2 fixes stuff before calling
# application in fact there's an additional SCRIPT_NAME in the
# environment that we are filtering out
assert foo_body_2 == bar_body_2, "bodies should be equal"
# so how to pass the path along? like /foo/fuuuu should come back
# /fuuuu does it
foo_body = Request.blank("/foo_2/fuuuu").send(pyramid_app).body
assert "'/fuuuu'," in foo_body, "path didn't get passed along"
# tldr: a wsgi app that is decorated with wsgiapp2 will recieve data
# as if it was mounted at "/", any url generation it has to do should
# take into account the SCRIPT_NAME variable that may arrive in the
# environ when it is called

If you're using traversal already, why not just use it to return your "rest API root" object when Pyramid traverses to /rest/? From there, everything will work naturally.
class ApplicationRoot(object):
def __getitem__(self, name):
if name == "rest":
return RestAPIRoot(parent=self, name=name)
...
If your "application tree" and "API tree" have the same children and you want to have different views registered for them depending on which branch of the tree the child is located in, you can use containment view predicate to register your API views, so they will only match when the child is inside the "API branch":
containment
This value should be a reference to a Python class or interface that a
parent object in the context resource’s lineage must provide in order
for this view to be found and called. The resources in your resource
tree must be “location-aware” to use this feature.
If containment is not supplied, the interfaces and classes in the
lineage are not considered when deciding whether or not to invoke the
view callable.
Another approach would be not to build a separate "API tree" but to use your "main" application's "URI-space" as RESTful API. The only problem with this is that GET and possibly POST request methods are already "taken" on your resources and mapped to your "normal" views which return HTML or consume HTTP form POSTs. There are numerous ways to work around this:
register the API views with a separate name, so, say GET /users/123 would return HTML and GET /users/123/json would return a JSON object. Similarly, POST /users/123 would expect HTTP form to be posted and POST /users/123/json would expect JSON. A nice thing about this approach is that you can easily add, say, an XML serializer at GET /users/123/xml.
use custom view predicates so GET /users/123 and GET /users/123?format=json are routed to different views. Actually, there's a built-in request_param predicate for that since Pyramid 1.2
use xhr predicate to differentiate requests based on HTTP_X_REQUESTED_WITH header or accept predicate to differentiate on HTTP_ACCEPT header

How do I handle file upload via PUT request in Django?

I'm implementing a REST-style interface and would like to be able to create (via upload) files via a HTTP PUT request. I would like to create either a TemporaryUploadedFile or a InMemoryUploadedFile which I can then pass to my existing FileField and .save() on the object that is part of the model, thereby storing the file.
I'm not quite sure about how to handle the file upload part. Specifically, this being a put request, I do not have access to request.FILES since it does not exist in a PUT request.
So, some questions:
Can I leverage existing functionality in the HttpRequest class, specifically the part that handles file uploads? I know a direct PUT is not a multipart MIME request, so I don't think so, but it is worth asking.
How can I deduce the mime type of what is being sent? If I've got it right, a PUT body is simply the file without prelude. Do I therefore require that the user specify the mime type in their headers?
How do I extend this to large amounts of data? I don't want to read it all into memory since that is highly inefficient. Ideally I'd do what TemporaryUploadFile and related code does - write it part at a time?
I've taken a look at this code sample which tricks Django into handling PUT as a POST request. If I've got it right though, it'll only handle form encoded data. This is REST, so the best solution would be to not assume form encoded data will exist. However, I'm happy to hear appropriate advice on using mime (not multipart) somehow (but the upload should only contain a single file).
Django 1.3 is acceptable. So I can either do something with request.raw_post_data or request.read() (or alternatively some other better method of access). Any ideas?

Django 1.3 is acceptable. So I can
either do something with
request.raw_post_data or
request.read() (or alternatively some
other better method of access). Any
ideas?
You don't want to be touching request.raw_post_data - that implies reading the entire request body into memory, which if you're talking about file uploads might be a very large amount, so request.read() is the way to go. You can do this with Django <= 1.2 as well, but it means digging around in HttpRequest to figure out the the right way to use the private interfaces, and it's a real drag to then ensure your code will also be compatible with Django >= 1.3.
I'd suggest that what you want to do is to replicate the existing file upload behaviour parts of the MultiPartParser class:
Retrieve the upload handers from request.upload_handlers (Which by default will be MemoryFileUploadHandler & TemporaryFileUploadHandler)
Determine the request's content length (Search of Content-Length in HttpRequest or MultiPartParser to see the right way to do this.)
Determine the uploaded file's filename, either by letting the client specify this using the last path part of the url, or by letting the client specify it in the "filename=" part of the Content-Disposition header.
For each handler, call handler.new_file with the relevant args (mocking up a field name)
Read the request body in chunks using request.read() and calling handler.receive_data_chunk() for each chunk.
For each handler call handler.file_complete(), and if it returns a value, that's the uploaded file.
How can I deduce the mime type of what
is being sent? If I've got it right, a
PUT body is simply the file without
prelude. Do I therefore require that
the user specify the mime type in
their headers?
Either let the client specify it in the Content-Type header, or use python's mimetype module to guess the media type.
I'd be interested to find out how you get on with this - it's something I've been meaning to look into myself, be great if you could comment to let me know how it goes!
Edit by Ninefingers as requested, this is what I did and is based entirely on the above and the django source.
upload_handlers = request.upload_handlers
content_type = str(request.META.get('CONTENT_TYPE', ""))
content_length = int(request.META.get('CONTENT_LENGTH', 0))
if content_type == "":
return HttpResponse(status=400)
if content_length == 0:
# both returned 0
return HttpResponse(status=400)
content_type = content_type.split(";")[0].strip()
try:
charset = content_type.split(";")[1].strip()
except IndexError:
charset = ""
# we can get the file name via the path, we don't actually
file_name = path.split("/")[-1:][0]
field_name = file_name
Since I'm defining the API here, cross browser support isn't a concern. As far as my protocol is concerned, not supplying the correct information is a broken request. I'm in two minds as to whether I want say image/jpeg; charset=binary or if I'm going to allow non-existent charsets. In any case, I'm putting setting Content-Type validly as a client-side responsibility.
Similarly, for my protocol, the file name is passed in. I'm not sure what the field_name parameter is for and the source didn't give many clues.
What happens below is actually much simpler than it looks. You ask each handler if it will handle the raw input. As the author of the above states, you've got MemoryFileUploadHandler & TemporaryFileUploadHandler by default. Well, it turns out MemoryFileUploadHandler will when asked to create a new_file decide whether it will or not handle the file (based on various settings). If it decides it's going to, it throws an exception, otherwise it won't create the file and lets another handler take over.
I'm not sure what the purpose of counters was, but I've kept it from the source. The rest should be straightforward.
counters = [0]*len(upload_handlers)
for handler in upload_handlers:
result = handler.handle_raw_input("",request.META,content_length,"","")
for handler in upload_handlers:
try:
handler.new_file(field_name, file_name,
content_type, content_length, charset)
except StopFutureHandlers:
break
for i, handler in enumerate(upload_handlers):
while True:
chunk = request.read(handler.chunk_size)
if chunk:
handler.receive_data_chunk(chunk, counters[i])
counters[i] += len(chunk)
else:
# no chunk
break
for i, handler in enumerate(upload_handlers):
file_obj = handler.file_complete(counters[i])
if not file_obj:
# some indication this didn't work?
return HttpResponse(status=500)
else:
# handle file obj!

Newer Django versions allow for handling this a lot easier thanks to https://gist.github.com/g00fy-/1161423
I modified the given solution like this:
if request.content_type.startswith('multipart'):
put, files = request.parse_file_upload(request.META, request)
request.FILES.update(files)
request.PUT = put.dict()
else:
request.PUT = QueryDict(request.body).dict()
to be able to access files and other data like in POST. You can remove the calls to .dict() if you want your data to be read-only.

I hit this problem while working with Django 2.2, and was looking for something that just worked for uploading a file via PUT request.
from django.http import QueryDict
from django.http.multipartparser import MultiValueDict
from django.core.files.uploadhandler import (
SkipFile,
StopFutureHandlers,
StopUpload,
)
class PutUploadMiddleware(object):
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
method = request.META.get("REQUEST_METHOD", "").upper()
if method == "PUT":
self.handle_PUT(request)
return self.get_response(request)
def handle_PUT(self, request):
content_type = str(request.META.get("CONTENT_TYPE", ""))
content_length = int(request.META.get("CONTENT_LENGTH", 0))
file_name = request.path.split("/")[-1:][0]
field_name = file_name
content_type_extra = None
if content_type == "":
return HttpResponse(status=400)
if content_length == 0:
# both returned 0
return HttpResponse(status=400)
content_type = content_type.split(";")[0].strip()
try:
charset = content_type.split(";")[1].strip()
except IndexError:
charset = ""
upload_handlers = request.upload_handlers
for handler in upload_handlers:
result = handler.handle_raw_input(
request.body,
request.META,
content_length,
boundary=None,
encoding=None,
)
counters = [0] * len(upload_handlers)
for handler in upload_handlers:
try:
handler.new_file(
field_name,
file_name,
content_type,
content_length,
charset,
content_type_extra,
)
except StopFutureHandlers:
break
for chunk in request:
for i, handler in enumerate(upload_handlers):
chunk_length = len(chunk)
chunk = handler.receive_data_chunk(chunk, counters[i])
counters[i] += chunk_length
if chunk is None:
# Don't continue if the chunk received by
# the handler is None.
break
for i, handler in enumerate(upload_handlers):
file_obj = handler.file_complete(counters[i])
if file_obj:
# If it returns a file object, then set the files dict.
request.FILES.appendlist(file_name, file_obj)
break
any(handler.upload_complete() for handler in upload_handlers)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.