What is the best REST implemenation when using tornado RequestHandlers

What is the best REST implemenation when using tornado RequestHandlers - python

I would like to define a REST API with a general pattern of:
mysite.com/OBJECT_ID/associations
For example:
mysite.com/USER_ID/vacations - manage a users vacation
mysite.com/USER_ID/music - manage music in the user's music library
mysite.com/PLAYLIST_ID/music - manage music in the context of the given playlist
I am using tornado on the server side and looking for suggestions about how to define the RequestHandlers for this API. For instance, I want to define a handler like:
/([0-9,a-z,A-Z,-]+)/music",MusicHandler), but I'm stuck on the implementation of MusicHandler, which needs to know if the object specified by in the uri supports music in the first place i.e. how to guard against a call like
mysite.com/LOCATION_ID/music
Where locations have no associations with music.
Is the best fix to modify the api to include the type i.e.:
mysite.com/users/USER_ID/music or
mysite.com/playlists/PLAYLIST_ID/music
and then a separate handler for each:
/users/([0-9,a-z,A-Z,-]+)/music",UserMusicHandler),
/playlists/([0-9,a-z,A-Z,-]+)/music",PlaylistMusicHandler)
That doesn't seem right, but I don't really understand how to make this work. I'm sure this is a simple issue, I am new to python and tornado.

First of all, to guard against mysite.com/LOCATION_ID/music I would create differences between all of your id's. For instance, have LOCATION_ID be a 32 character string, and PLAYLIST_ID 34, etc. This way you can check for the string length as soon as the handler is called.
Alternatively, you can use regular expression groups to catch it right in the URI and then define different handlers for each. (Also, your ID should probably be after all of the static text in the URI, just for good convention). For instance, if your PLAYLIST_ID is a UUID and you LOCATION_ID is a string:
(r"/music/([\w]{8}-[\w]{4}-[\w]{4}-[\w]{4}-[\w]{12})", PlaylistMusicHandler), #playlistID
(r"/music/([A-Za-z0-9]+)", LocationHandler), #locationID

if not self.db.get("SELECT 1 FROM objects WHERE music_id = %s", object_id):
raise HTTPError(404, "Music object %s not found" % name)
FWIW a mysite.com/music/MUSIC_ID scheme makes more sense to me.

Related

POST query in Tornado with multiple parameters

Code:
class Telegram(tornado.web.RequestHandler):
def my_f(self,number):
return number
def get(self,number):
self.write( self.my_f(number))
application = tornado.web.Application([
(r"/number/(.*?)", Telegram),
])
Using this piece of code, i can trigger Telegram, providing it with something from the (.*?) part.
Question is: i need to make POST queries like:
/number/messenger=telegram&phone=3332223332211
so that I can grab messenger parameter and phone parameter, and trigger the right class with provided phone number (like Telegram with 3332223332211)

POST requests (usually) have a body, so if you want everything in the URL you probably want a GET instead of a POST.
The normal way to pass arguments is by form-encoding them. That starts with a ? and looks like this: /number?messenger=telegram&phone=12345. To use arguments like this in Tornado, you use self.get_argument("messenger") instead of an argument to the get() method.
A second way of passing parameters is to put them in the "path" part of the URL, without a question mark. This is when you use (.*?) in your routing pattern and an argument to get(). Use this when you want to avoid the question mark for some reason (usually aesthetics).
You can also combine the two: pass the messenger parameter in the URL as you've done here, and add ?number=12345 and use get_argument. But unless you really care about what your URLs look like, I recommend the first form.

Flask : understanding POST method to transmit data

my question is quite hard to describe, so I will focus on explaining the situation. So let's say I have 2 different entities, which may run on different machines. Let's call the first one Manager and the second one Generator. The manager is the only one which can be called via the user.
The manager has a method called getVM(scenario_Id), which takes the ID of a scenario as a parameter, and retrieve a BLOB from the database corresponding to the ID given as a parameter. This BLOB is actually a XML structure that I need to send to the Generator. Both have a Flask running.
On another machine, I have my generator with a generateVM() method, which will create a VM according to the XML structure it recieves. We will not talk about how the VM is created from the XML.
Currently I made this :
Manager
# This method will be called by the user
#app.route("/getVM/<int:scId>", methods=['GET'])
def getVM(scId):
xmlContent = db.getXML(scId) # So here is what I want to send
generatorAddr = sgAdd + "/generateVM" # sgAdd is declared in the Initialize() [IP of the Generator]
# Here how should I put my data ?
# How can I transmit xmlContent ?
testReturn = urlopen(generatorAddr).read()
return json.dumps(testReturn)
Generator
# This method will be called by the user
#app.route("/generateVM", methods=['POST'])
def generateVM():
# Retrieve the XML content...
return "Whatever"
So as you can see, I am stuck on how to transmit the data itself (the XML structure), and then how to treat it... So if you have any strategy, hint, tip, clue on how I should proceed, please feel free to answer. Maybe there are some things I do not really understand about Flask, so feel free to correct everything wrong I said.
Best regards and thank you
PS : Lines with routes are commented because they mess up the syntax coloration

unless i'm missing something couldn't you just transmit it in the body of a post request? Isn't that how your generateVM method is setup?
#app.route("/getVM/<int:scId>", methods=['GET'])
def getVM(scId):
xmlContent = db.getXML(scId)
generatorAddr = sgAdd + "/generateVM"
xml_str = some_method_to_generate_xml()
data_str = urllib.urlencode({'xml': xml_str})
urllib.urlopen(generatorAddr, data=data_str).read()
return json.dumps(testReturn)
http://docs.python.org/2/library/urllib.html#urllib.urlopen

In Python App Engine How Do I Uniquely Identify An Instance Of My App Running On The Dev SDK?

My application relies on an external service that it communicates with using urlfetch. I have multiple developers each running their own instance of my application on their development computers while they add features. Each developer instance needs to be able to uniquely identify itself to the external service so that the external service can keep their data separated.
I need a way to automatically generate a unique identifier for each developer from within the application.
Yes, I could just have each developer put a unique id in a variable in their code but I would much prefer it was automatic.
Also, I could probably read some information about the hardware on the computer (like MAC address) and use that but I want this code to use only things that work on the production server so that I can use it there eventually as well.

The only trick I've seen to identify instances is using a global variable address.
UNIQUE_INSTANCE_ID = {} # at module level
logging.debug("Instance %s." % (str("%X" % id( UNIQUE_INSTANCE_ID )).zfill(16)))
That seems to work fairly well to uniquely identify an instance; but it only identifies an instance, not a machine. So if you restart your instance, you get a new identifier. That might be a "feature".
You could also use some of the META variables; if developers are all running out of a home directory, you could parse a username out of 'PATH_TRANSLATED'.
At the very least, you could make injecting a UUID into the datastore part of the data population; store a metadata kind in the datastore and the cache, and wrap that UUID into the requests.
from uuid import uuid4
from google.appengine.ext import db
from google.appengine.api import memcache
cache = memcache.Client()
class InstanceStamp(db.Model):
code = db.StringProperty()
INSTANCE_STAMP_KEY = "instance_stamp"
#classmethod
def get_stamp(cls):
cache_key = cls.INSTANCE_STAMP_KEY
stamp_code = cache.get(cache_key)
if stamp_code is None:
code = uuid4().hex
stamp = cls.get_or_insert('instance_stamp', code=code)
if stamp is not None:
cache.set(cache_key, stamp.code, 300)
stamp_code = stamp.code
return stamp_code

The string version of each db.Key instance has the same prefix and that prefix appears to be unique per developer instance. Even though they all have the same application id the encoded version of a key is different per machine.
For instance the string key for Foo:1 on one machine is:
ahNkZXZ-bWVkaWFjb29sZXItYXBwcgkLEgNGb28YAQw
On another machine it is:
ahFzfm1lZGlhY29vbGVyLWFwcHIJCxIDRm9vGAEM
I am not sure how many characters (bits?) of the key represents the application name instead of the type and id of the object so I don't think just using a sub-string containing the first N characters is the correct way to do this.
First attempt:
def get_unique_id():
return str(db.Key.from_path('UNIQUE_INSTANCE_ID', 1))
What it does is create a bogus db.Key for a model type that does not exist. On different machines this gives a different value and on the same machine this consistently gives the same value.
UPDATE:
As #Nick Johnson pointed out this does not actually work as I expected and does not solve the problem. I had assumed it was the appid in app.yaml used in the key, however the appid used for keys is the appid from app.yaml with a prefix depending on whether the application is being run in the SDK or on the HR datastore so the string representation of those keys is different because the appid in them is different.

Rewriting An URL With Regular Expression Substitution in Routes

In my Pylons app, some content is located at URLs that look like http://mysite/data/31415. Users can go to that URL directly, or search for "31415" via the search page. My constraints, however, mean that http://mysite/data/000031415 should go to the same page as the above, as should searches for "0000000000031415." Can I strip leading zeroes from that string in Routes itself, or do I need to do that substitution in the controller file? If it's possible to do it in routing.py, I'd rather do it there - but I can't quite figure it out from the documentation that I'm reading.

You can actually do that via conditional functions, since they let you modify the variables from the URL in place.

I know I am cheating by introducing a different routing library, since I haven't used Routes, but here's how this is done with Werkzeug's routing package. It lets you specify that a given fragment of the path is an integer. You can also implement a more specialized "converter" by inheriting werkzeug.routing.BaseConverter, if you wanted to parse something more interesting (e.g. a UUID).
Perhaps, Routes has a similar mechanism in place for specialized path-fragment-parsing needs.
import unittest
from werkzeug.routing import Map, Rule
class RoutingWithInts(unittest.TestCase):
m = Map([Rule('/data/<int:record_locator>', endpoint='data_getter')])
def test_without_leading_zeros(self):
urls = self.m.bind('localhost')
endpoint, urlvars = urls.match('/data/31415')
self.assertEquals({'record_locator': 31415}, urlvars)
def test_with_leading_zeros(self):
urls = self.m.bind('localhost')
endpoint, urlvars = urls.match('/data/000031415')
self.assertEquals({'record_locator': 31415}, urlvars)
unittest.main()

Recursive delete in google app engine

I'm using google app engine with django 1.0.2 (and the django-helper) and wonder how people go about doing recursive delete.
Suppose you have a model that's something like this:
class Top(BaseModel):
pass
class Bottom(BaseModel):
daddy = db.ReferenceProperty(Top)
Now, when I delete an object of type 'Top', I want all the associated 'Bottom' objects to be deleted as well.
As things are now, when I delete a 'Top' object, the 'Bottom' objects stay and then I get data that doesn't belong anywhere. When accessing the datastore in a view, I end up with:
Caught an exception while rendering: ReferenceProperty failed to be resolved.
I could of course find all objects and delete them, but since my real model is at least 5 levels deep, I'm hoping there's a way to make sure this can be done automatically.
I've found this article about how it works with Java and that seems to be pretty much what I want as well.
Anyone know how I could get that behavior in django as well?

You need to implement this manually, by looking up affected records and deleting them at the same time as you delete the parent record. You can simplify this, if you wish, by overriding the .delete() method on your parent class to automatically delete all related records.
For performance reasons, you almost certainly want to use key-only queries (allowing you to get the keys of entities to be deleted without having to fetch and decode the actual entities), and batch deletes. For example:
db.delete(Bottom.all(keys_only=True).filter("daddy =", top).fetch(1000))

Actually that behavior is GAE-specific. Django's ORM simulates "ON DELETE CASCADE" on .delete().
I know that this is not an answer to your question, but maybe it can help you from looking in the wrong places.

Reconsider the data structure. If the relationship will never change on the record lifetime, you could use "ancestors" feature of GAE:
class Top(db.Model): pass
class Middle(db.Model): pass
class Bottom(db.Model): pass
top = Top()
middles = [Middle(parent=top) for i in range(0,10)]
bottoms = [Bottom(parent=middle) for i in range(0,10) for middle in middles]
Then querying for ancestor=top will find all the records from all levels. So it will be easy to delete them.
descendants = list(db.Query().ancestor(top))
# should return [top] + middles + bottoms

If your hierarchy is only a small number of levels deep, then you might be able to do something with a field that looks like a file path:
daddy.ancestry = "greatgranddaddy/granddaddy/daddy/"
me.ancestry = daddy.ancestry + me.uniquename + "/"
sort of thing. You do need unique names, at least unique among siblings.
The path in object IDs sort of does this already, but IIRC that's bound up with entity groups, which you're advised not to use to express relationships in the data domain.
Then you can construct a query to return all of granddaddy's descendants using the initial substring trick, like this:
query = Person.all()
query.filter("ancestry >", gdaddy.ancestry + "\U0001")
query.filter("ancestry <", gdaddy.ancestry + "\UFFFF")
Obviously this is no use if you can't fit the ancestry into a 500 byte StringProperty.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is the best REST implemenation when using tornado RequestHandlers - python

if not self.db.get("SELECT 1 FROM objects WHERE music_id = %s", object_id): raise HTTPError(404, "Music object %s not found" % name) FWIW a mysite.com/music/MUSIC_ID scheme makes more sense to me.

Related

POST query in Tornado with multiple parameters

Flask : understanding POST method to transmit data

In Python App Engine How Do I Uniquely Identify An Instance Of My App Running On The Dev SDK?

Rewriting An URL With Regular Expression Substitution in Routes

Recursive delete in google app engine

Categories

Resources