simplejson + GAE serialize objects with fields names - python

I use this code to define my class in GAE Python:
class Pair(db.Model):
find = db.StringProperty()
replace = db.StringProperty()
rule = db.StringProperty()
tags = db.StringListProperty()
created = db.DateTimeProperty()
updated = db.DateTimeProperty(auto_now=True)
Then I use this code to serialize objects of that class with simplejson:
class PairEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, Pair):
return [str(obj.created), str(obj.updated), obj.find, obj.replace, obj.tags, obj.rule]
Finally I use this code to output the result as the response:
pairsquery = GqlQuery("SELECT * FROM Pair")
pairs = pairsquery.fetch(1000)
pairsList = []
for pair in pairs:
pairsList.append(json.dumps(pair, cls=PairEncoder))
serialized = json.dumps({
'pairs': pairsList,
'count': pairsquery.count()
})
self.response.out.write(serialized)
Here is a sample result I get:
{"count": 2, "pairs": ["[\"2010-12-06 12:32:48.140000\", \"2010-12-06 12:32:48.140000\", \"random string\", \"replacement\", [\"ort\", \"common\", \"movies\"], \"remove\"]", "[\"2010-12-06 12:37:07.765000\", \"2010-12-06 12:37:07.765000\", \"random string\", \"replacement\", [\"ort\", \"common\", \"movies\"], \"remove\"]"]}
All seems to be fine, except one thing - I need the fields in the response to have names from the class Pair, so there won't be just values but the names of the corresponding fields too. How can I do that?

class PairEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, Pair):
return {"created": str(obj.created), "updated:": str(obj.updated), "find": obj.find, "replace": obj.replace, "tags": obj.tags, "rule": obj.rule}
return json.JSONEncoder.default(self, obj)
But you are 'double encoding' here - i.e. encoding the pairs, adding that string to an object and encoding that too. If you 'double decode' on the other end it should work - but it's not the 'proper' way to do things.

I supposed I found a better simple solution for this, instead of serializing it with simplejson, I just created a method inside Pair class that looks like this:
def return_dict(self):
return {'find':self.find, 'replace':self.replace, 'rule':self.rule, 'tags':self.tags}
and does all I need. Thanks!

Related

django - combine the output of json views

I write a simple json api, I use one base class, and I mostly write one api view per one model class. What I want is to combine the output of few views into one url endpoint, with as least as possible additional code.
code:
# base class
class JsonView(View):
def get(self, request):
return JsonResponse(self.get_json())
def get_json(self):
return {}
class DerivedView(JsonView):
param = None
def get_json(self):
# .. use param..
return {'data': []}
urls.py:
url('/endpoint1', DerivedView.as_view(param=1))
url('/endpoint2', DerivedView2.as_view())
# What I want:
url('/combined', combine_json_views({
'output1': DerivedView.as_view(param=1),
'output2': DerivedView2.as_view()
}))
So /combined would give me the following json response:
{'output1': {'data': []}, 'output2': output of DerivedView2}
This is how combine_json_views could be implemented:
def combine_json_views(views_dict):
d = {}
for key, view in views_dict.items():
d[key] = view() # The problem is here
return json.dumps(d)
The problem is that calling view() give me the encoded json, so calling json.dumps again gives invalid json. I could call json.loads(view()), but that looks bad to decode the json that I just encoded.
How can I modify the code (maybe a better base class) here, while keeping it elegant and short? without adding too much code. Is there any way to access the data (dict) which is used to construct JsonResponse?
You can create a combined view that calls the get_json() methods and combines them:
class CombinedView(JsonView):
def get_json(self):
view1 = DerivedView(param=1)
view2 = DerivedView2()
d = view1.get_json()
d.update(view2.get_json())
return d
then:
url('/combined', CombinedView.as_view()),

Returning verbose names from query set when using json serializer

Is it possible to call
tasks = models.Conference.objects.filter(location_id=key)
data = serializers.serialize("json", tasks)
and have it return the verbose field names rather than the variable names?
One way to accomplish this, is by monkey patching the methods within the django.core.serializers.python.Serializer class to return each fields verbose_name opposed to the standard name attribute.
Take for example the following code...
models.py
from django.db import models
class RelatedNode(models.Model):
name = models.CharField(max_length=100, verbose_name="related node")
class Node(models.Model):
name = models.CharField(max_length=100, verbose_name="verbose name")
related_node = models.ForeignKey(RelatedNode, verbose_name="verbose fk related node", related_name="related_node")
related_nodes = models.ManyToManyField(RelatedNode, verbose_name="verbose related m2m nodes", related_name="related_nodes")
I create these model objects within the database...
RelatedNode.objects.create(name='related_node_1')
RelatedNode.objects.create(name='related_node_2')
RelatedNode.objects.create(name='related_node_fk')
Node.objects.create(name='node_1', related_node=RelatedNode.objects.get(name='related_node_fk'))
Node.objects.all()[0].related_nodes.add(RelatedNode.objects.get(name='related_node_1'))
Node.objects.all()[0].related_nodes.add(RelatedNode.objects.get(name='related_node_2'))
views.py
from testing.models import Node
from django.utils.encoding import smart_text, is_protected_type
from django.core.serializers.python import Serializer
from django.core import serializers
def monkey_patch_handle_field(self, obj, field):
value = field._get_val_from_obj(obj)
# Protected types (i.e., primitives like None, numbers, dates,
# and Decimals) are passed through as is. All other values are
# converted to string first.
if is_protected_type(value):
self._current[field.verbose_name] = value
else:
self._current[field.verbose_name] = field.value_to_string(obj)
def monkey_patch_handle_fk_field(self, obj, field):
if self.use_natural_foreign_keys and hasattr(field.rel.to, 'natural_key'):
related = getattr(obj, field.name)
if related:
value = related.natural_key()
else:
value = None
else:
value = getattr(obj, field.get_attname())
self._current[field.verbose_name] = value
def monkey_patch_handle_m2m_field(self, obj, field):
if field.rel.through._meta.auto_created:
if self.use_natural_foreign_keys and hasattr(field.rel.to, 'natural_key'):
m2m_value = lambda value: value.natural_key()
else:
m2m_value = lambda value: smart_text(value._get_pk_val(), strings_only=True)
self._current[field.verbose_name] = [m2m_value(related)
for related in getattr(obj, field.name).iterator()]
Serializer.handle_field = monkey_patch_handle_field
Serializer.handle_fk_field = monkey_patch_handle_fk_field
Serializer.handle_m2m_field = monkey_patch_handle_m2m_field
serializers.serialize('json', Node.objects.all())
This outputs for me...
u'[{"fields": {"verbose fk related node": 3, "verbose related m2m nodes": [1, 2], "verbose name": "node_1"}, "model": "testing.node", "pk": 1}]'
As we could see, this actually gives us back the verbose_name of each field as keys in the returned dictionaries.

ndb.Key filter for MapReduce input_reader

Playing with new Google App Engine MapReduce library filters for input_reader I would like to know how can I filter by ndb.Key.
I read this post and I've played with datetime, string, int, float, in filters tuples, but How I can filter by ndb.Key?
When I try to filter by a ndb.Key I get this error:
BadReaderParamsError: Expected Key, got u"Key('Clients', 406)"
Or this error:
TypeError: Key('Clients', 406) is not JSON serializable
I tried to pass a ndb.Key object and string representation of the ndb.Key.
Here are my two filters tuples:
Sample 1:
input_reader': {
'input_reader': 'mapreduce.input_readers.DatastoreInputReader',
'entity_kind': 'model.Sales',
'filters': [("client","=", ndb.Key('Clients', 406))]
}
Sample 2:
input_reader': {
'input_reader': 'mapreduce.input_readers.DatastoreInputReader',
'entity_kind': 'model.Sales',
'filters': [("client","=", "%s" % ndb.Key('Clients', 406))]
}
This is a bit tricky.
If you look at the code on Google Code you can see that mapreduce.model defines a JSON_DEFAULTS dict which determines the classes that get special-case handling in JSON serialization/deserialization: by default, just datetime. So, you can monkey-patch the ndb.Key class into there, and provide it with functions to do that serialization/deserialization - something like:
from mapreduce import model
def _JsonEncodeKey(o):
"""Json encode an ndb.Key object."""
return {'key_string': o.urlsafe()}
def _JsonDecodeKey(d):
"""Json decode a ndb.Key object."""
return ndb.Key(urlsafe=d['key_string'])
model.JSON_DEFAULTS[ndb.Key] = (_JsonEncodeKey, _JsonDecodeKey)
model._TYPE_IDS['Key'] = ndb.Key
You may also need to repeat those last two lines to patch mapreduce.lib.pipeline.util as well.
Also note if you do this, you'll need to ensure that this gets run on any instance that runs any part of a mapreduce: the easiest way to do this is to write a wrapper script that imports the above registration code, as well as mapreduce.main.APP, and override the mapreduce URL in your app.yaml to point to your wrapper.
Make your own input reader based on DatastoreInputReader, which knows how to decode key-based filters:
class DatastoreKeyInputReader(input_readers.DatastoreKeyInputReader):
"""Augment the base input reader to accommodate ReferenceProperty filters"""
def __init__(self, *args, **kwargs):
try:
filters = kwargs['filters']
decoded = []
for f in filters:
value = f[2]
if isinstance(value, list):
value = db.Key.from_path(*value)
decoded.append((f[0], f[1], value))
kwargs['filters'] = decoded
except KeyError:
pass
super(DatastoreKeyInputReader, self).__init__(*args, **kwargs)
Run this function on your filters before passing them in as options:
def encode_filters(filters):
if filters is not None:
encoded = []
for f in filters:
value = f[2]
if isinstance(value, db.Model):
value = value.key()
if isinstance(value, db.Key):
value = value.to_path()
entry = (f[0], f[1], value)
encoded.append(entry)
filters = encoded
return filters
Are you aware of the to_old_key() and from_old_key() methods?
I had the same problem and came up with a workaround with computed properties.
You can add to your Sales model a new ndb.ComputedProperty with the Key id. Ids are just strings, so you wont have any JSON problems.
client_id = ndb.ComputedProperty(lambda self: self.client.id())
And then add that condition to your mapreduce query filters
input_reader': {
'input_reader': 'mapreduce.input_readers.DatastoreInputReader',
'entity_kind': 'model.Sales',
'filters': [("client_id","=", '406']
}
The only drawback is that Computed properties are not indexed and stored until you call the put() parameter, so you will have to traverse all the Sales entities and save them:
for sale in Sales.query().fetch():
sale.put()

How to save data as BlobProperty instead of multiple ListProperties?

Currently I have the following code:
class User(db.Model):
field_names = db.StringListProperty(indexed=False)
field_values = db.StringListProperty(indexed=False)
field_scores = db.ListProperty(int, indexed=False)
def fields_add(user_key_name, field_name, field_value, field_score):
user = User.get(user_key_name)
if user:
try:
field_index = user.field_names.index(field_name) # (1)
user.field_values[field_index] = field_value
user.field_scores[field_index] = field_score
except ValueError:
# field wasn't added to the list before
user.field_names.append(field_name)
user.field_values.append(field_value)
user.field_scores.append(field_score)
user.put()
It works well, but I would like to optimize that - serialize field_name, field_value and field_score and store in one BlobProperty:
class User(db.Model):
fields = db.ListProperty(indexed=False)
f = {
'f': field_name,
'v': field_value,
's': field_score,
}
user.fields = simplejson.dumps(f)
But how should code (1) look like with such approach? How to find record for update?
If user.fields is a list of dicts where 'f' is the field name, this is one possible answer to your immediate question:
field_index = [field['f'] for field in user.fields].index(field_name)
It's not immediately clear why your revision is more optimal in your case, but I'll take your word for it. :)
You can serialize objects with json or pickle.
So for instance. If your model holds a property : udata = db.BlobProperty()
The serialize an object like : ..udata = pickle.dumps(object).

Serializing ReferenceProperty in Appengine Datastore to JSON

I am using the following code to serialize my appengine datastore to JSON
class DictModel(db.Model):
def to_dict(self):
return dict([(p, unicode(getattr(self, p))) for p in self.properties()])
class commonWordTweets(DictModel):
commonWords = db.StringListProperty(required=True)
venue = db.ReferenceProperty(Venue, required=True, collection_name='commonWords')
class Venue(db.Model):
id = db.StringProperty(required=True)
fourSqid = db.StringProperty(required=False)
name = db.StringProperty(required=True)
twitter_ID = db.StringProperty(required=True)
This returns the following JSON response
[
{
"commonWords": "[u'storehouse', u'guinness', u'badge', u'2011"', u'"new', u'mayor', u'dublin)']",
"venue": "<__main__.Venue object at 0x1028ad190>"
}
]
How can I return the actual venue name to appear?
Firstly, although it's not exactly your question, it's strongly recommended to use simplejson to produce json, rather than trying to turn structures into json strings yourself.
To answer your question, the ReferenceProperty just acts as a reference to your Venue object. So you just use its attributes as per normal.
Try something like:
cwt = commonWordTweets() # Replace with code to get the item from your datastore
d = {"commonWords":cwt.commonWords, "venue": cwt.venue.name}
jsonout = simplejson.dumps(d)

Categories