Is it possible in any way to query entities using one of their parent's property in GAE, like this (which doesn't work)?
class Car(db.Model):
title = db.StringProperty()
type = db.StringProperty()
class Part(db.Model):
title = db.StringProperty()
car = Car()
car.title = 'BMW X5'
car.type = 'SUV'
car.put()
part = Part(parent = car)
part.title = 'Left door'
part.put()
parts = Part.all()
parts.filter('parent.type ==', 'SUV') # this in particular
I've read about ReferenceProperty, and Indexes but I'm not sure what I need.
GAE lets me set a parent to the Part entity, but do I need an actually (kind of duplicate):
parent = db.ReferenceProperty(Car, required=True)
That would feel like duplicating what the system does already since it has a parent. Or is there an other way?
It's not an answer to your question as such, but NDB offers structured properties.
https://developers.google.com/appengine/docs/python/ndb/properties#structured
You can structure a model's properties. For example, you can define a model class Contact containing a list of addresses, each with internal structure.
Although the structured properties instances are defined using the same syntax as for model classes, they are not full-fledged entities. They don't have their own keys in the Datastore. They cannot be retrieved independently of the entity to which they belong. An application can, however, query for the values of their individual fields.
So here car would contain parts as a structured property. If this is viable in your use case depends on how you structure your data. If you want to know what parts make up a specific car, that seems viable. If you want to filer global parts regardless of what car they belong to, then you can still do that but you'll have to make the "parts" inside each car also refer to a different model. If you see what I mean (I'm not sure I do), as each car contains it's own parts.
Adding the parent as an explicit property isn't going to help.
You can break it up in two parts though:
for suv in Car.all().filter('type', 'SUV'):
for part in Part.all(ancestor=suv):
...do something with part...
If you want to query on the property of another (parent) object, you gotta get that object first.
I can think of two solutions to your problem:
Guido's way is to query for the parent, and then query for the part. This way issues more queries.
The second way is to store a copy of parent.type inside your Part. The downsides are that you're storing duplicate data (more storage), and you have to be careful that your the data in Part and data in Car match up. However, you only need to issue one query.
You'll have to figure out which one works better for you.
Related
I want to have several "bundles" (Mjbundle), which essentially are bundles of questions (Mjquestion). The Mjquestion has an integer "index" property which needs to be unique, but it should only be unique within the bundle containing it. I'm not sure how to model something like this properly, I try to do it using a structured (repeating) property below, but there is yet nothing actually constraining the uniqueness of the Mjquestion indexes. What is a better/normal/correct way of doing this?
class Mjquestion(ndb.Model):
"""This is a Mjquestion."""
index = ndb.IntegerProperty(indexed=True, required=True)
genre1 = ndb.IntegerProperty(indexed=False, required=True, choices=[1,2,3,4,5,6,7])
genre2 = ndb.IntegerProperty(indexed=False, required=True, choices=[1,2,3])
#(will add a bunch of more data properties later)
class Mjbundle(ndb.Model):
"""This is a Mjbundle."""
mjquestions = ndb.StructuredProperty(Mjquestion, repeated=True)
time = ndb.DateTimeProperty(auto_now_add=True)
(With the above model and having fetched a certain Mjbundle entity, I am not sure how to quickly fetch a Mjquestion from mjquestions based on the index. The explanation on filtering on structured properties looks like it works on the Mjbundle type level, whereas I already have a Mjbundle entity and was not sure how to quickly query only on the questions contained by that entity, without looping through them all "manually" in code.)
So I'm open to any suggestion on how to do this better.
I read this informational answer: https://stackoverflow.com/a/3855751/129202 It gives some thoughts about scalability and on a related note I will be expecting just a couple of bundles but each bundle will have questions in the thousands.
Maybe I should not use the mjquestions property of Mjbundle at all, but rather focus on parenting: each Mjquestion created should have a certain Mjbundle entity as parent. And then "manually" enforce uniqueness at "insert time" by doing an ancestor query.
When you use a StructuredProperty, all of the entities that type are stored as part of the containing entity - so when you fetch your bundle, you have already fetched all of the questions. If you stick with this way of storing things, iterating to check in code is the solution.
How do we implement agregation or composition with NDB on Google App Engine ? What is the best way to proceed depending on use cases ?
Thanks !
I've tried to use a repeated property. In this very simple example, a Project have a list of Tag keys (I have chosen to code it this way instead of using StructuredProperty because many Project objects can share Tag objects).
class Project(ndb.Model):
name = ndb.StringProperty()
tags = ndb.KeyProperty(kind=Tag, repeated=True)
budget = ndb.FloatProperty()
date_begin = ndb.DateProperty(auto_now_add=True)
date_end = ndb.DateProperty(auto_now_add=True)
#classmethod
def all(cls):
return cls.query()
#classmethod
def addTags(cls, from_str):
tagname_list = from_str.split(',')
tag_list = []
for tag in tagname_list:
tag_list.append(Tag.addTag(tag))
cls.tags = tag_list
--
Edited (2) :
Thanks. Finally, I have chosen to create a new Model class 'Relation' representing a relation between two entities. It's more an association, I confess that my first design was unadapted.
An alternative would be to use BigQuery. At first we used NDB, with a RawModel which stores individual, non-aggregated records, and an AggregateModel, which a stores the aggregate values.
The AggregateModel was updated every time a RawModel was created, which caused some inconsistency issues. In hindsight, properly using parent/ancestor keys as Tim suggested would've worked, but in the end we found BigQuery much more pleasant and intuitive to work with.
We just have cronjobs that run everyday to push RawModel to BigQuery and another to create the AggregateModel records with data fetched from BigQuery.
(Of course, this is only effective if you have lots of data to aggregate)
It really does depend on the use case. For small numbers of items StructuredProperty and repeated properties may well be the best fit.
For large numbers of entities you will then look at setting the parent/ancestor in the Key for composition, and have a KeyProperty pointing to the primary entity in a many to one aggregation.
However the choice will also depend heavily on the actual use pattern as well. Then considerations of efficiency kick in.
The best I can suggest is consider carefully how you plan to use these relationships, how active are they (ie are they constantly changing, adding, deleting), do you need to see all members of the relation most of the time, or just subsets. These consideration may well require adjustments to the approach.
Is there any way of using JsonProperties in queries in NDB/GAE? I can't seem to find any information about this.
Person.query(Person.custom.eye_color == "blue").fetch()
With a model looking something like this:
class Person(ndb.Model):
height = ndb.IntegerProperty(default=-1)
#...
#...
custom = ndb.JsonProperty(indexed=False, compressed=False)
The use case is this: I'm storing data about customers, where we at first only needed to query specific data. Now, we want to be able to query for any type of registred data about the persons. For example eye color, which some may have put into the system, or any other custom key/value pair in our JsonProperty.
I know about the expando class but for me, it seems a lot easier to be able to query jsonproperty and to keep all the custom properties on the same "name"; custom. That means that the front end can just loop over the properties in custom. If an expando class would be used, it would be harder to differentiate.
Rather than using a JSONProperty have you considered using a StructuredProperty. You maintain the same structure, just stored differently and you can filter by sub components of the StructureProperty with some restrictions, but that may be sufficient.
See https://developers.google.com/appengine/docs/python/ndb/queries#filtering_structured_properties
for querying StructuredProperties.
I have a model Entry
class Entry(db.Model):
year = db.StringProperty()
.
.
.
and for whatever reason the last name field is stored in a different model LastName:
class LastName(db.Model):
entry = db.ReferenceProperty(Entry, collection_name='last_names')
last_name = db.StringProperty()
If I query Entry and sort it by year (or any other property) using .order() how would I then sort that by the last name? I'm new to python but coming from Java I would guess there's some kind of comparator equivalent; or I'm completely wrong and there's another way to do it. I for sure cannot change my model at this point in time, though that may be the solution later. Any suggestions?
EDIT: I'm currently paginating through the results using offsets (moving to cursors soon, but I think it would be the same issue). So if I try to sort outside of the datastore I would only be sorting the current set; it's possible that the first page will be all 'B's and the second page will have 'A's, so it will only be sorted by page not by overall set. Am I screwed the way my models are currently set up?
A few issues here.
There's no way to do this sorting directly in the datastore API, either in Python or Java - as you no doubt know, the datastore is non-relational, and indirect lookups like this aren't supported.
If this was just a straight one-to-one relationship, which gave you an accessor from the Entry entity to the LastName one, you could use the standard Python sort function to sort the list:
entries.sort(key=lambda e: e.last_name.last_name)
(note that this sorts the list in place but returns None, so don't try assigning from it).
However, this won't work, because what you've actually got here is a one-to-many relationship: there are potentially many LastNames for each Entry. The definition actually recognises this: the collection_name attribute, which defines the accessor from Entry to LastName, is called last_names, ie plural.
So what you're asking doesn't really make sense: which of the potentially many LastNames do you want to sort on? You can certainly do it the other way round - given a query of LastNames, sort by entry year - but given your current structure there's not really any way of doing it.
I must say though, although I don't know the rest of your models, I suspect you have actually got that relationship the wrong way round: the ReferenceProperty should probably live on Entry pointing to LastName rather than the other way round as it is now. Then it would simply be the sort call I gave above.
Assume I have a model definition like this:
class Image(db.Model):
id = db.StringProperty()
url = db.URLProperty()
Now I want to add some fields to this model to make it look like this:
class Image(db.Model):
id = db.StringProperty()
url = db.URLProperty()
width = db.IntegerProperty()
height = db.IntegerProperty()
So, this new model will be applied properly to newly added Image entities. But I also want to update already existing entities so that they contained these two new fields and fill them with values. Will an already existing entity get these two fields automatically so when I refer to them, it will give me empty fields or will it cause an error? I suppose I will have to create a helper function that will go through all existing entities and set new fields values, right? So, what should I keep in mind and how to better do this model update? I think it will happen sometimes as the application emerges, so I think it would be useful to have some straightforward flow to do that.
This exact scenario is covered in the GAE docs (articles section):
Updating your model's schema.
Basically just change the model definition as you've done, then perform some operation to supply default values for all your extant entities. There are several ways to do the second part - the article describes one.
No already exisiting entity won't get these two fields automatically or it won't assume it to None. It will cause an error when those fields are accessed in existing objects. Only solution avaliable now is to use remote_apy and write your own script to update the existing records. It won't be big deal, write a script to get all the records in the datastore and to set some default values for the new attributes..
Setting_Up_remote_api
Update_schema