Understanding ndb key class vs KeyProperty

Understanding ndb key class vs KeyProperty - python

I've looked through the documentation, the docs and SO questions and answers and am still struggling with understanding a small piece of this. Which should you choose and when?
This is what I've read so far (just sample):
ndb documentation
movie database structure on SO
Parent Key issues
The key class seems pretty straightforward to me. When you create an ndb entity the datastore automatically creates for you a key usually in the form of key(Kind, id) where the id is created for you .
So say you have these two models:
class Blah(ndb.Model):
last_name = ndb.StringProperty()
class Blah2(ndb.Model):
first_name = ndb.StringProperty()
blahkey = ndb.KeyProperty()
So just using the key kind and you want to make Blah1 a parent (or have several family members with the same last name)
lname = Blah(last_name = "Bonaparte")
l_key = lname.put() **OR**
l_key = lname.key.id() # spits out some long id
fname_key = l_key **OR**
fname_key = ndb.Key('Blah', lname.last_name) # which is more readable..
then:
lname = Blah2( parent=fname_key, first_name = "Napoleon")
lname.put()
lname2 = Blah2( parent=fname_key, first_name = "Lucien")
lname2.put()
So far so good (I think). Now about the KeyProperty for Blah2. Assume Blah1 is still the same.
lname3 = Blah2( first_name = "Louis", blahkey = fname_key)
lname3.put()
Is this correct ?
How to query various things
Query Last Name:
Blah.query() # all last names
Blah.query(last_name='Bonaparte') # That specific entity.
First Name:
Blah2.query()
napol = Blah2.query(first_name = "Napoleon")
bonakey = napol.key.parent().get() # returns Bonaparte's key ??
bona = bonakey.get() # I think this might be redundant
this is where I get lost. How to look for Bonaparte from first name by using either key or keyproperty. I didn't add it here and perhaps should have and that is the discussion of parents, grand parents, great grand parents since Keys keep track of ancestors/parents.
How and why would you use KeyProperty vs the inherent key class. Also imagine you had 3 sensors s1, s2, s3. Each sensor had thousands of readings but you want to keep readings associated with s1 so that you could graph say All readings for today for s1. Which would you use? KeyProperty or the key class ? I apologize if this has been answered elsewhere but I didn't see a clear example/guide about choosing which and why/how.

I think the confusion comes from using a Key. A Key is not associated with any properties inside of an entity, it is only a unique identifier to locate a single entity. It can be either a number or a string.
Fortunately, all your code looks good except for this one line:
fname_key = ndb.Key('Blah', lname.last_name) # which is more readable..
Constructing a Key takes a unique ID, which is not the same as a property. That is, it won't associate the variable lname.last_name with the property last_name. Instead, you can create your record like this:
lname = Blah(id = "Bonaparte")
lname.put()
lname_key = ndb.Key('Blah', "Bonaparte")
You are guaranteed to have only one Blah entity with that ID. In fact, if you use a string like last_name as the ID, you don't need to store it as a separate property. Think of the entity ID as an extra string property that is unique.
Next, Be careful not to assume that Blah.last_name and Blah2.first_name are unique in your queries:
lname = Blah2( parent=fname_key, first_name = "Napoleon")
lname.put()
If you do this more than once, there will be multiple entities with a first_name of Napoleon (all with the same parent key).
Continuing with your code from above:
napol = Blah2.query(first_name = "Napoleon")
bonakey = napol.key.parent().get() # returns Bonaparte's key ??
bona = bonakey.get() # I think this might be redundant
napol holds a Query, not a result. You need to call napol.fetch() to get all entities with "Napolean" (or napol.get() if you're sure there is just one entity).
bonakey is the opposite, it holds the parent entity because of the get() and not Bonaparte's key. If you left the .get() off, then bona would correctly have the parent.
Finally, your question about sensors. You may not need KeyProperty or "inherent" keys. If you have a Readings class like this:
class Readings(ndb.Model):
sensor = ndb.StringProperty()
reading = ndb.IntegerProperty()
then you can store them all in a single table without keys. (You may want to include a timestamp or other attribute.) Later, you can retrieve then with this query:
s1_readings = Readings.query(Readings.sensor == 'S1').fetch()

I'm new to NDB also, and I'm still not understanding all for now, but I think that when you create Blah2 with a parent for Napoleon, you will need the parent to query it or will not appear. For example:
napol = Blah2.query(first_name = "Napoleon")
will not get anything (and you are not using the right format for NDB), but using the parent will do:
napol = Blah2.query(ancestor=fname_key).filter(Blah2.first_name == "Napoleon").get
Don't know if this puts some light for your question.

Related

Runtime Foreign Key vs Integerfield

I have a problem. I already have two solution for my problem, but i was wondering which of those is the faster solution.
I guess that the second solution is not only more convienient- to use but also faster, but i want to be sure, so thats the reason why im asking.
My problem is i want to group multiple rows together. The group won't hold any meta data. So im only interested in runtime.
On the one hand i can use a Integer field and filter it later on when i need to get all entries that belong to the group. I guess runtime of O(n).
class SingleEntries(models.Model):
name = models.CharField(max_length=20)
group = models.IntegerField(null=True)
def find_all_group_members(id):
return SingleEntries.objects.filter(group=id)
The second solution and probably the more practicle way would be to create a foreign key to another model only using the pk there.
Then i can use the reverse relation to find all the entries that belong to the group.
class Group(models.Model):
id = models.AutoField(primary_key=True)
class SingleEntries(models.Model):
name = models.CharField(max_length=20)
group = models.ForeignKey(Group,on_delete=models.CASCADE,null=True)
def find_all_group_members(id):
return Group.objects.get(id=id).singleentries_set.all()

The first is more efficient, since this will use one query, whereas the latter will first fetch the Group, and then another one for the SingleEntries.
Indeed, if you work with:
SingleEntries.objects.filter(group=id)
this will make a simple query:
SELECT appname_singleentries.*
FROM appname_singleentries
WHERE appname_singleentries.group_id = id
It thus does not first fetch the Group into memory.
The latter will however make two queries. Indeed, it will first make a query to retrieve the Group, and then it will make a query like the one above to fetch the SingleEntries.
The two are also semantically not entirely the same: if there is no such group, then the former will return an empty QuerySet, whereas the latter will raise a Group.DoesNotExists exception.
But you can model this with:
class Group(models.Model):
pass
class SingleEntries(models.Model):
name = models.CharField(max_length=20)
group = models.ForeignKey(Group,on_delete=models.CASCADE,null=True)
def find_all_group_members(id):
return SingleEntries.objects.filter(group_id=id)
So you can use a Group model without having to retrieve the Group first.

If the groups are static in nature, that means if you don't see more groups coming to your system, you can use choices in Django.
Define choices as below
class GroupType(models.IntegerChoices):
GROUP_0 = 0, "Group 0 name"
GROUP_1 = 1, "Group 1 name"
GROUP_2 = 2, "Group 2 name"
And use it as choices field in the SingleEntries model as below
class SingleEntries(models.Model):
name = models.CharField(max_length=20)
group = models.IntegerField(choices=GroupChoices.choices, default=<set default here>)
If the groups are dynamic, meaning users can create groups whenever they want, in that case, go with your second approach of having another model for group.

GAE NDB Accessing sub-instances (fields) within structured repeated list

I am having difficulties with accessing an instance within a structured list.
Below is my structured list:
class FavFruits(ndb.Model):
fruit = ndb.StringProperty()
score = ndb.IntegerProperty()
comment = ndb.TextProperty()
class UserProfile(ndb.Model):
uid = ndb.StringProperty(required=True)
password = ndb.StringProperty(required=True)
firstName = ndb.StringProperty(required=True)
favFruits = ndb.StructuredProperty(FavFruits, repeated=True)
I want to display score under FavFruits entity.
I tried UserProfile.favFruits.score with no luck.
I also tried UserProfile.favFruits[index].score, which worked, but now requires looping and I would like to avoid it.
Ultimately, I want to do the following logic:
if UserProfile.uid == userEntering then user enters fruit name
if UserProfile.favFruits.fruit == fruitName (user entered) then display UserProfile.favFruits.score and UserProfile.favFruits.comments for UserProfile.favFruits.fruit specified by user.
Lastly, I would like to display all the fruit/scores that user enters. Say, user entered "apple" and "orange" for fruit names, then I want to loop, for example (along this line):
for x in fruitNames
print x
print UserProfile.favFruits.score.query(UserProfile.favFruits.fruit == x)
Is this possible? Seemingly trivial task, but I cannot figure this out..
Thank you in advance!

Your requirements are contradictory. If you don't want to loop, then don't use repeated=True. But then you won't be able to store more than one for each entity. There's no possible way to have multiple things without looping or indexing.

referencing an entity by its key before it gets saved to the ndb

I would like to be able to relate an entity of one class to another entity at the moment of the creation of both entities (one entity will have the other as it's parent and the other would have a key pointing to the other entity). It seems I am unable to obtain the key of an entity prior it gets saved to the Datastore. Is there any way to achieve the above without having to save one of the entities twice?
Below is the example:
class A(ndb.Model):
key_of_b = ndb.KeyProperty(kind='B')
class B(ndb.Model):
pass
What I am trying to do:
a = A()
b = B(parent=a.key)
a.key_of_b = b.key
a.put()
b.put()
If the key doesn't get assigned prior to the entity being saved, is there anyway I could construct it myself? Is there any way to achieve this or would the only solution be to save one of the entities twice?

You could do this with named keys but then you have to be sure you can name the two entities with unique keys:
# It is possible to construct a key for an entity that does not yet exist.
keyname_a = 'abc'
keyname_b = 'def'
key_a = ndb.Key(A, keyname_a)
key_b = ndb.Key(A, keyname_a, B, keyname_b)
a = A(id=keyname_a)
a.key_of_b = key_b
b = B(id=keyname_b, parent=key_a)
a.put()
b.put()
However, I would suggest thinking about why you would need the key_of_b property in the first place. If you only set A as the parent of B then you will always be able to navigate from from A to B and the other way around:
# If you have the A entity from somewhere and want to find B.
b = B.query(ancestor=entity_a.key).get()
# You have the B entity from somewhere and want to find A.
a = entity_b.key.parent().get()
This also gives you the opportunity to create one-to-many relationships between A and B.

How do I filter by a date range based on a dictionary value in python, otherwise filter a list by a dictionary?

I have a dictionaries that look like this:
s_dates_dict = {17: datetime.date(2009,21,9,0,24), 19: datetime.datetime(2011,12,1,19,39,16), ....}
e_dates_dict = {17: datetime.date(2010,25,9,10,24), 19: datetime.datetime(2012,1,11,17,39,16), ....}
I want to use these dictionaries to find the next record of type (A or B) after that date for each record. So it seems that I would be best off creating a query similar to below:
next_record_list = Tasks.objects.filter(date__range(s_dates_dict,e_dates_dict), client__in=client_list, task_type__in=[A,B])
Whereby the dictionaries would cause a dynamically changing range to the pk referenced. I haven't found anything suggesting this is possible or efficient, so I am guessing the next best would be creating a list by eliminating the date range filter and then iterating a dictionary of the oldest values using a for statement and cutting off iteration of each record at the date referenced in dates_dict. But I haven't figured out a method to do this either. Could I get some suggestions of how to do this, or a totally different better method? Thanks.
EDIT:
client_list is a list of client objects.
Here is some of the models.py:
class Client(models.Model):
user_name = models.CharField
class Task(models.Model):
client = models.ForeignKey(
'Client',
)
task_type = models.ForeginKey(
'Task_Type',
)
date = models.DateTimeField(
default = datetime.now(),
blank = True,
null = True,
)
class Task_Type(models.Model):
name = models.CharField

I think I understand now. I am assuming client_list is actually a list of client objects. Someone may come up with a more efficient solution, but I would try something like this:
next_record = {}
for client in client_list:
s = s_dates_dict[client.pk]
e = e_dates_dict[client.pk]
next_record[client.pk] = client.tasks_set.filter(date__range(s,e), task_type__in=[A,B])
This gives you a dict (next_record) where each key corresponds to a list of tasks. I am assuming there is a ForeignKey relationship between your Tasks and Client models.
Hopefully this works, if not, can you please post the code for your models?

SQLAlchemy/Elixir - querying to check entity's membership in a many-to-many relationship list

I am trying to construct a sqlalchemy query to get the list of names of all professors who are assistants professors on MIT. Note that there can be multiple assistant professors associated with a certain course.
What I'm trying to do is roughly equivalent to:
uni_mit = University.get_by(name='MIT')
s = select([Professor.name],
and_(Professor.in_(Course.assistants),
Course.university = uni_mit))
session.execute(s)
This won't work, because in_ is only defined for entity's fields, not for the whole entity.. Can't use Professor.id.in_ as Course.assistants is a list of Professors, not a list of their ids. I also tried contains but I didn't work either.
My Elixir model is:
class Course(Entity):
id = Field(Integer, primary_key=True)
assistants = ManyToMany('Professor', inverse='courses_assisted', ondelete='cascade')
university = ManyToOne('University')
..
class Professor(Entity):
id = Field(Integer, primary_key=True)
name = Field(String(50), required=True)
courses_assisted = ManyToMany('Course', inverse='assistants', ondelete='cascade')
..
This would be trivial if I could access the intermediate many-to-many entity (the condition would be and_(interm_table.prof_id = Professor.id, interm_table.course = Course.id), but SQLAlchemy apparently hides this table from me.
I'm using Elixir 0.7 and SQLAlchemy 0.6.
Btw: This question is different from Sqlalchemy+elixir: How query with a ManyToMany relationship? in that I need to check the professors against all courses which satisfy a condition, not a single, static one.

You can find the intermediate table where Elixir has hidden it away, but note that it uses fully qualified column names (such as __package_path_with_underscores__course_id). To avoid this, define your ManyToMany using e.g.
class Course(Entity):
...
assistants = ManyToMany('Professor', inverse='courses_assisted',
local_colname='course_id', remote_colname='prof_id',
ondelete='cascade')
and then you can access the intermediate table using
rel = Course._descriptor.find_relationship('assistants')
assert rel
table = rel.table
and can access the columns using table.c.prof_id, etc.
Update: Of course you can do this at a higher level, but not in a single query, because SQLAlchemy doesn't yet support in_ for relationships. For example, with two queries:
>>> mit_courses = set(Course.query.join(
... University).filter(University.name == 'MIT'))
>>> [p.name for p in Professor.query if set(
... p.courses_assisted).intersection(mit_courses)]
Or, alternatively:
>>> plist = [c.assistants for c in Course.query.join(
... University).filter(University.name == 'MIT')]
>>> [p.name for p in set(itertools.chain(*plist))]
The first step creates a list of lists of assistants. The second step flattens the list of lists and removes duplicates through making a set.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Understanding ndb key class vs KeyProperty - python

Related

Runtime Foreign Key vs Integerfield

GAE NDB Accessing sub-instances (fields) within structured repeated list

referencing an entity by its key before it gets saved to the ndb

How do I filter by a date range based on a dictionary value in python, otherwise filter a list by a dictionary?

SQLAlchemy/Elixir - querying to check entity's membership in a many-to-many relationship list

Categories

Resources