Which query will be more efficient:
for id in user.posts:
Post.objects.get(id=id)
or
posts = Post.objects(user=user_id)
with the next schema
Post(Document):
user = ObjectIdField()
User(Document):
posts = ListField(ObjectIdField())
if there is indexing for user field in the Post document, and an average of 20 posts for each User. Also curious about other usage pattern scenarios
The following block, fires as many database queries as you have post in user.posts so it will be slow anyway.
for id in user.posts:
Post.objects.get(id=id)
but if you use it like this:
Post.objects.get(id__in=user.posts)
Then the performance will be similar to using Post.objects(user=user_id) because the primary key gets indexed by default
I believe you should also use ReferenceField i.o plain ObjectId. They allow for lazy loading of references
class Post(Document):
user = ReferenceField("User")
class User(Document):
name = StringField()
#property
def posts(self):
return Post.objects(user=self)
john = User(name='John').save()
post = Post(user=john).save()
print(john.posts()) # [<Post: Post object>]
Related
I've got the following models...
class User(ndb.Model):
email = ndb.StringProperty()
username = ndb.StringProperty(indexed=True)
password = ndb.StringProperty()
class Rel(ndb.Model):
user = ndb.KeyProperty(kind=User, indexed=True)
follows = ndb.KeyProperty(kind=User, indexed=True)
blocks = ndb.KeyProperty(kind=User)
I'm trying to make it so a user can follow or block any other number of users.
Using the above setup I'm finding it hard to perform tasks that would been easy with a traditional DBMS.
As a simple example, how would I find all of a given user's followers AND order by username-- keeping in mind when I perform a query on Rel, I'm getting back keys and not user objects?
Am I going about this the wrong way?
You have to do a fetch but you can go about designing it in a better way,
the follows and blocks fields can be lists instead of just key -
follows = ndb.KeyProperty(kind=User, repeated=True)
blocks = ndb.KeyProperty(kind=User, repeated=True)
after this when you need the follows of this user you can get the keys and do an ndb.get_multi(Keys_list) to get all the follows/blocks entities whatever you need.
OR
A better way of doing this -
If you care about the order and want to paginate, you will have to store all the follow/block entities separately,
for example if this is about a user 'a'
Follows entity will have records for each person 'a' follows
class FollowEntity(ndb.Model):
user = ndb.KeyProperty(kind=User)
follow = ndb.KeyProperty(kind=User)
follow_username = ndb.StringProperty()
a query can be
assuming user is an entry from your 'User' Entity.
query = FollowEntity.query(FollowEntity.user == user.key).order(FollowEntity.follow_username)
you can run this query and get the sorted username results, would work well if you use fetch_page to display the results in a batch.
Do the same for BlockEntity too
I have the following models:
class Roles(ndb.Model):
email = ndb.StringProperty(required=True)
type = ndb.StringProperty(choices=['writer', 'editor', 'admin']
class Book(ndb.Model):
uid = dnb.StringProperty(required=True)
user = ndb.UserProperty(auto_current_user_add=True)
name = ndb.StringProperty(required=True)
shared_with = ndb.StructuredProperty(Roles, repeated=True, indexed=True)
class Page(ndb.Model):
uid = dnb.StringProperty(required=True)
user = ndb.UserProperty(auto_current_user_add=True)
title = ndb.StringProperty(required=True)
parent_uid = ndb.ComputedProperty(lambda self: self.key.parent().get().uid)
shared_with = ndb.ComputedProperty(lambda self: self.key.parent().get().shared_with)
The structure I am using is:
Book1 Book2 - (parent)
| |
^ ^
pages pages - (child)
When a Book is created, the shared_with is filled with a list of emails/roles.
For example:
Book.uid = user.user_id()
Book.user = user
Book.name = "learning appengine NDB"
Book.shared_with = [Roles("user_1#domain.tld", "admin"), Roles("user_2#domain.tld", "editor")]
When a user creates a Page, the user.user_id() is stored as uid.
Example when user_2#domain.tld (role type: editor) creates a page:
Page.title = "understanding ComputedProperty"
Page.uid = user.user_id()
Page.user = user
With this schema, if I want to show to user_2#domain.tld only The pages he has created, I can do a simple query by filtering by uid, with something like:
# supposing user_2#domain.tld is logged in
user2_pages = Page.query(Page.uid = user.user_id())
But for other users that are listed on the shared_with property of the Book, how could I continue to show their own (pages they created), and all the rest only if they have a Role(admin,editor).
For example, if I want to allow other users (admins,editors); to see a list of last pages created for all the books, how could I perform a query to do so?
What I have been trying so far and not working, is to use a ComputedProperty, I can't make it work as expected.
To verify that I get the correct values, I do a query like:
query = Pages.query().get()
print query.parent_uid
I do get the parent uid, same with the the shared.with values, but for an unknown reason I can't filter with them, when using something like:
query = Pages.query(
Pages.parent_uuid == user.user_id()
)
# query returns None
A probably better and simpler approach is to show pages per book but I would like to know if it is possible to do it for all the books, so that admins and editors can just see a list of last pages created in general, instead of going into each book.
Any ideas?
Your computed property cannot work because it's only updated when Page entity is put. See https://stackoverflow.com/a/12630991/1756187. Any changes to Book entities have no effect on Page computed properties.
You can try to use Model hooks to maintain Page.shared_with. See https://developers.google.com/appengine/docs/python/ndb/entities#hooks.
I'm wondering though if this is the best approach. If you have the sharing info on the Book level, you can use its index to retrieve the list of book keys. You can do that using keys only query. Then you can retrieve the list of all pages for these parent keys. That way you don't have to add shared_with attribute to Page model at all. The cost of query will be slightly bigger, but the Page entities will be smaller and cheaper to maintain
Suppose I got two models like this:
class Article(models.Model):
article_title = models.CharField(max_length=100)
class EventRecord(models.Model):
article = models.ForeignKey(Article)
In a view, I select a certain EventRecord and want to show the Title of the Article it is related to as well. The following does not work:
def classify(request, pk):
event = get_object_or_404(EventRecord, pk=pk)
article_id = event.article
article = get_object_or_404(Article, pk=article_id)
How do I make this work?
Any help is really appreciated!
Django automatically handles this for you. For example:
>>> record = EventRecord.objects.get(...)
>>> isinstance(record.article, Article)
True
>>> record.article.article_title
u'title here'
This is one of the magical things Django does (nothing is magic but anyway...). Please keep in mind that in order for this work Django will usually execute some extra database queries. To eliminate them, you can use select_related method. Below is a snippet which eliminates extra queries and does what you want:
def classify(request, pk):
record = EventRecord.objects.filter(pk=pk).select_related()
# the above returns queryset hence you have to extract the record manually
if not len(record):
raise Http404()
else:
record = record[0]
# now use record as usual and no extra queries will be executed
title = record.article.article_title
...
event.article returns the actual Article object, not the primary key, so you don't need to do another database query.
def classify(request, pk):
event = get_object_or_404(EventRecord, pk=pk)
if not event.article:
raise Http404
print event.article.article_title
Basically what Im trying to make is a data structure where it has the users name, id, and datejoined. Then i want a "sub-structure" where it has the users "text" and the date it was modified. and the user will have multiple instances of this text.
class User(db.Model):
ID = db.IntegerProperty()
name = db.StringProperty()
datejoined = db.DateTimeProperty(auto_now_add=True)
class Content(db.Model):
text = db.StringProperty()
datemod= db.DateTimeProperty(auto_now_add = True)
Is the code set up correctly?
One problem you will have is that making User.ID unique will be non-trivial. The problem is that two writes to the database could occur on different shards, both check at about the same time for existing entries that match the uniqueness constraint and find none, then both create identical entries (with regard to the unique property) and then you have an invalid database state. To solve this, appengine provides a means of ensuring that certain datastore entities are always placed on the same physical machine.
To do this, you make use of the entity keys to tell google how to organize the entities. Lets assume you want the username to be unique. Change User to look like this:
class User(db.Model):
datejoined = db.DateTimeProperty(auto_now_add=True)
Yes, that's really it. There's no username since that's going to be used in the key, so it doesn't need to appear separately. If you like, you can do this...
class User(db.Model):
datejoined = db.DateTimeProperty(auto_now_add=True)
#property
def name(self):
return self.key().name()
To create an instance of a User, you now need to do something a little different, you need to specify a key_name in the init method.
someuser = User(key_name='john_doe')
...
someuser.save()
Well, really you want to make sure that users don't overwrite each other, so you need to wrap the user creation in a transaction. First define a function that does the neccesary check:
def create_user(username):
checkeduser = User.get_by_key_name(username)
if checkeduser is not None:
raise db.Rollback, 'User already exists!'
newuser = User(key_name=username)
# more code
newuser.put()
Then, invoke it in this way
db.run_in_transaction(create_user, 'john_doe')
To find a user, you just do this:
someuser = User.get_by_key_name('john_doe')
Next, you need some way to associate the content to its user, and visa versa. One solution is to put the content into the same entity group as the user by declaring the user as a parent of the content. To do this, you don't need to change the content at all, but you create it a little differently (much like you did with User):
somecontent = Content(parent=User.get_by_key_name('john_doe'))
So, given a content item, you can look up the user by examining its key:
someuser = User.get(somecontent.key().parent())
Going in reverse, looking up all of the content for a particular user is only a little trickier.
allcontent = Content.gql('where ancestor is :user', user=someuser).fetch(10)
Yes, and if you need more documentation, you can check here for database types and here for more info about your model classes.
An alternative solution you may see is using referenceproperty.
class User(db.Model):
name = db.StringProperty()
datejoined = db.DateTimeProperty(auto_now_add=True)
class Content(db.Model):
user = db.ReferenceProperty(User,collection_name='matched_content')
text = db.StringProperty()
datemod= db.DateTimeProperty(auto_now_add = True)
content = db.get(content_key)
user_name = content.user.name
#looking up all of the content for a particular user
user_content = content.user.matched_content
#create new content for a user
new_content = Content(reference=content.user)
This is a follow up on my previous question.
I set up the models with ReferenceProperty:
class User(db.Model):
userEmail = db.StringProperty()
class Comment(db.Model):
user = db.ReferenceProperty(User, collection_name="comments")
comment = db.StringProperty(multiline=True)
class Venue(db.Model):
user = db.ReferenceProperty(User, collection_name="venues")
venue = db.StringProperty()
In datastore I have this entry under User:
Entity Kind: User
Entity Key: ag1oZWxsby0xLXdvcmxkcgoLEgRVc2VyGBUM
ID: 21
userEmail: az#example.com
And under Venue:
Entity Kind: Venue
Entity Key: ag1oZWxsby0xLXdvcmxkcgsLEgVWZW51ZRhVDA
ID: 85
venue (string): 5 star venue
user (Key): ag1oZWxsby0xLXdvcmxkcgoLEgRVc2VyGBUM
User: id=21
I am trying to display the item in hw.py like this
query = User.all()
results = query.fetch(10)
self.response.out.write("<html><body><ol>")
for result in results:
self.response.out.write("<li>")
self.response.out.write(result.userEmail)
self.response.out.write(result.venues)
self.response.out.write("</li>")
self.response.out.write("</ol></body></html>")
This line works:
self.response.out.write(result.userEmail)
But this line does not work:
self.response.out.write(result.venues)
But as per vonPetrushev's answer result.venues should grab the venue associated with this userEmail.
Sorry if this is confusing: I am just trying to access the tables linked to the userEmail with ReferenceProperty. The linked tables are Venue and Comment. How do I access an item in Venue or in Comment? Thanks.
venues is actually a query object. So you'll need to fetch or iterate over it.
Try replacing the self.response.out.write(result.venues) line with a loop:
query = User.all()
users = query.fetch(10)
self.response.out.write("<html><body><ol>")
for user in users:
self.response.out.write("<li>")
self.response.out.write(user.userEmail)
self.response.out.write("<ul>")
# Iterate over the venues
for venue in user.venues:
self.response.out.write("<li>%s</li>" % venue.venue)
self.response.out.write("</ul></li>")
self.response.out.write("</ol></body></html>")
However, this will not scale very well. If there are 10 users, it will run 11 queries! Make sure you are using Appstats to look for performance issues like these. Try to minimize the number of RPC calls you make.
A better solution might be to denormalize and store the user's email on the Venue kind. That way you will only need one query to print the venue information and users email.