Google App Engine - adding an index - python

We had a class of Google App Engine (in Python), where the field is_admin was not indexed. We want it to be indexed now so we can filter the users by role - administrator or not administrator (is_admin true or false). The original class was like this:
class DomainUser(db.Expando, ExpandoEntity, SocialIconsEntity):
"""
User domain DB Model
"""
domain = db.StringProperty(required=True)
...
is_admin = db.BooleanProperty(default=False, indexed=False)
...
And I changed it to this:
class DomainUser(db.Expando, ExpandoEntity, SocialIconsEntity):
"""
User domain DB Model
"""
domain = db.StringProperty(required=True)
...
is_admin = db.BooleanProperty(default=False)
...
But I read in the documentation that we have to save each object again for the index to be created. Is it possible to create the index without saving all the objects again? We filter all the users by a specific domain and then filter (or sort) by the is_admin field. Can we add an index to index.yaml which will work? Currently if we filter users by is_admin then we receive an empty result.

The two options you listed are pretty much all you get:
Using indexes: re-save all the entities again with indexed=False removed (could be very expensive but one time only if you have lots of existing DomainUser entities).
Not using indexes at all: query by the domain property only and filter the results in python (could be very expensive each time if you have lots of users in each domain depending on how often you run the query), i.e.:
all_users = DomainUser.query(DomainUser.domain == 'xxx').fetch()
admins = [u for u in all_users if u.is_admin]
users = [u for u in all_users if not u.is_admin]
You will need to decide for yourself which option would be cheaper / faster if that's what you're after.

Related

Django Form Does Not Find Right ID int Increment on Production vs LH

For some reason on Localhost (LH) everything works fine but on my Production server, my form does not add a new user submission properly. The error I am getting is:
duplicate key value violates unique constraint "..."
DETAIL: Key (id)=(8) already exists.
Is there some sort of production "sudo systemctl restart gunicorn" I need to run (I have already tried the example above)? Maybe it's only working on LH because there I have tested there more and the increment naturally fell on the same level as the total amount of users? I really am out of ideas here.
models.py
class Lead(models.Model):
username = models.CharField(max_length=15, blank=True, null=True)
email = models.CharField(unique=True, max_length=150, validators=[validate_email])
created = models.DateTimeField(auto_now_add=True)
...
forms.py
class LeadCaptureForm1(forms.ModelForm):
birth_date = forms.DateField(widget=SelectDateWidget(years=range(1999, 1910, -1)))
class Meta:
model = Lead
widgets = {
'email': forms.TextInput(attrs={'class': 'form-control'}),
}
fields = ('email', 'birth_date',)
views.py
def iframe1(request):
ip = get_real_ip(request)
created = timezone.now()
if request.method == 'POST':
form = LeadCaptureForm1(request.POST)
if form.is_valid():
# Save lead
lead = form.save()
# attempt at fixing it
#lead.id = Lead.objects.get(all).count()
#print(lead.id)
lead.created = created
lead.birth_date = form.cleaned_data.get('birth_date')
lead.ipaddress = get_real_ip(request)
lead.joinmethod = "Iframe1"
lead.save()
print(lead)
I'm not sure why you are setting the ID manually, and especially why you are setting it to the count of items. You should always let the database manage the primary key itself - it is an autoincrement field, and is completely opaque to your data.
The reason why you are getting this conflict is that items can be deleted, so that there can be 8 entries in the database but ID 8 already exists. But as I say, don't do this at all.
Also, don't set created manually, as that will be done automatically as well because you have auto_now_add=True in the model field.. And birth_date is set by the form save already. Finally, you should call save with commit=False if you want to set some other fields manually.
So just do:
lead = form.save(commit=False)
lead.ipaddress = get_real_ip(request)
lead.joinmethod = "Iframe1"
lead.save()
This is due to the fact that we uploaded leads manually and Django uses PostgreSQL’s SERIAL data type to store auto-incrementing primary keys.
"A SERIAL column is populated with values from a sequence that keeps track of the next available value. Manually assigning a value to an auto-incrementing field doesn’t update the field’s sequence, which might later cause a conflict."
https://docs.djangoproject.com/en/2.0/ref/databases/#manually-specifying-values-of-auto-incrementing-primary-keys
To solve this we can either force a new serial number or build an exception to fix the serial. The latter option would be the ideal since we may upload users manually in the future. However, for now, we'll try and force serial number.
Run code: python manage.py sqlsequencereset [app_name]
For some reason, this did NOT work, so I was about to try and figure out how to build some sort of "python if exemption" but instead found this post first (IntegrityError duplicate key value violates unique constraint - django/postgres) that helped me directly update the "setval":
SELECT setval('tablename_id_seq', (SELECT MAX(id) FROM tablename)+1)

Many to many relationship with NDB on Google App Engine

I've got the following models...
class User(ndb.Model):
email = ndb.StringProperty()
username = ndb.StringProperty(indexed=True)
password = ndb.StringProperty()
class Rel(ndb.Model):
user = ndb.KeyProperty(kind=User, indexed=True)
follows = ndb.KeyProperty(kind=User, indexed=True)
blocks = ndb.KeyProperty(kind=User)
I'm trying to make it so a user can follow or block any other number of users.
Using the above setup I'm finding it hard to perform tasks that would been easy with a traditional DBMS.
As a simple example, how would I find all of a given user's followers AND order by username-- keeping in mind when I perform a query on Rel, I'm getting back keys and not user objects?
Am I going about this the wrong way?
You have to do a fetch but you can go about designing it in a better way,
the follows and blocks fields can be lists instead of just key -
follows = ndb.KeyProperty(kind=User, repeated=True)
blocks = ndb.KeyProperty(kind=User, repeated=True)
after this when you need the follows of this user you can get the keys and do an ndb.get_multi(Keys_list) to get all the follows/blocks entities whatever you need.
OR
A better way of doing this -
If you care about the order and want to paginate, you will have to store all the follow/block entities separately,
for example if this is about a user 'a'
Follows entity will have records for each person 'a' follows
class FollowEntity(ndb.Model):
user = ndb.KeyProperty(kind=User)
follow = ndb.KeyProperty(kind=User)
follow_username = ndb.StringProperty()
a query can be
assuming user is an entry from your 'User' Entity.
query = FollowEntity.query(FollowEntity.user == user.key).order(FollowEntity.follow_username)
you can run this query and get the sorted username results, would work well if you use fetch_page to display the results in a batch.
Do the same for BlockEntity too

NDB: how to get child entities that depend on values stored on a parent structured propery

I have the following models:
class Roles(ndb.Model):
email = ndb.StringProperty(required=True)
type = ndb.StringProperty(choices=['writer', 'editor', 'admin']
class Book(ndb.Model):
uid = dnb.StringProperty(required=True)
user = ndb.UserProperty(auto_current_user_add=True)
name = ndb.StringProperty(required=True)
shared_with = ndb.StructuredProperty(Roles, repeated=True, indexed=True)
class Page(ndb.Model):
uid = dnb.StringProperty(required=True)
user = ndb.UserProperty(auto_current_user_add=True)
title = ndb.StringProperty(required=True)
parent_uid = ndb.ComputedProperty(lambda self: self.key.parent().get().uid)
shared_with = ndb.ComputedProperty(lambda self: self.key.parent().get().shared_with)
The structure I am using is:
Book1 Book2 - (parent)
| |
^ ^
pages pages - (child)
When a Book is created, the shared_with is filled with a list of emails/roles.
For example:
Book.uid = user.user_id()
Book.user = user
Book.name = "learning appengine NDB"
Book.shared_with = [Roles("user_1#domain.tld", "admin"), Roles("user_2#domain.tld", "editor")]
When a user creates a Page, the user.user_id() is stored as uid.
Example when user_2#domain.tld (role type: editor) creates a page:
Page.title = "understanding ComputedProperty"
Page.uid = user.user_id()
Page.user = user
With this schema, if I want to show to user_2#domain.tld only The pages he has created, I can do a simple query by filtering by uid, with something like:
# supposing user_2#domain.tld is logged in
user2_pages = Page.query(Page.uid = user.user_id())
But for other users that are listed on the shared_with property of the Book, how could I continue to show their own (pages they created), and all the rest only if they have a Role(admin,editor).
For example, if I want to allow other users (admins,editors); to see a list of last pages created for all the books, how could I perform a query to do so?
What I have been trying so far and not working, is to use a ComputedProperty, I can't make it work as expected.
To verify that I get the correct values, I do a query like:
query = Pages.query().get()
print query.parent_uid
I do get the parent uid, same with the the shared.with values, but for an unknown reason I can't filter with them, when using something like:
query = Pages.query(
Pages.parent_uuid == user.user_id()
)
# query returns None
A probably better and simpler approach is to show pages per book but I would like to know if it is possible to do it for all the books, so that admins and editors can just see a list of last pages created in general, instead of going into each book.
Any ideas?
Your computed property cannot work because it's only updated when Page entity is put. See https://stackoverflow.com/a/12630991/1756187. Any changes to Book entities have no effect on Page computed properties.
You can try to use Model hooks to maintain Page.shared_with. See https://developers.google.com/appengine/docs/python/ndb/entities#hooks.
I'm wondering though if this is the best approach. If you have the sharing info on the Book level, you can use its index to retrieve the list of book keys. You can do that using keys only query. Then you can retrieve the list of all pages for these parent keys. That way you don't have to add shared_with attribute to Page model at all. The cost of query will be slightly bigger, but the Page entities will be smaller and cheaper to maintain

Google App Engine Python Datastore

Basically what Im trying to make is a data structure where it has the users name, id, and datejoined. Then i want a "sub-structure" where it has the users "text" and the date it was modified. and the user will have multiple instances of this text.
class User(db.Model):
ID = db.IntegerProperty()
name = db.StringProperty()
datejoined = db.DateTimeProperty(auto_now_add=True)
class Content(db.Model):
text = db.StringProperty()
datemod= db.DateTimeProperty(auto_now_add = True)
Is the code set up correctly?
One problem you will have is that making User.ID unique will be non-trivial. The problem is that two writes to the database could occur on different shards, both check at about the same time for existing entries that match the uniqueness constraint and find none, then both create identical entries (with regard to the unique property) and then you have an invalid database state. To solve this, appengine provides a means of ensuring that certain datastore entities are always placed on the same physical machine.
To do this, you make use of the entity keys to tell google how to organize the entities. Lets assume you want the username to be unique. Change User to look like this:
class User(db.Model):
datejoined = db.DateTimeProperty(auto_now_add=True)
Yes, that's really it. There's no username since that's going to be used in the key, so it doesn't need to appear separately. If you like, you can do this...
class User(db.Model):
datejoined = db.DateTimeProperty(auto_now_add=True)
#property
def name(self):
return self.key().name()
To create an instance of a User, you now need to do something a little different, you need to specify a key_name in the init method.
someuser = User(key_name='john_doe')
...
someuser.save()
Well, really you want to make sure that users don't overwrite each other, so you need to wrap the user creation in a transaction. First define a function that does the neccesary check:
def create_user(username):
checkeduser = User.get_by_key_name(username)
if checkeduser is not None:
raise db.Rollback, 'User already exists!'
newuser = User(key_name=username)
# more code
newuser.put()
Then, invoke it in this way
db.run_in_transaction(create_user, 'john_doe')
To find a user, you just do this:
someuser = User.get_by_key_name('john_doe')
Next, you need some way to associate the content to its user, and visa versa. One solution is to put the content into the same entity group as the user by declaring the user as a parent of the content. To do this, you don't need to change the content at all, but you create it a little differently (much like you did with User):
somecontent = Content(parent=User.get_by_key_name('john_doe'))
So, given a content item, you can look up the user by examining its key:
someuser = User.get(somecontent.key().parent())
Going in reverse, looking up all of the content for a particular user is only a little trickier.
allcontent = Content.gql('where ancestor is :user', user=someuser).fetch(10)
Yes, and if you need more documentation, you can check here for database types and here for more info about your model classes.
An alternative solution you may see is using referenceproperty.
class User(db.Model):
name = db.StringProperty()
datejoined = db.DateTimeProperty(auto_now_add=True)
class Content(db.Model):
user = db.ReferenceProperty(User,collection_name='matched_content')
text = db.StringProperty()
datemod= db.DateTimeProperty(auto_now_add = True)
content = db.get(content_key)
user_name = content.user.name
#looking up all of the content for a particular user
user_content = content.user.matched_content
#create new content for a user
new_content = Content(reference=content.user)

How can I query for records based on an attribute of a ReferenceProperty? (Django on App Engine)

If I have the following models in a Python (+ Django) App Engine app:
class Album(db.Model):
private = db.BooleanProperty()
...
class Photo(db.Model):
album = db.ReferenceProperty(Album)
title = db.StringProperty()
...how can I retrieve all Photos that belong to a public Album (that is, an Album with private == False)?
To further explain my intention, I thought it would be:
public_photos = Photos.all().filter('album.private = ', False)
and then I could do something like:
photos_for_homepage = public_photos.fetch(30)
but the query does not match anything, which tells me I'm going down the wrong path.
You can't. App engine doesn't support joins.
One approach is to implement the join manually. For example you could fetch all photos, then filter out the private ones in code. Or fetch all public albums, and then fetch each of their photos. It depends on your data as to whether this will perform okay or not.
The alternative approach is to denormalize your data. Put another field in the Photo model, eg:
class Photo(db.Model):
album = db.ReferenceProperty(Album)
album_private = db.BooleanProperty()
title = db.StringProperty()
Then you can filter for public photos with:
public_photos = Photos.all().filter('album_private = ', False)
This improves query performance, but at the expense of write performance. You will need to keep the album_private field of the photos updated whenever you change the private flag of the album. It depends on your data and read/write patterns as to whether this will be better or worse.

Categories