Working with ancestors in GAE - python

I only want that someone confirm me that I'm doing things in the right way.
I have this structure: Books that have Chapters (ancestor=Book) that have Pages (ancestor=Chapter)
It is clear for me that, to search for a Chapter by ID, I need the book to search by ancestor query.
My doubt is: do I need all the chain book-chapter to search a page?
For example (I'm in NDB):
class Book(ndb.Model):
# Search by id
#classmethod
def by_id(cls, id):
return Book.get_by_id(long(id))
class Chapter(ndb.Model):
# Search by id
#classmethod
def by_id(cls, id, book):
return Chapter.get_by_id(long(id), parent=book.key)
class Page(ndb.Model):
# Search by id
#classmethod
def by_id(cls, id, chapter):
return Page.get_by_id(long(id), parent=chapter.key)
Actually, when I need to search a Page to display its contents, I'm passing the complete chain in the url like this:
getPage?bookId=5901353784180736&chapterId=5655612935372800&pageId=1132165198169
So, in the controller, I make this:
def get(self):
# Get the id parameters
bookId = self.request.get('bookId')
chapterId = self.request.get('chapterId')
pageId = self.request.get('pageId')
if bookId and chapterId and pageId:
# Must be a digit
if bookId.isdigit() and chapterId.isdigit() and pageId.isdigit():
# Get the book
book = Book.by_id(bookId)
if book:
# Get the chapter
chapter = Chapter.by_id(chapterId, book)
if chapter:
# Get the page
page = Page.by_id(pageId, chapter)
Is this the right way? Must I have always the complete chain in the URL to get the final element of the chain?
If this is right, I suppose that this way of work, using NDB, does not have any impact on the datastore, because repeated calls to this page always hit the NDB cache for the same book, chapter and page (because I'm getting by id, is not a fetch command). Is my suppose correct?

No, there's no need to do that. The point is that keys are paths: you can build them up dynamically and only hit the datastore when you have a complete one. In your case, it's something like this:
page_key = ndb.Key(Book, bookId, Chapter, chapterId, Page, pageId)
page = page_key.get()
See the NDB docs for more examples.

Related

Django app view - how to get an entry from the django database in the url and to obtain more information from the database for the view?

I have a django application 'test_app'. The test_app page - "domain/test_app/" has a link to another page within the app "Genes". This takes me to "domain/test_app/genes/". In this page I have a list of links which comes from my a table in my django database called "gene", which is in test_app/models.py as class Gene. This links to another table called "Variants" via attribute 'gene' as a foreign key.
This is all in test_app/models.py:
class Gene(models.Model):
gene = models.CharField(primary_key=True, max_length=110)
transcript = models.CharField(max_length=100)
def __str__(self):
return self.gene
class Variant(models.Model):
variant = models.CharField(primary_key=True, max_length=210)
gene = models.ForeignKey(Gene, on_delete=models.CASCADE)
cds = models.CharField(max_length=150)
def __str__(self):
return self.variant
My test_app/urls.py urlpatterns look like this:
urlpatterns = [
url(r'^$', views.test_app, name='test_app'),
url(r'^genes/$', views.genes, name='genes'),
url(r'^genes/(?P<gene>)', views.variant_info, name='variant_info'),
]
and test_app/views.py look like this (link to test_app/templates/test_app/ directory with my templates in):
def test_app(request):
template = loader.get_template('test_app/index.html')
return render(request, 'test_app/index.html')
def genes(request):
gene_list = Gene.objects.all()
context = {'gene_list': gene_list}
return render(request, 'test_app/genes.html', context)
def variant_info(request, gene):
variant_info = Variant.objects.filter(gene=gene)
return(request, 'test_app/gene_info.html', {'variant_info': variant_info}
I have a list of genes in my gene table in my database. When obtaining these genes using 'gene_list = Gene.objects.all()' this works fine. I then render this list to a template as a series of links on my "domain/test_app/genes" page. For example, when I click on TP53 on this page, it goes to "domain/test_app/genes/TP53/". This takes me to the correct page.
However, I clearly do not have the regex for this urlpattern correct as when I type in more digits or characters such as /TP53098120918 it still goes to the same page. Even when I put r'^genes/(?P)$' this is still happening and I dont understand why?
But the main problem I am having is obtaining the information from my variant table in my database and rendering this to the specific gene page, such as domain/test_app/genes/TP53.
When I am clicking on the TP53 it does not pass through to my view function 'variant_info' as the gene argument because the line:
variant_info = Variant.objects.filter(gene=gene)
does not obtain any information from the database - i thought this would pass gene='TP53' when I click on the link, and therefore this would be:
variant_info = Variant.objects.filter(gene='TP53')
which would return the information from the variant table with the gene as TP53. I don't know how to properly access the TP53 from the genes function in views.py.
The problem is that your (?P<gene>) is not matching any characters, so you will always have gene='' in your view.
Try changing your url pattern to:
url(r'^genes/(?P<gene>\w+)/$', views.variant_info, name='variant_info'),
This will accept letters A-Z, a-z, digits 0-9 and underscore for the gene argument. I've added the trailing slash for consistency with your other views, and the dollar because it's usually a good idea to do this.
for the url you'd need something like to match only digits at the end
r'^genes/(?P<gene>TP\d+)$'
(I think otherwise it will try to match a string and just accept the first match)
as for your second question: sorry, but I don't get what you're asking.
But looking at your code I assume you want to filter by properties of the foreign Model:
variant_info = Variant.objects.filter(gene__gene__contains=gene)
This means to search in field gene of object Gene.
ps. that starts to look confusing, I would suggest changing the model to something like
class Gene(models.Model):
name = models.CharField(primary_key=True, max_length=110)
...

Create a url with an article's title

I have articles in MongoDB. I want the URLs for the articles to be readable. If I have an article named "How to Use Flask and MongoDB Seamlessly with Heroku", I want the URL to be something like localhost:5000/blog/how-to-use-flask-and-mongodb-seamlessly-with-heroku.
What is the best way to accomplish this? Any pointers in the right direction are appreciated. I wasn't sure exactly where to start on this one.
You are looking for a way to generate a "slug" and use that to identify the post.
If you want to use just a slug, all post titles will have to have a unique slug (which approximately means a unique title). This also means that if you change the post's title, the url could change, which would invalidate bookmarks and other outside links.
A better method is to do something like what Stack Overflow does for questions. If you look at this question's URL, you'll notice it has a unique id and a slug. In fact, the slug is optional, you can still get to this page by removing it from the url.
You'll need a way to generate slugs, and a custom url converter. The inflection library provides a nice way to slugify strings with the parameterize method. The following url converter takes an object and returns a url with the_object.id and the_object.title as a slug. When parsing a url, it will just return the object's id, since the slug is optional.
from inflection import parameterize
from werkzeug.routing import BaseConverter
class IDSlugConverter(BaseConverter):
"""Matches an int id and optional slug, separated by "/".
:param attr: name of field to slugify, or None for default of str(instance)
:param length: max length of slug when building url
"""
regex = r'-?\d+(?:/[\w\-]*)?'
def __init__(self, map, attr='title', length=80):
self.attr = attr
self.length = int(length)
super(IDSlugConverter, self).__init__(map)
def to_python(self, value):
id, slug = (value.split('/') + [None])[:2]
return int(id)
def to_url(self, value):
raw = str(value) if self.attr is None else getattr(value, self.attr, '')
slug = parameterize(raw)[:self.length].rstrip('-')
return '{}/{}'.format(value.id, slug).rstrip('/')
Register the converter so it can be used in routes:
app.url_map.converters['id_slug'] = IDSlugConverter
Use it in a route:
#app.route('/blog/<id_slug:id>')
def blog_post(id):
# get post by id, do stuff
Generate a url for a post. Note that you pass the object ('post'), not just the id, to the id parameter.:
url_for('blog_post', id=post)
# /blog/1234/the-post-title
Converter written by me for the Stack Overflow Python chat room site.

NDB: how to get child entities that depend on values stored on a parent structured propery

I have the following models:
class Roles(ndb.Model):
email = ndb.StringProperty(required=True)
type = ndb.StringProperty(choices=['writer', 'editor', 'admin']
class Book(ndb.Model):
uid = dnb.StringProperty(required=True)
user = ndb.UserProperty(auto_current_user_add=True)
name = ndb.StringProperty(required=True)
shared_with = ndb.StructuredProperty(Roles, repeated=True, indexed=True)
class Page(ndb.Model):
uid = dnb.StringProperty(required=True)
user = ndb.UserProperty(auto_current_user_add=True)
title = ndb.StringProperty(required=True)
parent_uid = ndb.ComputedProperty(lambda self: self.key.parent().get().uid)
shared_with = ndb.ComputedProperty(lambda self: self.key.parent().get().shared_with)
The structure I am using is:
Book1 Book2 - (parent)
| |
^ ^
pages pages - (child)
When a Book is created, the shared_with is filled with a list of emails/roles.
For example:
Book.uid = user.user_id()
Book.user = user
Book.name = "learning appengine NDB"
Book.shared_with = [Roles("user_1#domain.tld", "admin"), Roles("user_2#domain.tld", "editor")]
When a user creates a Page, the user.user_id() is stored as uid.
Example when user_2#domain.tld (role type: editor) creates a page:
Page.title = "understanding ComputedProperty"
Page.uid = user.user_id()
Page.user = user
With this schema, if I want to show to user_2#domain.tld only The pages he has created, I can do a simple query by filtering by uid, with something like:
# supposing user_2#domain.tld is logged in
user2_pages = Page.query(Page.uid = user.user_id())
But for other users that are listed on the shared_with property of the Book, how could I continue to show their own (pages they created), and all the rest only if they have a Role(admin,editor).
For example, if I want to allow other users (admins,editors); to see a list of last pages created for all the books, how could I perform a query to do so?
What I have been trying so far and not working, is to use a ComputedProperty, I can't make it work as expected.
To verify that I get the correct values, I do a query like:
query = Pages.query().get()
print query.parent_uid
I do get the parent uid, same with the the shared.with values, but for an unknown reason I can't filter with them, when using something like:
query = Pages.query(
Pages.parent_uuid == user.user_id()
)
# query returns None
A probably better and simpler approach is to show pages per book but I would like to know if it is possible to do it for all the books, so that admins and editors can just see a list of last pages created in general, instead of going into each book.
Any ideas?
Your computed property cannot work because it's only updated when Page entity is put. See https://stackoverflow.com/a/12630991/1756187. Any changes to Book entities have no effect on Page computed properties.
You can try to use Model hooks to maintain Page.shared_with. See https://developers.google.com/appengine/docs/python/ndb/entities#hooks.
I'm wondering though if this is the best approach. If you have the sharing info on the Book level, you can use its index to retrieve the list of book keys. You can do that using keys only query. Then you can retrieve the list of all pages for these parent keys. That way you don't have to add shared_with attribute to Page model at all. The cost of query will be slightly bigger, but the Page entities will be smaller and cheaper to maintain

Dereference Models from many to many relationship

In my schema, as described in the below test data generation example, I want to know a good way to:
Dereference all instances of Favourites that have reference keys to instances of Pictures that have been deleted. Just delete any Favourite that links to a deleted picture.
The Person class is a user
The Picture class is something that can be a Favourite
The Favourite class is an example of the Link-Model way of having many-to-many relationships.
Why this question?
First I hope it doesn't fall out of the scope here, second because this can happen and third because it's interesting.
How?
Let's say that a person can have up to thousands favourites, something like Likes are on social networks or to make it worse, orders, accounts or invalid data in a scientific application.
In our example for some reason (and these reasons happen) a person is experiencing lot of dead favourite link, or I do know, that there are dead favourites.
What would be a good way to do this, reducing ndb.get() operations and not iterating through every Favourite.
Lets not complicate things. Lets make the assumption that we have only one user suffering from dead favourites. He has a class of Person and stubbed user_id property of '123'.
In the following example you can use the following handlers and their corresponding functions.
import time
import sys
import logging
import random
import cgi
import webapp2
from google.appengine.ext import ndb
class Person(ndb.Expando):
pass
class Picture(ndb.Expando):
pass
class Favourite(ndb.Expando):
user_id = ndb.StringProperty(required=True)
#picture = ndb.KeyProperty(kind=Picture, required=True)
pass
class GenerateDataHandler(webapp2.RequestHandler):
def get(self):
try:
number_of_models = abs(int(cgi.escape(self.request.get('n'))))
except:
number_of_models = 10
logging.info("GET ?n=parameter not defined. Using default.")
pass
user_id = '123' #stub
person = Person.query().filter(ndb.GenericProperty('user_id') == user_id).get()
if not person:
person = Person()
person.user_id = user_id #Stub
person.put()
logging.info("Created Person instance")
if not self._gen_data(person, number_of_models):
return
self.response.write("Data generated successfully")
def _gen_data(self, person, number_of_models):
first, last = Picture.allocate_ids(number_of_models)
picture_keys = [ndb.Key(Picture, id) for id in range(first, last+1)]
pictures = []
favourites = []
for picture_key in picture_keys:
picture = Picture(key=picture_key)
pictures.append(picture)
favourite = Favourite(parent=person.key,
user_id=person.user_id,
picture=picture_key
)
favourites.append(favourite)
entities = favourites
entities[1:1] = pictures
ndb.put_multi(entities)
return True
class CorruptDataHandler(webapp2.RequestHandler):
def get(self):
if not self._corrupt_data(0.5):#50% corruption
return
self.response.write("Data corruption completed successfully")
def _corrupt_data(self, n):
picture_keys = Picture.query().fetch(99999, keys_only=True)
random_picture_keys = random.sample(picture_keys, int(float(len(picture_keys))*n))
ndb.delete_multi(random_picture_keys)
return True
class FixDataHandler(webapp2.RequestHandler):
def get(self):
user_id = '123' #stub
person = Person.query().filter(ndb.GenericProperty('user_id') == user_id).get()
self._dereference(person)
def _dereference(self, person):
#Here if where you implement your answer
Separate handlers due to eventual consistency in
the NDB Datastore. More info:
GAE put_multi() entities using backend NDB
Of course I am posting an answer as well to show that I tried something before posting this.
A ReferenceProperty is just a key, so if you have the key of the deleted Person, you can use that to query the Favourite.
Otherwise, there's no easy way. You'll have to filter through all Favourites and find ones that have an invalid Picture. It's very simple in a mapreduce job, but could be an expensive query if you have a lot of Favourites.
You could use a pre delete hook (look here for a way to implement it)
Of course this could be done easier if you use the NDB API instead of the Datastore API (hooks on NDB), but then you'll have to change the way you make the referenes

Query Set for Models Retrieved from External API

I am currently developing a web app that uses the Amazon Product API to get information on books. I use a Book model that contains only the ASIN amazon identification code that looks a bit like this:
class Book(models.Model):
asin = models.CharField(max_length-10, unique=True)
def retrieve(self, **kwargs):
kwargs['ItemId'] = self.asin
self.xml = amazon_lookup(**kwargs) # returns BeautifulSoup
#property
def title(self):
try:
return self.xml.ItemAttriibutes.Author.text
except AttributeError: # happens when xml has not been populated
return None
...
class BookManager(models.Manager):
def retrieve(self, **kwargs):
kwargs['SearchIndex'] = 'Books'
book_search = amazon_search(**kwargs)
books = []
for item in book_search:
book = self.get_or_create(asin=item.ASIN.text)[0]
book.xml = item
books.append(book)
return books
And then I can call it with
b = Book.objects.retrieve(Keywords="foo bar")
b.retrieve(ResponseGroup="Images,ItemAttributes,...")
t = b.title
This works well enough for tests, but I want a more robust system for future use.
What I'd really like to do is be able to perform searches with query sets so that frequently accessed results can be cached. As it stands, every request for a book detail view creates a new amazon API call. My job would be a lot easier if all the API calls were handled inside Query Sets alongside database calls. Unfortunately, I've found the Django Query Set documentation pretty cryptic and lacking when it comes to customization. This surely isn't a rare use case.
Can anyone provide the idiomatic way of handling a problem like this, or a good resource on the subject?

Categories