Fetching data from parent table in Django in database hierarchy - python

Following this answer, I tried to split my SQL Story table into parent/children - with the children holding the specific user data, the parent more generic data. Now I've run into a problem that betrays my lack of experience in Django. My user page attempts to show a list of all the stories that a user has written. Before, when my user page was only pulling data from the story table, it worked fine. Now I need to pull data from two tables with linked info and I just can't work out how to do it.
Here's my user_page view before attempts to pull data from the parent story table too:
def user_page(request, username):
user = get_object_or_404(User, username=username)
userstories = user.userstory_set.order_by('-id')
variables = RequestContext(request, {
'username': username,
'userstories': userstories,
'show_tags': True
})
return render_to_response('user_page.html', variables)
Here is my models.py:
class story(models.Model):
title = models.CharField(max_length=400)
thetext = models.TextField()
class userstory(models.Model):
main = models.ForeignKey(story)
date = models.DateTimeField()
user = models.ForeignKey(User)
I don't really know where to start in terms of looking up the appropriate information in the parent table too and assinging it to a variable. What I need to do is follow the 'main' Key of the userstory table into the story table and assign the story table as a variable. But I just can't see how to implement that in the definition.
EDIT: I've tried story = userstory.objects.get(user=user) but I get 'userstory matching query does not exist.'

Reading through your previous question that you linked to, I've discovered where the confusion lies. I was under the impression that a Story may have many UserStorys associated with it. Note that I'm using Capital for the class name, which is common Python practise. I've made this assumption because your model structure is allowing this to happen with the use of a Foreign Key in your UserStory model. Your model structure should look like this instead:
class Story(models.Model):
title = models.CharField(max_length=400)
thetext = models.TextField()
class UserStory(models.Model):
story = models.OneToOneField(Story) # renamed field to story as convention suggests
date = models.DateTimeField()
user = models.ForeignKey(User)
class ClassicStory(models.Model)
story = models.OneToOneField(Story)
date = models.DateTimeField()
author = models.CharField(max_length=200)
See the use of OneToOne relationships here. A OneToOne field denotes a 1-to-1 relationship, meaning that a Story has one, and only one, UserStory. This also means that a UserStory is related to exactly one Story. This is the "parent-child" relationship, with the extra constraint that a parent has only a single child. Your use of a ForeignKey before means that a Story has multiple UserStories associated with it, which is wrong for your use case.
Now your queries (and attribute accessors) will behave like you expected.
# get all of the users UserStories:
user = request.user
stories = UserStory.objects.filter(user=user).select_related('story')
# print all of the stories:
for s in stories:
print s.story.title
print s.story.thetext
Note that select_related will create a SQL join, so you're not executing another query each time you print out the story text. Read up on this, it is very very very important!
Your previous question mentions that you have another table, ClassicStories. It should also have a OneToOneField, just like the UserStories. Using OneToOne fields in this way makes it very difficult to iterate over the Story model, as it may be a "ClassicStory" but it might be a "UserStory" instead:
# iterate over ALL stories
allstories = Story.objects.all()
for s in allstories:
print s.title
print s.thetext
print s.userstory # this might error!
print s.classicstory # this might error!
See the issue? You don't know what kind of story it is. You need to check the type of story it is before accessing the fields in the sub-table. There are projects that help manage this kind of inheritance around, an example is django-model-utils InheritanceManager, but that's a little advanved. If you never need to iterate over the Story model and access it's sub tables, you don't need to worry though. As long as you only access Story from ClassicStories or UserStories, you will be fine.

Related

Getting data from related tables using Django ORM

Using the Django ORM, how does one access data from related tables without effectively making a separate call for each record (or redundantly denormalizing data to make it more easily accessible)?
Say I have 3 Models:
class Tournament(models.Model):
name = models.CharField(max_length=250)
active = models.BooleanField(null=True,default=1)
class Team(models.Model):
name = models.CharField(max_length=250)
coach_name = models.CharField(max_length=250)
active = models.BooleanField(null=True,default=1)
class Player(models.Model):
user = models.ForeignKey(
settings.AUTH_USER_MODEL,
on_delete=models.DO_NOTHING
)
number = models.PositiveIntegerField()
age = models.PositiveIntegerField()
active = models.BooleanField(null=True,default=1)
Note that this Player model is important in the application as it's a major connection to most of the models - from registration to teams to stats to results to prizes. But this Player model doesn't actually contain the person's name as the model contains a user field which is the foreign key to a custom AUTH_USER_MODEL ('user') model which contains first/last name information. This allows the player to log in to the application and perform certain actions.
In addition to these base models, say that since a player can play on different teams in different tournaments, I also have a connecting ManyToMany model:
class PlayerToTeam(models.Model):
player = models.ForeignKey(
Player,
on_delete=models.DO_NOTHING
)
team = models.ForeignKey(
Team,
on_delete=models.DO_NOTHING
)
tournament = models.ForeignKey(
Tournament,
on_delete=models.DO_NOTHING
)
As an example of one of the challenges I'm encountering, let's say I'm trying to create a form that allows coaches to select their starting lineup. So I need my form to list the names of the Players on a particular Team at a particular Tournament.
Given the tournament and team IDs, I can easily pull back the necessary QuerySet to describe the initial records I'm interested in.
playersOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id])
This returns the QuerySet of the IDs (but only the IDs) of the team, the tournament, and the players. However, the name data is two models away:
PlayerToTeam->[player_id]->Player->[user_id]->User->[first_name] [last_name]
Now, if I pull back only a single record, I could simply do
onlyPlayerOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id]).filter(player=[Player_id]).get()
onlyPlayerOnTeam.player.user.first_name
So if I was only needing to display the names, I believe I could pass the QuerySet in the view return and loop through it in the template and display what I need. But I can't figure out if you can do something similar when I need the names to be displayed as part of a form.
To populate the form, I believe I could loop through the initial QuerySet and build a new datastructure:
playersOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id])
allPlayersData= []
for nextPlayer in playersOnTeam:
playerDetails= {
"player_id": nextPlayer.player.id,
"first_name": nextPlayer.player.user.first_name,
"last_name": nextPlayer.player.user.last_name,
}
allPlayersData.append(playerDetails)
form = StartingLineupForm(allPlayersData)
However, I fear that would result in a separate database call for every player/user!
And while that may be tolerable for 6-10 players, for larger datasets, that seems less than ideal. Looping through performing a query for every user seems completely wrong.
Furthermore, what's frustrating is that this would be simple enough with a straight SQL query:
SELECT User.first_name, User.last_name
FROM PlayerToTeam
INNER JOIN Player ON PlayerToTeam.player_id = Player.id
INNER JOIN User ON Player.user_id = User.id
WHERE PlayerToTeam.tournament_id=[tourney_id] AND PlayerToTeam.team_id=[team_id]
But I'm trying to stick to the Django ORM best practices as much as I can and avoid just dropping to SQL queries when I can't immediately figure something out, and I'm all but certain that this isn't so complicated of a situation that I can't accomplish this without resorting to direct SQL queries.
I'm starting to look at select_related and prefetch_related, but I'm having trouble wrapping my head around how those work for relations more than a single table connection away. Like it seems like I could access the Player.age data using the prefetch, but I don't know how to get to User.first_name from that.
Any help would be appreciated.
I would suggest two approaches:
A) select related (one DB query):
objects = PlayerToTeam.objects.filter(
...
).select_related(
'player__user',
).only('player__user__name')
name = objects.first().user.name
B) annotate (one DB query):
objects = PlayerToTeam.objects.filter(
...
).annotate(
player_name=F('player__user__name'),
)
name = objects.first().player_name
To be sure you have only one object for specific player, team and tournament, I would suggest adding unique_together:
class PlayerToTeam(models.Model):
...
class Meta:
unique_together = ('player', 'team', 'tournament', )

django: need to design models/forms for a 'multiple level nested' structures

Assume some Company with Employees. There are Name and Contact information bound to each Employee. Each Contact contains Street and Phones fields.
What I want is a page which lists employees within a company. But everything must be listed as forms. Because I want to be able to modify the particular Employee information and the most important - I want to be able to add new Employees (clicking a button "Add new employee" must add a new empty "Employee form"). As well as it must allow to add a new phone number to the existing Employee's Contact information any time.
The data model looks like:
--Company
----Employee1
------Name
------Contact
--------Street
--------Phones
----------Phone1
----------Phone2
----Employee2
------Name
------Contact
--------Street
--------Phones
----------Phone1
----------Phone2
----------Phone3
...
Could someone please help to design Models and Forms for such a task? Your help is very much appreciated. Many thanks!
P.S. Forgot to mention that I want all the data "collected" in the Company object at the end of the day. I mean when I serialize c = Comapany.objects.all()[0] on the back end the entire employee information must be visible, like c.employees[0].contact.phones[0] must be the first employee's first phone number. Thanks.
P.P.S.
That is not the case that I'm just forwarding my project. This is just an hypothetical example I'd created to present the problem. I'm a django newbie and trying to understand how the framework gets things rolling.
I've spent lot of time on this. I've found several ways to go, but no one got me to the end. For instance, a wonderful blog about nested formsets http://yergler.net/blog/2013/09/03/nested-formsets-redux/ helped with forms and rendering. But, it solved only the half of the problem. The data like I mentioned above is not "being collected" into an object. At the end of the day I want to serialize a Company object and save it in yaml format using pyyaml (see my previous post django: want to have a form for dynamically changed sequence data).
Django is perfect with "static" models and forms, ModelForms are awesome. But what if your model needs to be changed dynamically? No standard way to go. Either no appropriate documentation nor I could find a one. Thus, I'd like to hear how experts imagine the solution for such a problem.
Try this:
from django.db import models
class _Contact(object):
pass
class Company(models.Model):
name = models.CharField(max_length=255)
created_at = models.DateTimeField(auto_now_add=True)
#property
def employees(self):
return self.employee_set.prefetch_related('phones').order_by('-created_at')
class Phone(models.Model):
number = models.CharField(max_length=255)
created_at = models.DateTimeField(auto_now_add=True)
class Employee(models.Model):
name = models.CharField(max_length=255)
street = models.CharField(max_length=255)
phones = models.ManyToManyField('Phone', through='EmployeePhone', blank=True)
created_at = models.DateTimeField(auto_now_add=True)
company = models.ForeignKey(Company)
#property
def contact(self):
_contact = _Contact()
_contact.street = self.street
_contact.phones = self.phones.order_by('-employeephone__created_at')
return _contact
class EmployeePhone(models.Model):
employee = models.ForeignKey(Employee)
phone = models.ForeignKey(Phone)
created_at = models.DateTimeField(auto_now_add=True)
However, you should just use employee.street and employee.phones. employee.contact is redundant.

Django Threaded Commenting System

(and sorry for my english)
I am learning Python and Django. Now, my challange is developing threaded generic comment system. There is two models, Post and Comment.
-Post can be commented.
-Comment can be commented. (endless/threaded)
-Should not be a n+1 query problem in system. (No matter how many comments, should not increase the number of queries)
My current models are like this:
class Post(models.Model):
title = models.CharField(max_length=100)
content = models.TextField()
child = generic.GenericRelation(
'Comment',
content_type_field='parent_content_type',
object_id_field='parent_object_id'
)
class Comment(models.Model):
content = models.TextField()
child = generic.GenericRelation(
'self',
content_type_field='parent_content_type',
object_id_field='parent_object_id'
)
parent_content_type = models.ForeignKey(ContentType)
parent_object_id = models.PositiveIntegerField()
parent = generic.GenericForeignKey(
"parent_content_type", "parent_object_id")
Are my models right? And how can i get all comment (with hierarchy) of post, without n+1 query problem?
Note: I know mttp and other modules but I want to learn this system.
Edit: I run "Post.objects.all().prefetch_related("child").get(pk=1)" command and this gave me post and its child comment. But when I wanna get child command of child command a new query is running. I can change command to ...prefetch_related("child__child__child...")... then still a new query running for every depth of child-parent relationship. Is there anyone who has idea about resolve this problem?
If you want to get all comments on a post with a single query then it would be good to have every comment link to the asssociated post. You can use a separate link to indicate the parent comment.
Basically:
class Post(models.Model):
...
comments = models.ManyToManyField('Comment')
# link to all comments, even children of comments
class Comment(models.Model):
...
child_comments = models.ManyToManyField('Comment')
# You may find it easier to organise these into a tree
# if you use a parent_comment ForeignKey. That way the
# top level comments have no parent and can be easily spotted.
Post.objects.all().select_related('comments').get(pk=1)
The many to many in this takes a little extra work to create the association, as it uses an intermediate table. If you want a pure one to many then you need a ForeignKey on the Comment but then you are restricted to a prefetch_related instead of a select_related, which then involves an extra database hit.
This is also better in that you do not have an untyped foreign key reference (your PostitiveIntegerField).
You then need to arrange the comments into a tree structure, but that is outside the scope of your question.

Database structure in Django for voting app

I'm trying to wrap my head around how I would structure my database tables in the Django webapp I'm writing. I'm a relative newbie to web development, but this is the very first time I've tried to use a database, so bear with me if it's a stupid question.
The webapp goes through each Oscar the Academy gives out and allows the user to select which of some (varying) number of nominations will win an Oscar. The data from each individual session will be publicly available by going to a url like [url].com/answers/[unique id]. The overall data will also be available on a results page. So I've started writing my models file, and this is what I have so far:
from django.db import models
class Nominee(models.Model):
award = models.CharField(max_length=50)
title = models.CharField(max_length=50)
key = modelsCharField(max_length=50)
subtitle = models.CharField(max_length=50)
numVotes = models.IntegerField()
class Session(models.Model):
id = models.IntegerField() # unique id of visitor
bpictureVote = models.ForeignKey(Nominee, related_name = 'nom')
bactorVote = models.ForeignKey(Nominee, related_name = 'nom')
# ... for each award
I was originally thinking of having
class Award(models.Model):
name = models.CharField(max_length=50)
and at the beginning of Nominee,
award = models.ForeignKey(Award, related_name = 'award')
but I couldn't figure out why that would be better than just having award be a part of the Nominee class.
This is really just a start, because I've gotten a bit stuck. Am I on the right track? Should I be doing this totally differently (as I probably should...)? Any thoughts?
Thanks!
You are on the right track.
You need a separate Award class to avoid having to type in award's name every time you create a Nominee. By having a ForeignKey reference you make sure that you can safely rename your award, add additional information about the award (let's say in the future you decide to give each award a separate page with a description and a list of nominees), you also avoid errors which can happen from having a set of different spellings and typos ("Best Engineer Award" and "Best Engineer award"). It also makes sense - your application operates a set of objects: user sessions, nominees and awards.
Few unrelated notes:
You don't need an explicit Session.id field, django ORM creates it for you.
Property names have to be name_with_underscores, not camelCase.
No spaces around "=" in an arguments list: models.ForeignKey(Nominee, related_name='nom').
4 spaces instead of 2 (unless explicitly otherwise specified).
I am not entirely sure, because you do mention multiple nominees per award (assuming this is something like a poll before the actual nomination) a ManyToMany would be your required relation, in order to use also the additional user data.
But in the case you have implemented this as a specific app for nominations and implemented a custom user model then this would be refactored to something else...
Anyway to your current implementation:
class Nominee(models.Model):
title = models.CharField(max_length=50)
key = modelsCharField(max_length=50)
subtitle = models.CharField(max_length=50)
...
class Award(models.Model):
name = models.CharField(max_length=50)
nominees = models.ManyToManyField(Nominee, through='AwardNominees')
...
class AwardNominees(models.Model):
nominee = models.ForeignKey(Nominee)
award = models.ForeignKey(Award)
user = models.ForeignKey(User)
numVotes = models.IntegerField()
....
So it turned out I was thinking about this entirely wrong. I've now completely changed things, and now it's fully functional (!). But in the spirit of full disclosure, I should say that it definitely may not be the best solution. It sure seems like a good one, though, because it's really simple. Now I have only one model:
class Vote(models.Model):
award = models.CharField(...) # Name of the award
title = models.CharField(...) # Title of the nominee
subtitle = models.CharField(...) # Subtitle of the nominee
uid = models.CharField(...) # A 6 character user ID for future access
When I want to show the results of one user's votes, I can use Django's database tools to filter for a certain uid captured in the URL. When I want to tally the votes, I can use a combination of filters and Django's count() to determine how many votes each nominee had for a certain award. Sounds reasonable enough to me!

A good data model for finding a user's favorite stories

Original Design
Here's how I originally had my Models set up:
class UserData(db.Model):
user = db.UserProperty()
favorites = db.ListProperty(db.Key) # list of story keys
# ...
class Story(db.Model):
title = db.StringProperty()
# ...
On every page that displayed a story I would query UserData for the current user:
user_data = UserData.all().filter('user =' users.get_current_user()).get()
story_is_favorited = (story in user_data.favorites)
New Design
After watching this talk: Google I/O 2009 - Scalable, Complex Apps on App Engine, I wondered if I could set things up more efficiently.
class FavoriteIndex(db.Model):
favorited_by = db.StringListProperty()
The Story Model is the same, but I got rid of the UserData Model. Each instance of the new FavoriteIndex Model has a Story instance as a parent. And each FavoriteIndex stores a list of user id's in it's favorited_by property.
If I want to find all of the stories that have been favorited by a certain user:
index_keys = FavoriteIndex.all(keys_only=True).filter('favorited_by =', users.get_current_user().user_id())
story_keys = [k.parent() for k in index_keys]
stories = db.get(story_keys)
This approach avoids the serialization/deserialization that's otherwise associated with the ListProperty.
Efficiency vs Simplicity
I'm not sure how efficient the new design is, especially after a user decides to favorite 300 stories, but here's why I like it:
A favorited story is associated with a user, not with her user data
On a page where I display a story, it's pretty easy to ask the story if it's been favorited (without calling up a separate entity filled with user data).
fav_index = FavoriteIndex.all().ancestor(story).get()
fav_of_current_user = users.get_current_user().user_id() in fav_index.favorited_by
It's also easy to get a list of all the users who have favorited a story (using the method in #2)
Is there an easier way?
Please help. How is this kind of thing normally done?
What you've described is a good solution. You can optimise it further, however: For each favorite, create a 'UserFavorite' entity as a child entity of the relevant Story entry (or equivalently, as a child entity of a UserInfo entry), with the key name set to the user's unique ID. This way, you can determine if a user has favorited a story with a simple get:
UserFavorite.get_by_name(user_id, parent=a_story)
get operations are 3 to 5 times faster than queries, so this is a substantial improvement.
I don't want to tackle your actual question, but here's a very small tip: you can replace this code:
if story in user_data.favorites:
story_is_favorited = True
else:
story_is_favorited = False
with this single line:
story_is_favorited = (story in user_data.favorites)
You don't even need to put the parentheses around the story in user_data.favorites if you don't want to; I just think that's more readable.
You can make the favorite index like a join on the two models
class FavoriteIndex(db.Model):
user = db.UserProperty()
story = db.ReferenceProperty()
or
class FavoriteIndex(db.Model):
user = db.UserProperty()
story = db.StringListProperty()
Then your query on by user returns one FavoriteIndex object for each story the user has favorited
You can also query by story to see how many users have Favorited it.
You don't want to be scanning through anything unless you know it is limited to a small size
With your new Design you can lookup if a user has favorited a certain story with a query.
You don't need the UserFavorite class entities.
It is a keys_only query so not as fast as a get(key) but faster then a normal query.
The FavoriteIndex classes all have the same key_name='favs'.
You can filter based on __key__.
a_story = ......
a_user_id = users.get_current_user().user_id()
favIndexKey = db.Key.from_path('Story', a_story.key.id_or_name(), 'FavoriteIndex', 'favs')
doesFavStory = FavoriteIndex.all(keys_only=True).filter('__key__ =', favIndexKey).filter('favorited_by =', a_user_id).get()
If you use multiple FavoriteIndex as childs of a Story you can use the ancestor filter
doesFavStory = FavoriteIndex.all(keys_only=True).ancestor(a_story).filter('favorited_by =', a_user_id).get()

Categories