What I have
I have an app, that archives tournaments in the game of chess. The app includes the following models:
class Tournament(models.Model):
name = models.CharField(max_length=128)
class Player(models.Model):
name = models.CharField(max_length=128)
# Abstract base class
class Match(models.Model):
tournament = models.ForeignKey(Tournament)
playerA = models.ForeignKey(Player, related_name='%(class)s_A') # eg. mastertournament_A
playerB = models.ForeignKey(Player, related_name='%(class)s_B')
score = models.CharField(max_length=16)
class Meta:
abstract = True
# here are tables of ``Match`` instances played out in a particular
# tournaments. All ``Match`` instances share the same fields
# so, I could also have one big table for all matches but I want to keep
# each Tournament in separate table for easiness.
class MasterTournament(Match):
pass
class AmateurTournament(Match):
pass
Now, I plan to have two different views: tournament_view (lists all matches played in a tournament) and player_view (lists all matches a player played throughout all tournaments)
Problem to solve
Given the views I mentioned, I need to perform two different queries for each.
In a tournament_view I will have filters (Choice Filter) playerA and playerB and I need to dynamically populate choices for them. This can easily be done with:
playersA_all = MasterTournament.objects.value_list('playerA')
playersB_all = MasterTournament.objects.value_list('playerB')
However, I am struggling to come up with the query for player_view. This view is very similar with Choice Filters playerA and playerB but now, for the choices I need to query all Tournament tables to get all opponents of the player who is being viewed. This will result in a bunch of database hits each time and in the process I'll need to introduce a temporary list to save and append results from different tables.
That's why I am feeling like I need to reorganize my models, but the only solution that comes to my mind is to have that huge one table with all tournaments' matches packed together, something I wanted to prevent from happening.
My question is, do you have any ideas how to tweak my models, or perhaps django does provide a solution to perform the query I need for player_view?
I've actually done something like this before, though I wasn't using Django to do it. The concept of getting all the opponents is a problem when the number of matches gets large. I was able to leverage my solution to also keep track of wins and losses, without having to calculate on the fly.
See www.eurosportscoreboard.com.
Anyway, the way I solved it was with triggers. You could do the same with a save signal.
Create an Opponent model with a fk relationship with Player and Match. When a Match is saved, create an Opponent for each player. The write will be a little slow, but the reads will be very fast.
Instead of having two ForeignKey fields, have one ManyToMany Field:
class Match(models.Model):
tournament = models.ForeignKey(Tournament)
players = models.ManyToManyField(Player, through='Participate')
score = models.CharField(max_length=16)
class Participate(models.Model):
player = models.ForeignKey(Player)
match = models.ForeignKey(Match)
visitor = models.BooleanField()
I think it solves most of your problem, and also makes a lot more sense, since there's no point in defining one as A and one is B. Both are players, there's nothing exceptionally distinguishable between them.
You cant query multiple tables at a time! You can make several queries and union results.
# so, I could also have one big table for all matches but I want to keep
# each Tournament in separate table for easiness.
This is a bad decision in case if you will need information from both tables in query results. Think about how you will query (with 2 tables) for one tournamet matches or positive score matches or matches with specific players - you must do queries to 2 tables and union results that doubles DB load. In this case you should create one table for matches, i think, and make one field for match type - Master or Amauter:
class Match(models.Model):
tournament = models.ForeignKey(Tournament)
playerA = models.ForeignKey(Player, related_name='%(class)s_A') # eg. mastertournament_A
playerB = models.ForeignKey(Player, related_name='%(class)s_B')
score = models.CharField(max_length=16)
master_or_amauter = models.BooleanField(default=True) # master by default
And with one table you have no problem in player_view...
Related
Using the Django ORM, how does one access data from related tables without effectively making a separate call for each record (or redundantly denormalizing data to make it more easily accessible)?
Say I have 3 Models:
class Tournament(models.Model):
name = models.CharField(max_length=250)
active = models.BooleanField(null=True,default=1)
class Team(models.Model):
name = models.CharField(max_length=250)
coach_name = models.CharField(max_length=250)
active = models.BooleanField(null=True,default=1)
class Player(models.Model):
user = models.ForeignKey(
settings.AUTH_USER_MODEL,
on_delete=models.DO_NOTHING
)
number = models.PositiveIntegerField()
age = models.PositiveIntegerField()
active = models.BooleanField(null=True,default=1)
Note that this Player model is important in the application as it's a major connection to most of the models - from registration to teams to stats to results to prizes. But this Player model doesn't actually contain the person's name as the model contains a user field which is the foreign key to a custom AUTH_USER_MODEL ('user') model which contains first/last name information. This allows the player to log in to the application and perform certain actions.
In addition to these base models, say that since a player can play on different teams in different tournaments, I also have a connecting ManyToMany model:
class PlayerToTeam(models.Model):
player = models.ForeignKey(
Player,
on_delete=models.DO_NOTHING
)
team = models.ForeignKey(
Team,
on_delete=models.DO_NOTHING
)
tournament = models.ForeignKey(
Tournament,
on_delete=models.DO_NOTHING
)
As an example of one of the challenges I'm encountering, let's say I'm trying to create a form that allows coaches to select their starting lineup. So I need my form to list the names of the Players on a particular Team at a particular Tournament.
Given the tournament and team IDs, I can easily pull back the necessary QuerySet to describe the initial records I'm interested in.
playersOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id])
This returns the QuerySet of the IDs (but only the IDs) of the team, the tournament, and the players. However, the name data is two models away:
PlayerToTeam->[player_id]->Player->[user_id]->User->[first_name] [last_name]
Now, if I pull back only a single record, I could simply do
onlyPlayerOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id]).filter(player=[Player_id]).get()
onlyPlayerOnTeam.player.user.first_name
So if I was only needing to display the names, I believe I could pass the QuerySet in the view return and loop through it in the template and display what I need. But I can't figure out if you can do something similar when I need the names to be displayed as part of a form.
To populate the form, I believe I could loop through the initial QuerySet and build a new datastructure:
playersOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id])
allPlayersData= []
for nextPlayer in playersOnTeam:
playerDetails= {
"player_id": nextPlayer.player.id,
"first_name": nextPlayer.player.user.first_name,
"last_name": nextPlayer.player.user.last_name,
}
allPlayersData.append(playerDetails)
form = StartingLineupForm(allPlayersData)
However, I fear that would result in a separate database call for every player/user!
And while that may be tolerable for 6-10 players, for larger datasets, that seems less than ideal. Looping through performing a query for every user seems completely wrong.
Furthermore, what's frustrating is that this would be simple enough with a straight SQL query:
SELECT User.first_name, User.last_name
FROM PlayerToTeam
INNER JOIN Player ON PlayerToTeam.player_id = Player.id
INNER JOIN User ON Player.user_id = User.id
WHERE PlayerToTeam.tournament_id=[tourney_id] AND PlayerToTeam.team_id=[team_id]
But I'm trying to stick to the Django ORM best practices as much as I can and avoid just dropping to SQL queries when I can't immediately figure something out, and I'm all but certain that this isn't so complicated of a situation that I can't accomplish this without resorting to direct SQL queries.
I'm starting to look at select_related and prefetch_related, but I'm having trouble wrapping my head around how those work for relations more than a single table connection away. Like it seems like I could access the Player.age data using the prefetch, but I don't know how to get to User.first_name from that.
Any help would be appreciated.
I would suggest two approaches:
A) select related (one DB query):
objects = PlayerToTeam.objects.filter(
...
).select_related(
'player__user',
).only('player__user__name')
name = objects.first().user.name
B) annotate (one DB query):
objects = PlayerToTeam.objects.filter(
...
).annotate(
player_name=F('player__user__name'),
)
name = objects.first().player_name
To be sure you have only one object for specific player, team and tournament, I would suggest adding unique_together:
class PlayerToTeam(models.Model):
...
class Meta:
unique_together = ('player', 'team', 'tournament', )
An assembled Team may be deployed over time to multiple Projects
In this example I want to query for Teams assigned to 'active' projects only. The code I am using is working, but I would like to know if there is a more efficient / compact means of doing so.
Models
class Team(ndb.Model):
"""Model for representing an project team."""
teamid = ndb.StringProperty(required=True)
project = ndb.KeyProperty(kind='Project', required=True, repeating=True)
class Project(ndb.Model):
"""Model for representing a Project"""
name = ndb.StringProperty(required=True)
status = ndb.StringProperty(required=True)
Query
status = 'active'
project_query = Project.query()\
.filter(Project.status == status)
active_projects = project_query.fetch(1000, keys_only=True)
team_query = Team.query().order(Team.teamid)\
.filter(Team.projectid.IN(active_projects))
results = team_query.fetch(max_results, offset=start_at)
Compact form (essentially the same thing)
team_query = Team.query().order(Team.teamid)\
.filter(Team.projectid.IN(Project.query().filter(Project.status == status)
.fetch(1000, keys_only=True)))
Is there a better way?
You could de-normalize your data model by "redundantly" having a project_status property on the Team entity as well, maintained to be the same as the status property on the corresponding Project entity.
The downside of course is that "changing a Project's status" also needs to find all Teams assigned to that Project and correspondingly change their project_status (and you may need a multiple-entity-groups transaction for that). But the big upside is that locating all Teams assigned to Projects in a certain status becomes much faster, a single simple query.
This trade-off is pretty typical for decisions related to de-normalizing data models. If, in your application, Project status changes relatively rarely (and perhaps typically few Teams are assigned to a given Project), while queries for "all Teams assigned to Projects in a certain status" are frequent and need to be fast, then the de-normalization will be a worthy optimization.
Waxing even more abstract, which de-normalizations are worthwhile is always highly dependent on specific application constraints -- which queries or updates are rare or frequent, what performance goals are there for each operation. You also always pay the price of having slightly more data around, since a few things are duplicated. On the other hand, many applications have relatively rare writes/updates, and pretty frequent reads/queries, which does tend to favor judicious de-normalizations.
Because you have a many-to-many relationship between Teams and Projects, it's your choice whether to have Teams with:
project = ndb.KeyProperty(kind='Project', required=True, repeating=True)
or Projects with:
team = ndb.KeyProperty(kind='Team', required=True, repeating=True)
You can improve on your query-in-a-query by using only one query and an ndb.get_multi(). Queries are slow, Gets are faster (especially when memcached). By changing your models, you can improve your queries.
class Team(ndb.Model):
"""Model for representing an project team."""
teamid = ndb.StringProperty(required=True)
class Project(ndb.Model):
"""Model for representing a Project"""
name = ndb.StringProperty(required=True)
status = ndb.StringProperty(required=True)
team = ndb.KeyProperty(kind='Team', required=True, repeating=True)
Now you can query by status and perform a get:
status = 'active'
project_query = Project.query()\
.filter(Project.status == status)
active_projects = project_query.fetch(1000, keys_only=True)
teams = set()
for project in active_projects:
teams.extend(project.team)
results = ndb.get_multi(list(teams))
(Credit to Tim Hoffman for commenting on this approach.)
Before I start: My understanding of Django is at a beginner Level and i could not find adequate help through google.
I'll start with an example:
class Player(models.Model):
...
class Tournament(models.Model):
...
first_place = models.ForeignKey(Player)
second_place = models.ForeignKey(...)
third_place = models.ForeignKey(...)
My problem is: there are multiple people in first place, second place and so on. How can I realize the model in a way which lets me add my own number of Players every time?
I already tried ManyToMany instead of ForeignKey but then I get an Error in the admin menu when i try to save a Tournament Object stating that there has to be an ID present for the Object even when I do not select any Players to be added.
Don't know if I understand the question correctly, but if you want to make ForeignKeys optional and also want to add multiple Players, you could use ManyToManyField and set null and blank, both True:
class Tournament(...):
...
first_place = models.ManyToManyField(Player, blank=True, null=True)
...
This would solve your problem you can now freely add each and any place to each player. The .count() function in your view can get the number of objects for the selected players. No need for a manytomanyfield when you can just assign a place to an object by the user with every user now being able to have unlimited places in each category, if I understand what your trying to do here. Comment if you need more help.
Class FirstPlace(models.Model):
first = models.ForeignKey(Player)
Class SecondPlace(models.Model):
second = models.ForeignKey(Player)
Class ThirdPlace(models.Model):
third = models.ForeignKey(Player)
I've a Django model
class Person(models.Model):
name = models.CharField(max_length=50)
team = models.ForeignKey(Team)
And a team model
class Team(models.Model):
name = models.CharField(max_length=50)
Then, I would like to add a 'coach' property which is a one to one relationship to person. If I am not wrong, I have two ways of doing it.
The first approach would be adding the field to Team:
class Team(models.Model):
name = models.CharField(max_length=50)
coach = models.OneToOneField(Person, related_name='master')
The second one would be creating a new model:
class TeamCoach(models.Model):
team = models.OneToOneField(Team)
coach = models.OneToOneField(Person)
Is this right ? is there a big difference for practical purposes ? which are the pro and cons of each approach ?
I will say NEITHER, as every Person has a Team and if every Team has a Coach, it's rather redundant circulation and somewhat unnecessary.
Better to add a field in Person called type directly is more clean and direct, something like:
class Person(models.Model):
# use _ if you care about i18n
TYPES = ('member', 'member',
'coach', 'coach',)
name = models.CharField(max_length=50)
team = models.ForeignKey(Team)
type = models.CharField(max_length=20, choices=TYPES)
Although I would seriously consider refactoring Person to be more generic and get Team to have a ManyToMany to Person... in that case, you can re-use Person in other areas, like Cheerleaders.
class Person(models.Model):
# use _ if you care about i18n
TYPES = ('member', 'member',
'coach', 'coach',)
name = models.CharField(max_length=50)
type = models.CharField(max_length=20, choices=TYPES)
class Team(models.Model):
name = models.CharField(max_length=50)
member = models.ManyToManyField(Person, related_name='master')
Make your models more generic and DRY, they should be easily manageable and not tightly coupled to certain fields (unless absolutely necessary), then the models are more future proof and will not fall under migration nightmare that easily.
Hope this helps.
I can't agree so easy with #Anzel, and since the name of the question is
What are the benefits of having two models instead of one?
I'll try to give my two cents. But before i start i want to place some quotes from the docs.
It doesn’t matter which model has the ManyToManyField, but you should
only put it in one of the models – not both.
Generally, ManyToManyField instances should go in the object that’s
going to be edited on a form. In the above example, toppings is in
Pizza (rather than Topping having a pizzas ManyToManyField ) because
it’s more natural to think about a pizza having toppings than a
topping being on multiple pizzas. The way it’s set up above, the Pizza
form would let users select the toppings.
Basically that's the first thing you should have in mind when creating a M2M relation (your TeamCoach model is that, but more on that in a second) which one is the object holding the relation. What would be more suitable for your problem - choosing a coach for a team when you create it, or choosing a team for a person when you create it? IF you ask me i would prefer the second variant and keep the teams inside of the Person class.
Now lets go to the next section of the docs
Extra fields on many-to-many relationships
When you’re only dealing with simple many-to-many relationships such
as mixing and matching pizzas and toppings, a standard ManyToManyField
is all you need. However, sometimes you may need to associate data
with the relationship between two models.
For example, consider the case of an application tracking the musical
groups which musicians belong to. There is a many-to-many relationship
between a person and the groups of which they are a member, so you
could use a ManyToManyField to represent this relationship. However,
there is a lot of detail about the membership that you might want to
collect, such as the date at which the person joined the group.
For these situations, Django allows you to specify the model that will
be used to govern the many-to-many relationship. You can then put
extra fields on the intermediate model. The intermediate model is
associated with the ManyToManyField using the through argument to
point to the model that will act as an intermediary.
That's actually the answer of your question, having an intermediate model give you the ability to store additional data about the collection. Consider the situation where a coach moves to another team next season, if you just update the M2M relation, you will loose the track of his past teams where he was coaching. Or you will never be able to answer the question who was the coach of that team at year XXX. So if you need more data, go with intermediate model. This is also were #Anzel going wrong, the type field is an additional data of that intermediate model, it's place must be inside it.
Now here is how i would probably create the relations:
class Person(models.Model):
name = models.CharField(max_length=50)
teams = models.ManyToManyField('Team', through='TeamRole')
class Team(models.Model):
name = models.CharField(max_length=50)
class TeamRole(models.Model):
COACH = 1
PLAYER = 2
CHEERLEADER = 3
ROLES = (
(COACH, 'Coach'),
(PLAYER, 'Player'),
(CHEERLEADER, 'Cheerleader'),
)
team = models.ForeignKey(Team)
person = models.ForeignKey(Person)
role = models.IntegerField(choices=ROLES)
date_joined = models.DateField()
date_left = models.DateField(blank=True, null=True, default=None)
How will I query this? Well, I can use the role to get what type of persons I'm looking for, and I can also use the date_left field to get the current persons participating in that team right now. Here are a few example methods:
class Person(models.Model):
#...
def get_current_team(self):
return self.teams.filter(teamrole__date_left__isnull=True).get()
class Team(models.Model):
#...
def _get_persons_by_role(self, role, only_active):
persons = self.person_set.filter(teamrole__role=role)
if only_active:
return persons.filter(teamrole__date_left__isnull=True)
return persons
def get_coaches(self, only_active=True):
return self._get_persons_by_role(TeamRole.COACH, only_active)
def get_players(self, only_active=True):
return self._get_persons_by_role(TeamRole.PLAYER, only_active)
Is there a way for a user in localhost/admin to great a new class automatically, rather than by having to add the code manually?
For example, let's say I have an app the tracks the statistics of players on a baseball team, and have defined a class for the statistics of the players on the team. If I want the user to be able to create a new table to house statistics of players on a different team (with the exact same Fields as the previous team, is there a way for the user to do this in localhost/admin?
You have to think of models as representations of your database. You don't want to dynamically create tables in your database. It needs consistent structure for sanity's sake.
That said, what you are describing can be done with related models. For example, a structure like this will allow you to add as many teams as you'd like:
class Team(models.Model):
team_name = ...
# Whatever other attributes.
class Player(models.Model):
first_name = ...
last_name = ...
# Assign players to a team.
team = models.ForeignKey(Team)
# Whatever stats you want to keep on the players