I have a django app with the following models:
class Person(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
class Job(models.Model):
title = models.CharField(max_length=100)
class PersonJob(models.Model):
person = models.ForeignKey(Person, related_name='person_jobs')
job = models.ForeignKey(Job, related_name='person_jobs')
is_active = models.BooleanField()
Multiple Person instances can hold the same job at once. I have a Job queryset and am trying to annotate or through some other method attach the names of each person with that job onto each item in the queryset. I want to be able to loop through the queryset and get those names without doing an additional query for each item. The closest I have gotten is the following:
qs = Job.objects.all().annotate(first_names='person_jobs__person__first_name')
.annotate(last_names='person_jobs__person__last_name')
This will store the name on the Job instance as I would like; however, if a job has multiple people in it, the queryset will have multiple copies of the same Job in it, each with the name of one person. Instead, I need there to only ever be one instance of a given Job in the queryset, which holds the names of all people in it. I don't care how the values are combined and stored; a list, delimited char field, or really any other standard data type would be fine.
I'm using Django 2.1 and Postgres 10.3. I would strongly prefer to not use any Postgres specific features.
You can use either ArrayAgg or StringAgg:
from django.contrib.postgres.aggregates import ArrayAgg, StringAgg
Job.objects.all().annotate(first_names=StringAgg('person_jobs__person__first_name', delimiter=',')
Job.objects.all().annotate(people=ArrayAgg('person_jobs__person__first_name'))
Related
Using the Django ORM, how does one access data from related tables without effectively making a separate call for each record (or redundantly denormalizing data to make it more easily accessible)?
Say I have 3 Models:
class Tournament(models.Model):
name = models.CharField(max_length=250)
active = models.BooleanField(null=True,default=1)
class Team(models.Model):
name = models.CharField(max_length=250)
coach_name = models.CharField(max_length=250)
active = models.BooleanField(null=True,default=1)
class Player(models.Model):
user = models.ForeignKey(
settings.AUTH_USER_MODEL,
on_delete=models.DO_NOTHING
)
number = models.PositiveIntegerField()
age = models.PositiveIntegerField()
active = models.BooleanField(null=True,default=1)
Note that this Player model is important in the application as it's a major connection to most of the models - from registration to teams to stats to results to prizes. But this Player model doesn't actually contain the person's name as the model contains a user field which is the foreign key to a custom AUTH_USER_MODEL ('user') model which contains first/last name information. This allows the player to log in to the application and perform certain actions.
In addition to these base models, say that since a player can play on different teams in different tournaments, I also have a connecting ManyToMany model:
class PlayerToTeam(models.Model):
player = models.ForeignKey(
Player,
on_delete=models.DO_NOTHING
)
team = models.ForeignKey(
Team,
on_delete=models.DO_NOTHING
)
tournament = models.ForeignKey(
Tournament,
on_delete=models.DO_NOTHING
)
As an example of one of the challenges I'm encountering, let's say I'm trying to create a form that allows coaches to select their starting lineup. So I need my form to list the names of the Players on a particular Team at a particular Tournament.
Given the tournament and team IDs, I can easily pull back the necessary QuerySet to describe the initial records I'm interested in.
playersOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id])
This returns the QuerySet of the IDs (but only the IDs) of the team, the tournament, and the players. However, the name data is two models away:
PlayerToTeam->[player_id]->Player->[user_id]->User->[first_name] [last_name]
Now, if I pull back only a single record, I could simply do
onlyPlayerOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id]).filter(player=[Player_id]).get()
onlyPlayerOnTeam.player.user.first_name
So if I was only needing to display the names, I believe I could pass the QuerySet in the view return and loop through it in the template and display what I need. But I can't figure out if you can do something similar when I need the names to be displayed as part of a form.
To populate the form, I believe I could loop through the initial QuerySet and build a new datastructure:
playersOnTeam = PlayerToTeam.objects.filter(tournament=[Tournament_id]).filter(team=[Team_id])
allPlayersData= []
for nextPlayer in playersOnTeam:
playerDetails= {
"player_id": nextPlayer.player.id,
"first_name": nextPlayer.player.user.first_name,
"last_name": nextPlayer.player.user.last_name,
}
allPlayersData.append(playerDetails)
form = StartingLineupForm(allPlayersData)
However, I fear that would result in a separate database call for every player/user!
And while that may be tolerable for 6-10 players, for larger datasets, that seems less than ideal. Looping through performing a query for every user seems completely wrong.
Furthermore, what's frustrating is that this would be simple enough with a straight SQL query:
SELECT User.first_name, User.last_name
FROM PlayerToTeam
INNER JOIN Player ON PlayerToTeam.player_id = Player.id
INNER JOIN User ON Player.user_id = User.id
WHERE PlayerToTeam.tournament_id=[tourney_id] AND PlayerToTeam.team_id=[team_id]
But I'm trying to stick to the Django ORM best practices as much as I can and avoid just dropping to SQL queries when I can't immediately figure something out, and I'm all but certain that this isn't so complicated of a situation that I can't accomplish this without resorting to direct SQL queries.
I'm starting to look at select_related and prefetch_related, but I'm having trouble wrapping my head around how those work for relations more than a single table connection away. Like it seems like I could access the Player.age data using the prefetch, but I don't know how to get to User.first_name from that.
Any help would be appreciated.
I would suggest two approaches:
A) select related (one DB query):
objects = PlayerToTeam.objects.filter(
...
).select_related(
'player__user',
).only('player__user__name')
name = objects.first().user.name
B) annotate (one DB query):
objects = PlayerToTeam.objects.filter(
...
).annotate(
player_name=F('player__user__name'),
)
name = objects.first().player_name
To be sure you have only one object for specific player, team and tournament, I would suggest adding unique_together:
class PlayerToTeam(models.Model):
...
class Meta:
unique_together = ('player', 'team', 'tournament', )
Suppose I have the two models below and I want to get a queryset of all developers that have games where the platform field matches a certain value. How would I go about that?
class Developer(models.Model):
name = models.CharField(max_length=100, default="Unknown")
class Game(models.Model):
name = models.CharField(max_length=300)
developer = models.ForeignKey(Developer, related_name="games", on_delete=models.CASCADE)
platform = models.CharField(max_length=40)
I tried a few approached but can't seem to figure anything out that works.
You can query this with:
Developer.objects.filter(games__platform='name-of-platform').distinct()
Without the .distinct() [Django-doc], the same developer will be returned multiple times, if they developed multiple Games for the same platform. If that is not a problem, you can of course omit the .distinct().
I want to implement something like the pattern which introduced in this answer. For example I have four models like this:
class Protperty(models.Model):
property_type = models.CharField(choices=TYPE_CHOICES, default='float', max_length=100)
property_name = models.CharField(max_length=100)
class FloatProperty(models.Model):
property_id = models.ForeignKey(Property, related_name='value')
value = models.FloatField(default=0.0)
class IntProperty(models.Model):
property_id = models.ForeignKey(Property, related_name='value')
value = models.IntField(default=0)
class StringProperty(models.Model):
property_id = models.ForeignKey(Property, related_name='value')
value = models.CharField(max_lenght=100, blank=True, default='')
After defining these classes, I do not know how I must implement serializer or view classes. For example for writing serializer I want to put a field value, which must be set depend of type of object currently is serialized or deserialilzed(property_type defines it).
I am new to django and rest framework too, please give me some suggestions for implementing such models and serializers.
Edit:
In general I want to construct some models that using them I able to store run time defining property with different values and able to query them further. For example I have a store and it has different goods which each one has specific property and I want to query them using a specific property.
Suppose I have the following models:
class User(models.Model):
# ... some fields
class Tag(models.Model):
# ... some fields
class UserTag(models.Model):
user = models.ForeignKey(User, related_name='tags')
tag = models.ForeignKey(Tag, related_name='users')
date_removed = models.DateTimeField(null=True, blank=True)
Now I lets say I want to get all the users that have a given tag that has not yet been removed (ie date_removed=None). If I didn't have to worry about the date_removed constraint, I could do:
User.objects.filter(tags__tag=given_tag)
But I want to get all users who have that given tag and have the tag without a date_removed on it. Is there an easy way in Django to get that in a single queryset? And assume I have millions of Users, so getting any sort of list of User IDs and keeping it in memory is not practical.
Your filter() call can include multiple constraints:
User.objects.filter(tags__tag=given_tag, tags__date_removed=None)
When they match, they will both match to the same Tag, not two possibly different ones.
See the documentation on spanning multi-valued relationships;
in particular, the difference between filter(a, b) and filter(a).filter(b).
I have a model 'Status' with a ManyToManyField 'groups'. Each group has a ManyToManyField 'users'. I want to get all the users for a certain status. I know I can do a for loop on the groups and add all the users to a list. But the users in the groups can overlap so I have to check to see if the user is already in the group. Is there a more efficient way to do this using queries?
edit: The status has a list of groups. Each group has a list of users. I want to get the list of users from all the groups for one status.
Models
class Status(geomodels.Model):
class Meta:
ordering = ['-date']
def __unicode__(self):
username = self.user.user.username
return "{0} - {1}".format(username, self.text)
user = geomodels.ForeignKey(UserProfile, related_name='statuses')
date = geomodels.DateTimeField(auto_now=True, db_index=True)
groups = geomodels.ManyToManyField(Group, related_name='receivedStatuses', null=True, blank=True)
class Group(models.Model):
def __unicode__(self):
return self.name + " - " + self.user.user.username
name = models.CharField(max_length=64, db_index=True)
members = models.ManyToManyField(UserProfile, related_name='groupsIn')
user = models.ForeignKey(UserProfile, related_name='groups')
I ended up creating a list of the groups I was looking for and then querying all users that were in any of those groups. This should be pretty efficient as I'm only using one query.
statusGroups = []
for group in status.groups.all():
statusGroups.append(group)
users = UserProfile.objects.filter(groupsIn__in=statusGroups)
As you haven't posted your models, its a bit difficult to give you a django queryset answer, but you can solve your overlapping problem by adding your users to a set which doesn't allow duplicates. For example:
from collections import defaultdict
users_by_status = defaultdict(set)
for i in Status.objects.all():
for group in i.group_set.all():
users_by_status[i].add(group.user.pk)
Based on your posted model code, the query for a given status is:
UserProfile.objects.filter(groupsIn__receivedStatuses=some_status).distinct()
I'm not 100% sure that the distinct() call is necessary, but I seem to recall that you'd risk duplicates if a given UserProfile were in multiple groups that share the same status. The main point is that filtering on many-to-many relationships works using the usual underscore notation, if you use the names as defined either by related_name or the default related name.