Joining and iterating through foreign-key tables with Django - python

I have the following 2 Django models:
from django.db import models
class Stock(models.Model):
symbol = models.CharField(db_index=True, max_length=5, null=False, editable=False, unique=True)
class PriceHistory(models.Model):
stock = models.ForeignKey(Stock, related_name='StockHistory_stock', editable=False)
trading_date = models.DateField(db_index=True, null=False, editable=False)
price = models.DecimalField(max_digits=12, db_index=True, decimal_places=5, null=False, editable=False)
class Meta:
unique_together = ('stock', 'date')
Obviously this leads to two DB tables being created: myapp_stock and myapp_pricehistory. These tables have 2 and 4 columns respectively. The first table contains thousands of rows. The second table contains millions of rows.
I want to join the tables, sort the resultant rows and iterate through these rows one-by-one print them. This is how I plan to do it:
for i in PriceHistory.object.all().order_by('stock__symbol', 'trading_date'):
print '{} {}: {}'.format(i.stock.symbol, i.trading_date, i.price)
Is this the most efficient way to do it to minimize calls to the database? I want it to run only one SQL query. I'm concerned that the above code will run a separate query of the myapp_stock table each time it goes through the for loop. Is this concern valid? If so, how to avoid that?
Basically, I know the ideal SQL would look something like this. How can I get Django to execute something similar?:
select
s.symbol,
ph.trading_date,
ph.price
from
myapp_stock as s,
myapp_pricehistory as ph
where
ph.stock_id=s.id
order by
s.symbol asc,
ph.trading_date asc

You need to use select_related to avoid making an additional query for each item in the loop:
histories = PriceHistory.objects.all().select_related('stock')\
.order_by('stock__symbol', 'trading_date')
for i in histories:
print '{} {}: {}'.format(i.stock.symbol, i.trading_date, i.price)

Related

How to get data from both sides of a many to many join in django

Let's say I have the following models:
class Well(TimeStampMixin, models.Model):
plate = models.ForeignKey(Plate, on_delete=models.CASCADE, related_name="wells")
row = models.TextField(null=False)
column = models.TextField(null=False)
class Meta:
unique_together = [["plate", "row", "column"]]
class Antibiotic(TimeStampMixin, models.Model):
name = models.TextField(null=True, default=None)
class WellConditionAntibiotic(TimeStampMixin, models.Model):
wells = models.ManyToManyField(Well, related_name="well_condition_antibiotics")
volume = models.IntegerField(null=True, default=None)
stock_concentration = models.IntegerField(null=True, default=None)
dosage = models.FloatField(null=True, default=None)
antibiotic = models.ForeignKey(
Antibiotic, on_delete=models.RESTRICT, related_name="antibiotics"
)
In plain english, there are a set of wells and each well can have multiple and many different types of antibiotics.
I'm trying to fetch the data of a given well and all of the antibiotics contained inside it.
I've tried WellConditionAntibiotic.objects.filter(wells__id=1).select_related('antibiotic')
which gives me this query:
SELECT
"kingdom_wellconditionantibiotic"."id",
"kingdom_wellconditionantibiotic"."created_at",
"kingdom_wellconditionantibiotic"."updated_at",
"kingdom_wellconditionantibiotic"."volume",
"kingdom_wellconditionantibiotic"."stock_concentration",
"kingdom_wellconditionantibiotic"."dosage",
"kingdom_wellconditionantibiotic"."antibiotic_id",
"kingdom_antibiotic"."id",
"kingdom_antibiotic"."created_at",
"kingdom_antibiotic"."updated_at",
"kingdom_antibiotic"."name"
FROM
"kingdom_wellconditionantibiotic"
INNER JOIN "kingdom_wellconditionantibiotic_wells" ON (
"kingdom_wellconditionantibiotic"."id" = "kingdom_wellconditionantibiotic_wells"."wellconditionantibiotic_id"
)
INNER JOIN "kingdom_antibiotic" ON (
"kingdom_wellconditionantibiotic"."antibiotic_id" = "kingdom_antibiotic"."id"
)
WHERE
"kingdom_wellconditionantibiotic_wells"."well_id" = 1
This gives me all of the antibiotic data, but none of the well data. So I tried
Well.objects.filter(pk=1).select_related(['well_condition_antibiotics', 'antibiotic']).query which errored.
How can I generate a django query to include all well data and all well antibiotic data?
Building up on your second attempt using Well, you will have to prefetch WellConditionAntibiotic and also select the related antibiotic like this:
from django.db.models import Prefetch
well = Well.objects.filter(pk=1).prefetch_related(
Prefetch(
"well_condition_antibiotics",
queryset=WellConditionAntibiotic.objects.select_related("antibiotic"),
)
)
Then you can just iterate through the related WellConditionAntibiotic entries with the corresponding antibiotic:
for well_condition_antiobiotic in well.well_condition_antibiotics.all():
print(well_condition_antiobiotic.antibiotic.name)
You can find more information about prefetch_related and Prefetch here..[Django-doc]

Django ORM filter by group with where clause

I have a model of students and opinions. Each student can change their opinion over time. What I ultimately want to do is create a plot showing the number of students with each opinion on a given day, but as a first step I want to count the number of students with each opinion on a given day.
My model is as follows (abbreviated for brevity):
class Student(models.Model):
first_name = models.CharField(max_length=30, null=True, blank=True)
surname = models.CharField(max_length=30, null=True, blank=True)
class Opinion(models.Model):
student = models.ForeignKey('Student', on_delete=models.CASCADE,null=True)
opdate = models.DateField(null=True, blank=True)
sentiment_choice = [
('Positive', 'Positive'),
('Negative', 'Negative'),
]
sentiment = models.CharField(
max_length=40,
choices=sentiment_choice,
default="Positive",
null=True, blank=True
)
My approach is to loop over all the dates in a range, filter the opinion table to get all the data upto that date, find the latest opinion per student, count these and load the results into an array.
I know how to filter the opinion table as follows (where start_date is my iterator):
Opinion.objects.filter(opdate__lte=start_date)
I also know how to pickup the latest opinion for each student:
Opinion.objects.values('student').annotate(latest_date=Max('opdate'))
How would I combine this so that I can get the latest opinion for each student that is prior to my iterator?
I'm working on Django 3.2.12 with an SQL Lite DB
You can use a Subquery expression [Django-doc] with:
from django.db.models import OuterRef, Subquery
Student.objects.annotate(
last_sentiment=Subquery(
Opinion.objects.filter(
student_id=OuterRef('pk')
).order_by('-opdate').values('sentiment')[:1]
)
)
The Students will have an extra attribute .last_sentiment that will contain the sentiment of the last related Opinion record, or NULL/None if there is no related Opinion.

Django order_by query runs incredibly slow in Python, but fast in DB

I have the following models:
class Shelf(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
name = models.CharField(max_length=200, db_index=True)
slug = models.SlugField(max_length=200, editable=False)
games = models.ManyToManyField(Game, blank=True, through='SortedShelfGames')
objects = ShelfManager()
description = models.TextField(blank=True, null=True)
class SortedShelfGames(models.Model):
game = models.ForeignKey(Game, on_delete=models.CASCADE)
shelf = models.ForeignKey(Shelf, on_delete=models.CASCADE)
date_added = models.DateTimeField()
order = models.IntegerField(blank=True, null=True)
releases = models.ManyToManyField(Release)
objects = SortedShelfGamesManager.as_manager()
class Game(models.Model):
name = models.CharField(max_length=300, db_index=True)
sort_name = models.CharField(max_length=300, db_index=True)
...
I have a view where I want to get all of a user's SortedShelfGames, distinct on the Game relationship. I then want to be able to sort that list of SortedShelfGames on a few different fields. So right now, I'm doing the following inside of the SortedShelfGamesManager (which inherits from models.QuerySet) to get the list:
games = self.filter(
pk__in=Subquery(
self.filter(shelf__user=user).distinct('game').order_by('game', 'date_added').values('pk') # The order_by statement in here is to get the earliest date_added field for display
)
)
That works the way it's supposed to. However, whenever I try and do an order_by('game__sort_name'), the query takes forever in my python. When I'm actually trying to use it on my site, it just times out. If I take the generated SQL and just run it on my database, it returns all of my results in a fraction of a second. I can't figure out what I'm doing wrong here. The SortedShelfGames table has millions of records in it if that matters.
This is the generated SQL:
SELECT
"collection_sortedshelfgames"."id", "collection_sortedshelfgames"."game_id", "collection_sortedshelfgames"."shelf_id", "collection_sortedshelfgames"."date_added", "collection_sortedshelfgames"."order",
(SELECT U0."rating" FROM "reviews_review" U0 WHERE (U0."game_id" = "collection_sortedshelfgames"."game_id" AND U0."user_id" = 1 AND U0."main") LIMIT 1) AS "score",
"games_game"."id", "games_game"."created", "games_game"."last_updated", "games_game"."exact", "games_game"."date", "games_game"."year", "games_game"."quarter", "games_game"."month", "games_game"."name", "games_game"."sort_name", "games_game"."rating_id", "games_game"."box_art", "games_game"."description", "games_game"."slug", "games_game"."giantbomb_id", "games_game"."ignore_giantbomb", "games_game"."ignore_front_page", "games_game"."approved", "games_game"."user_id", "games_game"."last_edited_by_id", "games_game"."dlc", "games_game"."parent_game_id"
FROM
"collection_sortedshelfgames"
INNER JOIN
"games_game"
ON
("collection_sortedshelfgames"."game_id" = "games_game"."id")
WHERE
"collection_sortedshelfgames"."id"
IN (
SELECT
DISTINCT ON (U0."game_id") U0."id"
FROM
"collection_sortedshelfgames" U0
INNER JOIN
"collection_shelf" U1 ON (U0."shelf_id" = U1."id")
WHERE
U1."user_id" = 1
ORDER
BY U0."game_id" ASC, U0."date_added" ASC
)
ORDER BY
"games_game"."sort_name" ASC
I think you don't need a Subquery for this.
Here's what I ended up doing to solve this. Instead of using a Subquery, I created a list of primary keys by evaluating what I was using as the Subquery in, then feeding that into my query. It looks like this:
pks = list(self.filter(shelf__user=user).distinct('game').values_list('pk', flat=True))
games = self.filter(
pk__in=pks)
)
games = games.order_by('game__sort_name')
This ended up being pretty fast. This is essentially the same thing as the Subquery method, but whatever was going on underneath the hood in python/Django was slowing this way down.

django filter get parent to child

i am using django(3.1.5). and i am trying to get parent model to child model by filter query
i have model like -
class Product(models.Model):
product_name = models.CharField(max_length=255, unique=True)
is_feature = models.BooleanField(default=False)
is_approved = models.BooleanField(default=False)
created_at = models.DateTimeField(auto_now_add=True)
class ProductGalleryImage(models.Model):
product = models.ForeignKey(Product, on_delete=models.CASCADE)
product_gallery_image = models.FileField(upload_to='path')
is_feature = models.BooleanField(default=False)
i am getting data from SELECT * FROM products_product AS pp INNER JOIN products_productgalleryimage AS ppgi ON ppgi.product_id = pp.id WHERE ppgi.is_feature=1 AND pp.is_feature=1 AND is_approved=1 ORDER BY pp.created_at LIMIT 4 mysql query.
so how can i get data like this query in django filter query
Firstly you can add related_name to ProductGalleryImage for better query support like this
product = models.ForeignKey(Product, on_delete=models.CASCADE, related_name='product_images')
Then your query should be like this
products=Product.objects.filter(is_approved=True, is_feature=True, product_images__is_feature=True).order_by('created_at')[:4]
You can simply loop over the other related model like so:
for product_gallery_image in product_instance.productgalleryimage_set.all():
print(product_gallery_image.product_gallery_image)
The productgalleryimage_set here is simply the related model name in lowercase with _set appended. You can change this by setting the related_name attribute on the foreign key.
Note: This will perform a query to fetch each of the product_gallery_image objects of some product instance.
If you want to get the first object only:
product_gallery_image = product_instance.productgalleryimage_set.first()
If you want to perform a join as in your example which will perform only one query you can use select_related (this will only work in forward direction for reverse direction look at prefetch_related):
product_gallery_images = ProductGalleryImage.objects.all().select_related('product')
for product_gallery_image in product_gallery_images:
print(product_gallery_image.product.product_name)
print(product_gallery_image.product_gallery_image)

Need help in Django Orm query

I have 3 models and they are as follow
class Table(models.Model):
waiter = models.ForeignKey(get_user_model(), on_delete=models.CASCADE,
related_name='restaurant_table')
table_no = models.IntegerField()
objects = TableManager()
class Order(models.Model):
customer = models.ForeignKey(Customer, on_delete=models.CASCADE)
food = models.ManyToManyField(OrderFood, related_name='ordered_food')
order_status = models.ForeignKey(OrderStatus, on_delete=models.CASCADE)
table = models.ForeignKey(Table, on_delete=models.CASCADE)
datetime = models.DateTimeField(default=now)
class OrderStatus(models.Model):
CHOOSE = (
('Received', 'Received'),
('Cooking', 'Cooking'),
('WaiterHand', 'In Waiter Hand'),
('Delivered', 'Delivered'),
('Paid', 'Payment Completed'),
('Rejected', 'Rejected')
)
status = models.CharField(max_length=30, null=False, blank=False, choices=CHOOSE)
created_at = models.DateTimeField(auto_now=True)
updated_at = models.DateTimeField()
Actually I am creating a restaurant management system. So here a restaurant has tables associated with a or more waiter. But I need a new feature that is table status. I mean when an order is actively associated with the table that means that table is booked. Actually that is not a problem as I can do that in many ways.
One way is I will count the active order associated with this table and if I found any active order I will return the table is booked.
Another way is I will add an extra field with the table that is a flag. This flag store status of tables is booked or not I mean the boolean field.
But my question is not the solution. My question which one is better or there are any other good solutions. Please explain it briefly I want to know which solution is better and why.
you can put #property function under class Table which you can use directly with any table objects, in templates also.
#property
def check_table_status(self):
status = 'Not Booked'
if self.order_set.all().exists():
status = 'Booked'
return status

Categories