I'm working on a web application with Django & PostgreSQL as Backend tech stack.
My models.py has 2 crucial Models defined. One is Product, and the other one Timestamp.
There are thousands of products and every product has multiple timestamps (60+) inside the DB.
The timestamps hold information about the product's performance for a certain date.
class Product:
owner = models.ForeignKey(AmazonProfile, on_delete=models.CASCADE, null=True)
state = models.CharField(max_length=8, choices=POSSIBLE_STATES, default="St.Less")
budget = models.FloatField(null=True)
product_type = models.CharField(max_length=17, choices=PRODUCT_TYPES, null=True)
name = models.CharField(max_length=325, null=True)
parent = TreeForeignKey('self', on_delete=models.CASCADE, null=True, blank=True, related_name="children")
class Timestamp:
product = models.ForeignKey(Product, null=True, on_delete=models.CASCADE)
product_type = models.CharField(max_length=35, choices=ADTYPES, blank=True, null=True)
owner = models.ForeignKey(AmazonProfile, null=True, blank=True, on_delete=models.CASCADE)
clicks = models.IntegerField(default=0)
spend = models.IntegerField(default=0)
sales = models.IntegerField(default=0)
acos = models.FloatField(default=0)
cost = models.FloatField(default=0)
cpc = models.FloatField(default=0)
orders = models.IntegerField(default=0)
ctr = models.FloatField(default=0)
impressions = models.IntegerField(default=0)
conversion_rate = models.FloatField(default=0)
date = models.DateField(null=True)
I'm using the data for a dashboard, where users are supposed to be able to view their products & the performance of the products for a certain daterange inside a table.
For example, a user might have 100 products inside the table and would like to view all data from the past 2 weeks. For this scenario, I'll describe the code's proceedure below:
Make call to the backend / server
Server has to filter & aggregate all Timestamps for each Product
Server sends data back to client
Client updates table values
The problem is, that step 2. takes a huge amount of time, and I do not know how to improve the performance.
products = Product.objects.filter(name="example")
for product in products:
product.report_set.filter(date_gte="2021-01-01", date__lte="2011-01-14").aggregate(
Sum("clicks"),
Sum("cost"),
Sum("sales"))
That is how the server is currently retrieving the timestamp values for the displayed products.
Any ideas how to retrieve & structure the data in a more efficient way?
It's slow because of the multiple queries you need to make to the database (in the loop).
See if grouping and annotating is better(one query then perhaps queries for fetching each product):-
Timestamp.objects.filter(daterange=["2011-01-01", "2011-01-15"]).values('product').annotate(sum_clicks=Sum("clicks")).annotate(sum_cost=Sum("cost")).annotate(sum_sales=Sum("sales"))
I don't know if this is possible but if it is it would be even better:-
Timestamp.objects.filter(daterange=["2011-01-01", "2011-01-15"]).values('product').annotate(sum_clicks=Sum("clicks")).annotate(sum_cost=Sum("cost")).annotate(sum_sales=Sum("sales")).select_related('product')
Edit:-
After looking back perhaps this might be better:-
products = Product.objects.filter(name="example", report_set__daterange=["2011-01-01", "2011-01-15"]).annotate(sum_clicks=Sum("report_set__clicks")).annotate(sum_cost=Sum("report_set__cost")).annotate(sum_sales=Sum("report_set__sales"))
Without more detail all i can recommend is to optimize, for database optimization i would follow the instructions listed here but know as you speed up the query there will likely be an increase in memory usage.
Related
I have the following models:
class Shelf(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
name = models.CharField(max_length=200, db_index=True)
slug = models.SlugField(max_length=200, editable=False)
games = models.ManyToManyField(Game, blank=True, through='SortedShelfGames')
objects = ShelfManager()
description = models.TextField(blank=True, null=True)
class SortedShelfGames(models.Model):
game = models.ForeignKey(Game, on_delete=models.CASCADE)
shelf = models.ForeignKey(Shelf, on_delete=models.CASCADE)
date_added = models.DateTimeField()
order = models.IntegerField(blank=True, null=True)
releases = models.ManyToManyField(Release)
objects = SortedShelfGamesManager.as_manager()
class Game(models.Model):
name = models.CharField(max_length=300, db_index=True)
sort_name = models.CharField(max_length=300, db_index=True)
...
I have a view where I want to get all of a user's SortedShelfGames, distinct on the Game relationship. I then want to be able to sort that list of SortedShelfGames on a few different fields. So right now, I'm doing the following inside of the SortedShelfGamesManager (which inherits from models.QuerySet) to get the list:
games = self.filter(
pk__in=Subquery(
self.filter(shelf__user=user).distinct('game').order_by('game', 'date_added').values('pk') # The order_by statement in here is to get the earliest date_added field for display
)
)
That works the way it's supposed to. However, whenever I try and do an order_by('game__sort_name'), the query takes forever in my python. When I'm actually trying to use it on my site, it just times out. If I take the generated SQL and just run it on my database, it returns all of my results in a fraction of a second. I can't figure out what I'm doing wrong here. The SortedShelfGames table has millions of records in it if that matters.
This is the generated SQL:
SELECT
"collection_sortedshelfgames"."id", "collection_sortedshelfgames"."game_id", "collection_sortedshelfgames"."shelf_id", "collection_sortedshelfgames"."date_added", "collection_sortedshelfgames"."order",
(SELECT U0."rating" FROM "reviews_review" U0 WHERE (U0."game_id" = "collection_sortedshelfgames"."game_id" AND U0."user_id" = 1 AND U0."main") LIMIT 1) AS "score",
"games_game"."id", "games_game"."created", "games_game"."last_updated", "games_game"."exact", "games_game"."date", "games_game"."year", "games_game"."quarter", "games_game"."month", "games_game"."name", "games_game"."sort_name", "games_game"."rating_id", "games_game"."box_art", "games_game"."description", "games_game"."slug", "games_game"."giantbomb_id", "games_game"."ignore_giantbomb", "games_game"."ignore_front_page", "games_game"."approved", "games_game"."user_id", "games_game"."last_edited_by_id", "games_game"."dlc", "games_game"."parent_game_id"
FROM
"collection_sortedshelfgames"
INNER JOIN
"games_game"
ON
("collection_sortedshelfgames"."game_id" = "games_game"."id")
WHERE
"collection_sortedshelfgames"."id"
IN (
SELECT
DISTINCT ON (U0."game_id") U0."id"
FROM
"collection_sortedshelfgames" U0
INNER JOIN
"collection_shelf" U1 ON (U0."shelf_id" = U1."id")
WHERE
U1."user_id" = 1
ORDER
BY U0."game_id" ASC, U0."date_added" ASC
)
ORDER BY
"games_game"."sort_name" ASC
I think you don't need a Subquery for this.
Here's what I ended up doing to solve this. Instead of using a Subquery, I created a list of primary keys by evaluating what I was using as the Subquery in, then feeding that into my query. It looks like this:
pks = list(self.filter(shelf__user=user).distinct('game').values_list('pk', flat=True))
games = self.filter(
pk__in=pks)
)
games = games.order_by('game__sort_name')
This ended up being pretty fast. This is essentially the same thing as the Subquery method, but whatever was going on underneath the hood in python/Django was slowing this way down.
I’m a python/django begginer. I decided to build a e-commerce website using django for an academic project. I’ve been able to develop enough of the project and build understanding of the subject, but right now I’m having issues finding a way to subtracts the number of items listed in the inventory whenever a order is made.
That’s the code for the models, evey product has it's own stock quantity call inventory:
class Product(models.Model):
name = models.CharField(max_length=200, null=True)
price = models.FloatField()
description = models.TextField(default='', null=True, blank=True)
digital = models.BooleanField(default=False,null=True, blank=True)
image = models.ImageField(null=True, blank=True)
inventory = models.IntegerField(default=0)
def __str__(self):
return self.name
def has_inventory(self):
return self.inventory > 0
This is the code I made to subtract base on quantity of the item ordered, but I can’t make it work, it won’t subtract the number of items from the inventory on the product stock.
class OrderItem(models.Model):
product = models.ForeignKey(Product, on_delete=models.SET_NULL, null=True)
order = models.ForeignKey(Order, on_delete=models.SET_NULL, null=True)
quantity = models.IntegerField(default=0, null=True, blank=True)
date_added = models.DateTimeField(auto_now_add=True)
def __str__(self):
return str(self.product) + " x " + str(self.quantity)
def inventory(self):
product.inventory = self.inventory
product.inventory -= int(self.quantity)
return inventory
What could I do to make it work?
All logic/action should be written under views.py file. You could create a function where it takes in a request, and when it does, it takes in all the value inputted through a form, and you could use filter to filter out the products you want to subtract its inventory and update query by Django to update the inventory.
It should look something like this inside your views function:
Product.objects.filter(name = name, description = description, digital = digital).update(Inventory = F('Inventory')-inventory)
Here is Django's documentation on queries: Django's Making Queries
I think there are a few problems with the snippet above.
First, the OrderItem.inventory is not referring the right value, it should be like the snippet below.
def inventory(self):
// remember the current stock is stored on Product.inventory
return self.product.inventory - self.quantity
Second, The method name should be remaining_stock not inventory to prevent misunderstanding.
def remaining_stock(self):
return self.product.inventory - self.quantity
Also, don't forget if you want to store inventory of the product please call the save method after successfully inserting the OrderItem.
I have 3 models and they are as follow
class Table(models.Model):
waiter = models.ForeignKey(get_user_model(), on_delete=models.CASCADE,
related_name='restaurant_table')
table_no = models.IntegerField()
objects = TableManager()
class Order(models.Model):
customer = models.ForeignKey(Customer, on_delete=models.CASCADE)
food = models.ManyToManyField(OrderFood, related_name='ordered_food')
order_status = models.ForeignKey(OrderStatus, on_delete=models.CASCADE)
table = models.ForeignKey(Table, on_delete=models.CASCADE)
datetime = models.DateTimeField(default=now)
class OrderStatus(models.Model):
CHOOSE = (
('Received', 'Received'),
('Cooking', 'Cooking'),
('WaiterHand', 'In Waiter Hand'),
('Delivered', 'Delivered'),
('Paid', 'Payment Completed'),
('Rejected', 'Rejected')
)
status = models.CharField(max_length=30, null=False, blank=False, choices=CHOOSE)
created_at = models.DateTimeField(auto_now=True)
updated_at = models.DateTimeField()
Actually I am creating a restaurant management system. So here a restaurant has tables associated with a or more waiter. But I need a new feature that is table status. I mean when an order is actively associated with the table that means that table is booked. Actually that is not a problem as I can do that in many ways.
One way is I will count the active order associated with this table and if I found any active order I will return the table is booked.
Another way is I will add an extra field with the table that is a flag. This flag store status of tables is booked or not I mean the boolean field.
But my question is not the solution. My question which one is better or there are any other good solutions. Please explain it briefly I want to know which solution is better and why.
you can put #property function under class Table which you can use directly with any table objects, in templates also.
#property
def check_table_status(self):
status = 'Not Booked'
if self.order_set.all().exists():
status = 'Booked'
return status
I have a Review model that is in one to one relationship with Rating model. A user can give a rating according to six different criteria -- cleanliness, communication, check_in, accuracy, location, and value -- which are defined as fields in the Rating model.
class Rating(models.Model):
cleanliness = models.PositiveIntegerField()
communication = models.PositiveIntegerField()
check_in = models.PositiveIntegerField()
accuracy = models.PositiveIntegerField()
location = models.PositiveIntegerField()
value = models.PositiveIntegerField()
class Review(models.Model):
room = models.ForeignKey('room.Room', on_delete=models.SET_NULL, null=True)
host = models.ForeignKey('user.User', on_delete=models.CASCADE, related_name='host_reviews')
guest = models.ForeignKey('user.User', on_delete=models.CASCADE, related_name='guest_reviews')
rating = models.OneToOneField('Rating', on_delete=models.SET_NULL, null=True)
content = models.CharField(max_length=2000)
I am thinking of a way to calculate the overall rating, which would be the average of average of each column in the Rating model. One way could be using Django's aggregate() function, and another option could be prefetching all reviews and looping through each review to manually calculate the overall rating.
For example,
for room in Room.objects.all()
ratings_dict = Review.objects.filter(room=room)\
.aggregate(*[Avg(field) for field in ['rating__cleanliness', 'rating__communication', \
'rating__check_in', 'rating__accuracy', 'rating__location', 'rating__value']])
ratings_sum = 0
for key in ratings_dict.keys():
ratings_sum += ratings_dict[key] if ratings_dict[key] else 0
Or, simply looping through,
rooms = Room.objects.prefetch_related('review_set')
for room in rooms:
reviews = room.review_set.all()
ratings = 0
for review in reviews:
ratings += (review.rating.cleanliness + review.rating.communication + review.rating.check_in +
review.rating.accuracy + review.rating.location+ review.rating.value)/6
Which way would be more efficient in terms of time complexity and result in less DB calls?
Does aggregate(Avg('field_name')) produce one Avg query at the database level per function call?
Will first calling all rooms with prefetch_related() help reduce number of queries later when calling room.review_set.all()?
I am trying to compare two querysets based on a single field. But I can't figure out most efficient way to do it.
This is my model and I want to check if old and new room_scans(ForeignKey) has PriceDatum's with the same checkin date. if not, create PriceDatum with that checkin date related to the new room_scan.
class PriceDatum(models.Model):
"""
Stores a price for a date for a given currency for a given
listingscan
Multiple such PriceData objects for each day for next X months are created in each Frequent listing scan
"""
room_scan = models.ForeignKey(RoomScan, default=1, on_delete=models.CASCADE)
room = models.ForeignKey(Room, on_delete=models.CASCADE)
checkin = models.DateField(db_index=True, help_text="Check in date", null=True)
checkout = models.DateField(db_index=True, help_text="checkout date", null=True)
price = models.PositiveSmallIntegerField(help_text="Price in the currency stated")
refund_status = models.CharField(max_length=100, default="N/A")
# scanned = models.DateTimeField(db_index=True, help_text="Check in date", null=True)
availability_count = models.PositiveSmallIntegerField(help_text="How many rooms are available for this price")
max_people = models.PositiveSmallIntegerField(help_text="How many people can stay in the room for this price")
meal = models.CharField(max_length=100, default="N/A", help_text="Tells if breakfast is included in room price")
Below is the code what I am trying to do:
previous_prices_final = previous_prices.filter(refund_status='refund',
current_prices_final=current_prices.filter(
refund_status='refund', max_people=max_people_count, meal=meal).order_by().order_by('checkin')
if len(previous_prices_final) > len(current_prices_final):
difference=previous_prices_final.difference(current_prices_final)
for x in difference:
PriceDatum.objects.create(room_scan=x.room_scan,
room=x.room,
checkin=x.checkin,
checkout=x.checkout,
price=0,
refund_status='refund',
availability_count=0,
max_people=max_people_count,
meal='not_included',
)
The thing is that I get all queries as different, because room_scan foreign key has different time created.
My question is: How do I use difference(), based only on checkin field.
Don't select field that contains creating time. Limit your QS with values.