Django __str__ reduce SQL queries - python

I have two models in my Django project
class BookSerie(models.Model):
id = models.AutoField(primary_key=True)
title = models.CharField(max_length=255)
class BookVolume(models.Model):
isbn = models.CharField(max_length=17)
volumeNumber = models.PositiveIntegerField()
serie = models.ForeignKey(BookSerie)
def __str__(self):
return self.serie.title+" volume "+str(self.volumeNumber)+" - "+str(self.isbn)
I only use __ str __ for my admin panel but when I use this code in my view (serie with id=1 have 5 volumes) :
def serieDetails(request, title):
try:
seriequery = BookSerie.objects.get(slug=title)
BookVolume.objects.filter(serie=seriequery).order_by('volumeNumber')
except BookSerie.DoesNotExist:
raise Http404("Serie does not exist")
return render(request, 'book/serieDetails.html', {'serie': seriequery, 'volumes' : volumesquery})
I have an important issue :
Query SELECT ••• FROM "book_bookserie" WHERE "book_bookserie"."id" = '1' is performed 5 times (django debug toolbar give this code line return self.serie.title+" volume "+str(self.volumeNumber)+" - "+str(self.isbn)
Query
SELECT ••• FROM "book_bookvolume" WHERE "book_bookvolume"."serie_id" = '1' ORDER BY "book_bookvolume"."volumeNumber" ASC
is performed 2 times

In your BookVolume's __str__ you access self.serie.title. This hits the database every time, as the according BookSerie record must be retrieved. One way to reduce queries here is to use select_related when you query your BookVolume:
# any reason why you don't store this QuerySet to a variable?
BookVolume.objects.filter(serie=seriequery).order_by('volumeNumber').select_related('serie')
# better:
seriequery.bookvolume_set.order_by('volumeNumber').select_related('serie')
From the docs:
select_related...
... will “follow” foreign-key relationships, selecting additional related-object data when it executes its query. This is a performance booster which results in a single more complex query but means later use of foreign-key relationships won’t require database queries.

Related

LEFT JOIN with other param in ON Django ORM

I have the following models:
class Customer(models.Model):
name = models.CharField(max_length=255)
email = models.EmailField(max_length = 255, default='example#example.com')
authorized_credit = models.IntegerField(default=0)
balance = models.IntegerField(default=0)
class Transaction(models.Model):
customer = models.ForeignKey(Customer, on_delete=models.CASCADE)
payment_amount = models.IntegerField(default=0) #can be 0 or have value
exit_amount = models.IntegerField(default=0) #can be 0 or have value
transaction_date = models.DateField()
I want to query for get all customer information and date of last payment.
I have this query in postgres that is correct, is just that i need:
select e.*, max(l.transaction_date) as last_date_payment
from app_customer as e
left join app_transaction as l
on e.id = l.customer_id and l.payment_amount != 0
group by e.id
order by e.id
But i need this query in django for an serializer. I try with that but return other query
In Python:
print(Customer.objects.filter(transaction__isnull=True).order_by('id').query)
>>> SELECT app_customer.id, app_customer.name, app_customer.email, app_customer.balance FROM app_customer
LEFT OUTER JOIN app_transaction
ON (app_customer.id = app_transaction.customer_id)
WHERE app_transaction.id IS NULL
ORDER BY app_customer.id ASC
But that i need is this rows
example
Whether you are working with a serializer or not you can reuse the same view/function for both the tasks.
First to get the transaction detail for the current customer object you have you have to be aware of related_name.related_name have default values but you can mention something unique so that you remember.
Change your model:
class Transaction(models.Model):
customer = models.ForeignKey(Customer, related_name="transac_set",on_delete=models.CASCADE)
related_names are a way in django to create reverse relationship from Customer to Transaction this way you will be able to do Customer cus.transac_set.all() and it will fetch all the transaction of cus object.
Since you might have multiple customers to get transaction details for you can use select_related() when querying this will hit the database least number of times and get all the data for you.
Create a function definition to get the data of all transaction of Customers:
def get_cus_transac(cus_id_list):
#here cus_id_list is the list of ids you need to fetch
cus_transac_list = Transaction.objects.select_related(customer).filter(id__in = cus_id_list)
return cus_transac_list
For your purpose you need to use another way that is the reason you needed related_name, prefetch_related().
Create a function definition to get the data of latest transaction of Customers: ***Warning: I was typing this answer before sleeping so there is no way the latest value of transaction is being fetched here.I will add it later but you can work on similar terms and get it done this way.
def get_latest_transac(cus_id_list):
#here cus_id_list is the list of ids you need to fetch
latest_transac_list = Customer.objects.filter(id__in = cus_id_list).prefetch_related('transac_set')
return latest_transac_list
Now coming to serializer,you need to have 3 serializers (Actually you need 2 only but third one can serialize Customer data + latest transaction that you need) ...one for Transaction and another for customer then the 3rd Serializer to combine them.
There might be some mistakes in code or i might have missed some details.As i have not checked it.I am assuming you know how to make serializers and views for the same.
One approach is to use subqueries:
transaction_subquery = Transaction.objects.filter(
customer=OuterRef('pk'), payment_amount__gt=0,
).order_by('-transaction_date')
Customer.objects.annotate(
last_date_payment=Subquery(
transaction_subquery.values('transaction_date')[:1]
)
)
This will get all customer data, and annotate with their last transaction date that has payment_amount as non-zero, in one query.
To solve your problem:
I want to query for get all customer information and date of last payment.
You can try use order by combine with distinct:
Customer.objects.prefetch_related('transaction_set').values('id', 'name', 'email', 'authorized_credit', 'balance', 'transaction__transaction_date').order_by('-transaction__transaction_date').distinct('transaction__transaction_date')
Note:
It only applies to PostgreSQL when distinct followed by parameters.
Usage of distinct: https://docs.djangoproject.com/en/3.2/ref/models/querysets/#distinct

Aggregation by Foreign Key and other field in Django Admin

I'm working in Django and having an issue displaying something properly in my Admin site. These are the models
class IndexSetSize(models.Model):
""" A time series of sizes for each index set """
index_set = models.ForeignKey(IndexSet, on_delete=models.CASCADE)
byte_size = models.BigIntegerField()
timestamp = models.DateTimeField()
class IndexSet(models.Model):
title = models.CharField(max_length=4096)
# ... some other stuff that isn't really important
def __str__(self):
return f"{self.title}"
It is displaying all the appropriate data I need, but, I want to display the sum of IndexSetSize, grouped by the index_set key and also grouped by the timestamp (There can be multiple occurrences of an IndexSet for a given timestamp, so I want to add up all the byte_sizes). Currently is just showing every single record. Additionally, I would prefer the total_size field to be sortable
Current Admin model looks like:
class IndexSetSizeAdmin(admin.ModelAdmin):
""" View-only admin for index set sizes """
fields = ["index_set", "total_size", "byte_size", "timestamp"]
list_display = ["index_set", "total_size", "timestamp"]
search_fields = ["index_set"]
list_filter = ["index_set__title"]
def total_size(self, obj):
""" Returns human readable size """
if obj.total_size:
return humanize.naturalsize(obj.total_size)
return "-"
total_size.admin_order_field = 'total_size'
def get_queryset(self, request):
queryset = super().get_queryset(request).select_related()
queryset = queryset.annotate(
total_size=Sum('byte_size', filter=Q(index_set__in_graylog=True)))
return queryset
It seems the proper way to do a group by in Django is to use .values(), although if I use that in get_queryset, an error is thrown saying Cannot call select_related() after .values() or .values_list(). I'm having trouble finding in the documentation if there's a 'correct' way to values/annotate/aggregate that will work correctly with get_queryset. It's a pretty simple sum/group by query I'm trying to do, but I'm not sure what the "Django way" is to accomplish it.
Thanks
I don't think you would be able to return the full queryset and group by index_set in get_queryset as you can't select all columns but group by an individual column in sql
SELECT *, SUM(index_size) FROM indexsetsize GROUP BY index_set // doesn't work
You could perform an extra query in the total_size method to get the aggregated value. However, this would perform the query for every row returned and slow your page load down.
def total_size(self, obj):
""" Returns human readable size """
return humanize.naturalsize(sum(IndexSetSize.objects.filter(
index_set=obj.index_set).values_list(
'byte_size', flat=True)))
total_size.admin_order_field = 'total_size'
It would be better to perform this annotation within the IndexSetAdmin as the index_set will already be grouped through the reverse foreign key. This will mean you can perform the annotation in get_queryset. I would also set the related_name on the foreign key on IndexSetSize so you can access the realted IndexSetSize objects from IndexSet using that name.
class IndexSetSize(models.Model):
index_set = models.ForeignKey(IndexSet, on_delete=models.CASCADE, related_name='index_set_sizes')
...
class IndexSetAdmin(admin.ModelAdmin):
...
def total_size(self, obj):
""" Returns human readable size """
if obj.total_size:
return humanize.naturalsize(obj.total_size)
return "-"
def get_queryset(self, request):
queryset = super().get_queryset(request).prefetch_related('index_set_sizes').annotate(
total_size=Sum('index_set_sizes__byte_size')).order_by('total_size')
return queryset

Django: Filter a Queryset made of unions not working

I defined 3 models related with M2M relationsships
class Suite(models.Model):
name = models.CharField(max_length=250)
title = models.CharField(max_length=250)
icon = models.CharField(max_length=250)
def __str__(self):
return self.title
class Role(models.Model):
name = models.CharField(max_length=250)
title = models.CharField(max_length=250)
suites = models.ManyToManyField(Suite)
services = models.ManyToManyField(Service)
Actions = models.ManyToManyField(Action)
users = models.ManyToManyField(User)
def __str__(self):
return self.title
In one of my views I tried to collect all the Suites related to an specific User. The user may be related to several Roles that can contain many Suites. And then filter Suites by name. But the filter seem to have no effects
queryset = Suite.objects.union(*(role.suites.all() for role in
self.get_user().role_set.all()))
repr(self.queryset)
'<QuerySet [<Suite: energia>, <Suite: waste 4 thing>]>'
self.queryset = self.queryset.filter(name="energia")
repr(self.queryset)
'<QuerySet [<Suite: energia>, <Suite: waste 4 thing>]>'
The query atribute inside the queryset not alter its content before executin the filter:
(SELECT "navbar_suite"."id", "navbar_suite"."name", "navbar_suite"."title", "navbar_suite"."icon" FROM "navbar_suite") UNION (SELECT "navbar_suite"."id", "navbar_suite"."name", "navbar_suite"."title", "navbar_suite"."icon" FROM "navbar_suite" INNER JOIN "navbar_role_suites" ON ("navbar_suite"."id" = "navbar_role_suites"."suite_id") WHERE "navbar_role_suites"."role_id" = 1)
(SELECT "navbar_suite"."id", "navbar_suite"."name", "navbar_suite"."title", "navbar_suite"."icon" FROM "navbar_suite") UNION (SELECT "navbar_suite"."id", "navbar_suite"."name", "navbar_suite"."title", "navbar_suite"."icon" FROM "navbar_suite" INNER JOIN "navbar_role_suites" ON ("navbar_suite"."id" = "navbar_role_suites"."suite_id") WHERE "navbar_role_suites"."role_id" = 1)
As stated in django docs, only count(), order_by(), values(), values_list() and slicing of union queryset is allowed. You can't filter on union queryset.
That means, you have to apply filters on queries before applying union on them.
Also, you can achieve your goal without even using union():
Suite.objects.filter(role_set__users=self.get_user(), name="energia")
You may need to adjust field name in filter if you've used related_name or related_query_name in definition of suites M2M field in Role model.
I had the same issue and ended up using the union query as a subquery so that the filters could work:
yourModelUnionSubQuerySet = YourModelQS1.union(YourModelQS2)
yourModelUnionQuerySet = YourModel.objects.filter(id__in=yourModelUnionSubQuerySet.values('id'))
There is a simple solution. Just use
self.queryset = self.queryset | <querySet you want to append>
instead of
self.queryset = self.queryset.union(<QuerySet you want to append>)
Worked for me. I hope this is understandable. After this you will be able to use filter.

Django: Creating a non-destinct union of two querysets

I am writing an accouting app in django and there are Orders, which have a date when the invoice was created and an optional date when a credit note is created.
class Order(models.Model):
date_invoice_created = models.DateTimeField(null=True, blank=True)
date_credit_note_created = models.DateTimeField(null=True, blank=True)
I'm currently developing the view for our accountant, and she'd like to have both the invoice and the credit note on separate rows in the admin panel, sorted by theirs respective creation dates.
So basically I'd like to show the same model twice, in different row, sorted by different fields. In SQL, this would be something like:
SELECT id, create_date FROM (
SELECT id, date_invoice_created AS create_date, 'invoice' AS type FROM order
UNION
SELECT id, date_credit_note_created AS create_date, 'creditnote' AS type FROM order
) ORDER BY create_date
Don't mind my SQL-fu not being up-to-date, but I guess you understand what I mean.
So I've tried to get django to do this for me, by overriding the date in the second queryset, because django does not support the union of two extra'd querysets:
invoices = Order.objects.filter(date_invoice_created__isnull=False)
credit_notes = Order.filter_valid_orders(qs
).filter(
date_credit_note_created__isnull=False
).extra(
select={'date_invoice_created': 'date_credit_note_created'}
)
return (invoices | credit_notes).order_by('date_invoice_created')
unfortunately, the bit-wise-or operation for union always makes sure that the IDs are distinct, but I really want them not to be. How can I achieve to have a union with duplicate rows?
I have now found the solution to my problem using a SQL-View.
I've created a new migration (using south), which contains the above SQL query mentioned in the question as a view, which returns all rows twice, each with a create_date and type respectively for the credit note and the invoice.
accounting/migrations/00xx_create_invoice_creditnote_view.py:
class Migration(SchemaMigration):
def forwards(self, orm):
query = """
CREATE VIEW invoiceoverview_invoicecreditnoteunion AS
SELECT * FROM (
SELECT *,
date_invoice_created AS create_date,
'invoice' AS type
FROM accounting_order
WHERE date_invoice_created NOT NULL
UNION
SELECT *,
date_credit_note_created AS date,
'creditnote' AS type
FROM accounting_order
WHERE date_credit_note_created NOT NULL
);
"""
db.execute(query)
def backwards(self, orm):
query = """
DROP VIEW invoiceoverview_invoicecreditnoteunion;
"""
db.execute(query)
# ...
# the rest of the migration model
# ...
Then I've created a new model for this view, which has the Meta managed = False so that django uses the model without caring about it's creation. It has all the same fields as the original Order model, but also includes the two new fields from the SQL-View:
invoiceoverview/models.py:
class InvoiceCreditNoteUnion(models.Model):
""" This class is a SQL-view to Order, so that the credit note and
invoice can be displayed independently.
"""
class Meta:
managed = False # do not manage the model in the DB
# fields of the view
date = models.DateTimeField()
type = models.CharField(max_length=255)
# ...
# all the other fields of the original Order
# ...
Now I can use this model for the contrib.admin.ModelAdmin and display the appripriate content by checking the type field. e.g.:
class InvoiceAdmin(admin.ModelAdmin):
list_display = ['some_special_case']
def some_special_case(self, obj):
if obj.type == 'creditnote':
return obj.credit_note_specific field
else:
return obj.invoice_specific_field
admin.site.register(InvoiceCreditNoteUnion, InvoiceAdmin)
This finally allows me to use all the other features provided by the admin-panel, e.g. overriding the queryset method, sorting etc.

optimizing no. of queries being fired in django orm for a given model

I have an object, that is uploaded by the user, it contains several details, but for the sake of clarity, can be simply defined by the following model representation -
After this other users can upvote and downvote what this user has uploaded hence a vote model.
Now I want to get the upvotes and downvotes of all the objects to be displayed in the template. Hence I add two functions to the ObjectDetail class, as upvote and downvote.
The trouble with this model is, say there are 20 objects, for each object 2 queries are fired one to get the upvote and the other to get the downvote. Hence the no. of queries are 40 now for 20 objects.
What would be a good way to tweak this to reduce the number of queries, and display the upvotes and downvotes on each of the object?
class ObjectDetail(models.Model):
title = models.CharField()
img = models.ImageField()
description = models.TextField()
uploaded_by = models.ForeignKey(User, related_name='voted_by')
#property
def upvote(self):
upvote = Vote.objects.filter(shared_object__id = self.id,
vote_type = True).count()
return upvote
#property
def downvote(self):
downvote = Vote.objects.filter(shared_object__id = self.id,
vote_type = False).count()
return downvote
class Vote(models.Model):
vote_type = models.BooleanField(default = False)
voted_by = models.ForeignKey(User, related_name='voted_by')
voted_for = models.ForeignKey(User, related_name='voted_for')
shared_object = models.ForeignKey(ObjectDetail, null=True, blank=True)
dtobject = models.DateTimeField(auto_now_add=True)
On one hand, django does give you the capability to write raw SQL when you have to. But this example is simple, you should not have to use raw SQL to get this information.
Django will put off making a query until you access the results of a queryset. So you can try to compose the whole query using querysets and Q objects and then access the results on the composed query - this should trigger one DB query (or one per model, rather than one per instance) for all the results.
So, how to do that? You want to get all the Vote records for a given set of ObjectDetail records. I'm going to assume you have a list of ids of ObjectDetail records.
Unfortunately, your upvote and downvote properties return the result of "count" on their query sets. This counts as an "access to the results" of the queryset produced by the "filter" call. I would change those method definitions to refer to the backwards-relation object manager vote_set, like so:
#property
def upvote(self):
answer = 0
for vote in self.vote_set.all ():
if vote.vote_type:
answer += 1
return answer
#property
def downvote(self):
answer = 0
for vote in self.vote_set.all ():
if not vote.vote_type:
answer += 1
return answer
Note we just access the query set of votes for the current object. At this stage, we are assuming that the orm can access the cached results.
Now, in the view and/or template, we want to assemble the big complex query.
My example is a functional view:
def home (request):
# just assigning a constant list for simplicity.
# Also was lazy and did 10 examples rather than 20.
objids = [ 1, 5, 15, 23, 48, 52, 55, 58, 59, 60 ]
# make a bunch of Q objects, one for each object id:
q_objs = []
for objid in objids:
q_objs.append(Q(id__exact = objid))
# 'or' them together into one big Q object.
# There's probably a much nicer way to do this.
big_q = q_objs[0]
for q_obj in q_objs[1:]:
big_q |= q_obj
# Make another queryset that will ask for the Vote objects
# along with the ObjectDetail objects.
# Try commenting out this line and uncommenting the one below.
the_objects = ObjectDetail.objects.filter(big_q).prefetch_related('vote_set')
# the_objects = ObjectDetail.objects.filter(big_q)
template = 'home.html'
context = {
'the_objects' : the_objects,
}
context_instance = RequestContext (request)
return render_to_response (template, context, context_instance)
Here are some pointers to related documentation:
https://docs.djangoproject.com/en/1.5/topics/db/queries/#querysets-are-lazy
https://docs.djangoproject.com/en/1.5/ref/models/querysets/#when-querysets-are-evaluated
https://docs.djangoproject.com/en/1.5/topics/db/queries/#complex-lookups-with-q-objects
https://docs.djangoproject.com/en/1.5/topics/db/queries/#following-relationships-backward
It's in the docs at https://docs.djangoproject.com/en/1.5/ref/models/querysets/#django.db.models.query.QuerySet.extra
I am using the extra() clause to inject some raw sql here.
EDIT: this works with an app called 'vot' and at least Sqlite. Change the vot_* table names to your needs.
from django.db.models import Count
objects = ObjectDetail.objects.all().extra(
select={ 'upvotes': '''SELECT COUNT(*) FROM vot_vote
WHERE vot_vote.shared_object_id = vot_objectdetail.id
AND vot_vote.vote_type = 1''',
'downvotes': '''SELECT COUNT(*) FROM vot_vote
WHERE vot_vote.shared_object_id=vot_objectdetail.id
AND vot_vote.vote_type = 0'''})
Now each element in objects has a upvotes and downvotes property.

Categories