I have 30 instances of the Room objects, i.e. 30 rows in the database table.
In Python code I have Room.objects.all().delete().
I see that Django ORM translated it into the following PostgreSQL query: DELETE FROM "app_name_room" WHERE "app_name_room"."id" IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30).
Why doesn't the Django ORM use a more parsimonious DELETE FROM app_name_room query? Is there any way to switch to it and avoid listing all IDs?
Interesting question. It got me thinking so I went a little deeper. The main reason could be that using DELETE FROM app_name_room doesn't take care of CASCADE delete
However, answering your question
Is there any way to switch to it and avoid listing all IDs?
You can do this using the private method _raw_delete. For instance:
objects_to_delete = Foo.objects.all()
objects_to_delete._raw_delete(objects_to_delete.db)
This will execute the following query:
DELETE FROM "objects_to_delete"
PS: According to the function docstring:
Delete objects found from the given queryset in single direct SQL query. No signals are sent and there is no protection for cascades.
Related
You can add to an existing Postgresql tsvector value using ||, for example:
UPDATE acme_table
SET my_tsvector = my_tsvector ||
to_tsvector('english', 'some new words to add to existing ones')
WHERE id = 1234;
Is there any way to access this functionality via the Django ORM? I.e. incrementally add to an existing SearchVectorField value rather than reconstruct from scratch?
The issue I'm having is the SearchVectorField property returns the tsvector as a string. So when I use the || operator as +, eg:
from django.contrib.postgres.search import SearchVector
instance.my_tsvector_prop += SearchVector(
["new", "fields"],
weight="A",
config='english'
)
I get the error:
TypeError: SearchVector can only be combined with other SearchVector instances, got str.
Because:
type(instance.my_tsvector_prop) == str
A fix to this open Django bug whereby a SearchVectorField property returns a SearchVector instance would probably enable this, if possible. (Although less efficient than combining in the database. In our case the update will run asynchronously so performance is not too important.)
MyModel.objects
.filter(pk=1234)
.update(my_tsvector_prop=
F("my_tsvector_prop") +
SearchVector(
["new_field_name"],
weight="A",
config='english')
)
)
Returns:
FieldError: Cannot resolve expression type, unknown output_field
Another solution would be to run a raw SQL UPDATE, although I'd rather do it through the Django ORM if possible as our tsvector fields often reference values many joins away, so it'd be nice to find a sustainable solution.
I am trying to query my postgres database from django, the query I'm using is
s = Booking.objects.all().filter(modified_at__range=[last_run, current_time], coupon_code__in=l)
Now I am changing this object of mine in some ways in my script, and not saving it to the database. What I want to know is that, is it possible to query this object now?
say, I changed my variable as
s.modified_at = '2016-02-22'
Is it still possible to query this object as:
s.objects.all()
or something similar?
The QueryManager is Django's interface to the database (ORM). By definition this means you can only query data that has been stored in the database.
So, in short: "no". You cannot do queries on unsaved data.
Thinking about why you are even asking this, especially looking at the example using "modified_at": why do you not want to save your data?
(You might want to use auto_now=True for your "modified_at" field, btw.)
You could do something like this:
bookings = Booking.objects.all().filter(modified_at__range=[last_run, current_time], coupon_code__in=l)
for booking in bookings:
booking.modified_at = 'some value'
booking.save() # now booking object will have the updated value
Consider this query:
query = Novel.objects.< ...some filtering... >.annotate(
latest_chapter_id=Max("volume__chapter__id")
)
Actually what I need is to annotate each Novel with its latest Chapter object, so after this query, I have to execute another query to select actual objects by annotated IDs. IMO this is ugly. Is there a way to combine them into a single query?
Yes, it's possible.
To get a queryset containing all Chapters which are the last in their Novels, simply do:
from django.db.models.expressions import F
from django.db.models.aggregates import Max
Chapters.objects.annotate(last_chapter_pk=Max('novel__chapter__pk')
).filter(pk=F('last_chapter_pk'))
Tested on Django 1.7.
Possible with Django 3.2+
Make use of django.db.models.functions.JSONObject (added in Django 3.2) to combine multiple fields (in this example, I'm fetching the latest object, however it is possible to fetch any arbitrary object provided that you can get LIMIT 1) to yield your object):
MainModel.objects.annotate(
last_object=RelatedModel.objects.filter(mainmodel=OuterRef("pk"))
.order_by("-date_created")
.values(
data=JSONObject(
id="id", body="body", date_created="date_created"
)
)[:1]
)
Yes, using Subqueries, docs: https://docs.djangoproject.com/en/3.0/ref/models/expressions/#subquery-expressions
latest_chapters = Chapter.objects.filter(novel = OuterRef("pk"))\
.order_by("chapter_order")
novels_with_chapter = Novel.objects.annotate(
latest_chapter = Subquery(latest_chapters.values("chapter")[:1]))
Tested on Django 3.0
The subquery creates a select statement inside the select statement for the novels, then adds this as an annotation. This means you only hit the database once.
I also prefer this to Rune's answer as it actually annotates a Novel object.
Hope this helps, anyone who came looking like much later like I did.
No, it's not possible to combine them into a single query.
You can read the following blog post to find two workarounds.
I'm trying to use the Django ORM for a task that requires a JOIN in SQL. I
already have a workaround that accomplishes the same task with multiple queries
and some off-DB processing, but I'm not satisfied by the runtime complexity.
First, I'd like to give you a short introduction to the relevant part of my
model. After that, I'll explain the task in English, SQL and (inefficient) Django ORM.
The Model
In my CMS model, posts are multi-language: For each post and each language, there can be one instance of the post's content. Also, when editing posts, I don't UPDATE, but INSERT new versions of them.
So, PostContent is unique on post, language and version. Here's the class:
class PostContent(models.Model):
""" contains all versions of a post, in all languages. """
language = models.ForeignKey(Language)
post = models.ForeignKey(Post) # the Post object itself only
version = models.IntegerField(default=0) # contains slug and id.
# further metadata and content left out
class Meta:
unique_together = (("resource", "language", "version"),)
The Task in SQL
And this is the task: I'd like to get a list of the most recent versions of all posts in each language, using the ORM. In SQL, this translates to a JOIN on a subquery that does GROUP BY and MAX to get the maximum of version for each unique pair of resource and language. The perfect answer to this question would be a number of ORM calls that produce the following SQL statement:
SELECT
id,
post_id,
version,
v
FROM
cms_postcontent,
(SELECT
post_id as p,
max(version) as v,
language_id as l
FROM
cms_postcontent
GROUP BY
post_id,
language_id
) as maxv
WHERE
post_id=p
AND version=v
AND language_id=l;
Solution in Django
My current solution using the Django ORM does not produce such a JOIN, but two seperate SQL
queries, and one of those queries can become very large. I first execute the subquery (the inner SELECT from above):
maxv = PostContent.objects.values('post','language').annotate(
max_version=Max('version'))
Now, instead of joining maxv, I explicitly ask for every single post in maxv, by
filtering PostContent.objects.all() for each tuple of post, language, max_version. The resulting SQL looks like
SELECT * FROM PostContent WHERE
post=P1 and language=L1 and version=V1
OR post=P2 and language=L2 and version=V2
OR ...;
In Django:
from django.db.models import Q
conjunc = map(lambda pc: Q(version=pc['max_version']).__and__(
Q(post=pc['post']).__and__(
Q(language=pc['language']))), maxv)
result = PostContent.objects.filter(
reduce(lambda disjunc, x: disjunc.__or__(x), conjunc[1:], conjunc[0]))
If maxv is sufficiently small, e.g. when retrieving a single post, this might be
a good solution, but the size of the query and the time to create it grow linearly with
the number of posts. The complexity of parsing the query is also at least linear.
Is there a better way to do this, apart from using raw SQL?
You can join (in the sense of union) querysets with the | operator, as long as the querysets query the same model.
However, it sounds like you want something like PostContent.objects.order_by('version').distinct('language'); as you can't quite do that in 1.3.1, consider using values in combination with distinct() to get the effect you need.
In documentation is example:
Book.objects.annotate(num_authors=Count('authors')).filter(num_authors__gt=1)
How can I filter authors, before executing annotation on authors?
For example I want Count only those authors that have name "John".
I don't believe you can make this selective count with the Django database-abstraction API without including some SQL. You make additions to a QuerySet's SQL using the extra method.
Assuming that the example in the documention is part an app called "inventory" and using syntax that works with postgresql (you didn't specify and it's what I'm more familiar with), the following should do what you're asking for:
Book.objects.extra(
select={"john_count":
"""SELECT COUNT(*) from "inventory_book_authors"
INNER JOIN "inventory_author" ON ("inventory_book_authors"."author_id"="inventory_author"."id")
WHERE "inventory_author"."name"=%s
AND "inventory_book_authors"."book_id"="inventory_book"."id"
"""},
select_params=('John',)
)