I need to retrieve a value from a Many-To-Many query. Let's say I have 3 models: Toy, Part, and ToyParts
ToyParts has a field called "part_no". I need to be able to get the value of this.
class Toy(models.Model):
parts = models.ManyToManyField(Part, through="ToyParts")
class Part(models.Model):
pass
class ToyParts(models.Model):
toy = models.ForeignKey(Toy, ...)
part = models.ForeignKey(Part, ...)
part_no = models.CharField(...)
I've tried using:
toy.parts.all().first().part_no
which obviously doesn't work as Part does not have a field called "part_no"
I've also tried just simply using:
ToyParts.objects.filter(toy=..., part=...)
but that adds additional queries.
How would I be able to get part_no without querying ToyParts directly?
I've tried using: toy.parts.all().first().part_no
The part_no field is declared on the model ToyParts. You therefore need to get an instance of ToyParts to access this field. Assuming you have a Toy instance you can use the reverse relation to ToyParts, which defaults to toyparts_set, as follows:
toy.toyparts_set.first().part_no
How would I be able to get part_no without querying ToyParts directly?
You can't. If you want to reduce the number of queries you can use select_related:
for tp in toy.toyparts_set.select_related('part').all():
print(tp.part_no, tp.part.id)
In this example tp.part doesn't require an extra query as the part instance is already fetched by select_related.
Related
I have a concrete base model, from which other models inherit (all models in this question have been trimmed for brevity):
class Order(models.Model):
state = models.ForeignKey('OrderState')
Here are a few examples of the "child" models:
class BorrowOrder(Order):
parts = models.ManyToManyField('Part', through='BorrowOrderPart')
class ReturnOrder(Order):
parts = models.ManyToManyField('Part', through='ReturnOrderPart')
As you can see from these examples, each child model has a many-to-many relationship of Parts through a custom table. Those custom through-tables look something like this:
class BorrowOrderPart(models.Model):
borrow_order = models.ForeignKey('BorrowOrder', related_name='borrowed_parts')
part = models.ForeignKey('Part')
qty_borrowed = models.PositiveIntegerField()
class ReturnOrderPart(models.Model):
return_order = models.ForeignKey('ReturnOrder', related_name='returned_parts')
part = models.ForeignKey('Part')
qty_returned = models.PositiveIntegerField()
Note that the "quantity" field in each through table has a custom name (unfortunately): qty_borrowed or qty_returned. I'd like to be able to query the base table (so that I'm searching across all order types), and include an annotated field for each that sums these quantity fields:
# Not sure what I specify in the Sum() call here, given that the fields
# I'm interested in are different depending on the child's type.
qs = models.Order.objects.annotate(total_qty=Sum(???))
# For a single model, I would do something like:
qs = models.BorrowOrder.objects.annotate(
total_qty=Sum('borrowed_parts__qty_borrowed'))
So I guess I have two related questions:
Can I annotate a child-model's data through a query on the parent model?
If so, can I conditionally specify the field to be annotated, given that the actual field name changes depending on the model in question?
This feels to me like a place where using When() and Case() might be helpful, but I'm not sure how I'd build the necessary logic.
The problem is that, when you are querying the base model (in multi-table inheritance), it's hard to find out which subclass the object actually is. See How to know which is the child class of a model.
The query might be achievable in theory, with something like
SELECT
CASE
WHEN child1.base_ptr_id IS NOT NULL THEN ...
WHEN child2.base_ptr_id IS NOT NULL THEN ...
END AS ...
FROM base
LEFT JOIN child1 ON child1.base_ptr_id = base.id
LEFT JOIN child2 ON child2.base_ptr_id = base.id
...
but I don't know how to translate that in Django and I think it would be too much trouble to do it. It could be done, if not anything else using raw queries.
Another solution would be to add to the base class a field that specifies which actual subclass each object is; in that case, you'd need to make as many queries as there are subclasses and join them. I don't like this solution either. Update: After I slept on this I conclude that the most Django-like solution would be not to query the parent model in the first place; simply query the submodels and join the results. I would explore the third option below only if there were performance or other practical problems.
Another idea is to create a database view (with CREATE VIEW) based on the above SQL query and translate it into a Django model with managed = False, and query that one. Maybe this is somewhat cleaner than the other solutions, but it is a bit non-standard.
Context
I am quite new to Django and I am trying to write a complex query that I think would be easily writable in raw SQL, but for which I am struggling using the ORM.
Models
I have several models named SignalValue, SignalCategory, SignalSubcategory, SignalType, SignalSubtype that have the same structure like the following model:
class MyModel(models.Model):
id = models.BigAutoField(primary_key=True)
name = models.CharField()
fullname = models.CharField()
I also have explicit models that represent the relationships between the model SignalValue and the other models SignalCategory, SignalSubcategory, SignalType, SignalSubtype. Each of these relationships are named SignalValueCategory, SignalValueSubcategory, SignalValueType, SignalValueSubtype respectively. Below is the SignalValueCategory model as an example:
class SignalValueCategory(models.Model):
signal_value = models.OneToOneField(SignalValue)
signal_category = models.ForeignKey(SignalCategory)
Finally, I also have the two following models. ResultSignal stores all the signals related to the model Result:
class Result(models.Model):
pass
class ResultSignal(models.Model):
id = models.BigAutoField(primary_key=True)
result = models.ForeignKey(
Result
)
signal_value = models.ForeignKey(
SignalValue
)
Query
What I am trying to achieve is the following.
For a given Result, I want to retrieve all the ResultSignals that belong to it, filter them to keep the ones of my interest, and annotate them with two fields that we will call filter_group_id and filter_group_name. The values of two fields are determined by the SignalValue of the given ResultSignal.
From my perspective, the easiest way to achieve this would be first to annotate the SignalValues with their corresponding filter_group_name and filter_group_id, and then to join the resulting QuerySet with the ResultSignals. However, I think that it is not possible to join two QuerySets together in Django. Consequently, I thought that we could maybe use Prefetch objects to achieve what I am trying to do, but it seems that I am unable to make it work properly.
Code
I will now describe the current state of my queries.
First, annotating the SignalValues with their corresponding filter_group_name and filter_group_id. Note that filter_aggregator in the following code is just a complex filter that allows me to select the wanted SignalValues only. group_filter is the same filter but as a list of subfilters. Additionally, filter_name_case is a conditional expression (Case() construct):
# Attribute a group_filter_id and group_filter_name for each signal
signal_filters = SignalValue.objects.filter(
filter_aggregator
).annotate(
filter_group_id=Window(
expression=DenseRank(),
order_by=group_filters
),
filter_group_name=filter_name_case
)
Then, trying to join/annotate the SignalResults:
prefetch_object = Prefetch(
lookup="signal_value",
queryset=signal_filters,
to_attr="test"
)
result_signals: QuerySet = (
last_interview_result
.resultsignal_set
.filter(signal_value__in=signal_values_of_interest)
.select_related(
'signal_value__signalvaluecategory__signal_category',
'signal_value__signalvaluesubcategory__signal_subcategory',
'signal_value__signalvaluetype__signal_type',
'signal_value__signalvaluesubtype__signal_subtype',
)
.prefetch_related(
prefetch_object
)
.values(
"signal_value",
"test",
category=F('signal_value__signalvaluecategory__signal_category__name'),
subcategory=F('signal_value__signalvaluesubcategory__signal_subcategory__name'),
type=F('signal_value__signalvaluetype__signal_type__name'),
subtype=F('signal_value__signalvaluesubtype__signal_subtype__name'),
)
)
Normally, from my understanding, the resulting QuerySet should have a field "test" that is now available, that would contain the fields of signal_filter, the first QuerySet. However, Django complains that "test" is not found when calling .values(...) in the last part of my code: Cannot resolve keyword 'test' into field. Choices are: [...]. It is like the to_attr parameter of the Prefetch object was not taken into account at all.
Questions
Did I missunderstand the functioning of annotate() and prefetch_related() functions? If not, what am I doing wrong in my code for the specified parameter to_attr to not exist in my resulting QuerySet?
Is there a better way to join two QuerySets in Django or am I better off using RawSQL? An alternative way would be to switch to Pandas to make the join in-memory, but it is very often more efficient to do such transformations on the SQL side with well-designed queries.
You're on the right path, but just missing what prefetch does.
Your annotations are correct, but the "test" prefetch isn't really an attribute. You batch up the SELECT * FROM signal_value queries so you don't have to execute the select per row. Just drop the "test" annotation and you should be fine. https://docs.djangoproject.com/en/3.2/ref/models/querysets/#prefetch-related
Please don't use pandas, it's definitely not necessary and is a ton of overhead. As you say yourself, it's more efficient to do the transforms on the sql side
From the docs on prefetch_related:
Remember that, as always with QuerySets, any subsequent chained methods which imply a different database query will ignore previously cached results, and retrieve data using a fresh database query.
It's not obvious but the values() call is part of these chained methods that imply a different query, and will actually cancel prefetch_related. This should work if you remove it.
I have three models
class A(Model):
...
class B(Model):
id = IntegerField()
a = ForeignKey(A)
class C(Model):
id = IntegerField()
a = ForeignKey(A)
I want get the pairs of (B.id, C.id), for which B.a==C.a. How do I make that join using the django orm?
Django allows you to reverse the lookup in much the same way that you can use do a forward lookup using __:
It works backwards, too. To refer to a “reverse” relationship, just use the lowercase name of the model.
This example retrieves all Blog objects which have at least one Entry whose headline contains 'Lennon':
Blog.objects.filter(entry__headline__contains='Lennon')
I think you can do something like this, with #Daniel Roseman's caveat about the type of result set that you will get back.
ids = B.objects.prefetch_related('a', 'a__c').values_list('id', 'a__c__id')
The prefetch related will help with performance in older versions of django if memory serves.
How can I apply annotations and filters from a custom manager queryset when filtering via a related field? Here's some code to demonstrate what I mean.
Manager and models
from django.db.models import Value, BooleanField
class OtherModelManager(Manager):
def get_queryset(self):
return super(OtherModelManager, self).get_queryset().annotate(
some_flag=Value(True, output_field=BooleanField())
).filter(
disabled=False
)
class MyModel(Model):
other_model = ForeignKey(OtherModel)
class OtherModel(Model):
disabled = BooleanField()
objects = OtherModelManager()
Attempting to filter the related field using the manager
# This should only give me MyModel objects with related
# OtherModel objects that have the some_flag annotation
# set to True and disabled=False
my_model = MyModel.objects.filter(some_flag=True)
If you try the above code you will get the following error:
TypeError: Related Field got invalid lookup: some_flag
To further clarify, essentially the same question was reported as a bug with no response on how to actually achieve this: https://code.djangoproject.com/ticket/26393.
I'm aware that this can be achieved by simply using the filter and annotation from the manager directly in the MyModel filter, however the point is to keep this DRY and ensure this behaviour is repeated everywhere this model is accessed (unless explicitly instructed not to).
How about running nested queries (or two queries, in case your backend is MySQL; performance).
The first to fetch the pk of the related OtherModel objects.
The second to filter the Model objects on the fetched pks.
other_model_pks = OtherModel.objects.filter(some_flag=...).values_list('pk', flat=True)
my_model = MyModel.objects.filter(other_model__in=other_model_pks)
# use (...__in=list(other_model_pks)) for MySQL to avoid a nested query.
I don't think what you want is possible.
1) I think you are miss-understanding what annotations do.
Generating aggregates for each item in a QuerySet
The second way to generate summary values is to generate an
independent summary for each object in a QuerySet. For example, if you
are retrieving a list of books, you may want to know how many authors
contributed to each book. Each Book has a many-to-many relationship
with the Author; we want to summarize this relationship for each book
in the QuerySet.
Per-object summaries can be generated using the annotate() clause.
When an annotate() clause is specified, each object in the QuerySet
will be annotated with the specified values.
The syntax for these annotations is identical to that used for the
aggregate() clause. Each argument to annotate() describes an aggregate
that is to be calculated.
So when you say:
MyModel.objects.annotate(other_model__some_flag=Value(True, output_field=BooleanField()))
You are not annotation some_flag over other_model.
i.e. you won't have: mymodel.other_model.some_flag
You are annotating other_model__some_flag over mymodel.
i.e. you will have: mymodel.other_model__some_flag
2) I'm not sure how familiar SQL is for you, but in order to preserve MyModel.objects.filter(other_model__some_flag=True) possible, i.e. to keep the annotation when doing JOINS, the ORM would have to do a JOIN over subquery, something like:
INNER JOIN
(
SELECT other_model.id, /* more fields,*/ 1 as some_flag
FROM other_model
) as sub on mymodel.other_model_id = sub.id
which would be super slow and I'm not surprised they are not doing it.
Possible solution
don't annotate your field, but add it as a regular field in your model.
The simplified answer is that models are authoritative on the field collection and Managers are authoritative on collections of models. In your efforts to make it DRY you made it WET, cause you alter the field collection in your manager.
In order to fix it, you would have to teach the model about the lookup and need to do that using the Lookup API.
Now I'm assuming that you're not actually annotating with a fixed value, so if that annotation is in fact reducible to fields, then you may just get it done, because in the end it needs to be mapped to database representation.
I'd like to set up a ForeignKey field in a django model which points to another table some of the time. But I want it to be okay to insert an id into this field which refers to an entry in the other table which might not be there. So if the row exists in the other table, I'd like to get all the benefits of the ForeignKey relationship. But if not, I'd like this treated as just a number.
Is this possible? Is this what Generic relations are for?
This question was asked a long time ago, but for newcomers there is now a built in way to handle this by setting db_constraint=False on your ForeignKey:
https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.ForeignKey.db_constraint
customer = models.ForeignKey('Customer', db_constraint=False)
or if you want to to be nullable as well as not enforcing referential integrity:
customer = models.ForeignKey('Customer', null=True, blank=True, db_constraint=False)
We use this in cases where we cannot guarantee that the relations will get created in the right order.
EDIT: update link
I'm new to Django, so I don't now if it provides what you want out-of-the-box. I thought of something like this:
from django.db import models
class YourModel(models.Model):
my_fk = models.PositiveIntegerField()
def set_fk_obj(self, obj):
my_fk = obj.id
def get_fk_obj(self):
if my_fk == None:
return None
try:
obj = YourFkModel.objects.get(pk = self.my_fk)
return obj
except YourFkModel.DoesNotExist:
return None
I don't know if you use the contrib admin app. Using PositiveIntegerField instead of ForeignKey the field would be rendered with a text field on the admin site.
This is probably as simple as declaring a ForeignKey and creating the column without actually declaring it as a FOREIGN KEY. That way, you'll get o.obj_id, o.obj will work if the object exists, and--I think--raise an exception if you try to load an object that doesn't actually exist (probably DoesNotExist).
However, I don't think there's any way to make syncdb do this for you. I found syncdb to be limiting to the point of being useless, so I bypass it entirely and create the schema with my own code. You can use syncdb to create the database, then alter the table directly, eg. ALTER TABLE tablename DROP CONSTRAINT fk_constraint_name.
You also inherently lose ON DELETE CASCADE and all referential integrity checking, of course.
To do the solution by #Glenn Maynard via South, generate an empty South migration:
python manage.py schemamigration myapp name_of_migration --empty
Edit the migration file then run it:
def forwards(self, orm):
db.delete_foreign_key('table_name', 'field_name')
def backwards(self, orm):
sql = db.foreign_key_sql('table_name', 'field_name', 'foreign_table_name', 'foreign_field_name')
db.execute(sql)
Source article
(Note: It might help if you explain why you want this. There might be a better way to approach the underlying problem.)
Is this possible?
Not with ForeignKey alone, because you're overloading the column values with two different meanings, without a reliable way of distinguishing them. (For example, what would happen if a new entry in the target table is created with a primary key matching old entries in the referencing table? What would happen to these old referencing entries when the new target entry is deleted?)
The usual ad hoc solution to this problem is to define a "type" or "tag" column alongside the foreign key, to distinguish the different meanings (but see below).
Is this what Generic relations are for?
Yes, partly.
GenericForeignKey is just a Django convenience helper for the pattern above; it pairs a foreign key with a type tag that identifies which table/model it refers to (using the model's associated ContentType; see contenttypes)
Example:
class Foo(models.Model):
other_type = models.ForeignKey('contenttypes.ContentType', null=True)
other_id = models.PositiveIntegerField()
# Optional accessor, not a stored column
other = generic.GenericForeignKey('other_type', 'other_id')
This will allow you use other like a ForeignKey, to refer to instances of your other model. (In the background, GenericForeignKey gets and sets other_type and other_id for you.)
To represent a number that isn't a reference, you would set other_type to None, and just use other_id directly. In this case, trying to access other will always return None, instead of raising DoesNotExist (or returning an unintended object, due to id collision).
tablename= columnname.ForeignKey('table', null=True, blank=True, db_constraint=False)
use this in your program