I have a model that looks like this:
class Keyword(models.Model):
name = models.CharField(unique=True)
class Post(models.Model):
title = models.CharField()
keywords = models.ManyToManyField(
Keyword, related_name="posts_that_have_this_keyword"
)
Now I want to migrate all Posts of a wrongly named Keyword to a new properly named Keyword. And there are multiple wrongly named Keywords.
I can do the following but it leads to a number of SQL queries.
for keyword in Keyword.objects.filter(is_wrongly_named=True).iterator():
old = keyword
new, _ = Keyword.objects.get_or_create(name='some proper name')
for note in old.notes_that_have_this_keyword.all():
note.keywords.add(old)
old.delete()
Is there a way I can achieve this while minimizing the SQL queries executed?
I prefer Django ORM solution to a raw SQL one, because I jumped right into the Django ORM without studying deep into SQL, not so familiar with SQL.
Thank you.
If you want to perform bulk operations with M2M relationships I suggest that you act directly on the table that joins the two objects. Django allows you to access this otherwise anonymous table by using the through attribute on the M2M attribute on an object.
So, to get the table that joins Keywords and Posts you could reference either Keyword.posts_that_have_this_keyword.through or Post.keywords.through. I'd suggest you assign a nicely named variable to this like:
KeywordPost = Post.keywords.through
Once you get a hold onto that table bulk operations can be performed.
bulk remove bad entries
KeywordPost.objects.filter(keyword__is_wrongly_named=True).delete()
bulk create new entries
invalid_keyword_posts = KeywordPost.objects.filter(keyword__is_wrongly_named=True)
post_ids_to_update = invalid_keyword_posts.values_list("post_id", flat=True)
new_keyword_posts = [KeywordPost(post_id=p_id, keyword=new_keyword) for p_id in post_ids_to_update]
KeywordPost.objects.bulk_create(new_keyword_posts)
Basically you get access to all the features that the ORM provides on this join table. You should be able to achieve much better performance that way.
You can read up more on the through attribute here: https://docs.djangoproject.com/en/3.0/ref/models/fields/#django.db.models.ManyToManyField.through
Good luck!
I have three models
class A(Model):
...
class B(Model):
id = IntegerField()
a = ForeignKey(A)
class C(Model):
id = IntegerField()
a = ForeignKey(A)
I want get the pairs of (B.id, C.id), for which B.a==C.a. How do I make that join using the django orm?
Django allows you to reverse the lookup in much the same way that you can use do a forward lookup using __:
It works backwards, too. To refer to a “reverse” relationship, just use the lowercase name of the model.
This example retrieves all Blog objects which have at least one Entry whose headline contains 'Lennon':
Blog.objects.filter(entry__headline__contains='Lennon')
I think you can do something like this, with #Daniel Roseman's caveat about the type of result set that you will get back.
ids = B.objects.prefetch_related('a', 'a__c').values_list('id', 'a__c__id')
The prefetch related will help with performance in older versions of django if memory serves.
I am using Django 1.9 and am trying the bulk_create to create many new model objects and associate them with a common related many_to_many object.
My models are as follows
#Computational Job object
class OT_job(models.Model):
is_complete = models.BooleanField()
is_submitted = models.BooleanField()
user_email = models.EmailField()
#Many sequences
class Seq(models.Model):
sequence=models.CharField(max_length=100)
ot_job = models.ManyToManyField(OT_job)
I have thousands of Seq objects that are submitted and have to be associated with their associated job. Previously I was using an iterator and saving them in a for loop. But after reading realized that Django 1.9 has bulk_create.
Currently I am doing
DNASeqs_list = []
for a_seq in some_iterable_with_my_data:
# I create new model instances and add them to the list
DNASeqs_list.append(Seq(sequence=..., ))
I now want to bulk_create these sequence and associate them with the current_job_object.
created_dnaseqs = Seq.objects.bulk_create(DNASeqs_list)
# How do I streamline this part below
for a_seq in created_dnaseqs:
# Had to call save here otherwise got an error
a_seq.save()
a_seq.ot_job.add(curr_job_obj)
I had to call "a_seq.save()" in for loop because I got an error in the part where I was doing "a_seq.ot_job.add(curr_job_obj)" which said
....needs to have a value for field "seq" before this many-to-many relationship can be used.
Despite reading the other questions on this topic , I am still confused because unlike others I do not have a custom "through" model. I am confused with how best to associate the OT_Job with many Seqs with minimal hits to database.
From the docs https://docs.djangoproject.com/en/1.9/ref/models/querysets/#bulk-create:
If the model’s primary key is an AutoField it does not retrieve and set the primary key attribute, as save() does.
It does not work with many-to-many relationships.
bulk_create literally will just create the objects, it does not retrieve the PK into the variable as save does. You would have to re-query the db to get your newly created objects, and then create the M2M relationships, but it sounds like that would not be appropriate and that your current method is currently the best solution.
consider the below:
class Tag(Model):
...
class Post(Model):
tags = ManyToManyField(Tag) # a join table "post_tags" is created
post = Post.objects.get(pk=1)
post.tags.all() # this will cause django to join "tag" with "post_tags"
post.tags.values('pk') # even though pk is already in post_tags, django will still join with "tag" table
My need is only the list of PKs. Does anyone know of a supported way, or a clean hack where I can just get the PKs from an M2M without an additional join to the actual related table?
You can checkout django doc about prefetch_related. Quoting the docs:
prefetch_related, on the other hand, does a separate lookup for each
relationship, and does the ‘joining’ in Python. This allows it to
prefetch many-to-many and many-to-one objects, which cannot be done
using select_related, in addition to the foreign key and one-to-one
relationships that are supported by select_related.
So it should be:
post = Post.objects.filter(pk=1).prefetch_related('tags')[0]
You can define relation using through argument:
class Tag(Model):
pass
class Post(Model):
tags = ManyToManyField(Tag, through='PostTag')
class PostTag(Model):
post = models.ForeignKey(Tag)
tag = models.ForeignKey(Post)
then
PostTag.objects.filter(post_id=1).values('tag_id')
will perform in a single query, like this:
SELECT `appname_posttag`.`tag_id` FROM `appname_posttag` WHERE `appname_posttag`.`post_id` = 1
I'd like to set up a ForeignKey field in a django model which points to another table some of the time. But I want it to be okay to insert an id into this field which refers to an entry in the other table which might not be there. So if the row exists in the other table, I'd like to get all the benefits of the ForeignKey relationship. But if not, I'd like this treated as just a number.
Is this possible? Is this what Generic relations are for?
This question was asked a long time ago, but for newcomers there is now a built in way to handle this by setting db_constraint=False on your ForeignKey:
https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.ForeignKey.db_constraint
customer = models.ForeignKey('Customer', db_constraint=False)
or if you want to to be nullable as well as not enforcing referential integrity:
customer = models.ForeignKey('Customer', null=True, blank=True, db_constraint=False)
We use this in cases where we cannot guarantee that the relations will get created in the right order.
EDIT: update link
I'm new to Django, so I don't now if it provides what you want out-of-the-box. I thought of something like this:
from django.db import models
class YourModel(models.Model):
my_fk = models.PositiveIntegerField()
def set_fk_obj(self, obj):
my_fk = obj.id
def get_fk_obj(self):
if my_fk == None:
return None
try:
obj = YourFkModel.objects.get(pk = self.my_fk)
return obj
except YourFkModel.DoesNotExist:
return None
I don't know if you use the contrib admin app. Using PositiveIntegerField instead of ForeignKey the field would be rendered with a text field on the admin site.
This is probably as simple as declaring a ForeignKey and creating the column without actually declaring it as a FOREIGN KEY. That way, you'll get o.obj_id, o.obj will work if the object exists, and--I think--raise an exception if you try to load an object that doesn't actually exist (probably DoesNotExist).
However, I don't think there's any way to make syncdb do this for you. I found syncdb to be limiting to the point of being useless, so I bypass it entirely and create the schema with my own code. You can use syncdb to create the database, then alter the table directly, eg. ALTER TABLE tablename DROP CONSTRAINT fk_constraint_name.
You also inherently lose ON DELETE CASCADE and all referential integrity checking, of course.
To do the solution by #Glenn Maynard via South, generate an empty South migration:
python manage.py schemamigration myapp name_of_migration --empty
Edit the migration file then run it:
def forwards(self, orm):
db.delete_foreign_key('table_name', 'field_name')
def backwards(self, orm):
sql = db.foreign_key_sql('table_name', 'field_name', 'foreign_table_name', 'foreign_field_name')
db.execute(sql)
Source article
(Note: It might help if you explain why you want this. There might be a better way to approach the underlying problem.)
Is this possible?
Not with ForeignKey alone, because you're overloading the column values with two different meanings, without a reliable way of distinguishing them. (For example, what would happen if a new entry in the target table is created with a primary key matching old entries in the referencing table? What would happen to these old referencing entries when the new target entry is deleted?)
The usual ad hoc solution to this problem is to define a "type" or "tag" column alongside the foreign key, to distinguish the different meanings (but see below).
Is this what Generic relations are for?
Yes, partly.
GenericForeignKey is just a Django convenience helper for the pattern above; it pairs a foreign key with a type tag that identifies which table/model it refers to (using the model's associated ContentType; see contenttypes)
Example:
class Foo(models.Model):
other_type = models.ForeignKey('contenttypes.ContentType', null=True)
other_id = models.PositiveIntegerField()
# Optional accessor, not a stored column
other = generic.GenericForeignKey('other_type', 'other_id')
This will allow you use other like a ForeignKey, to refer to instances of your other model. (In the background, GenericForeignKey gets and sets other_type and other_id for you.)
To represent a number that isn't a reference, you would set other_type to None, and just use other_id directly. In this case, trying to access other will always return None, instead of raising DoesNotExist (or returning an unintended object, due to id collision).
tablename= columnname.ForeignKey('table', null=True, blank=True, db_constraint=False)
use this in your program