Django/MySQL Unique Constraint how to treat NULLs as equal

Django/MySQL Unique Constraint how to treat NULLs as equal - python

I have a Django model which has a recursive field. A simplified version is below. The idea is roughly to have a tree data structure in sql. The problem which I have is that, apparently, Django does not treat NULLs as equal. The problem now is that, since every tree's root has a 'null pointer' by necessity, I can have two identical trees but Django will treat them as different because of the NULL value. How can I implement the UniqueConstraint below so that two 'Link' objects with NULL link values and equal node values will be treated as identical, and fail the UniqueConstraint test? Thank you.
class Link(models.Model):
node = models.ForeignKey(Node, on_delete=models.CASCADE)
link = models.ForeignKey('self', on_delete = models.CASCADE, null=True)
class Meta:
constraints = [
models.UniqueConstraint(['node', 'link'], name='pipe_unique')
]
EDIT
Of course ideally the constraint would be enforced by the db. But even if I can enforce it in application logic by hooking somewhere or using a custom constraint, that would be good enough.

You may be able to do this with a custom constraint
UniqueConstraint(fields=['node'], condition=Q(link__isnull=True), name='unique_root_node')
EDIT:
If you wished to manually add the check you could do it in the save method of Link and also in the clean method so that it get's run in any model forms before you even get to saving the instance
def clean(self):
if self.node_id and not self.link_id:
if self.__class__.objects.exclude(pk=self.pk).filter(node=self.node, link__isnull=True).exists():
raise ValidationError(f'A root node already exists for {self.node}')
Excluding pk=self.pk avoids getting conflicts with itself when updating an object
def save(self, *args, **kwargs):
self.clean()
super().save(*args, **kwargs)

Related

Can I create a Django object using a subquery for a field value?

TLDR
When creating a new object using Django ORM, can I, in a transactionally safe / race-condition-free manner, set a field's value based on an already existing object's value, say F('sequence_number') + 1 where F('sequence_number') refers not to the current object (which does not exist yet) but to the most recent object with that prefix in the table?
Longer version
I have a model Issue with properties sequence_number and sequence_prefix. There is a unique constraint on (sequence_prefix, sequence_number) (e.g. DATA-1).
class Issue(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
sequence_prefix = models.CharField(blank=True, default="", max_length=32)
sequence_number = models.IntegerField(null=False)
class Meta:
constraints = [
models.UniqueConstraint(
fields=["sequence_prefix", "sequence_number"], name="unique_sequence"
)
]
The idea is that issues —for auditing purposes— have unique sequence numbers for each variable (user-determined) prefix: when creating an issue the user selects a prefix, e.g. REVIEW or DATA, and the sequence number is the incremented value of the previous issue with that same sequence. So it's like an AutoField but dependent on the value of another field for its value. There can not be two issues DATA-1, but REVIEW-1 and DATA-1 and OTHER-1 all may exist at the same time.
How can I tell Django when creating an Issue, that it must find the most recent object for that given sequence_prefix, take the sequence_number + 1 and use that for the new object's sequence_number value, in a way that is safe of any race-condition?

A good way to archive this is to override the save() method of the Issue model.
For example:
class Issue(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
sequence_prefix = models.CharField(blank=True, default="", max_length=32)
sequence_number = models.IntegerField(null=False)
def save(self, *args, **kwargs):
max_id_by_prefix = Issue.objects.filter(sequence_prefix=self.sequence_prefix).max().id
self.sequence_number = max_id_by_prefix + 1
super(Issue, self).save(*args, **kwargs)
class Meta:
constraints = [
models.UniqueConstraint(
fields=["sequence_prefix", "sequence_number"], name="unique_sequence"
)
]
In this way, before saving the object, you can take the max sequence_number of the sequence_prefix that you are saving.

Unless you want to use database sequences (AutoField), I believe you will need to implement something on your own. There are two options
Prevent concurrent inserts per specific sequence_prefix with some locking mechanism (I would use Redis for a distributed lock, to support multi-processing setup)
Implement your own sequencing (again, Redis is a perfect choices), which will provide you with auto-incrementing sequence_number per prefix. For example:
sequence_number = redis_client.incr('sequence:REVIEW')

Django undirected unique-together

I want to model pair-wise relations between all members of a set.
class Match(models.Model):
foo_a = models.ForeignKey(Foo, related_name='foo_a')
foo_b = models.ForeignKey(Foo, related_name='foo_b')
relation_value = models.IntegerField(default=0)
class Meta:
unique_together = ('ingredient_a', 'ingredient_b')
When I add a pair A-B, it successfully prevents me from adding A-B again, but does not prevent me from adding B-A.
I tried following, but to no avail.
unique_together = (('ingredient_a', 'ingredient_b'), ('ingredient_b', 'ingredient_a'))
Edit:
I need the relationship_value to be unique for every pair of items

If you define a model like what you defined, its not just a ForeignKey, its called a ManyToMany Relation.
In the django docs, it is explicitly defined that unique together constraint cannot be included for a ManyToMany Relation.
From the docs,
A ManyToManyField cannot be included in unique_together. (It’s not clear what that would even mean!) If you need to validate uniqueness related to a ManyToManyField, try using a signal or an explicit through model.
EDIT
After lot of search and some trial and errors and finally I think I have found a solution for your scenario. Yes, as you said, the present schema is not as trivial as we all think. In this context, Many to many relation is not the discussion we need to forward. The solution is, (or what I think the solution is) model clean method:
class Match(models.Model):
foo_a = models.ForeignKey(Foo, related_name='foo_a')
foo_b = models.ForeignKey(Foo, related_name='foo_b')
def clean(self):
a_to_b = Foo.objects.filter(foo_a = self.foo_a, foo_b = self.foo_b)
b_to_a = Foo.objects.filter(foo_a = self.foo_b, foo_b = self.foo_a)
if a_to_b.exists() or b_to_a.exists():
raise ValidationError({'Exception':'Error_Message')})
For more details about model clean method, refer the docs here...

I've overridden the save method of the object to save 2 pairs every time. If the user wants to add a pair A-B, a record B-A with the same parameters is automatically added.
Note: This solution affects the querying speed. For my project, it is not an issue, but it needs to be considered.
def save(self, *args, **kwargs):
if not Match.objects.filter(foo_a=self.foo_a, foo_b=self.foo_b).exists():
super(Match, self).save(*args, **kwargs)
if not Match.objects.filter(foo_a=self.foo_b, foo_b=self.foo_a).exists():
Match.objects.create(foo_a=self.foo_b, foo_b=self.foo_a, bar=self.bar)
EDIT: Update and remove methods need to be overridden too of course.

Persistent Calculated Fields in Django

In MS SQL Server there is a feature to create a calculated column: a table column that is calculated on the fly at retrieval time. This more-or-less maps on to using a method on a Django model to retrieve a calculated value (the common example being retrieving Full Name, based on stored Forename/Surname fields).
For expensive operations, SQL Server provides a Persisted option. This populates the table column with the results of the calculation, and updates those results when the table is updated - a very useful feature when the calculation is not quick but does not change often compared to access.
However, in Django I cannot find a way to duplicate this functionality. Am I missing something obvious? My best guess would be some sort of custom Field that takes a function as a parameter, but I couldn't see a pre-existing one of those. Is there a better way?

One approach is just to use a regular model field that is calculated whenever an object is saved, e.g.,:
class MyModel(models.Model):
first_name = models.CharField(max_length=255)
surname = models.CharField(max_length=255)
# This is your 'persisted' field
full_name = models.CharField(max_length=255, blank=True)
def save(self, *args, **kwargs):
# set the full name whenever the object is saved
self.full_name = '{} {}'.format(self.first_name, self.surname)
super(MyModel, self).save(*args, **kwargs)
You could make this special field read-only in the admin and similarly exclude it from any model forms.

Filter non-existing GenericForeignKey objects in Django queryset

I have a simple model with a generic foreign key:
class Generic(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
I would like to filter all entries in this table that have non-null content_object's, i.e. filter out all instances of Generic whose content objects no longer exist:
Generic.objects.filter(~Q(content_object=None))
This doesn't work, giving the exception:
django.core.exceptions.FieldError: Field 'content_object' does not
generate an automatic reverse relation and therefore cannot be used
for reverse querying. If it is a GenericForeignKey, consider adding a
GenericRelation.
Adding GenericRelation to the referenced content type models makes no difference.
Any help on how to achieve this would be appreciated, many thanks.
EDIT: I realise I could cascade the delete, however this is not an option in my situation (I wish to retain the data).

If you want to filter some records out, it's often better to use the exclude() method:
Generic.objects.exclude(object_id__isnull=True)
Note, though, that your model now doesn't allow empty content_object fields. To change this behaviour, use the null=True argument to both object_id and content_type fields.
Update
Okay, since the question has shifted from filtering out null records to determining broken RDBMS references without help of RDBMS itself, I'd suggest a (rather slow and memory hungry) workaround:
broken_items = []
for ct in ContentType.objects.all():
broken_items.extend(
Generic.objects
.filter(content_type=ct)
.exclude(object_id__in=ct.model_class().objects.all())
.values_list('pk', flat=True))
This would work as a one-time script, but not as a robust solution. If you absolutely want to retain the data, the only fast way I could think out is having a is_deleted boolean flag in your Generic model and setting it in a (post|pre)_delete signal.

Django ForeignKey which does not require referential integrity?

I'd like to set up a ForeignKey field in a django model which points to another table some of the time. But I want it to be okay to insert an id into this field which refers to an entry in the other table which might not be there. So if the row exists in the other table, I'd like to get all the benefits of the ForeignKey relationship. But if not, I'd like this treated as just a number.
Is this possible? Is this what Generic relations are for?

This question was asked a long time ago, but for newcomers there is now a built in way to handle this by setting db_constraint=False on your ForeignKey:
https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.ForeignKey.db_constraint
customer = models.ForeignKey('Customer', db_constraint=False)
or if you want to to be nullable as well as not enforcing referential integrity:
customer = models.ForeignKey('Customer', null=True, blank=True, db_constraint=False)
We use this in cases where we cannot guarantee that the relations will get created in the right order.
EDIT: update link

I'm new to Django, so I don't now if it provides what you want out-of-the-box. I thought of something like this:
from django.db import models
class YourModel(models.Model):
my_fk = models.PositiveIntegerField()
def set_fk_obj(self, obj):
my_fk = obj.id
def get_fk_obj(self):
if my_fk == None:
return None
try:
obj = YourFkModel.objects.get(pk = self.my_fk)
return obj
except YourFkModel.DoesNotExist:
return None
I don't know if you use the contrib admin app. Using PositiveIntegerField instead of ForeignKey the field would be rendered with a text field on the admin site.

This is probably as simple as declaring a ForeignKey and creating the column without actually declaring it as a FOREIGN KEY. That way, you'll get o.obj_id, o.obj will work if the object exists, and--I think--raise an exception if you try to load an object that doesn't actually exist (probably DoesNotExist).
However, I don't think there's any way to make syncdb do this for you. I found syncdb to be limiting to the point of being useless, so I bypass it entirely and create the schema with my own code. You can use syncdb to create the database, then alter the table directly, eg. ALTER TABLE tablename DROP CONSTRAINT fk_constraint_name.
You also inherently lose ON DELETE CASCADE and all referential integrity checking, of course.

To do the solution by #Glenn Maynard via South, generate an empty South migration:
python manage.py schemamigration myapp name_of_migration --empty
Edit the migration file then run it:
def forwards(self, orm):
db.delete_foreign_key('table_name', 'field_name')
def backwards(self, orm):
sql = db.foreign_key_sql('table_name', 'field_name', 'foreign_table_name', 'foreign_field_name')
db.execute(sql)
Source article

(Note: It might help if you explain why you want this. There might be a better way to approach the underlying problem.)
Is this possible?
Not with ForeignKey alone, because you're overloading the column values with two different meanings, without a reliable way of distinguishing them. (For example, what would happen if a new entry in the target table is created with a primary key matching old entries in the referencing table? What would happen to these old referencing entries when the new target entry is deleted?)
The usual ad hoc solution to this problem is to define a "type" or "tag" column alongside the foreign key, to distinguish the different meanings (but see below).
Is this what Generic relations are for?
Yes, partly.
GenericForeignKey is just a Django convenience helper for the pattern above; it pairs a foreign key with a type tag that identifies which table/model it refers to (using the model's associated ContentType; see contenttypes)
Example:
class Foo(models.Model):
other_type = models.ForeignKey('contenttypes.ContentType', null=True)
other_id = models.PositiveIntegerField()
# Optional accessor, not a stored column
other = generic.GenericForeignKey('other_type', 'other_id')
This will allow you use other like a ForeignKey, to refer to instances of your other model. (In the background, GenericForeignKey gets and sets other_type and other_id for you.)
To represent a number that isn't a reference, you would set other_type to None, and just use other_id directly. In this case, trying to access other will always return None, instead of raising DoesNotExist (or returning an unintended object, due to id collision).

tablename= columnname.ForeignKey('table', null=True, blank=True, db_constraint=False)
use this in your program

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django/MySQL Unique Constraint how to treat NULLs as equal - python

Related

Can I create a Django object using a subquery for a field value?

Django undirected unique-together

Persistent Calculated Fields in Django

Filter non-existing GenericForeignKey objects in Django queryset

Django ForeignKey which does not require referential integrity?

Categories

Resources