Django cross table model structure - python

I have a System model, and an Interface model. An Interface is the combination between two systems. Before, this interface was represented as an Excel sheet (cross table). Now I'd like to store it in a database.
I tried creation an Interface model, with two foreign keys to System. This doesn't work because :
It creates two different reverse relationships on the target model
It doesn't avoid having duplicates (first and second rel swapped)
I used this code :
class SystemInterface(Interface):
assigned_to = models.ManyToManyField(User)
first_system = models.ForeignKey(System)
second_system = models.ForeignKey(System)
Isn't there a better way to do this ?
I'd need to have symmetrical relations : it should'nt matter is a System is the "first" or "second" in an Interface.

I think the best simpliest to represent those models would be like this:
class System(models.Model):
pass
class Interface(models.Model):
assigned_to = models.ManyToManyField(to=User)
system = models.ForeignKey(System)
#property
def systems(self):
Interface.objects.get(system=self.system).interfacedsystem_set.all()
class InterfacedSystem(models.Model):
interface = models.ForeignKey(Interface)
system = models.ForeignKey(System)
The add/remove of interfaced system is obviously left as an exercise to the reader, put should be fairly easy.

You can use a many to many relationship with extra fields, but it can't be symetrical.
The table used for a many to many relation contain a row per relation between 2 models. The table used for a many to many relation from System to self, has one row per relation between two Systems. This is consistent with the fact that your model fits the structure of a model used for ManyToManyField.through.
Using an intermediary model allows to add fields like assigned_to to the many to many table.
It might be tricky to understand, but it should prevent the creation of SystemInterface(left_system=system_a, right_system=system_b). Note that I changed "first" by "left" and "second" by "right" for the purpose of representing a many to many relation row/instance, which has a "left" side and a "right" side.
Because they can't be symetrical, this won't solve the problem of having one SystemInterface(left_system=system_a, right_system=system_b) and one with SystemInterface(left_system=system_b, right_system=system_a). You should prevent that from happening in the clean() method of the SystemInterface - or any model used to represent the many to many table with a ManyToManyField.through model.

Since django doesn't support symmetrical many-to-many relationships with extra data, you probably need to enforce this yourself.
If you have a convenient immutable value in the system (e.g. system id), you can create a predictable algorithm for which system will be stored in which entry in your table. If the systems are always persistent by the time you create the Interface object, you can use the primary key.
Then, write a function to create the interface. For example:
class System(models.Model):
def addInterface(self, other_system, user):
system_interface = SystemInterface()
system_interface.assigned_to = user
if other_system.id < self.id:
system_interface.first_system = other_system
system_interface.second_system = self
else:
system_interface.first_system = self
system_interface.second_system = other_system
system_interface.save()
return system_interface
Using this design, you can do the usual validation, duplication detection, etc. on the SystemInterface object. The main point is that you enforce the constraint in your code rather than in the data model.
Does this make sense?

Related

Retrieving a child model's annotation via a query to the parent in Django?

I have a concrete base model, from which other models inherit (all models in this question have been trimmed for brevity):
class Order(models.Model):
state = models.ForeignKey('OrderState')
Here are a few examples of the "child" models:
class BorrowOrder(Order):
parts = models.ManyToManyField('Part', through='BorrowOrderPart')
class ReturnOrder(Order):
parts = models.ManyToManyField('Part', through='ReturnOrderPart')
As you can see from these examples, each child model has a many-to-many relationship of Parts through a custom table. Those custom through-tables look something like this:
class BorrowOrderPart(models.Model):
borrow_order = models.ForeignKey('BorrowOrder', related_name='borrowed_parts')
part = models.ForeignKey('Part')
qty_borrowed = models.PositiveIntegerField()
class ReturnOrderPart(models.Model):
return_order = models.ForeignKey('ReturnOrder', related_name='returned_parts')
part = models.ForeignKey('Part')
qty_returned = models.PositiveIntegerField()
Note that the "quantity" field in each through table has a custom name (unfortunately): qty_borrowed or qty_returned. I'd like to be able to query the base table (so that I'm searching across all order types), and include an annotated field for each that sums these quantity fields:
# Not sure what I specify in the Sum() call here, given that the fields
# I'm interested in are different depending on the child's type.
qs = models.Order.objects.annotate(total_qty=Sum(???))
# For a single model, I would do something like:
qs = models.BorrowOrder.objects.annotate(
total_qty=Sum('borrowed_parts__qty_borrowed'))
So I guess I have two related questions:
Can I annotate a child-model's data through a query on the parent model?
If so, can I conditionally specify the field to be annotated, given that the actual field name changes depending on the model in question?
This feels to me like a place where using When() and Case() might be helpful, but I'm not sure how I'd build the necessary logic.
The problem is that, when you are querying the base model (in multi-table inheritance), it's hard to find out which subclass the object actually is. See How to know which is the child class of a model.
The query might be achievable in theory, with something like
SELECT
CASE
WHEN child1.base_ptr_id IS NOT NULL THEN ...
WHEN child2.base_ptr_id IS NOT NULL THEN ...
END AS ...
FROM base
LEFT JOIN child1 ON child1.base_ptr_id = base.id
LEFT JOIN child2 ON child2.base_ptr_id = base.id
...
but I don't know how to translate that in Django and I think it would be too much trouble to do it. It could be done, if not anything else using raw queries.
Another solution would be to add to the base class a field that specifies which actual subclass each object is; in that case, you'd need to make as many queries as there are subclasses and join them. I don't like this solution either. Update: After I slept on this I conclude that the most Django-like solution would be not to query the parent model in the first place; simply query the submodels and join the results. I would explore the third option below only if there were performance or other practical problems.
Another idea is to create a database view (with CREATE VIEW) based on the above SQL query and translate it into a Django model with managed = False, and query that one. Maybe this is somewhat cleaner than the other solutions, but it is a bit non-standard.

Using a single ManyToMany relation table instead of ManyToMany & ForeignKey field on multiple models?

I have a Django application that handles data analysis workflows, with database models that look something like this:
class Workflow(models.Model):
execution_id = models.UUIDField()
class WorkflowItem(models.Model):
workflow = models.ForeignKey(Workflow)
type = models.CharField(choices=["input", "output"])
files = models.ManyToManyField(File)
class File(models.Model):
path = models.CharField()
class FileMetadata(models.Model):
metadata = models.JSONField()
file = models.ForeignKey(File)
version = models.IntegerField()
A given Workflow will have many WorkflowItem's, which correspond to File's which can be used by WorkflowItem's across many Workflow's. Each File can have many associated FileMetadata's, of which the entry with the max version value is typically used for a given operation.
As the application has been growing, its getting tedious to build out all the different combinations of logic needed to find the entries in one table based on a given entry in another table just by using each tables' Foreign Key interface (Workflow <-> WorkflowItem <-> File <-> FileMetadata).
I am considering just building a table that holds all the foreign keys for every relationship in a single place. Something like this:
class WorkflowFile(models.Model):
workflow = models.ForeignKey(Workflow)
workflow_item = models.ForeignKey(WorkflowItem)
file = models.ForeignKey(File)
file_metadata = models.ForeignKey(FileMetadata)
However, I am not sure if this is a good idea or not. Its not clear to me if implementing a table like this is advantageous compared to just following all the foreign key relationships individually per-table. Its also not clear to me how I should set up such a table through Django, and if the new requirement for manually entering values into this table all the time would outweigh the reduced need for unique query logic every time I want to query these relationships. My end-goal is to provide a simpler, more consistent way to get all of the items in the relationship based on any of the other items in the relationship.
This question seems similar in premise, but I am not clear that the problem or proposed solution is relevant to what I am looking for here.
Not sure this will actually answer your question but if you want to go the way with multiple FK's then you may consider using through table in combination with m2m changed signal to add proper FK's to this model after adding M2M records to WorkflowItem.
It'll be something like:
from django.db.models.signals import m2m_changed
class WorkflowItem(models.Model):
workflow = models.ForeignKey(Workflow)
type = models.CharField(choices=["input", "output"])
files = models.ManyToManyField(File, through=IntermediateTable)
class IntermediateTable(models.Model):
file = models.ForeignKey(File, related_name='file')
workflow_item = models.ForeignKey(WorkflowItem, related_name='worflowitem')
workflow = models.ForeignKey(Workflow, null=True)
file_metadata = models.ForeignKey(FileMetadata)
def workflow_item_changed(sender, **kwargs):
sender.workflow = sender.workflow_item.workflow
...
sender.save()
m2m_changed.connect(workflow_item_changed, sender=WorkflowItem.files.through)

ForeignKey vs CharField

I have an idea for data model in django and I was wondering if someone can point out pros and cons for these two setups.
Setup 1: This would be an obvious one. Using CharFields for each field of each object
class Person(models.Model):
name = models.CharField(max_length=255)
surname = models.CharField(max_length=255)
city = models.CharField(max_length=255)
Setup 2: This is the one I am thinking about. Using a ForeignKey to Objects that contain the values that current Object should have.
class Person(models.Model):
name = models.ForeignKey('Name')
surname = models.ForeignKey('Surname')
city = models.ForeignKey('City')
class Chars(models.Model):
value = models.CharField(max_length=255)
def __str__(self):
return self.value
class Meta:
abstract = True
class Name(Chars):pass
class Surname(Chars):pass
class City(Chars):pass
So in setup 1, I would create an Object with:
Person.objects.create(name='Name', surname='Surname', city='City')
and each object would have it's own data. In setup 2, I would have to do this:
_name = Name.objects.get_or_create(value='Name')[0]
_surname = Surname.objects.get_or_create(value='Surname')[0]
_city = City.objects.get_or_create(value='City')[0]
Person.objects.create(name=_name, surname=_surname, city=_city)
Question: Main purpose for this would be to reuse existing values for multiple objects, but is this something worth doing, when you take into consideration that you need multiple hits on the database to create an Object?
Choosing the correct design pattern for your application is a very wide area which is influenced by many factors that are even possibly out of scope in a Stack Overflow question. So in a sense your question could be a bit subjective and too broad.
Nevertheless, I would say that assigning a separate model (class) for first name, another separate for last name etc. is an overkill. You might essentially end up overengineering your app.
The main reasoning behind the above recommendation is that you probably do not want to treat a name as a separate entity and possibly attach additional properties to it. Unless you really would need such a feature, a name is usually a plain string that some users happen to have identical.
It doesn't make any good to keep name and surname as separate object/model/db table. In your setup, if you don't set name and surname as unique, then it doesn't make any sense to put them in separate model. Even worse, it will incur additional DB work and decrease performance. Now, if you set them as unique, then you have to work over the situation when, e.g. some user changes his name and by default it would be changed for all users with that name.
On the other hand, city - there're not that many cities and it's a good idea to keep it as separate object and refer to it via foreign key from user. This will save disk space, allow to easily get all users from same city. Even better, you can prepopulate cities DB and provide autocompletion fro users entering there city. Though for performance you might still want to keep city as a string on the user model.
Also, to mention 'gender' field, since there're not many possible choices for this data, it's worth to use enumeration in your code and store a value in DB, i.e. use choices instead of ForeignKey to a separate DB table.

Nice pythonic way to specify django model field choices with extra attributes and methods

How can you specify choices on a django model such that the "choice" carries more information than just the database value and display value (as required by django's choices spec)? Suppose the different choices have a number of config options that I want to set (in code) as well as some methods, and the methods may be different between different choices. Here's an example:
class Reminder(models.Model):
frequency = models.CharField(choices=SOME_CHOICES)
next_reminder = models.DateTimeField()
...
How should we specify SOME_CHOICES for something like "weekly" and "monthly" reminders? So far my best solution is to write a class for each frequency choice, and store the class name in the database and then import the class by name whenever I need the methods or config data.
Ideally I would like to specify all of these config values and define the methods for each choice all in one place, rather than have the logic scattered all over a bunch of Model methods with long if/elif/elif...else structures. Python is object-oriented... these frequency choices seem to have "data" and "methods" so they seem like good candidates for classes...
class ReminderFrequency(object):
pass
class Weekly(ReminderFrequency):
display_value = "Every week"
def get_next_reminder_time(self):
return <now + 7 days>
class Monthly(ReminderFrequency):
display_value = "Every month"
def get_next_reminder_time(self):
return <now + 1 month>
SOME_CHOICES = ((freq.__name__, freq.display_value) for freq in [Weekly, Monthly])
And suppose that in addition to get_next_reminder_time, I want to specify something like first_reminder (let's say for "weekly" your first reminder comes three days from now, and the next one 7 days after that, but for "monthly", the first reminder comes 7 days from now, and the next one month after that, etc). Plus 5 other config values or methods that depend on the choice.
One thought is to make frequency a FK to some other model Frequency where I set the config values, but that does not allow different choices to have different methods or logic (like the weekly vs monthly example). So that's out.
My current approach feels very cumbersome because of the requirement to load each ReminderFrequency subclass by name according to the class name stored in the database. Any other ideas? Or is this "the Right and Good Pythonic Way"?
I think the most natural way of handling this in Django would be to create a custom model field. You would use it like so:
class Reminder(models.Model):
frequency = models.FrequencyField()
next_reminder = models.DateTimeField()
reminder = Reminder.objects.get()
reminder_time = reminder.frequency.get_next_reminder_time()
To implement it, review the relevant documentation. Briefly, you'd probably:
Inherit from CharField
Supply the choices in the field definition
Implement get_prep_value(). You could represent values as actual class names, like you have above, or by some other value and use a lookup table.
Implement to_python(). This is where you'll convert the database representation into an actual Python instance of your classes.
It's a little more involved than that, but not much.
(The above assumes that you want to define the behavior in code. If you need to configure the behavior by supplying configuration values to the database (as you suggested above with the ForeignKey idea) that's another story.)

App Engine, Cross reference between two entities

i will like to have two types of entities referring to each other.
but python dont know about name of second entity class in the body of first yet.
so how shall i code.
class Business(db.Model):
bus_contact_info_ = db.ReferenceProperty(reference_class=Business_Info)
class Business_Info (db.Model):
my_business_ = db.ReferenceProperty(reference_class=Business)
if you advice to use reference in only one and use the implicitly created property
(which is a query object) in other.
then i question the CPU quota penalty of using query vs directly using get() on key
Pleas advise how to write this code in python
Queries are a little slower, and so they do use a bit more resources. ReferenceProperty does not require reference_class. So you could always define Business like:
class Business(db.Model):
bus_contact_info_ = db.ReferenceProperty()
There may also be better options for your datastructure too. Check out the modelling relationships article for some ideas.
Is this a one-to-one mapping? If this is a one-to-one mapping, you may be better off denormalizing your data.
Does it ever change? If not (and it is one-to-one), perhaps you could use entity groups and structure your data so that you could just directly use the keys / key names. You might be able to do this by making BusinessInfo a child of Business, then always use 'i' as the key_name. For example:
business = Business().put()
business_info = BusinessInfo(key_name='i', parent=business).put()
# Get business_info from business:
business_info = db.get(db.Key.from_path('BusinessInfo', 'i', parent=business))
# Get business from business_info:
business = db.get(business_info.parent())

Categories