I've got a situation where I have one base Django model Rolls, mapped to the table rolls. There are multiple types of rolls, controlled by a column called type. In the older codebase I'm writing a v2 for (used to be PHP), I created subclasses for each type, that controlled setting their own type value, and it worked fine. I can't figure out how to set this up in Django.
I'd like them all to use the same table, and each will derive methods from the base model, but have different implementations for many of those methods. I figure I can write a manager to handle getting back the right values, but I can't figure out how to setup the models.
I tried setting a single base model and then derived other models from it, but those created different tables for each of them. Using managed = False seems the wrong way to go, given the subclasses don't represent tables of their own.
You're on the right track, but I believe you want proxy models and not unmanaged ones, e.g. proxy = True:
Sometimes, however, you only want to change the Python behavior of a model – perhaps to change the default manager, or add a new method.
This is what proxy model inheritance is for: creating a proxy for the original model. You can create, delete and update instances of the proxy model and all the data will be saved as if you were using the original (non-proxied) model. The difference is that you can change things like the default model ordering or the default manager in the proxy, without having to alter the original.
Then you could override each subclass' save method to set the correct type, and each subclass' default query manager to filter on that type.
tl;dr
Is there a simple alternative to multi-table inheritance for implementing the basic data-model pattern depicted below, in Django?
Premise
Please consider the very basic data-model pattern in the image below, based on e.g. Hay, 1996.
Simply put: Organizations and Persons are Parties, and all Parties have Addresses. A similar pattern may apply to many other situations.
The important point here is that the Address has an explicit relation with Party, rather than explicit relations with the individual sub-models Organization and Person.
Note that each sub-model introduces additional fields (not depicted here, but see code example below).
This specific example has several obvious shortcomings, but that is beside the point. For the sake of this discussion, suppose the pattern perfectly describes what we wish to achieve, so the only question that remains is how to implement the pattern in Django.
Implementation
The most obvious implementation, I believe, would use multi-table-inheritance:
class Party(models.Model):
""" Note this is a concrete model, not an abstract one. """
name = models.CharField(max_length=20)
class Organization(Party):
"""
Note that a one-to-one relation 'party_ptr' is automatically added,
and this is used as the primary key (the actual table has no 'id'
column). The same holds for Person.
"""
type = models.CharField(max_length=20)
class Person(Party):
favorite_color = models.CharField(max_length=20)
class Address(models.Model):
"""
Note that, because Party is a concrete model, rather than an abstract
one, we can reference it directly in a foreign key.
Since the Person and Organization models have one-to-one relations
with Party which act as primary key, we can conveniently create
Address objects setting either party=party_instance,
party=organization_instance, or party=person_instance.
"""
party = models.ForeignKey(to=Party, on_delete=models.CASCADE)
This seems to match the pattern perfectly. It almost makes me believe this is what multi-table-inheritance was intended for in the first place.
However, multi-table-inheritance appears to be frowned upon, especially from a performance point-of-view, although it depends on the application. Especially this scary, but ancient, post from one of Django's creators is quite discouraging:
In nearly every case, abstract inheritance is a better approach for the long term. I’ve seen more than few sites crushed under the load introduced by concrete inheritance, so I’d strongly suggest that Django users approach any use of concrete inheritance with a large dose of skepticism.
Despite this scary warning, I guess the main point in that post is the following observation regarding multi-table inheritance:
These joins tend to be "hidden" — they’re created automatically — and mean that what look like simple queries often aren’t.
Disambiguation: The above post refers to Django's "multi-table inheritance" as "concrete inheritance", which should not be confused with Concrete Table Inheritance on the database level. The latter actually corresponds better with Django's notion of inheritance using abstract base classes.
I guess this SO question nicely illustrates the "hidden joins" issue.
Alternatives
Abstract inheritance does not seem like a viable alternative to me, because we cannot set a foreign key to an abstract model, which makes sense, because it has no table. I guess this implies that we would need a foreign key for every "child" model plus some extra logic to simulate this.
Proxy inheritance does not seem like an option either, as the sub-models each introduce extra fields. EDIT: On second thought, proxy models could be an option if we use Single Table Inheritance on the database level, i.e. use a single table that includes all the fields from Party, Organization and Person.
GenericForeignKey relations may be an option in some specific cases, but to me they are the stuff of nightmares.
As another alternative, it is often suggested to use explicit one-to-one relations (eoto for short, here) instead of multi-table-inheritance (so Party, Person and Organization would all just be subclasses of models.Model).
Both approaches, multi-table-inheritance (mti) and explicit one-to-one relations (eoto), result in three database tables. So, depending on the type of query, of course, some form of JOIN is often inevitable when retrieving data.
By inspecting the resulting tables in the database, it becomes clear that the only difference between the mti and eoto approaches, on the database level, is that an eoto Person table has an id column as primary-key, and a separate foreign-key column to Party.id, whereas an mti Person table has no separate id column, but instead uses the foreign-key to Party.id as its primary-key.
Question(s)
I don't think the behavior from the example (especially the single direct relation to the parent) can be achieved with abstract inheritance, can it? If it can, then how would you achieve that?
Is an explicit one-to-one relation really that much better than multi-table-inheritance, except for the fact that it forces us to make our queries more explicit? To me the convenience and clarity of the multi-table approach outweighs the explicitness argument.
Note that this SO question is very similar, but does not quite answer my questions. Moreover, the latest answer there is almost nine years old now, and Django has changed a lot since.
[1]: Hay 1996, Data Model Patterns
While awaiting a better one, here's my attempt at an answer.
As suggested by Kevin Christopher Henry in the comments above, it makes sense to approach the problem from the database side. As my experience with database design is limited, I have to rely on others for this part.
Please correct me if I'm wrong at any point.
Data-model vs (Object-Oriented) Application vs (Relational) Database
A lot can be said about the object/relational mismatch,
or, more accurately, the data-model/object/relational mismatch.
In the present
context I guess it is important to note that a direct translation between data-model,
object-oriented implementation (Django), and relational database implementation, is not always
possible or even desirable. A nice three-way Venn-diagram could probably illustrate this.
Data-model level
To me, a data-model as illustrated in the original post represents an attempt to capture the essence of a real world information system. It should be sufficiently detailed and flexible to enable us to reach our goal. It does not prescribe implementation details, but may limit our options nonetheless.
In this case, the inheritance poses a challenge mostly on the database implementation level.
Relational database level
Some SO answers dealing with database implementations of (single) inheritance are:
How can you represent inheritance in a database?
How do you effectively model inheritance in a database?
Techniques for database inheritance?
These all more or less follow the patterns described in Martin Fowler's book
Patterns of Application Architecture.
Until a better answer comes along, I am inclined to trust these views.
The inheritance section in chapter 3 (2011 edition) sums it up nicely:
For any inheritance structure there are basically three options.
You can have one table for all the classes in the hierarchy: Single Table Inheritance (278) ...;
one table for each concrete class: Concrete Table Inheritance (293) ...;
or one table per class in the hierarchy: Class Table Inheritance (285) ...
and
The trade-offs are all between duplication of data structure and speed of access. ...
There's no clearcut winner here. ... My first choice tends to be Single Table Inheritance ...
A summary of patterns from the book is found on martinfowler.com.
Application level
Django's object-relational mapping (ORM) API
allows us to implement these three approaches, although the mapping is not
strictly one-to-one.
The Django Model inheritance docs
distinguish three "styles of inheritance", based on the type of model class used (concrete, abstract, proxy):
abstract parent with concrete children (abstract base classes):
The parent class has no database table. Instead each child class has its own database
table with its own fields and duplicates of the parent fields.
This sounds a lot like Concrete Table Inheritance in the database.
concrete parent with concrete children (multi-table inheritance):
The parent class has a database table with its own fields, and each child class
has its own table with its own fields and a foreign-key (as primary-key) to the
parent table.
This looks like Class Table Inheritance in the database.
concrete parent with proxy children (proxy models):
The parent class has a database table, but the children do not.
Instead, the child classes interact directly with the parent table.
Now, if we add all the fields from the children (as defined in our data-model)
to the parent class, this could be interpreted as an implementation of
Single Table Inheritance.
The proxy models provide a convenient way of dealing with the application side of
the single large database table.
Conclusion
It seems to me that, for the present example, the combination of Single Table Inheritance with Django's proxy models may be a good solution that does not have the disadvantages of "hidden" joins.
Applied to the example from the original post, it would look something like this:
class Party(models.Model):
""" All the fields from the hierarchy are on this class """
name = models.CharField(max_length=20)
type = models.CharField(max_length=20)
favorite_color = models.CharField(max_length=20)
class Organization(Party):
class Meta:
""" A proxy has no database table (it uses the parent's table) """
proxy = True
def __str__(self):
""" We can do subclass-specific stuff on the proxies """
return '{} is a {}'.format(self.name, self.type)
class Person(Party):
class Meta:
proxy = True
def __str__(self):
return '{} likes {}'.format(self.name, self.favorite_color)
class Address(models.Model):
"""
As required, we can link to Party, but we can set the field using
either party=person_instance, party=organization_instance,
or party=party_instance
"""
party = models.ForeignKey(to=Party, on_delete=models.CASCADE)
One caveat, from the Django proxy-model documentation:
There is no way to have Django return, say, a MyPerson object whenever you query for Person objects. A queryset for Person objects will return those types of objects.
A potential workaround is presented here.
Baseline: I'm building a django based application that is heavily using the admin interface as it spares me a lot of work in developing own CRUD routines. By now I came across several situations where i have models that hold some general information (say parents) and often have foreignkey-relations to derived models (say childs).
I realized that i sometimes implemented my routines to create child objects within the admin-class, sometimes within the model class(method being called from within some admin routine) or sometimes even within view-classes (e.g. as reaction to POST requests on some custom forms). It feels now, that my design is not very consistent (the effects of changing some model parameters being distributed over a lot of files) and i should refactor before it gets to big a mess.
So what is the best approach? Where should one concentrate methods that create/modify related objects (keeping in mind that i often want to give some feedback-messages related to process) ?
If your code is about a Model class, add it to models.py. This makes sense when the classes have to be added to database (migrations)
If your code is related to views, attach it to views.py. This makes sense when the code handles requests.
If your code is related to admin, attach it to admins.py. This makes sense when the code is related to admin interface.
If your code is generic, used in multiple places, refactor it into a separate file, and import that file elsewhere.
Your use case isn't exactly clear to me, so I'm taking a stab in the dark here - You can have the Models in models.py and create a separate file to create objects for models with child objects containing the code that's related to creating parent-child objects with given data. Then use this as an import in the admin and views wherever applicable.
Something like:
# foo in views.py and admin.py
def foo():
data = {} # get all data
make_parent_child(data) # create parent-child objects
We have a django application that is, at its core, a series of webpages with Forms which our users fill out in order. (We'll call the particular series of pages with forms on them a "flow".)
We will be white-labeling this application for a Partner -- the Partner will want to add some fields and even add some webpages with their own new Forms. This may result in a new order in which the Forms are filled out. (A new "flow", in addition to changes to existing Forms/Models or new Forms/Models.)
What is the best way to extend our existing, simple Forms-and-Models structure to use different Forms and Models depending on the running instance of the app (e.g. an environment variable)? Some things we thought about:
implement something like get_user_model for every Model and Form use in the app, which would look at the current environment
implement a more generic key-value store so that we're not bound by the current implementation's field types (i.e., have the data field name be part of the data as well)
a data model which tracks this particular environment's "flow" and which models it needs to use
subclass existing Models and Forms for each new white-label implementation
Model Field injection may be what you are looking for, take a look of this article
The approach boils down to three concepts:
Dynamically adding fields to model classes Ensuring Django’s model
system respects the new fields
Getting the load ordering correct for the above to work
Mezzanine has done a beautiful job implementing this model field injection with dynamic extra models via EXTRA_MODEL_FIELDS
What is the rationale behind the decision to not support Single Table Inheritance in Django?
Is STI a bad design? Does it result in poor performance? Would it conflict with the Django ORM as it is?
Just wondering because it's been a missing feature for like ten years now and so there must have been a conscious decision made that it would never be supported.
One reason is possibly that Django does not (currently) have the ability to modify database tables after creation.
You can 'kind-of' do STI using proxy models. This will not allow you to have different fields on the different models, but it will allow you to attach different behaviour (via model methods) to different subclasses.
However, if you decide to create a subclass with extra fields, Django will not be able to update the database to reflect that.