Mapping a class hierarchy to a database - python

Prerequisite
I want to implement abstract inheritance:
class Base(models.Model):
title = models.CharField()
slug = models.SlugField()
description = models.TextField()
class Meta:
abstract = True
class ChildA(Base):
foo = models.CharField()
class ChildB(Base):
bar = models.CharField()
For multiple reasons I need a db representation of the hierarchy of these classes. So I want to create a model like this one (left and right attributes allow us to identify instance's place in the node tree):
class Node(models.Model):
app_label = models.CharField()
model_name = models.CharField()
parent = models.ForeignKey('self', blank=True, null=True)
right = models.PositiveIntegerField()
left = models.PositiveIntegerField()
The Problem
I need something similar to this:
class Base(models.Model):
...
def __init_subclass__(cls):
app_label = cls._meta.app_label
model_name = cls._meta.model_name
parent_id = ? # I am not sure how do we get parent's id for now, but it should be manageable
obj = Node.objects.create('app_label'=app_label, 'model_name'=model_name, 'parent'=parent_id)
obj.save()
So, as we subclass an abstract model, a new node is created that represents this new model in the hierarchy tree. Unfortunately, it won't work. It seems __init_subclass__ is invoked before Model class is properly initialized, so cls._meta.model_name will return incorrect value (parent's model_name, in fact). Can we bypass this (or use some other hook)?
Other concerns
I am not sure if this whole idea is sensible. I previously used multi-table inheritance, but at some point SQL queries became really ugly, so I am trying to fix it by using abstract models instead. But we still need to tie models to each other, a node tree seems appealing. This way I get a concrete model to manage multiple tables simultaneously, like this:
class NodeManager(models.Manager):
def get_queryset(self):
root = self._get_root_node()
fields = root._meta.get_fields()
field_names = [field.name for field in fields]
descendants = list(self._get_descendant_models(root))
queryset_list = []
for model in descendants:
qs = model.objects.values(*field_names)
annotated_qs = qs.annotate(
resource_model=models.Value(
root._meta.model_name,
models.CharField(max_length=120)
)
)
queryset_list.append(annotated_qs)
if len(queryset_list) > 1:
merged_queryset = queryset_list[0].union(*queryset_list[1:])
elif len(queryset_list) == 1:
merged_queryset = queryset_list[0]
else:
merged_queryset = None
return merged_queryset
This is not how managers are supposed to be used, I guess, so I am not sure if it's fine.
I don't want to focus on this, it's mainly to give a better idea what I am aiming for. But if you let me know whether you think it's fine or not, I will greatly appreciate it.

Django has ContentType module that can be directly used for this purpose but you would need to do some extra things to get what you need.
You need to handle app ready method that's defined by AppConfig class. The problem is you need to handle the App ready method for each app, instead of adding this code to every app you can just add this as a base class.
class BaseAppConfig(AppConfig):
def add_or_update_node(self, model, super_classes):
# assuming Node model is defined in a separate app called node
from node.models import Node
super_classes = super_classes[::-1]
super_classes.append(model)
prev_node = None
for super_class in super_classes:
node, _ = Node.objects.get_or_create(app_label=super_class._meta.app_label,
model_name=super_class._meta.model_name)
if node.parent is None and prev_node is not None:
node.parent = prev_node
node.save()
prev_node = node
def ready(self):
import inspect
// if variable is not set than do not do anything
if os.getenv('CREATE_NODE') is None:
return
for model_name, model in self.models.items():
super_classes = []
for clazz in inspect.getmro(model):
if clazz == Model or clazz == object or clazz == model:
continue
super_classes.append(clazz)
# no super class
if len(super_classes) == 0:
continue
self.add_or_update_node(model, super_classes)
This handles only the case when you have a single hierarchy model, what about the case when a model is extending from multiple abstract models, left for the op to work upon.
You need to extend this class in your apps.py file.
class CoreConfig(BaseAppConfig):
name = 'Core'
def ready(self):
super(CoreConfig, self).ready()

Related

Passing parameters through factory-boy Factory to SubFactory without specifying it

I'm using the pythons factory_boy package to create instances of models for testing purposes.
I want to pass the parameters used when calling Facotry.create() to all the SubFactories in the Factory being called.
Here's how I do it now:
Example 1:
I have to explicitly set the company when calling the SubFactory (the BagFactory)
class BagTrackerFactory(BaseFactory):
company = factory.SubFactory(CompanyFactory)
bag = factory.SubFactory(BagFactory, company=factory.SelfAttribute("..company"))
class BagFactory(BaseFactory):
company = factory.SubFactory(CompanyFactory)
Example 2:
In this example, I have to add company to Params in the BagFactory, so I can pass it down to ItemFactory which has the company parameter.
class BagTrackerFactory(BaseFactory):
company = factory.SubFactory(CompanyFactory)
bag = factory.SubFactory(BagFactory, company=factory.SelfAttribute("..company"))
class BagFactory(BaseFactory):
item = factory.SubFactory(ItemFactory, company=factory.SelfAttribute("..company"))
class Params:
company = factory.SubFactory(CompanyFactory)
class ItemFactory(BaseFactory):
company = factory.SubFactory(CompanyFactory)
The reason why I do it like this is that it saves time and it makes sense that the Bag belongs to the same company as the BagTracker when created by the same Factory.
Note: the BaseFactory is factory.alchemy.SQLAlchemyModelFactory
Question:
What I would like is to have company (and all the other parameters) from the parent Factory be passed down to SubFactories without having to pass it explicitly. And this continues downstream all the way to the last SubFactory, so every model has the same company, from the topmost parent Factory to the lowest child SubFactory. I hope you understand what I'm saying.
Is there an easy way to do this? Like some option in the factory-boy package?
EDIT:
I ended up doing it the long way, passing down parameters manually. In this example, I'm showing both cases: when the parent factory has the company parameter(BagTrackerFactory) and doesn't have it but must pass it downstream (BagFactory).
class CompanyFactory(BaseFactory):
id = get_sequence()
class Meta:
model = Company
class ItemFactory(BaseFactory):
id = get_sequence()
owner = factory.SubFactory(CompanyFactory)
owner_id = factory.SelfAttribute("owner.id")
class Meta:
model = Item
class BagFactory(BaseFactory):
id = get_sequence()
item = factory.SubFactory(ItemFactory, owner=factory.SelfAttribute("..company"))
item_id = factory.SelfAttribute("item.id")
class Params:
company = factory.SubFactory(CompanyFactory)
class Meta:
model = Bag
class BagTrackerFactory(BaseFactory):
id = get_sequence()
company = factory.SubFactory(CompanyFactory)
company_id = factory.SelfAttribute("company.id")
item = factory.SubFactory(ItemFactory, owner=factory.SelfAttribute("..company"))
item_id = factory.SelfAttribute("item.id")
bag = factory.SubFactory(BagFactory, company=factory.SelfAttribute("..company"))
bag_id = factory.SelfAttribute("bag.id")
class Meta:
model = BagTracker
This is possible, but will have to be done specifically for your codebase.
At its core, a factory has no knowledge of your models' specific structure, hence can't forward the company field — for instance, some models might not accept that field in their __init__, and providing the field would crash.
However, if you've got a chain where the field is always accepted, you may use the following pattern:
class WithCompanyFactory(factory.BaseFactory):
class Meta:
abstract = True
company = factory.Maybe(
"factory_parent", # Is there a parent factory?
yes_declaration=factory.SelfAttribute("..company"),
no_declaration=factory.SubFactory(CompanyFactory),
)
This works thanks to the factory_parent attribute of the stub used when building a factory's parameters: https://factoryboy.readthedocs.io/en/stable/reference.html#parents
This field either points to the parent (when the current factory is called as a SubFactory), or to None. With a factory.Maybe, we can copy the value through a factory.SelfAttribue when a parent is defined, and instantiate a new value.
This can be used afterwards in your code:
class ItemFactory(WithCompanyFactory):
pass
class BagFactory(WithCompanyFactory):
item = factory.SubFactory(ItemFactory)
class BagTrackerFactory(WithCompanyFactory):
bag = factory.SubFactory(BagFactory)
>>> tracker = BagTrackerFactory()
>>> assert tracker.company == tracker.bag.company == tracker.bag.item.company
... True
# It also works starting anywhere in the chain:
>>> company = Company(...)
>>> bag = BagFactory(company=company)
>>> assert bag.company == bag.item.company == company
... True
If some models must pass the company value to their subfactories, but without a company field themselves, you may also split the special WithCompanyFactory into two classes: WithCompanyFieldFactory and CompanyPassThroughFactory:
class WithCompanyFieldFactory(factory.Factory):
"""Automatically fill this model's `company` from the parent factory."""
class Meta:
abstract = True
company = factory.Maybe(
"factory_parent", # Is there a parent factory?
yes_declaration=factory.SelfAttribute("..company"),
no_declaration=factory.SubFactory(CompanyFactory),
)
class CompanyPassThroughFactory(factory.Factory):
"""Expose the parent model's `company` field to subfactories declared in this factory."""
class Meta:
abstract = True
class Params:
company = factory.Maybe(
"factory_parent", # Is there a parent factory?
yes_declaration=factory.SelfAttribute("..company"),
no_declaration=factory.SubFactory(CompanyFactory),
)

Update field in model for all "submodels" changes

I know how to update some field of ForeignKey. For example when I want change last_modified field every time if Configuration or SomeOtherImportantClass is changed:
class Configuration(models.Model):
title = models.TextField()
last_modified = models.DateTimeField(auto_now=True)
class SomeOtherImportantClass(models.Model):
conf = models.ForeignKey(Configuration)
important_number = models.IntegerField()
def save(self, *args, **kwargs):
conf.last_modified = timezone.now() # I'm not sure if it is necessary
conf.save()
return super().save(*args, **kwargs)
but in my real situation the Cofiguration model is a ForeignKey for more than 30 other models. In each of them I want to update configuration.last_modified field for every change performed on them or when another model (which has ForeignKey to some model which has ForeignKey do Configuration) is changed. So it looks like that:
class Configuration(models.Model):
title = models.TextField()
last_modified = models.DateTimeField(auto_now=True)
class A(models.Model):
conf = models.ForeignKey(Configuration) # conf.last_modified must be updated on every change on A model object.
class B(models.Model):
conf = models.ForeignKey(Configuration) # same
...
class Z(models.Model):
conf = models.ForeignKey(Configuration) # same
class AA(models.Model):
some_field = models.TextField()
a = models.ForeignKey(A)
...
class ZZ(models.Model)
some_field = models.TextField()
z = models.ForeignKey(Z)
so even if AA object field "some_field" is changed I want to update last_modified Configuration field. Is there any recursion way to declare it once in Configuration or somewhere else?
UPDATE: Great-granchilds like AAA and AAAA classes can exist too.
Use abstract base classes as explained in the docs. For A-Z it's quite easy:
class ConfigurationChild(Model):
conf = ForeignKey(Configuration)
class Meta:
abstract = True
def save(self):
self.conf.last_modified = ...
self.conf.save()
super().save()
class A(ConfigurationChild):
# other fields, without conf
For the grand-children it's a bit more complex because then don't have a reference to conf directly. Set an attribute on the base class that you populate on each child class:
class ConfigurationDescendant(Model):
conf_handle = None
class Meta:
abstract = True
def get_conf(self):
if not self.conf_handle:
return None # or raise an error
parent = getattr(self, self.conf_handle)
if isinstance(parent, ConfigurationDescendant):
return parent.get_conf() # recursion
else:
# reached `ConfigurationChild` class, might want to check this
return parent.conf if parent else None
def save(self):
conf = self.get_conf()
# you might want to handle the case that the attribute is None
if conf:
conf.last_modified = ...
conf.save()
super().save()
class AA(ConfigurationDescendant):
conf_handle = 'a'
a = ForeignKey(A)
class AAA(ConfigurationDescendant):
conf_handle = 'aa'
aa = ForeignKey(AA)
The above code will handle the case when the chain breaks because conf_handle is missing on one of the parents. In which case None is returned and nothing happens. I'm not checking if the handle is set wrongly (i.e. not pointing in the right direction towards the parent Configuration), that will raise an exception which you probably want so you can catch mistakes.

django has more than 1 foreignkey error

My models file works just fine. As soon as I replace every models.Model with MyModel (a child-class of models.Model), one of my models raises a
<class 'puppy.cms.models.Appearance'> has more than 1 ForeignKey to <class 'puppy.cms.models.Segment'>
exception. The only thing that I am doing in the child class is override the clean method.
What could I be doing wrong?
class SansHashUrl(object):
""" Upon each call to clean, iterates over every field,
and deletes all '#/' and '#!/' occurances.
IMPORTANT: This mixin must be listed first in the inheritance list to work
properly. """
def clean(self):
attrs = (field.attname for field in self.__class__._meta.fields
if isinstance(field, models.CharField)
or isinstance(field, models.TextField))
for attr in attrs:
attr_value = self.__getattribute__(attr)
tokens = attr_value.split()
for i, token in enumerate(tokens):
if has_internal_domain(token):
suggested_url = re.sub('#!?/','', token)
tokens[i] = suggested_url
self.__setattr__(attr, ' '.join(tokens))
class MyModel(SansHashUrl, models.Model):
pass
Model that throws the error:
class Appearance(MyModel):
appearance_type = models.CharField(max_length=20,
choices=APPEARANCE_TYPE_CHOICES)
person = models.ForeignKey(Person, related_name='person_appearance')
item = models.ForeignKey(ManagedItem)
class Meta:
unique_together = (('person', 'item'),)
def __unicode__(self):
return self.person.__unicode__()
In reference to:
class Segment(Story, HasStatsTags, HasFullUrl):
...
It might be useful to note that Story is a subclass of ManagedItem (a subclass of MyModel).
You need to declare MyModel (and probably ManagedItem) as an abstract model in its Meta class, otherwise Django will create a separate table for them and define FKs between them.
class MyModel(SansHashUrl, models.Model):
class Meta:
abstract = True

How to move stored procedure to django model class and use them in filter/exclude?

How to move stored procedure to django model class and use them in filter/exclude?
As said here What is the best way to access stored procedures in Django's ORM it should be possible.
In another word, how can i accomplish something like this:
class Project(models.Model):
name = models.CharField()
def is_finished(self):
count = self.task_set.all().count()
count2 = self.task_set.filter(finished=True).count()
if count == count2:
return True
else:
return False
class Task(models.Model):
name = models.CharField()
finished = models.BooleanField()
project = models.ForeignKey(Project)
#somewhere else in the code...
finished_projects = Project.objects.filter(is_finished=True)
Not sure why you are referring to stored procedures in this context.
But if i understand your example correct, your problem is that you can filter only by modelfields that have a corresponding field in a database table.
And therefore you can't use django's orm to filter by methods and properties.
But you can achieve what you want using a list comprehension:
finished_projects = [p for p in Project.objects.all() if p.is_finished()]
One solution is denormalization:
class Project(models.Model):
name = models.CharField()
is_finished = models.BooleanField()
def _is_finished(self):
return self.task_set.exclude(finished=True).exists()
def update_finished(self):
self.is_finished = self._is_finished()
self.save()
class Task(models.Model):
name = models.CharField()
finished = models.BooleanField()
project = models.ForeignKey(Project)
def save(*args, **kwargs):
res = super(Task, self).save(*args, **kwargs)
self.project.update_finished()
return res
#somewhere else in the code...
finished_projects = Project.objects.filter(is_finished=True)
It is nice if you have much more reads than writes because reads will be very fast (faster than e.g. using stored procedures). But you should take care of consistency yourselves.
Django's aggregates or 'raw' support can be often used to implement stored procedure logic.

How to retrieve a polymorphic model from a single table in django - or how to implement polymorphic behaviour in django

Can I read polymorphic models from a single database table, with their behaviour depending on a (boolean) field of the model?
In one of my models the behaviour is slightly different if the instance is 'forward' vs. 'backward' or 'left' vs. 'right'. That leads to a lot of if-clauses and code duplication. So I want to have a Forward- and a Backward-variant of the model that encapsulate the different behaviours.
But how can I make the models manager return the instances of the right classes? Do I have to overwrite __init__ of the model?
Maybe it's easier to explain with an example. What I'm doing:
class Foo(models.Model):
forward = models.BooleanField()
other_fields = ...
def do_foobar(bar):
if self.forward:
gap = bar.end_pos - bar.current_pos
self.do_forward_move(max = gap)
if self.pos==bar.end_pos:
and so on ...
else:
gap = bar.current_pos - bar.start_pos
self.do_backward_move(max = gap)
if self.pos==bar.start_pos:
and so on ...
What I want to do:
class Foo(models.Model):
forward = models.BooleanField()
other_fields = ...
def __init__(*args, **kwargs):
""" return ForwardFoo or BackwardFoo
depending on the value of 'forward'"""
How?
def do_foobar(bar):
gap = self.calculate_gap(bar)
self.do_move(max = gap)
if self.end_point_reached():
and so on ...
class ForwardFoo(Foo):
def calculate_gap(bar):
return bar.end_pos - bar.current_pos
and so on ...
for f in Foo.objects.all():
f.do_foobar(bar)
Or is there a totally different way to avoid this kind of code duplication?
Proxy models:
class Foo(models.Model):
# all model attributes here
class ForwardFooManager(models.Manager):
def get_query_set(self, *args, **kwargs):
qs = super(ForwardFooManager, self).get_query_set(*args, **kwargs)
return qs.filter(forward=True)
class ForwardFoo(Foo):
class Meta:
proxy = True
objects = ForwardsFooManager()
# methods for forward model
class BackwardFooManager(models.Manager):
def get_query_set(self, *args, **kwargs):
qs = super(BackwardFooManager, self).get_query_set(*args, **kwargs)
return qs.filter(forward=False)
class BackwardFoo(Foo):
class Meta:
proxy = True
objects = BackwardFooManager()
# methods for backward model
The above creates two proxy models: one for forward, one for backward. Proxy models do not have their own database table; they use the same database table as the model they inherit from. (This also means you cannot add any additional fields to the proxy model, only methods.)
There's also a custom manager for to force each one to only return the subset of items that belong to each. Just add whatever specific methods you need and you're done.

Categories