I have an idea for data model in django and I was wondering if someone can point out pros and cons for these two setups.
Setup 1: This would be an obvious one. Using CharFields for each field of each object
class Person(models.Model):
name = models.CharField(max_length=255)
surname = models.CharField(max_length=255)
city = models.CharField(max_length=255)
Setup 2: This is the one I am thinking about. Using a ForeignKey to Objects that contain the values that current Object should have.
class Person(models.Model):
name = models.ForeignKey('Name')
surname = models.ForeignKey('Surname')
city = models.ForeignKey('City')
class Chars(models.Model):
value = models.CharField(max_length=255)
def __str__(self):
return self.value
class Meta:
abstract = True
class Name(Chars):pass
class Surname(Chars):pass
class City(Chars):pass
So in setup 1, I would create an Object with:
Person.objects.create(name='Name', surname='Surname', city='City')
and each object would have it's own data. In setup 2, I would have to do this:
_name = Name.objects.get_or_create(value='Name')[0]
_surname = Surname.objects.get_or_create(value='Surname')[0]
_city = City.objects.get_or_create(value='City')[0]
Person.objects.create(name=_name, surname=_surname, city=_city)
Question: Main purpose for this would be to reuse existing values for multiple objects, but is this something worth doing, when you take into consideration that you need multiple hits on the database to create an Object?
Choosing the correct design pattern for your application is a very wide area which is influenced by many factors that are even possibly out of scope in a Stack Overflow question. So in a sense your question could be a bit subjective and too broad.
Nevertheless, I would say that assigning a separate model (class) for first name, another separate for last name etc. is an overkill. You might essentially end up overengineering your app.
The main reasoning behind the above recommendation is that you probably do not want to treat a name as a separate entity and possibly attach additional properties to it. Unless you really would need such a feature, a name is usually a plain string that some users happen to have identical.
It doesn't make any good to keep name and surname as separate object/model/db table. In your setup, if you don't set name and surname as unique, then it doesn't make any sense to put them in separate model. Even worse, it will incur additional DB work and decrease performance. Now, if you set them as unique, then you have to work over the situation when, e.g. some user changes his name and by default it would be changed for all users with that name.
On the other hand, city - there're not that many cities and it's a good idea to keep it as separate object and refer to it via foreign key from user. This will save disk space, allow to easily get all users from same city. Even better, you can prepopulate cities DB and provide autocompletion fro users entering there city. Though for performance you might still want to keep city as a string on the user model.
Also, to mention 'gender' field, since there're not many possible choices for this data, it's worth to use enumeration in your code and store a value in DB, i.e. use choices instead of ForeignKey to a separate DB table.
Related
Is there a way to set foreign key relationship using the integer id of a model? This would be for optimization purposes.
For example, suppose I have an Employee model:
class Employee(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
type = models.ForeignKey('EmployeeType')
and
EmployeeType(models.Model):
type = models.CharField(max_length=100)
I want the flexibility of having unlimited employee types, but in the deployed application there will likely be only a single type so I'm wondering if there is a way to hardcode the id and set the relationship this way. This way I can avoid a db call to get the EmployeeType object first.
Yep:
employee = Employee(first_name="Name", last_name="Name")
employee.type_id = 4
employee.save()
ForeignKey fields store their value in an attribute with _id at the end, which you can access directly to avoid visiting the database.
The _id version of a ForeignKey is a particularly useful aspect of Django, one that everyone should know and use from time to time when appropriate.
caveat: [ < Django 2.1 ]
#RuneKaagaard points out that employee.type is not accurate afterwards in recent Django versions, even after calling employee.save() (it holds its old value). Using it would of course defeat the purpose of the above optimisation, but I would prefer an accidental extra query to being incorrect. So be careful, only use this when you are finished working on your instance (eg employee).
Note: As #humcat points out below, the bug is fixed in Django 2.1
An alternative that uses create to create the object and save it to the database in one line:
employee = Employee.objects.create(first_name='first', last_name='last', type_id=4)
Context
Hey guys,
So let's say I have two models: Person and Attribute connected by a ManyToMany relationship (one person can have many attributes, one attribute can be shared by many people)
class Attribute(models.model):
attribute_name = models.CharField(max_length=100)
attribute_type = models.CharField(max_length=1)
class Person(models.model):
article_name = models.CharField(max_length=100)
attributes = models.ManyToManyField(Attribute)
Attributes can be things like hair colour, location, university degree.
So for example, an attribute may have an 'attribute_name' of 'Computer Science' and an 'attribute_type' of 'D' (for degree).
Another example would be 'London', 'L'.
The Issue
On this web page, users can select people by attributes. For example, they may want to see all people who live in London and who have degrees in both History and Biology (all AND relationships).
I understand that this could be represented in the following (breaks for legibility):
Person.objects
.filter(attributes__attribute_name='London', attributes__attribute_type='L')
.filter(attributes__attribute_name='History', attributes__attribute_type='D')
.filter(attributes__attribute_name='Biology', attributes__attribute_type='D')
However, the user could equally ask for users who have four different degrees. The point being, we don't know how many attributes the user will ask for in the search function.
Questions
As such, which would be the best way to append these filters if we don't know how many, and what types of attributes the user will request?
Is appending filters like this the best way?
Thanks!
Nick
You could obtain all attributes selected by the user and then iterate over:
# sel_att holds the user selected attributes.
result = Person.objects.all()
for att in sel_att:
result = result.filter(
attributes__attribute_name=att.attribute_name,
attributes__attribute_type=att.attribute_type
)
Use the Q module for complex lookups.
For example:
from django.db.models import Q
Person.objects.get(Q(attributes__attribute_name='London') | Q(attributes__attribute_name='History')
Within a QuerySet a | acts as an OR and a , acts as an AND, pretty much as expected.
The problem with chanining filters is you can only implement an AND logic between them, for a complex AND, OR, NOT logic Q would be the better way to go.
How can you specify choices on a django model such that the "choice" carries more information than just the database value and display value (as required by django's choices spec)? Suppose the different choices have a number of config options that I want to set (in code) as well as some methods, and the methods may be different between different choices. Here's an example:
class Reminder(models.Model):
frequency = models.CharField(choices=SOME_CHOICES)
next_reminder = models.DateTimeField()
...
How should we specify SOME_CHOICES for something like "weekly" and "monthly" reminders? So far my best solution is to write a class for each frequency choice, and store the class name in the database and then import the class by name whenever I need the methods or config data.
Ideally I would like to specify all of these config values and define the methods for each choice all in one place, rather than have the logic scattered all over a bunch of Model methods with long if/elif/elif...else structures. Python is object-oriented... these frequency choices seem to have "data" and "methods" so they seem like good candidates for classes...
class ReminderFrequency(object):
pass
class Weekly(ReminderFrequency):
display_value = "Every week"
def get_next_reminder_time(self):
return <now + 7 days>
class Monthly(ReminderFrequency):
display_value = "Every month"
def get_next_reminder_time(self):
return <now + 1 month>
SOME_CHOICES = ((freq.__name__, freq.display_value) for freq in [Weekly, Monthly])
And suppose that in addition to get_next_reminder_time, I want to specify something like first_reminder (let's say for "weekly" your first reminder comes three days from now, and the next one 7 days after that, but for "monthly", the first reminder comes 7 days from now, and the next one month after that, etc). Plus 5 other config values or methods that depend on the choice.
One thought is to make frequency a FK to some other model Frequency where I set the config values, but that does not allow different choices to have different methods or logic (like the weekly vs monthly example). So that's out.
My current approach feels very cumbersome because of the requirement to load each ReminderFrequency subclass by name according to the class name stored in the database. Any other ideas? Or is this "the Right and Good Pythonic Way"?
I think the most natural way of handling this in Django would be to create a custom model field. You would use it like so:
class Reminder(models.Model):
frequency = models.FrequencyField()
next_reminder = models.DateTimeField()
reminder = Reminder.objects.get()
reminder_time = reminder.frequency.get_next_reminder_time()
To implement it, review the relevant documentation. Briefly, you'd probably:
Inherit from CharField
Supply the choices in the field definition
Implement get_prep_value(). You could represent values as actual class names, like you have above, or by some other value and use a lookup table.
Implement to_python(). This is where you'll convert the database representation into an actual Python instance of your classes.
It's a little more involved than that, but not much.
(The above assumes that you want to define the behavior in code. If you need to configure the behavior by supplying configuration values to the database (as you suggested above with the ForeignKey idea) that's another story.)
I am using appengine with python 2.7 and webapp2 framework. I am not using ndb.model.
I have the following model:
class Story(db.Model);
name = db.StringProperty()
class UserProfile(db.Model):
name = db.StringProperty()
user = db.UserProperty()
class Tracking(db.Model):
user_profile = db.ReferenceProperty(UserProfile)
story = db.ReferenceProperty(Story)
upvoted = db.BooleanProperty()
flagged = db.BoolenProperty()
A user can upvote and/or flag a story but only once. Hence I came up with the above model.
Now when a user clicks on the upvote link, on the database I try to see if the user has not already voted it, hence I do try to do the following:
get the user instance with his id as up = db.get(db.Key.from_path('UserProfile', uid))
then get the story instance as follows s_ins = db.get(db.Key.from_path('Story', uid))
Now it is the turn to check if a Tracking based on these two exist, if yes then don't allow voting, else allow him to vote and update the Tracking instance.
What is the most convenient way to fetch a Tracking instance given an id(db.key().id()) of user_profile and story?
What is the most convenient way to save a Tracking model having given a user profile id and an story id?
Is there a better way to implement tracking?
You can try tracking using lists of keys versus having a separate entry for track/user/story:
class Story(db.Model);
name = db.StringProperty()
class UserProfile(db.Model):
name = db.StringProperty()
user = db.UserProperty()
class Tracking(db.Model):
story = db.ReferenceProperty(Story)
upvoted = db.ListProperty(db.Key)
flagged = db.ListProperty(db.Key)
So when you want to see if a user upvoted for a given story:
Tracking.all().filter('story =', db.Key.from_path('Story', uid)).filter('upvoted =', db.Key.from_path('UserProfile', uid)).get(keys_only=True)
Now the only problem here is the size of the upvoted/flagged lists can't grow too large (I think the limit is 5000), so you'd have to make a class to manage this (that is, when adding to the upvoted/flagged lists, detect if X entries exists, and if so, start a new tracking object to hold additional values). You will also have to make this transactional and with HR you have a 1 write per second threshold. This may or may not be an issue depending on your expected use case. A way around the write threshold would be to implement upvotes/flags using pull-queues and to have a cron job that pulls and batch updates tracking objects as needed.
This method has its pros/cons. The most obvious cons are the ones I just listed. The pros, however, may be worth it. You can get a full list of users who upvoted/flagged a story from a single list (or multiple depending on how popular the story is). You can get a full list of users with a lot fewer queries to the datastore. This method should also take less storage, index, and metadata space. Additionally, adding a user to a tracking object will be cheaper, instead of writing a new object + 2 writes for each property, you would just be charged 1 write for the object + 2 writes for the entry to the list (9 vs 3 writes for adding users to a pre-existing tracked story, or 9 vs 7 for untracked stories)
What you propose sounds reasonable.
Don't use the app engine generated key for Tracking. Because the combination of story/user should be unique, create your own key as a combination of the story/user. Something like
tracking = Tracking.get_or_insert(str(story.id) + "-" + str(user.id), **params)
If you know the story/user, then you can always fetch the tracking by key name.
Howdy. I'm working on migrating an internal system to Django and have run into a few wrinkles.
Intro
Our current system (a billing system) tracks double-entry bookkeeping while allowing users to enter data as invoices, expenses, etc.
Base Objects
So I have two base objects/models:
JournalEntry
JournalEntryItems
defined as follows:
class JournalEntry(models.Model):
gjID = models.AutoField(primary_key=True)
date = models.DateTimeField('entry date');
memo = models.CharField(max_length=100);
class JournalEntryItem(models.Model):
journalEntryID = models.AutoField(primary_key=True)
gjID = models.ForeignKey(JournalEntry, db_column='gjID')
amount = models.DecimalField(max_digits=10,decimal_places=2)
So far, so good. It works quite smoothly on the admin side (inlines work, etc.)
On to the next section.
We then have two more models
InvoiceEntry
InvoiceEntryItem
An InvoiceEntry is a superset of / it inherits from JournalEntry, so I've been using a OneToOneField (which is what we're using in the background on our current site). That works quite smoothly too.
class InvoiceEntry(JournalEntry):
invoiceID = models.AutoField(primary_key=True, db_column='invoiceID', verbose_name='')
journalEntry = models.OneToOneField(JournalEntry, parent_link=True, db_column='gjID')
client = models.ForeignKey(Client, db_column='clientID')
datePaid = models.DateTimeField(null=True, db_column='datePaid', blank=True, verbose_name='date paid')
Where I run into problems is when trying to add an InvoiceEntryItem (which inherits from JournalEntryItem) to an inline related to InvoiceEntry. I'm getting the error:
<class 'billing.models.InvoiceEntryItem'> has more than 1 ForeignKey to <class 'billing.models.InvoiceEntry'>
The way I see it, InvoiceEntryItem has a ForeignKey directly to InvoiceEntry. And it also has an indirect ForeignKey to InvoiceEntry through the JournalEntry 1->M JournalEntryItems relationship.
Here's the code I'm using at the moment.
class InvoiceEntryItem(JournalEntryItem):
invoiceEntryID = models.AutoField(primary_key=True, db_column='invoiceEntryID', verbose_name='')
invoiceEntry = models.ForeignKey(InvoiceEntry, related_name='invoiceEntries', db_column='invoiceID')
journalEntryItem = models.OneToOneField(JournalEntryItem, db_column='journalEntryID')
I've tried removing the journalEntryItem OneToOneField. Doing that then removes my ability to retrieve the dollar amount for this particular InvoiceEntryItem (which is only stored in journalEntryItem).
I've also tried removing the invoiceEntry ForeignKey relationship. Doing that removes the relationship that allows me to see the InvoiceEntry 1->M InvoiceEntryItems in the admin inline. All I see are blank fields (instead of the actual data that is currently stored in the DB).
It seems like option 2 is closer to what I want to do. But my inexperience with Django seems to be limiting me. I might be able to filter the larger pool of journal entries to see just invoice entries. But it would be really handy to think of these solely as invoices (instead of a subset of journal entries).
Any thoughts on how to do what I'm after?
First, inheriting from a model creates an automatic OneToOneField in the inherited model towards the parents so you don't need to add them. Remove them if you really want to use this form of model inheritance.
If you only want to share the member of the model, you can use Meta inheritance which will create the inherited columns in the table of your inherited model. This way would separate your JournalEntry in 2 tables though but it would be easy to retrieve only the invoices.
All fields in the superclass also exist on the subclass, so having an explicit relation is unnecessary.
Model inheritance in Django is terrible. Don't use it. Python doesn't need it anyway.