Appending filters to django models - python

Context
Hey guys,
So let's say I have two models: Person and Attribute connected by a ManyToMany relationship (one person can have many attributes, one attribute can be shared by many people)
class Attribute(models.model):
attribute_name = models.CharField(max_length=100)
attribute_type = models.CharField(max_length=1)
class Person(models.model):
article_name = models.CharField(max_length=100)
attributes = models.ManyToManyField(Attribute)
Attributes can be things like hair colour, location, university degree.
So for example, an attribute may have an 'attribute_name' of 'Computer Science' and an 'attribute_type' of 'D' (for degree).
Another example would be 'London', 'L'.
The Issue
On this web page, users can select people by attributes. For example, they may want to see all people who live in London and who have degrees in both History and Biology (all AND relationships).
I understand that this could be represented in the following (breaks for legibility):
Person.objects
.filter(attributes__attribute_name='London', attributes__attribute_type='L')
.filter(attributes__attribute_name='History', attributes__attribute_type='D')
.filter(attributes__attribute_name='Biology', attributes__attribute_type='D')
However, the user could equally ask for users who have four different degrees. The point being, we don't know how many attributes the user will ask for in the search function.
Questions
As such, which would be the best way to append these filters if we don't know how many, and what types of attributes the user will request?
Is appending filters like this the best way?
Thanks!
Nick

You could obtain all attributes selected by the user and then iterate over:
# sel_att holds the user selected attributes.
result = Person.objects.all()
for att in sel_att:
result = result.filter(
attributes__attribute_name=att.attribute_name,
attributes__attribute_type=att.attribute_type
)

Use the Q module for complex lookups.
For example:
from django.db.models import Q
Person.objects.get(Q(attributes__attribute_name='London') | Q(attributes__attribute_name='History')
Within a QuerySet a | acts as an OR and a , acts as an AND, pretty much as expected.
The problem with chanining filters is you can only implement an AND logic between them, for a complex AND, OR, NOT logic Q would be the better way to go.

Related

Best way to model current version/single elevated list item in Django?

Suppose I'm Santa Claus and I'm making a site to keep track of kids' Christmas lists. I might have something like this:
class Kid(models.Model):
name = models.CharField()
class Gift(models.Model):
name = models.CharField()
class WishList(models.Model):
date_received = models.DateTimeField()
kid = models.ForeignKey(Kid, on_delete=CASCADE)
I have implemented the WishList model, instead of just linking Gifts to Kids with ForeignKeys directly, because I would like to keep track of individual Christmas wish lists discretely. I want to archive all the old Christmas lists as I received them. The most obvious way to implement this would be to add something like current = models.BooleanField() to the WishList class, and then filter for wishlist.current=true when I want to get the current wishlist for a given kid. However, there are two problems with this: 1. I don't know how the database query algorithm works behind the scenes, and I don't want to make the computer look through every list to find the current one, when 99% of the time that's the only one I'll be looking at, and 2. this doesn't prevent me from having two current wish lists, and I don't want to have to write special validator code to ensure that.
The second option I thought of would be to simply display the most recently received WishList. However, this doesn't satisfy all of my usage needs--I might have a kid write me saying, "wait, actually, disregard the list I sent in October, and use the one I sent in August instead." I could, of course, make a copy of the August list marked with today's date, but that feels both unnecessary and confusing.
The third option I thought of was that I could get rid of the ForeignKey field in WishList, and instead put the links in the Kid model:
class Kid(models.Model):
name = models.CharField()
current_wishlist = models.OneToOneField(WishList)
archived_wishlists = models.ManyToManyField(WishList)
This last one seems the most promising, but I am unfamiliar with OneToOneFields and ManyToManyFields, and am unsure if it is best practice. It also feels bad to have two separate fields for one type of model relation. Can anyone give me some guidance on what the best way to accomplish this would be?
I'm no sure on what would be the best way, but you could also do something like:
class Kid(models.Model):
name = models.CharField()
current_wishlist = models.ManyToManyField(Gift)
archived_wishlists = models.ManyToManyField(Gift)
and when archiving a gift just do
kidObj.current_wishlist.remove(giftObj)
kidObj.archived_wishlist.add(giftObj)
and viewing them
listofgifts = kidObj.current_wishlist.all()
powerrangergifts = kidObj.current_wishlist.filter(name__icontains='Power Ranger')
standard should be
or
archived_wishlists = models.ManyToManyField(WishList)
it works but you now could add one whishlist to multiple kids
kid - whishlist is a one-many relation therefor you have to put the foreign key into the "many model"
I don't know how the database query algorithm works behind the scenes, and I don't want to make the computer look through every list to find the current one, when 99% of the time that's the only one I'll be looking at
well you dont have to know
this doesn't prevent me from having two current wish lists, and I don't want to have to write special validator code to ensure that.
then go for the OnetoOne but you have to ensure the kid matches the one referenced in the whishlist

ForeignKey vs CharField

I have an idea for data model in django and I was wondering if someone can point out pros and cons for these two setups.
Setup 1: This would be an obvious one. Using CharFields for each field of each object
class Person(models.Model):
name = models.CharField(max_length=255)
surname = models.CharField(max_length=255)
city = models.CharField(max_length=255)
Setup 2: This is the one I am thinking about. Using a ForeignKey to Objects that contain the values that current Object should have.
class Person(models.Model):
name = models.ForeignKey('Name')
surname = models.ForeignKey('Surname')
city = models.ForeignKey('City')
class Chars(models.Model):
value = models.CharField(max_length=255)
def __str__(self):
return self.value
class Meta:
abstract = True
class Name(Chars):pass
class Surname(Chars):pass
class City(Chars):pass
So in setup 1, I would create an Object with:
Person.objects.create(name='Name', surname='Surname', city='City')
and each object would have it's own data. In setup 2, I would have to do this:
_name = Name.objects.get_or_create(value='Name')[0]
_surname = Surname.objects.get_or_create(value='Surname')[0]
_city = City.objects.get_or_create(value='City')[0]
Person.objects.create(name=_name, surname=_surname, city=_city)
Question: Main purpose for this would be to reuse existing values for multiple objects, but is this something worth doing, when you take into consideration that you need multiple hits on the database to create an Object?
Choosing the correct design pattern for your application is a very wide area which is influenced by many factors that are even possibly out of scope in a Stack Overflow question. So in a sense your question could be a bit subjective and too broad.
Nevertheless, I would say that assigning a separate model (class) for first name, another separate for last name etc. is an overkill. You might essentially end up overengineering your app.
The main reasoning behind the above recommendation is that you probably do not want to treat a name as a separate entity and possibly attach additional properties to it. Unless you really would need such a feature, a name is usually a plain string that some users happen to have identical.
It doesn't make any good to keep name and surname as separate object/model/db table. In your setup, if you don't set name and surname as unique, then it doesn't make any sense to put them in separate model. Even worse, it will incur additional DB work and decrease performance. Now, if you set them as unique, then you have to work over the situation when, e.g. some user changes his name and by default it would be changed for all users with that name.
On the other hand, city - there're not that many cities and it's a good idea to keep it as separate object and refer to it via foreign key from user. This will save disk space, allow to easily get all users from same city. Even better, you can prepopulate cities DB and provide autocompletion fro users entering there city. Though for performance you might still want to keep city as a string on the user model.
Also, to mention 'gender' field, since there're not many possible choices for this data, it's worth to use enumeration in your code and store a value in DB, i.e. use choices instead of ForeignKey to a separate DB table.

GAE Datastore ndb models accessed in 5 different ways

I run an online marketplace. I don't know the best way to access NDB models. I'm afraid it's a real mess and I really don't know which way to turn. If you don't have time for a full response, I'm happy to read an article on NDB best practices
I have these classes, which are interlinked in different ways:
User(webapp2_extras.appengine.auth.models.User) controls seller logins
Partner(ndb.Model) contains information about sellers
menuitem(ndb.Model) contains information about items on menu
order(ndb.Model) contains buyer information & information about an order (all purchases are "guest" purchases)
Preapproval(ndb.Model) contains payment information saved from PayPal
How they're linked.
User - Partner
A 1-to-1 relationship. Both have "email address" fields. If these match, then can retrieve user from partner or vice versa. For example:
user = self.user
partner = model.Partner.get_by_email(user.email_address)
Where in the Partner model we have:
#classmethod
def get_by_email(cls, partner_email):
query = cls.query(Partner.email == partner_email)
return query.fetch(1)[0]
Partner - menuitem
menuitems are children of Partner. Created like so:
myItem = model.menuitem(parent=model.partner_key(partner_name))
menuitems are referenced like this:
menuitems = model.menuitem.get_by_partner_name(partner.name)
where get_by_partner_name is this:
#classmethod
def get_by_partner_name(cls, partner_name):
query = cls.query(
ancestor=partner_key(partner_name)).order(ndb.GenericProperty("itemid"))
return query.fetch(300)
and where partner_key() is a function just floating at the top of the model.py file:
def partner_key(partner_name=DEFAULT_PARTNER_NAME):
return ndb.Key('Partner', partner_name)
Partner - order
Each Partner can have many orders. order has a parent that is Partner. How an order is created:
partner_name = self.request.get('partner_name')
partner_k = model.partner_key(partner_name)
myOrder = model.order(parent=partner_k)
How an order is referenced:
myOrder_k = ndb.Key('Partner', partnername, 'order', ordernumber)
myOrder = myOrder_k.get()
and sometimes like so:
order = model.order.get_by_name_id(partner.name, ordernumber)
(where in model.order we have:
#classmethod
def get_by_name_id(cls, partner_name, id):
return ndb.Key('Partner', partner_name, 'order', int(id)).get()
)
This doesn't feel particularly efficient, particularly as I often have to look up the partner in the datastore just to pull up an order. For example:
user = self.user
partner = model.Partner.get_by_email(user.email_address)
order = model.order.get_by_name_id(partner.name, ordernumber)
Have tried desperately to get something simple like myOrder = order.get_by_id(ordernumber) to work, but it seems that having a partner parent stops that working.
Preapproval - order.
a 1-to-1 relationship. Each order can have a 'Preapproval'. Linkage: a field in the Preapproval class: order = ndb.KeyProperty(kind=order).
creating a Preapproval:
item = model.Preapproval( order=myOrder.key, ...)
accessing a Preapproval:
preapproval = model.Preapproval.query(model.Preapproval.order == order.key).get()
This seems like the easiest method to me.
TL;DR: I'm linking & accessing models in many ways, and it's not very systematic.
User - Parner
You could replace:
#classmethod
def get_by_email(cls, partner_email):
query = cls.query(Partner.email == partner_email)
return query.fetch(1)[0]
with:
#classmethod
def get_by_email(cls, partner_email):
query = cls.query(Partner.email == partner_email).get()
But because of transactions issues is better to use entity groups: User should be parent of Partner.
In this case instead of using get_by_email you can get user without queries:
user = partner.key.parent().get()
Or do an ancestor query for getting the partner object:
partner = Partner.query(ancestor=user_key).get()
Query
Don't use fetch() if you don't need it. Use queries as iterators.
Instead of:
return query.fetch(300)
just:
return query
And then use query as:
for something in query:
blah
Relationships: Partner-Menu Item and Partner - Order
Why are you using entity groups? Ancestors are not used for modeling 1 to N relationships (necessarily). Ancestors are used for transactions, defining entity groups. They are useful in composition relationships (e.g.: partner - user)
You can use a KeyProperty for the relationship. (multivalue (i.e. repeated=true) or not, depending on the orientation of the relationship)
Have tried desperately to get something simple like myOrder = order.get_by_id(ordernumber) to work, but it seems that having a partner parent stops that working.
No problem if you stop using ancestors in this relationship.
TL;DR: I'm linking & accessing models in many ways, and it's not very systematic
There is not a systematic way of linking models. It depends of many factors: cardinality, number of possible items in each side, need transactions, composition relationship, indexes, complexity of future queries, denormalization for optimization, etc.
Ok, I think the first step in cleaning this up is as follows:
At the top of your .py file, import all your models, so you don't have to keep using model.ModelName. That cleans up a bit if the code. model.ModelName becomes ModelName.
First best practice in cleaning this up is to always use a capital letter as the first letter to name a class. A model name is a class. Above, you have mixed model names, like Partner, order, menuitem. It makes it hard to follow. Plus, when you use order as a model name, you may end up with conflicts. Above you redefined order as a variable twice. Use Order as the model name, and this_order as the lookup, and order_key as the key, to clear up some conflicts.
Ok, let's start there

Django: Grouping and ordering across foreign keys with conditions

I have some Django models that record people's listening habits (a bit like Last.fm), like so:
class Artist(models.Model):
name = models.CharField()
class Song(models.Model):
artist = models.ForeignKey(Artist)
title = models.CharField()
class SongPlay(models.Model):
song = models.ForeignKey(Song)
user = models.ForeignKey(User)
time = models.DateTimeField()
class User(models.Model):
# doesn't really matter!
I'd like to have a user page where I can show the top songs that they've listened to in the past month. What's the best way to do this?
The best I've come up with so far is:
SongPlay.past_month
.filter(user=user)
.values('song__title', 'song__id', 'song__artist__name')
.annotate(plays=Count('song'))
.order_by('-plays')[:20]
Above, past_month is a manager that just filters plays from the last month. Assume that we've already got the correct user object to filter by as well.
I guess my two questions are:
How can I get access to the original object as well as the plays annotation?
This just gives me certain values, based on what I pass to values. I'd much rather have access to the original object – the model has methods I'd like to call.
How can I group from SongPlay to Artist?
I'd like to show a chart of artists, as well as a chart of songs.
You can use the same field in both values and annotate.
You have the primary key of the Song object (you could just use song instead of song__id), so use
Song.objects.get(id=...)
For your second question, do a separate query with song__artist as the field in values and annotate:
from django.db.models import Count
SongPlay.past_month
.filter(user=user)
.values('song__artist')
.annotate(plays=Count('song__artist'))
.order_by('-plays')[:20]
agf has already showed you how to group by song_artist. What I would do to get the actual Song object is store it in memcached, or if the method you are calling is rather simplistic make it a static method or a class method. You might could also initialize a Song object with the data from the query and not actually save it to get access to this method. Might help to know the details of the methods you want to call from the Song object.

modelling the google datastore/python

Hi I am trying to build an application which has models resembling something like the below ones:-(While it would be easy to merge the two models into one and use them , but that is not feasible in the actual app)
class User(db.Model):
username=db.StringProperty()
email=db.StringProperty()
class UserLikes(db.Model):
username=db.StringProperty()
food=db.StringProperty()
The objective- The user after logging in enters the food that he likes and the app in turn returns all the other users who like that food.
Now suppose a user Alice enters that she likes "Pizzas" , it gets stored in the datastore. She logs out and logs in again.At this point we query the datastore for the food that she likes and then query again for all users who like that food. This as you see are two datastore queries which is not the best way. I am sure there would definitely be a better way to do this. Can someone please help.
[Update:-Or can something like this be done that I change the second model such that usernames become a multivalued property in which all the users that like that food can be stored.. however I am a little unclear here]
[Edit:-Hi Thanks for replying but I found both the solutions below a bit of a overkill here. I tried doing it like below.Request you to have a look at this and kindly advice. I maintained the same two tables,however changed them like below:-
class User(db.Model):
username=db.StringProperty()
email=db.StringProperty()
class UserLikes(db.Model):
username=db.ListProperty(basestring)
food=db.StringProperty()
Now when 2 users update same food they like, it gets stored like
'pizza' ----> 'Alice','Bob'
And my db query to retrieve data becomes quite easy here
query=db.Query(UserLikes).filter('username =','Alice').get()
which I can then iterate over as something like
for elem in query.username:
print elem
Now if there are two foods like below:-
'pizza' ----> 'Alice','Bob'
'bacon'----->'Alice','Fred'
I use the same query as above , and iterate over the queries and then the usernames.
I am quite new to this , to realize that this just might be wrong. Please Suggest!
Beside the relation model you have, you could handle this in two other ways depending on your exact use case. You have a good idea in your update, use a ListProperty. Check out Brett Slatkin's taslk on Relation Indexes for some background.
You could use a child entity (Relation Index) on user that contains a list of foods:
class UserLikes(db.Model):
food = db.StringListProperty()
Then when you are creating a UserLikes instance, you will define the user it relates to as the parent:
likes = UserLikes(parent=user)
That lets you query for other users who like a particular food nicely:
like_apples_keys = UserLikes.all(keys_only=True).filter(food='apples')
user_keys = [key.parent() for key in like_apples_keys]
users_who_like_apples = db.get(user_keys)
However, what may suit your application better, would be to make the Relation a child of a food:
class WhoLikes(db.Model):
users = db.StringListProperty()
Set the key_name to the name of the food when creating the like:
food_i_like = WhoLikes(key_name='apples')
Now, to get all users who like apples:
apple_lover_key_names = WhoLikes.get_by_key_name('apples')
apple_lovers = UserModel.get_by_key_names(apple_lover_key_names.users)
To get all users who like the same stuff as a user:
same_likes = WhoLikes.all().filter('users', current_user_key_name)
like_the_same_keys = set()
for keys in same_likes:
like_the_same_keys.union(keys.users)
same_like_users = UserModel.get_by_key_names(like_the_same_keys)
If you will have lots of likes, or lots users with the same likes, you will need to make some adjustments to the process. You won't be able to fetch 1,000s of users.
Food and User relation is a so called Many-to-Many relationship tipically handled with a Join table; in this case a db.Model that links User and Food.
Something like this:
class User(db.Model):
name = db.StringProperty()
def get_food_I_like(self):
return (entity.name for entity in self.foods)
class Food(db.Model):
name = db.StringProperty()
def get_users_who_like_me(self):
return (entity.name for entity in self.users)
class UserFood(db.Model):
user= db.ReferenceProperty(User, collection_name='foods')
food = db.ReferenceProperty(Food, collection_name='users')
For a given User's entity you could retrieve preferred food with:
userXXX.get_food_I_like()
For a given Food's entity, you could retrieve users that like that food with:
foodYYY.get_users_who_like_me()
There's also another approach to handle many to many relationship storing a list of keys inside a db.ListProperty().
class Food(db.Model):
name = db.StringProperty()
class User(db.Model):
name = db.StringProperty()
food = db.ListProperty(db.Key)
Remember that ListProperty is limited to 5.000 keys or again, you can't add useful properties that would fit perfectly in the join table (ex: a number of stars representing how much a User likes a Food).

Categories