How to get an extra count field with Django ORM?

How to get an extra count field with Django ORM? - python

My Django Models are like this:
class User(models.Model):
username = models.CharField(max_length=32)
class Message(models.Model):
content = models.TextField()
class UserMessageRel(models.Model):
user = models.ForeignKey(User)
message = models.ForeignKey(Message)
is_read = models.BooleanField()
Now I want to get all messages, for each message, I need to know how many users that received it has read it.
The naive way to do it is:
msgs = Message.objects.all()
messages = []
for msg in msgs:
reads = UserMessageRel.objects.filter(message=msg, is_read=True).count()
messages.append((msg, reads))
But this is very inefficient, with a SQL query to get the number of reads for each message.
I am not sure if this can be done with annotations or aggregations in ORM?
What I want is something like this:
msgs_with_reads = Message.objects.all().annotate(
number_of_reads=Count("user_message_rel_with_is_read_true"))
which can be translated into one nice SQL query.
Is this achievable?

I'm interpreting your question to be that you want to improve query time for this count. Unfortunately, with the current setup, a full table scan is necessary. There are ways to improve it, the easiest being indexing. You could add an index on the Message id column in UserMessageRel, which would speed up the read time (at the cost of space, of course). The most readable way to access this count though, is Pieter's answer.

You can do a related lookup from the Message object, I would put a helper function on the Message model like this, then you would be able to call the function from the object.
def get_read_count(self):
return self.usermessagerel_set.filter(is_read=True).count()
message_obj.get_read_count()

I didn't find a way to use Django ORM to generate one SQL query for my requirement, but the following code can generate 2 queries:
messages = Message.objects.all()
messageReads = UserMessageRel.objects.filter(isRead=True).
values("message_id").annotate(cnt=Count("user"))
Then I can map the messages with their read count in python.
This solution is good enough for me.

Related

Is there a better way to design the Message model?

Is there a better way to design the Message model ?
I have a Message model:
class Message(models.Model):
"""
message
"""
title = models.CharField(max_length=64, help_text="title")
content = models.CharField(max_length=1024, help_text="content")
is_read = models.BooleanField(default=False, help_text="whether message is read")
create_user = models.ForeignKey(User, related_name="messages",help_text="creator")
receive_user = models.CharField(max_length=1024, help_text="receive users' id")
def __str__(self):
return self.title
def __unicode__(self):
return self.title
You see, I use models.CharField to store the users' id, so I can know the users who should receive this row message.
I don't know whether this design type is good. or is there a better way to do that?
I have considered use ManyToMany Field, but I think if user is too many, the admin create one message will create as many as users count, so I think this is not a good idea.

I would definitely use ManyToManyField for your receive_user. You're going to find that keeping a CharField updated and sanitised with user_ids is going to be a nightmare that will involve re-implementing vast swathes of existing Django functionality.
I'm not sure if I understand your potential issue to using ManyToManyField, users of the admin will be able to select which users are to be recipients of the message, it doesn't automatically a message for each user.
e: Also, depending on which version of python you're using (2 or 3) you only need one of either __str__ or __unicode__
__unicode__ is the method to use for python2, __str__ for python3: See this answer for more details

So it actually depends on your needs in which direction I would change your message Model.
General Changes
Based on the guess: you don't ever need an index on the content field
I would change the content to a TextField (alse because the length of 1024 is already to large for a propper index on mysql for example) https://docs.djangoproject.com/en/1.11/ref/databases/#textfield-limitations here some more infos about this topic.
I would pbly increase the size of the title field just because it seems convenient to me.
1. Simple -> One User to One User
The single read field indicates a one to one message:
I would change the Receiver to also be a Foreign key and adapt the related names of the sender and receiver field to represent these connections to something like sent-messages and received-messages.
Like #sebastian-fleck already suggested I'd also change the read field to a datetime field, it only changes your querysets from filter(read=True) to filter(read__isnull=False) to get the same results and you could create a property representing the read as boolean for conveniance, e.g.
#property
def read(self):
return bool(self.read_datetime) # assumed read as a datetime is read_datetime
2. More Complex: One User to Multiple User
This can get a lot more complex, here the least complex solution I could think of.
Conditions:
- there are only messages and no conversation like strukture
- a message should have a read status for every receiver
(I removed descriptions for an easier overview and changed the models according to my opinions from before, this is based on my experience and the business needs I assumed from your example and answers)
#python_2_unicode_compatible
class Message(models.Model):
title = models.CharField(max_length=160)
content = models.TextField()
create_user = models.ForeignKey(User, related_name="sent-messages")
receive_users = models.ManyToManyField(User, through=MessageReceiver)
def __str__(self):
return 'Message: %s' % self.title
#python_2_unicode_compatible
class MessageReceiver(models.Model):
is_read = models.Datetime(null=True, blank=True)
receiver = models.ForeignKey(User)
message = models.ForeignKey(Message)
This structure is using the power of ManyToMany with a custom through Model, check this out, it very mighty: https://docs.djangoproject.com/en/1.11/ref/models/fields/#django.db.models.ManyToManyField.through.
tldr: we want every receiver to have a read status, so we modeled this in a separate object
Longer version: we utilize the power of a custom ManyToMany through model to have a separate read status for every receiver. This means we need to change some parts of our code to work for the many to many structure, e.g. if we want to know if a message was read by all receivers:
def did_all_receiver_read_the_message(message)
unread_count = my_message.receive_users.filter(is_read__isnull=True).count()
if unread_count > 0:
return True
return False
if we want to know if a specific user read a specific message:
def did_user_read_this_message(user, message)
receiver = message.receive_users.get(receiver=user)
return bool(receiver.is_read)
3. Conversations + Messages + Participants
This is something that would exceed my time limit but some short hints:
Conversation holds everything together
Message is written by a Participant and holds a created timestamp
Participant allows access to a conversation and links a User to the Conversation object
the Participant holds a last_read timestamp with can be used to calculate if a message was read or not using the messages created timestamps (-> annoyingly complex part & milliseconds are important)
Everything else pbly would need to be adapted to your specific business needs. This scenario is pbly the most flexible but it's a lot of work (based on personal experience) and adds quite a bit of complexity to your architecture - I only recommend this if it's really really needed ^^.
Disclaimer:
This could be an overall structure, most design decisions I made for the examples are based on assumptions, I could only mentioned some or the text would to long, but feel free to ask.
Please excuse any typos and errors, I didn't had the chance to run the code.

Filter queryset to return only the best result for each user

I have a couple of django models, one of which holds a number of user results for different events. I'm looking for a way to generate a queryset consisting of only the best (highest) result for each user that also has the other attributes attached (like the date of the result).
My models are as shown as well as using the built in user model:
class CombineEvents(models.Model):
team = models.ForeignKey(Team)
event = models.CharField(max_length=100)
metric = models.CharField(max_length=100)
lead_order = models.IntegerField()
def __unicode__(self):
return self.event
class CombineResults(models.Model):
user = models.ForeignKey(User)
date = models.DateField()
event = models.ForeignKey(CombineEvents)
result = models.FloatField()
def __unicode__(self):
return str(self.date) + " " + str(self.event)
I am iterating through each event and attaching a queryset of the events results, which is working fine, but I want that sub-queryset to only include one object for each user and that object should be that user's best result. My queryset code is below:
combine_events = CombineEvents.objects.filter(team__id=team_id)
for event in combine_events:
event.results = CombineResults.objects.filter(event=event)
I'm not sure how filter down to just those best results for each user. I want to use these querysets to create leaderboards, so I'd still like to be able to also have the date of that best result and the user name, but don't want the leaderboard to allow more than one spot per user. Any ideas?

Since your CombineResults model has a FK relation to CombineEvents, you can do something like this:
combine_events = CombineEvents.objects.filter(team__id=team_id)
for event in combine_events:
result = event.combineresults_set.order_by('-result')[0]
The combineresults_set attribute is auto-generated by the FK field, though you can set it to something more helpful by specifying the related_name keyword argument:
class CombineResults(models.Model):
event = models.ForeignKey(CombineEvents, related_name='results')
would enable you to call event.results.order_by(...). There is more in the documentation here:
https://docs.djangoproject.com/en/1.9/topics/db/queries/#following-relationships-backward
Note that this isn't the most DB-friendly approach as you will effectively hit the database once to get combine_events (as soon you start iterating), and then again for each event in that list. It will probably be better to use prefetch_related(), which you can use to make two DB queries only. Documentation can be found here.
prefetch_related() however will default to do a queryset.all() for the related documents, which you could further control by using Prefetch objects as documented here.
Edit:
Apologies for getting the question wrong. Getting every user's best result per event (which is what I think you want) is not quite as simple. I'd probably do something like this:
from django.db.models import Q, Max
combine_events = CombineEvents.objects \
.filter(team_id=team_id) \
.prefetch_related('combineresults_set')
for event in combine_events:
# Get the value of the best result per user
result = event.combineresults_set.values('user').annotate(best=Max('result'))
# Now construct a Q() object, note this will evaluate the result query
base_q = Q()
for res in result:
# this is the equivalent to base_q = base_q | ....
base_q |= (Q(user_id=res['user']) & Q(result=res['best']))
# Now you're ready to filter results
result = event.combineresults_set.filter(base_q)
You can read more about Q objects here, or alternatively write your own SQL using RawSQL and the likes. Or wait for someone with a better idea..

Django 'likes' - ManyToManyField vs new model

I'm implementing likes on profiles for my website and I'm not sure which would be the best practice, a ManyToManyField like so:
class MyUser(AbstractBaseUser):
...
likes = models.ManyToManyField('self', symmetrical = False, null = True)
...
or just creating a class Like, like so:
class Like(models.Model):
liker = models.ForeignKey(MyUser, related_name='liker')
liked = models.ForeignKey(MyUser, related_name='liked')
Is one of them a better choice than the other? If so, why?
thanks

The first option should be preffered. If you need some additional fields to describe the likes, you can still use through="Likes" in your ManyToManyField and define the model Likes.
Manipulating the data entries would be also somewhat more pythonic:
# returns an object collection
likes_for_me = MyUser.objects.filter(pk=1).likes
instead of:
me = MyUser.objects.filter(pk=1)
likes_for_me = Like.objects.filter(liked=me)

The second option is basically what is done internally: a new table is created, which is used to create the links between the entities.
For the first option, you let django do the job for you.
The choice is certainly more about how you want to do the requests. On the second options, you would have to query the Like models that match you model, while on the first one, you only have to request the MyUser, from which you can access the connections.

Second option is more flexible and extensible. For example, you'll probably want to track when like was created (just add Like.date_created field). Also you'll probably want to send notification to content author when content was liked. But at first like only (add Like.cancelled boolead field and wrap it with some logic...).
So I'll go with separate model.

I think the one you choose totally depends on the one you find easier to implement or better. I tend to always use the first approach, as it is more straightforward and logical, at least to me. I also disagree with Igor on that it's not flexible and extensible, you can also initiate notifications when it happens. If you are going to use the Django rest framework, then I totally suggest using the first method, as the second could be a pain.
class Post(models.Model):
like = models.ManyToManyField(settings.AUTH_USER_MODEL, blank=True, related_name='post_like')
Then in your view, you just do this.
#api_view(['GET'])
#permission_classes([IsAuthenticated])
def like(request, id):
signed_in = request.user
post = Post.objects.get(id=id)
if signed_in and post:
post.like.add(signed_in)
# For unlike, remove instead of add
return Response("Successful")
else:
return Response("Unsuccessful", status.HTTP_404_NOT_FOUND)
Then you can use the response however you like on the front end.

Autoincrement with MongoEngine

I'm developing a blog engine with Flask and MongoEngine, and I need sequential IDs for my posts.
I need MongoEngine to create a new ID for each new post, so I was thinking of doing something like this:
class Post(Document):
title = StringField(required=True)
content = StringField(required=True)
published_at = datetime.utcnow()
id = Post.objects.count() + 1
Will this work? is there a better way to do this?

Firstly, you need to understand why you need incremental id's? What do they solve?
Theres no native solution in mongoDB - please read: http://www.mongodb.org/display/DOCS/How+to+Make+an+Auto+Incrementing+Field
As you already have a unique identifier with the pk of the Post, why not use that?
Finally, if I haven't dissuaded you from folly, there is a SequenceField in mongoengine that handles incrementing for you.

Edit: This is an incorrect solution, as others pointed out that this approach causes a race condition. I have only left it here so others would know why this is bad. (multiple clients can access this same object and increment it, resulting in inconsistent results).
Old answer:
I figured it out.
The Post class looks like this:
class Post(Document):
title = StringField(required=True)
content = StringField(required=True)
published_at = datetime.utcnow()
ID = IntField(min_value=1)
And in the function that inserts the post, I count the available records and then increment them by 1, like so:
def create_post(title, content):
Post(title=title, content=content, ID=Post.objects.count() + 1).save()

you can use mongoengine.signals and the post_init for the sake of auto incrementing a field. haven't tested it btw.

In my case I needed to create a sequential number for each invoice generated at the POS.
I used class mongoengine.fields.SequenceField
class Model(Document):
.....
sequenceSale = db.SequenceField(required=False) #(default int)
NOTE
in case the counter is defined in the abstract document, it will be common to all inherited documents and the default sequence name will be the class name of the abstract document.
More about mongoengine.fields.SequenceField

Perform a SQL JOIN on Django models that are not related?

I have 2 Models, User (django.contrib.auth.models.User) and a model named Log. Both contain an "email" field. Log does not have a ForeignKey pointing to the User model. I'm trying to figure out how I can perform a JOIN on these two tables using the email field as the commonality.
There are basically 2 queries I want to be able to perform. A basic join for filtering
#Get all the User objects that have related Log objects with the level parameter set to 3.
User.objects.filter(log__level=3)
I'd also like to do some aggregates.
User.objects.all().anotate(Count('log'))
Of course, it would be nice to be able to do the reverse as well.
log = Log.objects.get(pk=3)
log.user...
Is there a way to do this with the ORM? Maybe something I can add to the model's Meta class to "activate" the relation?
Thanks!

You can add an extra method onto the User class, using MonkeyPatching/DuckPunching:
def logs(user):
return Log.objects.filter(email=user.email)
from django.contrib.auth.models import User
User.logs = property(logs)
Now, you can query a User, and ask for the logs attached (for instance, in a view):
user = request.user
logs = user.logs
This type of process is common in the Ruby world, but seems to be frowned upon in Python.
(I came across the DuckPunching term the other day. It is based on Duck Typing, where we don't care what class something is: if it quacks like a duck, it is a duck as far as we are concerned. If it doesn't quack when you punch it, keep punching until it quacks).

why not use extra()?
example (untested):
User.objects.extra(
select={
'log_count': 'SELECT COUNT(*) FROM myapp_log WHERE myapp_log.email = auth_user.email'
},
)
for the User.objects.filter(log__level=3) portion here is the equivalent with extra (untested):
User.objects.extra(
select={
'log_level_3_count': 'SELECT COUNT(*) FROM myapp_log WHERE (myapp_log.email = auth_user.email) AND (myapp_log.level=3)'
},
).filter(log_level_3_count__gt=0)

Do the Log.email values always correspond to a User? If so, how about just adding a ForeignKey(User) to the Log object?
class Log(models.Model):
# ...
user = models.ForeignKey(User)
With the FK to User, it becomes fairly straight forward to find what you want:
User.objects.filter(log__level=3)
User.objects.all().anotate(Count('log'))
user.log_set.all()
user.log_set.count()
log.user
If the Log.email value does not have to belong to a user you can try adding a method to a model manager.
class LogManager(models.Manager):
def for_user(self, user):
return super(LobManager, self).get_query_set().filter(email=user.email)
class Log(models.Model):
# ...
objects = LogManager()
And then use it like this:
user = User.objects.get(pk=1)
logs_for_user = Log.objects.for_user(user)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get an extra count field with Django ORM? - python

You can do a related lookup from the Message object, I would put a helper function on the Message model like this, then you would be able to call the function from the object. def get_read_count(self): return self.usermessagerel_set.filter(is_read=True).count() message_obj.get_read_count()

Related

Is there a better way to design the Message model?

Filter queryset to return only the best result for each user

Django 'likes' - ManyToManyField vs new model

Autoincrement with MongoEngine

Perform a SQL JOIN on Django models that are not related?

Categories

Resources