Persisted data with Django and Algolia search model indexing - python

This is a curious one for Django+Algolia. I'm using the Algolia specific Django package:
$ pip install algoliasearch-django
I have the following model schema:
import os
import datetime
from channels import Group
from django.db import models
from django.conf import settings
from django.utils.six import python_2_unicode_compatible
from django.utils.translation import ugettext_lazy as _
from django.core.files.storage import FileSystemStorage
from django.contrib.humanize.templatetags.humanize import naturaltime
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
SITE_UPLOAD_LOC = FileSystemStorage(location=os.path.join(BASE_DIR, 'uploads/site'))
USER_UPLOAD_LOC = FileSystemStorage(location=os.path.join(BASE_DIR, 'uploads/user'))
#python_2_unicode_compatible
class Room(models.Model):
"""
This model class sets up the room that people can chat within - much like a forum topic.
"""
title = models.CharField(max_length=255)
staff = models.BooleanField(default=False)
slug = models.SlugField(max_length=250, default='')
banner = models.ImageField(storage=USER_UPLOAD_LOC, null=True, blank=True)
def last_activity(self):
"""
For date and time values show how many seconds, minutes, or hours ago a message
was sent (i.e., persised to the database) compared to current timestamp return representing string.
"""
last_persisted_message = Messages.objects.filter(where=self.slug).order_by('-sent_at').first()
if last_persisted_message is not None:
# First we can store "last persisted message" time in ISO format (could be useful for sitemap.xml generation; SEO tasks etc)
last_persisted_message_iso = last_persisted_message.sent_at.isoformat()
# Use the natural time package form contrib.humanize to convert our datetime to a string.
last_persisted_message = naturaltime(last_persisted_message.sent_at)
return last_persisted_message
else:
return "No activity to report"
Which is indexed as:
from algoliasearch_django import AlgoliaIndex
class RoomIndex(AlgoliaIndex):
fields = ('title', 'last_activity')
settings = {
'searchableAttributes': ['title'],
'attributesForFaceting': ['title', 'last_activity'],
'hitsPerPage': 15,
}
index_name = 'Room Index'
Essentially, to bring the 'last_activity' value to the front end it needs to pass through the index which is updated as far as I can tell with running:
$ python manage.py algolia_reindex
However, the last activity comes from the last time (converted to humanized django naturaltime, e.g. '3 days ago' etc etc) a Message was sent within a websocket connection - persisted to the database. All of this functionality works except that to update I need to run the algolia_reindex command.
Rather unsure as to how this could potentially be done a little more simultaneously...?

Ok, so this one was slightly more complex as I was using websockets. When a message is sent and persisted to the database we can also do the following within the relevant "consumer" method (really, consumers.py is the websocket equivalent of the views.py file so I should have known this!)
The following lines of code worked:
client = algoliasearch.Client(settings.ALGOLIA['APPLICATION_ID'], settings.ALGOLIA['API_KEY'])
index = client.init_index('Room Index')
res = index.partial_update_objects([{"last_activity": naturaltime(datetime.datetime.now()), "objectID": your_object_id]}])
The trick for anyone listening would be to designate the your_object_id from what value of the message is passed in from the client side to the consumer.
Don't forget to add:
import datetime
from django.conf import settings
from django.contrib.humanize.templatetags.humanize import naturaltime
At the top of the consumers.py file!
I also found the python specific incremental updates documentation from Algolia extremely useful:
I https://www.algolia.com/doc/tutorials/indexing/synchronization/incremental-updates/
To render the updated time in "real time" - use which ever front-end tool floats your boat, I used jQuery but Vue.js or React.js would work equally well.

Related

Python: Calculate time between current time and last login. (Automated Communication)

I'm trying to make a celery task that would send a basic reminder to our users. So in our automated communication project, we have these tasks:
As you can see there are few actions that are different. So for now I have created a logic that fetches all the users from the DB and then continues by checking the time difference. But for now, I only have set-up for 2 hours or more. How should I use it correctly? I do not want to re-write each if statement because it's bad practice. How should I make it clear and reduce the system load?
#app.task
def check_registered_users():
from apps.users.models import User
from apps.notifications.models import AutomatedCommunicationNotifications
day_start = datetime.utcnow().date()
day_end = day_start + timedelta(days=1)
users = User.objects.filter(is_active=True, date_joined__range=(day_start, day_end))
users_that_received_notification = AutomatedCommunicationNotifications.objects.all().values('user__id')
excluded_users = users.exclude(id__in=users_that_received_notification)
for user in excluded_users:
if user.last_login < user.last_login + timedelta(hours=2):
# Sign-up uncompleted Push notification 2 hours after last login
template = SiteConfiguration.get_solo().automated_comms_signup_uncompleted
send_plain_email_task(
email=user.email,
subject=template.subject,
html_message=template.content,
from_email=f'{settings.EMAIL_FROM_PREFIX} <{settings.DEFAULT_FROM_EMAIL}>',
)
P.S AutomatedCommunicationNotifications table is for us to track which user has already received a notification.
class AutomatedCommunicationNotifications(BaseModel):
""" Model for automated comms notifications """
user = models.ForeignKey(User, on_delete=models.CASCADE)
type = models.CharField(
max_length=255,
choices=NotificationTypes.get_choices(),
default=NotificationTypes.EMAIL_NOTIFICATION
)
def __str__(self):
return str(self.user.phone)
You'll have to iterate over your queried users at least once but here are tips that may help:
models.py
class User(...):
# add a field to determine if the user has registered or not
# set this to `True` when a User successfully registers:
is_registered = models.BooleanField(default=False)
class AutomatedCommunicationNotifications(BaseModel):
# add a related name field for easier coding:
user = models.ForeignKey(..., related_name = 'notifications')
tasks.py
# load packages outside of your function so this only runs once on startup:
from django.models import F
from apps.users.models import User
from apps.notifications.models import AutomatedCommunicationNotifications
#app.task
def check_registered_users():
# timestamps:
two_hours_ago = datetime.now() - timedelta(hours=2)
# query for unregistered users who have not received a notification:
users = User.objects.filter(
is_registered = False,
last_login__lt = two_hours_ago # last logged in 2 or more hours ago
).exclude(
notifications__type = "the type"
).prefetch_related(
'notifications' # prejoins tables to improve performance
)
for user in users:
# send email
...
I would do this with a cron job. You can let it run whenever you want, depends on how fast after your give time frame you want to sent this.
You start with making a folder in your app:
/django/yourapp/management/commands
There you make a python file which contains your logic. Make sure to import the right modules from your views.
from django.core.management.base import BaseCommand, CommandError
from yourapp.models import every, module, you, need
from django.utils import timezone
from datetime import datetime, date, timedelta
from django.core.mail import send_mail, EmailMessage
class Command(BaseCommand):
help = 'YOUR HELP TEXT FOR INTERNAL USE'
def handle(self, *args, **options):
# Your logic
I added the crontab to the www-data users crontab like this:
# m h dom mon dow command
45 3 * * * /websites/vaccinatieplanner/venv/bin/python /websites/vaccinatieplanner/manage.py reminder
You can use that crontab file to tweak your optimal time between checks. If you remove the 3 and replace it by a * then you will have it check every 45 mins.

Using Python 3.7 contextvars to pass state between Django views

I'm building a single database/shared schema multi-tenant application using Django 2.2 and Python 3.7.
I'm attempting to use the new contextvars api to share the tenant state (an Organization) between views.
I'm setting the state in a custom middleware like this:
# tenant_middleware.py
from organization.models import Organization
import contextvars
import tenant.models as tenant_model
tenant = contextvars.ContextVar('tenant', default=None)
class TenantMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
response = self.get_response(request)
user = request.user
if user.is_authenticated:
organization = Organization.objects.get(organizationuser__is_current_organization=True, organizationuser__user=user)
tenant_object = tenant_model.Tenant.objects.get(organization=organization)
tenant.set(tenant_object)
return response
I'm using this state by having my app's models inherit from a TenantAwareModel like this:
# tenant_models.py
from django.contrib.auth import get_user_model
from django.db import models
from django.db.models.signals import pre_save
from django.dispatch import receiver
from organization.models import Organization
from tenant_middleware import tenant
User = get_user_model()
class TenantManager(models.Manager):
def get_queryset(self, *args, **kwargs):
tenant_object = tenant.get()
if tenant_object:
return super(TenantManager, self).get_queryset(*args, **kwargs).filter(tenant=tenant_object)
else:
return None
#receiver(pre_save)
def pre_save_callback(sender, instance, **kwargs):
tenant_object = tenant.get()
instance.tenant = tenant_object
class Tenant(models.Model):
organization = models.ForeignKey(Organization, null=False, on_delete=models.CASCADE)
def __str__(self):
return self.organization.name
class TenantAwareModel(models.Model):
tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE, related_name='%(app_label)s_%(class)s_related', related_query_name='%(app_label)s_%(class)ss')
objects = models.Manager()
tenant_objects = TenantManager()
class Meta:
abstract = True
In my application the business logic can then retrieve querysets using .tenant_objects... on a model class rather than .objects...
The problem I'm having is that it doesn't always work - specifically in these cases:
In my login view after login() is called, the middleware runs and I can see the tenant is set correctly. When I redirect from my login view to my home view, however, the state is (initially) empty again and seems to get set properly after the home view executes. If I reload the home view, everything works fine.
If I logout and then login again as a different user, the state from the previous user is retained, again until a do a reload of the page. This seems related to the previous issue, as it almost seems like the state is lagging (for lack of a better word).
I use Celery to spin off shared_tasks for processing. I have to manually pass the tenant to these, as they don't pick up the context.
Questions:
Am I doing this correctly?
Do I need to manually reload the state somehow in each module?
Frustrated, as I can find almost no examples of doing this and very little discussion of contextvars. I'm trying to avoid passing the tenant around manually everywhere or using thread.locals.
Thanks.
You're only setting the context after the response has been generated. That means it will always lag. You probably want to set it before, then check after if the user has changed.
Note though that I'm not really sure this will ever work exactly how you want. Context vars are by definition local; but in an environment like Django you can never guarantee that consecutive requests from the same user will be served by the same server process, and similarly one process can serve requests from multiple users. Plus, as you've noted, Celery is a yet another separate process again, which won't share the context.

Extend django-import-export's import form to specify fixed value for each imported row

I am using django-import-export 1.0.1 with admin integration in Django 2.1.1. I have two models
from django.db import models
class Sector(models.Model):
code = models.CharField(max_length=30, primary_key=True)
class Location(models.Model):
code = models.CharField(max_length=30, primary_key=True)
sector = ForeignKey(Sector, on_delete=models.CASCADE, related_name='locations')
and they can be imported/exported just fine using model resources
from import_export import resources
from import_export.fields import Field
from import_export.widgets import ForeignKeyWidget
class SectorResource(resources.ModelResource):
code = Field(attribute='code', column_name='Sector')
class Meta:
model = Sector
import_id_fields = ('code',)
class LocationResource(resources.ModelResource):
code = Field(attribute='code', column_name='Location')
sector = Field(attribute='sector', column_name='Sector',
widget=ForeignKeyWidget(Sector, 'code'))
class Meta:
model = Location
import_id_fields = ('code',)
and import/export actions can be integrated into the admin by
from django.contrib import admin
from import_export.admin import ImportExportModelAdmin
class SectorAdmin(ImportExportModelAdmin):
resource_class = SectorResource
class LocationAdmin(ImportExportModelAdmin):
resource_class = LocationResource
admin.site.register(Sector, SectorAdmin)
admin.site.register(Location, LocationAdmin)
For Reasons™, I would like to change this set-up so that a spreadsheet of Locations which does not contain a Sector column can be imported; the value of sector (for each imported row) should be taken from an extra field on the ImportForm in the admin.
Such a field can indeed be added by overriding import_action on the ModelAdmin as described in Extending the admin import form for django import_export. The next step, to use this value for all imported rows, is missing there, and I have not been able to figure out how to do it.
EDIT(2): Solved through the use of sessions. Having a get_confirm_import_form hook would still really help here, but even better would be having the existing ConfirmImportForm carry across all the submitted fields & values from the initial import form.
EDIT: I'm sorry, I thought I had this nailed, but my own code wasn't working as well as I thought it was. This doesn't solve the problem of passing along the sector form field in the ConfirmImportForm, which is necessary for the import to complete. Currently looking for a solution which doesn't involve pasting the whole of import_action() into an ImportMixin subclass. Having a get_confirm_import_form() hook would help a lot here.
Still working on a solution for myself, and when I have one I'll update this too.
Don't override import_action. It's a big complicated method that you don't want to replicate. More importantly, as I discovered today: there are easier ways of doing this.
First (as you mentioned), make a custom import form for Location that allows the user to choose a Sector:
class LocationImportForm(ImportForm):
sector = forms.ModelChoiceField(required=True, queryset=Sector.objects.all())
In the Resource API, there's a before_import_row() hook that is called once per row. So, implement that in your LocationResource class, and use it to add the Sector column:
def before_import_row(self, row, **kwargs):
sector = self.request.POST.get('sector', None)
if contract:
self.request.session['import_context_sector'] = sector
else:
# if this raises a KeyError, we want to know about it.
# It means that we got to a point of importing data without
# contract context, and we don't want to continue.
try:
sector = self.request.session['import_context_sector']
except KeyError as e:
raise Exception("Sector context failure on row import, " +
f"check resources.py for more info: {e}")
row['sector'] = sector
(Note: This code uses Django sessions to carry the sector value from the import form to the import confirmation screen. If you're not using sessions, you'll need to find another way to do it.)
This is all you need to get the extra data in, and it works for both the dry-run preview and the actual import.
Note that self.request doesn't exist in the default ModelResource - we have to install it by giving LocationResource a custom constructor:
def __init__(self, request=None):
super()
self.request = request
(Don't worry about self.request sticking around. Each LocationResource instance doesn't persist beyond a single request.)
The request isn't usually passed to the ModelResource constructor, so we need to add it to the kwargs dict for that call. Fortunately, Django Import/Export has a dedicated hook for that. Override ImportExportModelAdmin's get_resource_kwargs method in LocationAdmin:
def get_resource_kwargs(self, request, *args, **kwargs):
rk = super().get_resource_kwargs(request, *args, **kwargs)
rk['request'] = request
return rk
And that's all you need.

How to generate HASH for Django model

I am trying to generate unique HASH values for my Django models of 10 digit i have tried these methods but i am getting this error
return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: column hash_3 is not unique
Here what i have tried :
import os
import time
import hashlib
from os import path
from binascii import hexlify
from django.db import models
from django.contrib import admin
from django.core.files.storage import FileSystemStorage
#------------------------------------------------------------------------------
def _createHash():
"""This function generate 10 character long hash"""
hash = hashlib.sha1()
hash.update(str(time.time()))
return hash.hexdigest()[:-10]
class tags(models.Model):
""" This is the tag model """
seo_url1 = models.URLField()
seo_url2 = models.URLField()
seo_url3 = models.URLField()
tagDescription = models.TextField() # Tag Description
tag = models.CharField(max_length=200) # Tag name
tagSlug = models.CharField(max_length=400) # Extra info can be added to the existing tag using this field
updatedAt = models.DateTimeField(auto_now=True) # Time at which tag is updated
createdAt = models.DateTimeField(auto_now_add=True) # Time at which tag is created
hash_1 = models.CharField(max_length=10,default=_createHash(),unique=True)
hash_2 = models.CharField(max_length=10,default=_createHash(),unique=True)
hash_3 = models.CharField(max_length=10,default=_createHash(),unique=True)
I have also tried this method:
def _createHash():
"""This function generate 10 character long hash"""
return hexlify(os.urandom(5))
I have a script which inserts data into this model every time i run my script i got above mentioned error ..is there any other way of doing this..i want to store unique hash values into columns hash_1,hash_2,hash_3.
Don't call the _createHash() function in your field, but just pass the reference to the callable in your model, e.g.
hash_1 = models.CharField(max_length=10,default=_createHash,unique=True)
As Lennart Regebro mentioned in his answer, you'll get the same value for each time you start the server in your attempt.
The Django docs say this about it:
Field.default
The default value for the field. This can be a value or
a callable object. If callable it will be called every time a new
object is created.
_createHash() is called when you define the model, so you have the same default every time you create a new object.
You can look at creating the hash in the save() method of the model, that's probably the easiest.

Django, how to see session data in the admin interface

I'm using Django sessions and I would like a way of seeing the session data in the admin interface. Is this possible?
I.e. for each session I want to see the data stored in the session database (which is essentially a dictionary as far as I can gather).
Currently I can just see a hash in the Session data field, such as:
gAJ9cQEoVQ5zb3J0aW5nX2Nob2ljZXECVQJQT3EDVQxnYW1lc19wbGF5ZWRxBH1xBVgLAAAAcG9z
dG1hbi1wYXRxBksDc1UKaXBfYWRkcmVzc3EHVQkxMjcuMC4wLjFxCFUKdGVzdGNvb2tpZXEJVQZ3
b3JrZWRxClUKZ2FtZV92b3Rlc3ELfXEMdS4wOGJlMDY3YWI0ZmU0ODBmOGZlOTczZTUwYmYwYjE5
OA==
I have put the following into admin.py to achieve this:
from django.contrib.sessions.models import Session
...
admin.site.register(Session)
In particular I was hoping to be able to see at least an IP address for each session. (Would be nice too if I could count how many sessions per IP address and order the IPs based on number of sessions in total for each.)
Thank you for your help :-)
You can do something like this:
from django.contrib.sessions.models import Session
class SessionAdmin(ModelAdmin):
def _session_data(self, obj):
return obj.get_decoded()
list_display = ['session_key', '_session_data', 'expire_date']
admin.site.register(Session, SessionAdmin)
It might be even that get_decoded can be used directly in list_display. And in case there's some catch that prevents this from working ok, you can decode the session data yourself, based on the linked Django source.
Continuing from Tomasz's answer, I went with:
import pprint
from django.contrib.sessions.models import Session
class SessionAdmin(admin.ModelAdmin):
def _session_data(self, obj):
return pprint.pformat(obj.get_decoded()).replace('\n', '<br>\n')
_session_data.allow_tags=True
list_display = ['session_key', '_session_data', 'expire_date']
readonly_fields = ['_session_data']
exclude = ['session_data']
date_hierarchy='expire_date'
admin.site.register(Session, SessionAdmin)
Session data is contained in a base64 encoded pickled dictionary. That's is what you're seeing in the admin because that data is stored in a TextField in the Session model.
I don't think any distributed django code stores the ip address in the session but you could do it yourself if you can access it.
In order to display the real session information, you may write your own form field that presents the decoded information. Keep in mind that you'll have to also overwrite the save method if you want to modify it. You can take a look at the encode and decode methods in django/contrib/sessions/models.py.
EB's otherwise great answer left me with the error "Database returned an invalid value in QuerySet.dates(). Are time zone definitions and pytz installed?". (I do have db tz info and pytz installed, and my app uses timezones extensively.) Removing the 'date_hierarchy' line resolved the issue for me. So:
import pprint
from django.contrib.sessions.models import Session
class SessionAdmin(admin.ModelAdmin):
def _session_data(self, obj):
return pprint.pformat(obj.get_decoded()).replace('\n', '<br>\n')
_session_data.allow_tags=True
list_display = ['session_key', '_session_data', 'expire_date']
readonly_fields = ['_session_data']
exclude = ['session_data']
admin.site.register(Session, SessionAdmin)
Adding to previous answers, We can also show the user for that session which is helpful for identifying the session of users.
class SessionAdmin(admin.ModelAdmin):
def user(self, obj):
session_user = obj.get_decoded().get('_auth_user_id')
user = User.objects.get(pk=session_user)
return user.email
def _session_data(self, obj):
return pprint.pformat(obj.get_decoded()).replace('\n', '<br>\n')
_session_data.allow_tags = True
list_display = ['user', 'session_key', '_session_data', 'expire_date']
readonly_fields = ['_session_data']

Categories