How do I cascade delete in this scenario using MongoEngine? - python

I have this simple model:
from mongoengine import *
from datetime import datetime
class Person(Document):
firstname = StringField(required=True)
#property
def comments(self):
return Comment.objects(author=self).all()
class Comment(Document):
text = StringField(required=True)
timestamp = DateTimeField(required=True, default=datetime.now())
author = ReferenceField('Person', required=True, reverse_delete_rule=CASCADE)
class Program(Document):
title = StringField(required=True)
comments = ListField(ReferenceField('Comment'))
class Episode(Document):
title = StringField(required=True)
comments = ListField(ReferenceField('Comment'))
As you can see, both Programs and Episodes can have comments. Initially, I tried to embed the comments but I seemed to run into a brick wall. So I'm trying Comments as a Document class instead. My question is, how do I model it so that:
When a Person is deleted, so are all their comments
When a Comment is deleted (either directly or indirectly), it is removed from its parent
When a Program or Episode is deleted, so are the Comment objects
I'm use to doing all this manually in MongoDB (and SQLa, for that matter), but I'm new to MongoEngine and I'm struggling a bit. Any help would be awesome!

Not all of these are possible without writing application code to handle the logic. I would write signals to handle some of the edge cases.
The main issue you have is global updates / removes aren't handled - so you'd have to ensure that the api you write in the api is used, to ensure a clean database state.

Related

Extend django-import-export's import form to specify fixed value for each imported row

I am using django-import-export 1.0.1 with admin integration in Django 2.1.1. I have two models
from django.db import models
class Sector(models.Model):
code = models.CharField(max_length=30, primary_key=True)
class Location(models.Model):
code = models.CharField(max_length=30, primary_key=True)
sector = ForeignKey(Sector, on_delete=models.CASCADE, related_name='locations')
and they can be imported/exported just fine using model resources
from import_export import resources
from import_export.fields import Field
from import_export.widgets import ForeignKeyWidget
class SectorResource(resources.ModelResource):
code = Field(attribute='code', column_name='Sector')
class Meta:
model = Sector
import_id_fields = ('code',)
class LocationResource(resources.ModelResource):
code = Field(attribute='code', column_name='Location')
sector = Field(attribute='sector', column_name='Sector',
widget=ForeignKeyWidget(Sector, 'code'))
class Meta:
model = Location
import_id_fields = ('code',)
and import/export actions can be integrated into the admin by
from django.contrib import admin
from import_export.admin import ImportExportModelAdmin
class SectorAdmin(ImportExportModelAdmin):
resource_class = SectorResource
class LocationAdmin(ImportExportModelAdmin):
resource_class = LocationResource
admin.site.register(Sector, SectorAdmin)
admin.site.register(Location, LocationAdmin)
For Reasons™, I would like to change this set-up so that a spreadsheet of Locations which does not contain a Sector column can be imported; the value of sector (for each imported row) should be taken from an extra field on the ImportForm in the admin.
Such a field can indeed be added by overriding import_action on the ModelAdmin as described in Extending the admin import form for django import_export. The next step, to use this value for all imported rows, is missing there, and I have not been able to figure out how to do it.
EDIT(2): Solved through the use of sessions. Having a get_confirm_import_form hook would still really help here, but even better would be having the existing ConfirmImportForm carry across all the submitted fields & values from the initial import form.
EDIT: I'm sorry, I thought I had this nailed, but my own code wasn't working as well as I thought it was. This doesn't solve the problem of passing along the sector form field in the ConfirmImportForm, which is necessary for the import to complete. Currently looking for a solution which doesn't involve pasting the whole of import_action() into an ImportMixin subclass. Having a get_confirm_import_form() hook would help a lot here.
Still working on a solution for myself, and when I have one I'll update this too.
Don't override import_action. It's a big complicated method that you don't want to replicate. More importantly, as I discovered today: there are easier ways of doing this.
First (as you mentioned), make a custom import form for Location that allows the user to choose a Sector:
class LocationImportForm(ImportForm):
sector = forms.ModelChoiceField(required=True, queryset=Sector.objects.all())
In the Resource API, there's a before_import_row() hook that is called once per row. So, implement that in your LocationResource class, and use it to add the Sector column:
def before_import_row(self, row, **kwargs):
sector = self.request.POST.get('sector', None)
if contract:
self.request.session['import_context_sector'] = sector
else:
# if this raises a KeyError, we want to know about it.
# It means that we got to a point of importing data without
# contract context, and we don't want to continue.
try:
sector = self.request.session['import_context_sector']
except KeyError as e:
raise Exception("Sector context failure on row import, " +
f"check resources.py for more info: {e}")
row['sector'] = sector
(Note: This code uses Django sessions to carry the sector value from the import form to the import confirmation screen. If you're not using sessions, you'll need to find another way to do it.)
This is all you need to get the extra data in, and it works for both the dry-run preview and the actual import.
Note that self.request doesn't exist in the default ModelResource - we have to install it by giving LocationResource a custom constructor:
def __init__(self, request=None):
super()
self.request = request
(Don't worry about self.request sticking around. Each LocationResource instance doesn't persist beyond a single request.)
The request isn't usually passed to the ModelResource constructor, so we need to add it to the kwargs dict for that call. Fortunately, Django Import/Export has a dedicated hook for that. Override ImportExportModelAdmin's get_resource_kwargs method in LocationAdmin:
def get_resource_kwargs(self, request, *args, **kwargs):
rk = super().get_resource_kwargs(request, *args, **kwargs)
rk['request'] = request
return rk
And that's all you need.

An efficient way to save parsed XML content to Django Model

This is my first question so I will do my best to conform to the question guidelines. I'm also learning how to code so please ELI5.
I'm working on a django project that parses XML to django models. Specifically Podcast XMLs.
I currently have this code in my model:
from django.db import models
import feedparser
class Channel(models.Model):
channel_title = models.CharField(max_length=100)
def __str__(self):
return self.channel_title
class Item(models.Model):
channel = models.ForeignKey(Channel, on_delete=models.CASCADE)
item_title = models.CharField(max_length=100)
def __str__(self):
return self.item_title
radiolab = feedparser.parse('radiolab.xml')
if Channel.objects.filter(channel_title = 'Radiolab').exists():
pass
else:
channel_title= radiolab.feed.title
a = Channel.objects.create(channel_title=channel_title)
a.save()
for episode in radiolab.entries:
item_title = episode.title
channel_title = Channel.objects.get(channel_title="Radiolab")
b = Item.objects.create(channel=channel_title, item_title=item_title)
b.save()
radiolab.xml is a feed I've saved locally from Radiolab Podcast Feed.
Because this code is run whenever I python manage.py runserver, the parsed xml content is sent to my database just like I want to but this happens every time I runserver, meaning duplicate records.
I'd love some help in finding a way to make this happen just once and also a DRY mechanism for adding different feeds so they're parsed and saved to database preferably with the feed url submitted via forms.
If you don't want it run every time, don't put it in models.py. The only thing that belongs there are the model definitions themselves.
Stuff that happens in response to a user action on the site goes in a view. Or, if you want this to be done from the admin site, it should go in the admin.py file.

How do I test a Django model with ImageSpecField?

I got a following model:
class Room(models.Model):
order = models.SmallIntegerField()
name = models.CharField(max_length=20)
background = models.ImageField(upload_to='room_background', blank=False, null=False)
background_preview = ImageSpecField(source='background', processors=[ResizeToFit(300, 400)])
def serialize(self):
room_dict = model_to_dict(self)
room_dict['background_preview_url'] = self.background_preview.url
return room_dict
I'm not using room object directly on my views, instead I convert them to dict, extending with the 'background_preview_url' key.
Now I want to write some Django tests, using serialized room objects. The issue is that if I just do:
test_room = Room(order=1)
test_room.save
test_room.serialize()
The ImageKit throws a MissingSource error, as there's no background image in my test room to generate preview from.
How do I better overcome that in my tests? Should I carry a fixture with the backgroud images?
Or should I write second serialize_for_test() method?
Or maybe I can instanciate the Room() with some test value for the background_preview field? - I tried this but the direct Room(background_preview='fake_url') didn't work.
Thanks.
The solution, which worked for me:
from django.core.files.uploadedfile import SimpleUploadedFile
test_room.image = SimpleUploadedFile(name='foo.gif', content=b'GIF87a\x01\x00\x01\x00\x80\x01\x00\x00\x00\x00ccc\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02D\x01\x00')
test_room.save

How to get a collection_name without having and instance of the referencing object?

I'm doing a simple program about customers, products and drafts.
Since they are referenced to each other in some way, when I delete one entity of a kind, another entity of another kind might give an error.
Here's what I have:
-customer.py
class Customer(db.Model):
"""Defines the Customer entity or model."""
c_name = db.StringProperty(required=True)
c_address = db.StringProperty()
c_email = db.StringProperty() ...
-draft.py
class Draft(db.Model):
"""Defines the draft entity or model."""
d_customer = db.ReferenceProperty( customer.Customer,
collection_name='draft_set')
d_address = db.StringProperty()
d_country = db.StringProperty() ...
Ok, now what I want to do is check if a customer has any Draft referencing to him, before deleting him.
This is the code I'm using:
def deleteCustomer(self, customer_key):
'''Deletes an existing Customer'''
# Get the customer by its key
customer = Customer.get(customer_key)
if customer.draft_set: # (or customer.draft_set.count > 0...)
customer.delete()
else:
do_something_else()
And now, it comes the problem.
If I have a draft previously created with the selected customer on it, there's no problem at all, and it does what has to do. But if I haven't created any draft that references to that customer, when trying to delete him, it will show this error:
AttributeError: 'Customer' object has no attribute 'draft_set'
What am I doing wrong? Is it needed to always create a Draft including a Customer for him to have the collection_name property "available"?
EDIT: I found out what the error was.
Since I have both classes in different .py files, it seems that GAE loads the entities into the datastore at the same moment as it "goes through" the file that contains that model.
Therefore, if I'm executing the program, and never use or import that file, the datastore is not updated until then.
Now what I'm doing is:
from draft.py import Draft
inside de "deleteCustomer()" function and it's finally working fine, but I get a horrible "warning not used" because of so.
Is there any other way I can fix this?
The collection_name property a query, so it should always be available.
What you may be missing is the reference_class parameter (check the ReferenceProperty docs)
class Draft(db.Model):
"""Defines the draft entity or model."""
d_customer = db.ReferenceProperty(reference_class=customer.Customer, collection_name='draft_set')
The following should work:
if customer.draft_set.count():
customer.delete()
note that customer.draft_set will always return true, as it is the generated Query object, so you MUST use the count()
There were two possible solutions:
Ugly, bad one: as described in my edited question.
Best practice: put all the models together inside one file (e.g. models.py) that looks like this:
class Customer(db.Model):
"""Defines the Customer entity or model."""
c_name = db.StringProperty(required=True)
c_address = db.StringProperty()
c_email = db.StringProperty() ...
class Draft(db.Model):
"""Defines the draft entity or model."""
d_customer = db.ReferenceProperty( customer.Customer,
collection_name='draft_set')
d_address = db.StringProperty()
d_country = db.StringProperty() ...
Easy!

Google App Engine Python Datastore

Basically what Im trying to make is a data structure where it has the users name, id, and datejoined. Then i want a "sub-structure" where it has the users "text" and the date it was modified. and the user will have multiple instances of this text.
class User(db.Model):
ID = db.IntegerProperty()
name = db.StringProperty()
datejoined = db.DateTimeProperty(auto_now_add=True)
class Content(db.Model):
text = db.StringProperty()
datemod= db.DateTimeProperty(auto_now_add = True)
Is the code set up correctly?
One problem you will have is that making User.ID unique will be non-trivial. The problem is that two writes to the database could occur on different shards, both check at about the same time for existing entries that match the uniqueness constraint and find none, then both create identical entries (with regard to the unique property) and then you have an invalid database state. To solve this, appengine provides a means of ensuring that certain datastore entities are always placed on the same physical machine.
To do this, you make use of the entity keys to tell google how to organize the entities. Lets assume you want the username to be unique. Change User to look like this:
class User(db.Model):
datejoined = db.DateTimeProperty(auto_now_add=True)
Yes, that's really it. There's no username since that's going to be used in the key, so it doesn't need to appear separately. If you like, you can do this...
class User(db.Model):
datejoined = db.DateTimeProperty(auto_now_add=True)
#property
def name(self):
return self.key().name()
To create an instance of a User, you now need to do something a little different, you need to specify a key_name in the init method.
someuser = User(key_name='john_doe')
...
someuser.save()
Well, really you want to make sure that users don't overwrite each other, so you need to wrap the user creation in a transaction. First define a function that does the neccesary check:
def create_user(username):
checkeduser = User.get_by_key_name(username)
if checkeduser is not None:
raise db.Rollback, 'User already exists!'
newuser = User(key_name=username)
# more code
newuser.put()
Then, invoke it in this way
db.run_in_transaction(create_user, 'john_doe')
To find a user, you just do this:
someuser = User.get_by_key_name('john_doe')
Next, you need some way to associate the content to its user, and visa versa. One solution is to put the content into the same entity group as the user by declaring the user as a parent of the content. To do this, you don't need to change the content at all, but you create it a little differently (much like you did with User):
somecontent = Content(parent=User.get_by_key_name('john_doe'))
So, given a content item, you can look up the user by examining its key:
someuser = User.get(somecontent.key().parent())
Going in reverse, looking up all of the content for a particular user is only a little trickier.
allcontent = Content.gql('where ancestor is :user', user=someuser).fetch(10)
Yes, and if you need more documentation, you can check here for database types and here for more info about your model classes.
An alternative solution you may see is using referenceproperty.
class User(db.Model):
name = db.StringProperty()
datejoined = db.DateTimeProperty(auto_now_add=True)
class Content(db.Model):
user = db.ReferenceProperty(User,collection_name='matched_content')
text = db.StringProperty()
datemod= db.DateTimeProperty(auto_now_add = True)
content = db.get(content_key)
user_name = content.user.name
#looking up all of the content for a particular user
user_content = content.user.matched_content
#create new content for a user
new_content = Content(reference=content.user)

Categories