ZODB multiple object references - python

im developing an application that it is used to fill some huge forms. There are several projects that a form can belong to. Also the form has two sections a that can be filled in many times, like objectives and activities, so a form can have many objectives and activities defined.
I have a class to represent the projects, another for the form and two simple classes to represent the objective and activities. Project has a list of forms, and Form has a list of activities and objectives.
class Project(persistent.Persistent):
forms = PersistentList()
...
class Form(persistent.Persistent):
objectives = PersistentList()
activities = PersistentList()
...
My question is, im planning on storing this data in ZODB like this:
db['projects'] = OOBTree()
db['forms'] = OOBTree()
db['activities'] = OOBTree()
db['objectives'] = OOBTree()
project = Project(...)//fill data with some parameters
form = Form(...)//fill data with some parameters
objective1 = Objective(...)
objective2 = Objective(...)
activity1 = Activitiy(...)
activity2 = Activitiy(...)
form.addObjective(objective1)
form.addObjective(objective2)
form.addActivity(activity1)
form.addActivity(activity2)
project.addForm(form)
db['projects']['projectID'] = project
db['forms']['formID'] = form
db['activities']['activityID'] = activity1
db['activities']['activityID'] = activity2
db['objectives']['objectiveID'] = objective1
db['objectives']['objectiveID'] = ojective2
transaction.commit()
I know that when storing the project, the list of forms gets persisted as well, and the corresponding list of objectives and activities from the form too.
But what happens in the case of the other OOBTrees, 'forms', 'activities' and 'objectives'?
im doing this in order to be easier to traverse or look for individual forms/objectives/activities. But im not sure if the ZODB will cache those objects and only persist them once when saving the project, and keeping a reference to that object. So when any of those are modified, all references are updated.
Meaning that when doing db['forms']['formID'] = form the OOBTree will point to the same object as the project OOBTree and thus not persisting the same object twice.
Is that the way it works? or ill get duplicated persisted objects and will all be independent instances?
I know that theres repoze catalog to handle indexing and stuff, but i dont need that much, just be able to access a form without having to iterate over projects.
Thanks!

Yes, as long as the target objects you are storing have classes that subclass persistent.Persistent somewhere in their inheritance, any references to the same object will point to exactly the same (persistent) object. You should not expect duplication as you have described this.
The short-long-version: ZODB uses special pickling techniques, when serializing the source/referencing object, it sees that the reference is to a persistent object, instead of storing that object again, it stores a tuple of the class dotted name and the internal OID of the target object.
Caveat: this only works within the same object database. You should not have cross-database references in your application.

Related

Names of instances and loading objects from a database

I got for example the following structure of a class.
class Company(object):
Companycount = 0
_registry = {}
def __init__(self, name):
Company.Companycount +=1
self._registry[Company.Companycount] = [self]
self.name = name
k = Company("a firm")
b = Company("another firm")
Whenever I need the objects I can access them by using
Company._registry
which gives out a dictionary of all instances.
Do I need reasonable names for my objects since the name of the company is a class attribute, and I can iterate over Company._registry?
When loading the data from the database does it matter what the name of the instance (here k and b) is? Or can I just use arbitrary strings?
Both your Company._registry and the names k and b are just references to your actual instances. Neither play any role in what you'd store in the database.
Python's object model has all objects living on a big heap, and your code interacts with the objects via such references. You can make as many references as you like, and objects automatically are deleted when there are no references left. See the excellent Facts and myths about Python names and values article by Ned Batchelder.
You need to decide, for yourself, if the Company._registry structure needs to have names or not. Iteration over a list is slow if you already have a name for a company you wanted to access, but a dictionary gives you instant access.
If you are going to use an ORM, then you don't really need that structure anyway. Leave it to the ORM to help you find your objects, or give you a sequence of all objects to iterate over. I recommend using SQLAlchemy for this.
the name doesn't matter but if you are gonna initialize a lot of objects you are still gonna make it reasonable somehow

how can I store and read a Python list of Django model objects from and to the session?

I am looking for a way to put Django model objects into a list and then store this in the session. I found out that it requires serialization before it can be stored in the session. So when reading the list from the session I first serialize it.
But then I was hoping to be able to access my initial objects as before, but it turns out to be a DeserializedObject.
Does anyone know how I should approach my requirements ?
In a nutshell this is the code I was trying, yet unsuccessfully
team1_list = generate_random_playerlist() #generates a list of Player() objects
request.session['proposed_team_1'] = serializers.serialize('json', team1_list)
#some code inbetween
des_list = serializers.deserialize('json', request.session['proposed_team_1'])
for player in des_list :
print("name of player:"+player.last_name) #fails
I would not serialize the model instances themselves. Just write their primary keys to the session which is easy and requires no serialization. Then you can retrieve the queryset from the database in a single query and don't have to worry about changed data, missing instances and the like.
request.session['player_ids'] = list(players.values_list('pk', flat=True))
players = Player.objects.filter(pk__in=request.session.get('player_ids', []))
If your Player instances are not yet saved to the database, you might have to go your way. Then you get the actual model instance from the DeserializedObject via the .object attribute.
for player in des_list :
print("name of player:" + player.object.last_name)
See the docs on deserialization.

Adding a value to an Object retrieved via a QuerySet

In Django I often call Objects via QuerySets and put them in a dict to give them to a template.
ObjectInstance = Object.objects.get(pk=pk)
ObjectsInstanceDict = {'value1': ObjectInstance.value1,
'value2': ObjectInstance.value2,
'specialvalue': SomeBusinessLogic(Data_to_aggregate)
Sometimes specialvalue is just a timestamp converted to a string other times there are some analysis done with the data.
Instead of creating a dict I would like to add a special value to the ObjectInstance instead so I don't have to repeat all the existing values and just adding new computed values.
How would I do this?
And please tell me if I made a fundamental mistake I work around here.
Django model instances are normal Python objects. Like almost any other Python object, you can freely add attributes to them at any time.
object_instance = Object.objects.get(pk=pk)
object_instance.special_value = SomeBusinessLogic(data_to_aggregate)

Django cross table model structure

I have a System model, and an Interface model. An Interface is the combination between two systems. Before, this interface was represented as an Excel sheet (cross table). Now I'd like to store it in a database.
I tried creation an Interface model, with two foreign keys to System. This doesn't work because :
It creates two different reverse relationships on the target model
It doesn't avoid having duplicates (first and second rel swapped)
I used this code :
class SystemInterface(Interface):
assigned_to = models.ManyToManyField(User)
first_system = models.ForeignKey(System)
second_system = models.ForeignKey(System)
Isn't there a better way to do this ?
I'd need to have symmetrical relations : it should'nt matter is a System is the "first" or "second" in an Interface.
I think the best simpliest to represent those models would be like this:
class System(models.Model):
pass
class Interface(models.Model):
assigned_to = models.ManyToManyField(to=User)
system = models.ForeignKey(System)
#property
def systems(self):
Interface.objects.get(system=self.system).interfacedsystem_set.all()
class InterfacedSystem(models.Model):
interface = models.ForeignKey(Interface)
system = models.ForeignKey(System)
The add/remove of interfaced system is obviously left as an exercise to the reader, put should be fairly easy.
You can use a many to many relationship with extra fields, but it can't be symetrical.
The table used for a many to many relation contain a row per relation between 2 models. The table used for a many to many relation from System to self, has one row per relation between two Systems. This is consistent with the fact that your model fits the structure of a model used for ManyToManyField.through.
Using an intermediary model allows to add fields like assigned_to to the many to many table.
It might be tricky to understand, but it should prevent the creation of SystemInterface(left_system=system_a, right_system=system_b). Note that I changed "first" by "left" and "second" by "right" for the purpose of representing a many to many relation row/instance, which has a "left" side and a "right" side.
Because they can't be symetrical, this won't solve the problem of having one SystemInterface(left_system=system_a, right_system=system_b) and one with SystemInterface(left_system=system_b, right_system=system_a). You should prevent that from happening in the clean() method of the SystemInterface - or any model used to represent the many to many table with a ManyToManyField.through model.
Since django doesn't support symmetrical many-to-many relationships with extra data, you probably need to enforce this yourself.
If you have a convenient immutable value in the system (e.g. system id), you can create a predictable algorithm for which system will be stored in which entry in your table. If the systems are always persistent by the time you create the Interface object, you can use the primary key.
Then, write a function to create the interface. For example:
class System(models.Model):
def addInterface(self, other_system, user):
system_interface = SystemInterface()
system_interface.assigned_to = user
if other_system.id < self.id:
system_interface.first_system = other_system
system_interface.second_system = self
else:
system_interface.first_system = self
system_interface.second_system = other_system
system_interface.save()
return system_interface
Using this design, you can do the usual validation, duplication detection, etc. on the SystemInterface object. The main point is that you enforce the constraint in your code rather than in the data model.
Does this make sense?

App Engine, Cross reference between two entities

i will like to have two types of entities referring to each other.
but python dont know about name of second entity class in the body of first yet.
so how shall i code.
class Business(db.Model):
bus_contact_info_ = db.ReferenceProperty(reference_class=Business_Info)
class Business_Info (db.Model):
my_business_ = db.ReferenceProperty(reference_class=Business)
if you advice to use reference in only one and use the implicitly created property
(which is a query object) in other.
then i question the CPU quota penalty of using query vs directly using get() on key
Pleas advise how to write this code in python
Queries are a little slower, and so they do use a bit more resources. ReferenceProperty does not require reference_class. So you could always define Business like:
class Business(db.Model):
bus_contact_info_ = db.ReferenceProperty()
There may also be better options for your datastructure too. Check out the modelling relationships article for some ideas.
Is this a one-to-one mapping? If this is a one-to-one mapping, you may be better off denormalizing your data.
Does it ever change? If not (and it is one-to-one), perhaps you could use entity groups and structure your data so that you could just directly use the keys / key names. You might be able to do this by making BusinessInfo a child of Business, then always use 'i' as the key_name. For example:
business = Business().put()
business_info = BusinessInfo(key_name='i', parent=business).put()
# Get business_info from business:
business_info = db.get(db.Key.from_path('BusinessInfo', 'i', parent=business))
# Get business from business_info:
business = db.get(business_info.parent())

Categories