Django global ManyToMany Relation (more than symmetrical) - python

I have trouble on how to make a ManyToMany relation where every object is connected to every other object.
Here's my example for better understanding:
class Animal(models.Model):
animaux_lies = models.ManyToManyField("self", verbose_name="Animaux liƩs", blank=True)
If y have only two animals linked together, it works fine, thez I correctly linked together in both ways (because the relation is symmetrical).
But if I have 3 or more animals, I don't get the result I want. If Animal1 is linked to Animal2 and Animal3, I would like Animal2 to not only be linked to Animal1 but also to Animal3 (and Animal3 linked to 1 and 2).
How can I do that? Even with a through table I don't see how to do this correctly

It sounds like you're trying to represent graphs and ask questions about connections within those graphs. That is extremely challenging using a relational database, which is what Django's ORM works with. It is possible using a graph database, but that will require different libraries to query the data.

Related

Generic data handling in Python

The situation
While reading the Bible (as context) I'd like to point out certain dependencies e.g. of people and locations. Due to swift expandability I'm choosing Python to handle this versatile data. Currently I'm creating many feature vectors independent from each other, containing various information as the database.
In the end I'd like to type in a keyword to search in this whole database, which shall return everything that is in touch with it. Something simple as
results = database(key)
What I'm looking for
Unfortunately I'm not a Pro about different database handling possibilities and I hope you can help me finding an appropriate option.
Are there possibilities that can be used out of the box or do I need to create all logic by myself?
This is a little vague so I'll try to handle the People and Location bit of it to help you get started.
One possibility is to build a SQLite database. (The sqlite3 library + documentation is relatively friendly). Also here's a nice tutorial on getting started with SQLite.
To start, you can create two entity tables:
People: contains details about every person in bible.
Locations: contains details about every location in bible.
You can then create two relationship tables that reference people and locations (as Foreign Keys). For example, one of these relationship tables might be
People_Visited_Locations: contains information about where each person visited in their lifetime. The schema might looks something like this:
| person (Foreign Key)| location (Foreign Key) | year |
Remember that Foreign Key refers to an entry in another table. In our case, person is an existing unique ID from your entity table People, location is an existing unique ID from your entity table Locations, and year could be the year that person went to that location.
Then to fetch every place that some person, say Adam in the bible visited, you can create a Select statement that returns all entries in People_Visited_Locations with Adam as person.
I think key (pun intended) takeaway is how Relationship tables can help you map relationships between entities.
Hope this helps get you started :)

Storing multiple values into a single field in mysql database that preserve order in Django

I've been trying to build a Tutorial system that we usually see on websites. Like the ones we click next -> next -> previous etc to read.
All Posts are stored in a table(model) called Post. Basically like a pool of post objects.
Post.objects.all() will return all the posts.
Now there's another Table(model)
called Tutorial That will store the following,
class Tutorial(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
tutorial_heading = models.CharField(max_length=100)
tutorial_summary = models.CharField(max_length=300)
series = models.CharField(max_length=40) # <---- Here [10,11,12]
...
Here entries in this series field are post_ids stored as a string representation of a list.
example: series will have [10,11,12] where 10, 11 and 12 are post_id that correspond to their respective entries in the Post table.
So my table entry for Tutorial model looks like this.
id heading summary series
"5" "Series 3 Tutorial" "lorem on ullt consequat." "[12, 13, 14]"
So I just read the series field and get all the Posts with the ids in this list then display them using pagination in Django.
Now, I've read from several stackoverflow posts that having multiple entries in a single field is a bad idea. And having this relationship to span over multiple tables as a mapping is a better option.
What I want to have is the ability to insert new posts into this series anywhere I want. Maybe in the front or middle. This can be easily accomplished by treating this series as a list and inserting as I please. Altering "[14,12,13]" will reorder the posts that are being displayed.
My question is, Is this way of storing multiple values in field for my usecase is okay. Or will it take a performance hit Or generally a bad idea. If no then is there a way where I can preserve or alter order by spanning the relationship by using another table or there is an entirely better way to accomplish this in Django or MYSQL.
Here entries in this series field are post_ids stored as a string representation of a list.
(...)
So I just read the series field and get all the Posts with the ids in this list then display them using pagination in Django.
DON'T DO THIS !!!
You are working with a relational database. There is one proper way to model relationships between entities in a relational database, which is to use foreign keys. In your case, depending on whether a post can belong only to a single tutorial ("one to many" relationship) or to many tutorials at the same time ("many to many" relationship, you'll want either to had to post a foreign key on tutorial, or to use an intermediate "post_tutorials" table with foreign keys on both post and tutorials.
Your solution doesn't allow the database to do it's job properly. It cannot enforce integrity constraints (what if you delete a post that's referenced by a tutorial ?), it cannot optimize read access (with proper schema the database can retrieve a tutorial and all it's posts in a single query) , it cannot follow reverse relationships (given a post, access the tutorial(s) it belongs to) etc. And it requires an external program (python code) to interact with your data, while with proper modeling you just need standard SQL.
Finally - but this is django-specific - using proper schema works better with the admin features, and with django rest framework if you intend to build a rest API.
wrt/ the ordering problem, it's a long known (and solved) issue, you just need to add an "order" field (small int should be enough). There are a couple 3rd part django apps that add support for this to both your models and the admin so it's almost plug and play.
IOW, there are absolutely no good reason to denormalize your schema this way and only good reasons to use proper relational modeling. FWIW I once had to work on a project based on some obscure (and hopefully long dead) PHP cms that had the brillant idea to use your "serialized lists" anti-pattern, and I can tell you it was both a disaster wrt/ performances and a complete nightmare to maintain. So do yourself and the world a favour: don't try to be creative, follow well-known and established best practices instead, and your life will be much happier. My 2 cents...
I can think of two approaches:
Approach One: Linked List
One way is using linked list like this:
class Tutorial(models.Model):
...
previous = models.OneToOneField('self', null=True, blank=True, related_name="next")
In this approach, you can access the previous Post of the series like this:
for tutorial in Tutorial.objects.filter(previous__isnull=True):
print(tutorial)
while(tutorial.next_post):
print(tutorial.next)
tutorial = tutorial.next
This is kind of complicated approach, for example whenever you want to add a new tutorial in middle of a linked-list, you need to change in two places. Like:
post = Tutorial.object.first()
next_post = post.next
new = Tutorial.objects.create(...)
post.next=new
post.save()
new.next = next_post
new.save()
But there is a huge benefit in this approach, you don't have to create a new table for creating series. Also, there is possibility that the order in tutorials will not be modified frequently, which means you don't need to take too much hassle.
Approach Two: Create a new Model
You can simply create a new model and FK to Tutorial, like this:
class Series(models.Model):
name = models.CharField(max_length=255)
class Tutorial(models.Model):
..
series = models.ForeignKey(Series, null=True, blank=True, related_name='tutorials')
order = models.IntegerField(default=0)
class Meta:
unique_together=('series', 'order') # it will make sure that duplicate order for same series does not happen
Then you can access tutorials in series by:
series = Series.object.first()
series.tutorials.all().order_by('tutorials__order')
Advantage of this approach is its much more flexible to access Tutorials through series, but there will be an extra table created for this, and one extra field as well to maintain order.

Reuse existing objects in django ORM

We want to reduce the amount of same object instances in one python interpreter.
Example:
class Blog(models.Model):
author=models.ForeignKey(User)
If we iterate the thousand blogs, the same (same id but different python object) author objects get created several times.
Is there a way to make the django ORM reuse the already created user instances?
Example:
for blog in Blog.objects.all():
print (blog.author.username)
If author "foo-writer" has 100 blogs, there are 100 author objects in the memory. That's what we want to avoid.
I think solutions like mem-cached/redis won't help here, since we want to optimize the python process.
I'm not sure if it's database calls or memory usage you're concerned about here.
If the former, then using select_related will help you:
Blog.objects.all().select_related('author')
which will get all the blogs and their associated authors.
If you want to optimize memory, then the best way to do it is to get the relevant author objects manually in one go, store them in a dict, then manually annotate that object on each blog:
blogs = Blog.objects.all()
author_ids = set(b.author_id for b in blogs)
authors = Author.objects.in_bulk(list(author_ids))
for blog in blogs:
blog._author = authors[blog.author_id]

Query entities wih a specific parent's property in GAE and python?

Is it possible in any way to query entities using one of their parent's property in GAE, like this (which doesn't work)?
class Car(db.Model):
title = db.StringProperty()
type = db.StringProperty()
class Part(db.Model):
title = db.StringProperty()
car = Car()
car.title = 'BMW X5'
car.type = 'SUV'
car.put()
part = Part(parent = car)
part.title = 'Left door'
part.put()
parts = Part.all()
parts.filter('parent.type ==', 'SUV') # this in particular
I've read about ReferenceProperty, and Indexes but I'm not sure what I need.
GAE lets me set a parent to the Part entity, but do I need an actually (kind of duplicate):
parent = db.ReferenceProperty(Car, required=True)
That would feel like duplicating what the system does already since it has a parent. Or is there an other way?
It's not an answer to your question as such, but NDB offers structured properties.
https://developers.google.com/appengine/docs/python/ndb/properties#structured
You can structure a model's properties. For example, you can define a model class Contact containing a list of addresses, each with internal structure.
Although the structured properties instances are defined using the same syntax as for model classes, they are not full-fledged entities. They don't have their own keys in the Datastore. They cannot be retrieved independently of the entity to which they belong. An application can, however, query for the values of their individual fields.
So here car would contain parts as a structured property. If this is viable in your use case depends on how you structure your data. If you want to know what parts make up a specific car, that seems viable. If you want to filer global parts regardless of what car they belong to, then you can still do that but you'll have to make the "parts" inside each car also refer to a different model. If you see what I mean (I'm not sure I do), as each car contains it's own parts.
Adding the parent as an explicit property isn't going to help.
You can break it up in two parts though:
for suv in Car.all().filter('type', 'SUV'):
for part in Part.all(ancestor=suv):
...do something with part...
If you want to query on the property of another (parent) object, you gotta get that object first.
I can think of two solutions to your problem:
Guido's way is to query for the parent, and then query for the part. This way issues more queries.
The second way is to store a copy of parent.type inside your Part. The downsides are that you're storing duplicate data (more storage), and you have to be careful that your the data in Part and data in Car match up. However, you only need to issue one query.
You'll have to figure out which one works better for you.

Grouping model objects in Django

I have an app with 2 models: Product and Photo, each of which corresponds to a MySQL table (drived by MyISAM).
Product is a ForeignKey field of Photo. Several Photo objects may share a single Product object.
Now, the question: I need Photo objects to be further subgrouped into sets representing the same real world object (instance of a product) photographed from different aspects. I want to differentiate this different real world objects, but still have them all connected to their Product object.
What's the best way to group the photos in terms of efficiency of both database querying and manual data input?
Thanks to #shawnwall, more ideas:
Maybe Product should be connected not to individual Photo objects but to sets of photos.
There should be a set for each real-word object, even if there's only 1 photo of it now.
The set should be represented by an ID field on the Photo table, common between certain Photo objects and a Product object. (What kind of field is that?)
Seems like your approach is decent so far. I considered a manytomany field from product to photos, but it doesn't sound like its needed. You could add an 'aspect' column that relates to a list of 'choices' on the photo table. Also remember django will let you query both ways:
Photo.objects.filter(product__id=1)
or
Product.objects.filter(photo__id=2)
You can also access them through instances:
photo.product
or
product.photo_set

Categories