I am currently using Django framework including its Models mechanism to abstract the database schema declaration and general db access, which is working fine for most scenarios.
However, my application also requires tables to be created and accessed dynamically during runtime, which as far as I can see, is not supported by Django out of the box.
These tables usually have an identical structure, and can basically be abstracted by the same Model class, but Django doesn't let you change the underlying db_table of a certain model query, as it is declared on the Model class and not on the Manager.
My solution for this is to do this process whenever I need a new table to be created, populated and accessed:
Create and populate the table using raw sql
Add indexes to the table using raw sql
When I need to access the table (using django queryset api), I declare a new type dynamically and return it as the model for the query, by using this code:
table_name = # name of the table created by sql
model_name = '%d_%s' % (connection.tenant.id, table_name)
try:
model = apps.get_registered_model('myapp', model_name)
return model
except LookupError:
pass
logger.debug("no model exists for model %s, creating one" % model_name)
class Meta:
db_table = table_name
managed = False
attrs = {
'field1' : models.CharField(max_length=200),
'field2' : models.CharField(max_length=200),
'field3' : models.CharField(max_length=200)
'__module__': 'myapp.models',
'Meta':Meta
}
model = type(str(model_name), (models.Model,), attrs)
return model
Note that I do check if the model is already registered in django and I'm using an existing model in case it does. The model name is always unique for each table. Since I'm using multi tenants, the tenant name is also part of the model name to avoid conflict with similar tables declared on different schemas.
In case it's not clear: the tables created dynamically will and should be persisted permanently for future sessions.
This solution works fine for me so far.
However, the application will need to support a large number of these tables. i.e. 10,000 - 100,000 such tables(and corresponding model classes), with up to a million rows per table.
Assuming the underlying db is fine with this load, my questions are:
Do you see any problem with this solution, with and without regards to the expected scale ?
Anybody has a better solution for this scenario ?
Thanks.
There is a wiki page on creating models dynamically, although it has been a while since it was last updated:
DynamicModels Django
There are also a few apps that are designed for this use case, but I don't think any of them is being actively maintained:
Django Packages: Dynamic models
I understand that if you are already committed to Django this isn't very helpful, but this a use case for which Django isn't really good. It might be more costly to fight against the abstractions provided by Django's model layer, than to just use psycopg2 or whatever other adapter is appropriate for your data.
Depending on what sort of operations you are going to perform on your data, it may be also more reasonable to use a single model with an indexed field that allows you to distinguish in which table that row would be and then sharding the data by that column.
If you still need to do this, the general idea would be:
Create a metaclass that extends Django's ModelBase. This metaclass you would use as a factory for your actual models.
Consider stuff mentioned on that wiki page, like circumventing the app_label issue.
Generate and execute the sql for the creation of the model as also shown on the wiki page.
Related
I am writing a model for an external Oracle DB, which I need to pull info from to my project. One of the fields in that Oracle DB is an XMLType, and it hosts large portion of data, which needs to be pulled via .getClobVal() method, instead of querying it straight as it is.
I was wondering, if Django supports any sort of custom logic, for Model's specific fields. Something along the lines of a getter for each field, which I could overwrite to query that specific field as XMLType.getClobVal() instead of just XMLType. At the same time, I don`t want to touch the logic for querying the rest of the Model's fields.
Any ideas?
I have a model with as many as 20 fields. It is also referenced as ManytoMany in another model which references it using a through table. Let me put a scenario here showing my case.
class Class1(models.Model):
some_field = .....
myfield1 = models.ManyToManyField(Class2,through='Another')
......
class Another(models.Model):
class1 = models.ForeignKey(Class1, related_name='class1_class2')
class2 = models.ForeignKey(Class2, related_name='class1_class2')
"Another" is an admin inline field, using default Admin UI of Django. The problem is that if there are too many objects of "Another" which loads lot of other objects of class1 and class2, NGINX gives me 502: Bad Gateway.
I am not willing to increase the NGINX time, I have already done that many times. What I want to know is that, if there is a way I can say Django Admin to load the inlines only after all other contents are loaded, or say Lazy Load the inlines.
I have gone through almost every post that says Lazy Loading in Django, but it all applies to a particular view or a field, I found nothing close to what I need.
I would be very appreciable if anyone can shed some light on this.
Regards.
Using defer may give you what you need...
https://docs.djangoproject.com/en/dev/ref/models/querysets/#defer
defer
defer(*fields)
In some complex data-modeling situations, your models might contain a lot of fields, some of which could contain a lot of data (for example, text fields), or require expensive processing to convert them to Python objects. If you are using the results of a queryset in some situation where you don’t know if you need those particular fields when you initially fetch the data, you can tell Django not to retrieve them from the database.
This is done by passing the names of the fields to not load to defer():
Entry.objects.defer("headline", "body")
A queryset that has deferred fields will still return model instances. Each deferred field will be retrieved from the database if you access that field (one at a time, not all the deferred fields at once).
You can make multiple calls to defer(). Each call adds new fields to the deferred set:
Defers both the body and headline fields.
Entry.objects.defer("body").filter(rating=5).defer("headline")
The order in which fields are added to the deferred set does not matter. Calling defer() with a field name that has already been deferred is harmless (the field will still be deferred).
You can defer loading of fields in related models (if the related models are loading via select_related()) by using the standard double-underscore notation to separate related fields:
Blog.objects.select_related().defer("entry__headline", "entry__body")
If you want to clear the set of deferred fields, pass None as a parameter to defer():
Load all fields immediately.
my_queryset.defer(None)
Some fields in a model won’t be deferred, even if you ask for them. You can never defer the loading of the primary key. If you are using select_related() to retrieve related models, you shouldn’t defer the loading of the field that connects from the primary model to the related one, doing so will result in an error.
We have a django application that is, at its core, a series of webpages with Forms which our users fill out in order. (We'll call the particular series of pages with forms on them a "flow".)
We will be white-labeling this application for a Partner -- the Partner will want to add some fields and even add some webpages with their own new Forms. This may result in a new order in which the Forms are filled out. (A new "flow", in addition to changes to existing Forms/Models or new Forms/Models.)
What is the best way to extend our existing, simple Forms-and-Models structure to use different Forms and Models depending on the running instance of the app (e.g. an environment variable)? Some things we thought about:
implement something like get_user_model for every Model and Form use in the app, which would look at the current environment
implement a more generic key-value store so that we're not bound by the current implementation's field types (i.e., have the data field name be part of the data as well)
a data model which tracks this particular environment's "flow" and which models it needs to use
subclass existing Models and Forms for each new white-label implementation
Model Field injection may be what you are looking for, take a look of this article
The approach boils down to three concepts:
Dynamically adding fields to model classes Ensuring Django’s model
system respects the new fields
Getting the load ordering correct for the above to work
Mezzanine has done a beautiful job implementing this model field injection with dynamic extra models via EXTRA_MODEL_FIELDS
I am beginning to learn Python and Django. I want to know how if I have a simple class of "player" with some properties, like: name, points, inventory, how would I make the class also write the values to the database if they are changed. My thinking is that I create Django data models and then call the .save method within my classes. Is this correct?
You are correct that you call the save() method to save models to your db, But you don't have to define the save method within your model classes if you don't want to. It would be extremely helpful to go through the django tutorial which explains all.
https://docs.djangoproject.com/en/dev/intro/tutorial01/
https://docs.djangoproject.com/en/dev/topics/db/models/
Explains django models
django uses its own ORM (object-relational mapping)
This does exacxtly what it sounds like maps your django/python objects (models) to your backend.
It provides a sleek, intuitive, pythonic, very easy to use interface for creating models (tables in your rdbms) adding data and retrieving data.
First you would define your model
class Player(models.Model):
points = models.IntegerField()
name = models.CharField(max_length=255)
django provides commands for chanign this python object into a table.
python manage.py syncdb
you could also use python manage.py sql <appname> to show the actual sql that django is generating to turn this object into a table.
Once you have a storage for this object you can create new ones in the same manner you would create python objects
new_player = Player(points=100, name='me')
new_player.save()
Calling save() actually writes the object to your backend.
You're spot on...
Start at https://docs.djangoproject.com/en/dev/intro/tutorial01/
Make sure you have the python bindings for MySQL and work your way through it... Then if you have specific problems, ask again...
How do I make syncdb execute SQL queries (for table creation) defined by me, rather then generating tables automatically.
I'm looking for this solution as some particular models in my app represent SQL-table-views for a legacy-database table.
So, I've created their SQL-views in my django-DB like this:
CREATE VIEW legacy_series AS SELECT * FROM legacy.series;
I have a reverse engineered model that represents the above view/legacytable. But whenever I run syncdb, I have to create all the views first by running sql scripts, otherwise syncdb simply creates tables for them (if a view is not found).
How do I make syncdb run the above mentioned SQL?
There are 2 possible approaches I know of to adapt your models to a legacy database table (without using views that is):
1) Run python manage.py inspectdb within your project. This will generate models for existing database tables, you can then continue to work with those.
2) Modify your tables with some specific settings. First of all you define the table name in your model by setting the db_table option in your meta options. Secondly you define for each field the column name to match your legacy database by setting the db_column option. Note there are other db_ options listed you possibly could use to match your legacy database.
If you really want the views approach an (ugly) workaround is possible, you can define custom sql commands per application model. This file is found in "application"/sql/"model".sql . Django will call this sql's after it created all tables. You can try to specify DROP statements for the generated tables followed by your view create statement in this file. Note that this will be a bit tricky for the tables with foreign keys as django guarantees no order of execution of these files (so stuffing all statements in one .sql will be the easiest way I think, I've never tried this before).
You could use unmanaged models for your reverse-engineered models, and initial SQL scripts for creating your views.
EDIT:
A bit more detailed answer. When you use unmanaged models, syncdb will not create your database tables for you, so you have to take care of it yourself. An important point is the table name, and how django maps Model classes to table names, I suggest you read the doc on that point.
Basically, your Series model will look like that :
class Series(models.Model):
# model fields...
...
class Meta:
managed = False
db_table = "legacy_series"
Then, you can put your SQL commands, in the yourapp/sql/series.sql file :
### yourapp/sql/series.sql
CREATE VIEW legacy_series AS SELECT * FROM legacy.series;
You can then syncdb as usual, and start using your legacy models.