In Django, can I specify database when creating an object? - python

Look at this Django ORM code:
my_instance = MyModel()
my_instance.some_related_object = OtherModel.objects.using('other_db').get(id)
At this point, in the second line, Django will throw an error:
ValueError: Cannont assign "<OtherModel: ID>": instance is on database "default", value is on database "other_db"
To me, it doesn't make much sense. How Django can tell on which database my_instance is, if I haven't even called:
my_instance.save(using='some_database')
yet?
I guess, that during the construction of an object Django automatically assigns it to the default database. Can I change it? Can I specify database when creating an object, by passing an argument to its constructor? According to the documentation, the only arguments I can pass, when creating an object are the values of its fields. So how can I solve my problem?
In Django 1.8 There is a new method called Model.from_db (https://docs.djangoproject.com/en/1.8/ref/models/instances/) but I'm using earlier version of Django and can't switch to the newer now. Looking at the implementation all it does is setting two model's attributes:
instance._state.adding = False
instance._state.db = db
So would it be enough to change my code to:
my_instance = MyModel()
my_instance._state.adding = False
my_instance._state.db = 'other_db'
my_instance.some_related_object = OtherModel.objects.using('other_db').get(id)
or it is too late to do it because those flags are used in constructor and have to be set in constructor only?

You might want to look into database routing, which has been supported since Django 1.2. This will let you setup multiple databases (or "routers") for different models.
You can create a custom database router (a class inheriting from the built-in object type), with db_for_read and db_for_write methods that return the name of the database (as defined in the DATABASES setting) that should be used for the model passed into that method. Return None to let Django figure it out.
It's usually used for handling master-slave replication, so you can have a separate read-only database from your writeable one, but the same logic would apply to let you specify that certain models live in certain databases.
You would probably also want to define an allow_syncdb method so that only the models you want to appear in database B will appear there, and everything else will appear in database A.

Django knows what database each object comes from because it notes it such in its internal properties. The QuerySet too has this information stored within itself.
Actually, database routing isn't really needed to achieve what you want here.
Consider the following code fragment:
my_instance = MyModel()
my_instance.some_related_object_id = OtherModel.objects.using('other_db').get(id).id
Note how I assign just the ID, not the object itself.
You will lose the actual object here, but gain the ability to store referential data.
AFAIK there's no API to change an object's associated database.

Related

How can I create multiple polymorphic Django models using the same table

I've got a situation where I have one base Django model Rolls, mapped to the table rolls. There are multiple types of rolls, controlled by a column called type. In the older codebase I'm writing a v2 for (used to be PHP), I created subclasses for each type, that controlled setting their own type value, and it worked fine. I can't figure out how to set this up in Django.
I'd like them all to use the same table, and each will derive methods from the base model, but have different implementations for many of those methods. I figure I can write a manager to handle getting back the right values, but I can't figure out how to setup the models.
I tried setting a single base model and then derived other models from it, but those created different tables for each of them. Using managed = False seems the wrong way to go, given the subclasses don't represent tables of their own.
You're on the right track, but I believe you want proxy models and not unmanaged ones, e.g. proxy = True:
Sometimes, however, you only want to change the Python behavior of a model – perhaps to change the default manager, or add a new method.
This is what proxy model inheritance is for: creating a proxy for the original model. You can create, delete and update instances of the proxy model and all the data will be saved as if you were using the original (non-proxied) model. The difference is that you can change things like the default model ordering or the default manager in the proxy, without having to alter the original.
Then you could override each subclass' save method to set the correct type, and each subclass' default query manager to filter on that type.

In Django, ok to set Model.objects to another manager outside of model definition?

Suppose, in Django 1.6, you have the following model code:
class FooManager(models.Manager):
def get_queryset():
return ... # i.e. return a custom queryset
class Foo(models.Model):
foo_manager = FooManager()
If, outside the Foo model definition (e.g. in view code or in the shell), you do:
Foo.objects = FooManager()
Foo.objects.all()
you'll get an exception in the Django internal code on Foo.objects.all() due to a variable named lookup_model being `None'.
However, if you instead do:
Foo.objects = Foo.foo_manager
Foo.objects.all()
The Foo.objects.all() will work as expected, i.e. as if objects had been defined to be FooManager() in the model definition in the first place.
I believe this behavior is due to Django working its "magic" in creating managers during model definition (just as it works magic in creating model fields).
My question: is there any reason NOT to assign objects to an alternate manager in this way outside of the model definition? It seems to work fine, but I don't fully understand the internals so want to make sure.
In case you are wondering, the context is that I have a large code base with many typical references to objects. I want to have this code base work on different databases dynamically, i.e. based on a request URL parameter. My plan is to use middleware that sets objects for all relevant models to managers that point to the appropriate database. The rest of the app code would then go on its merry way, using objects without ever having to know anything has changed.
The trouble is that this is not at all thread safe. Doing this will change the definition for all requests being served by that process, until something else changes it again. That is very likely to have all sorts of unexpected effects.

Is __new__ a good way to retrieve a SQLAlchemy object

I am using SQLAlchemy and I just read about the __new__ function. I also read the other posts here about __new__ so I am aware of the difference to __init__, the order they get called and their purpose and the main message for me was: Use __new__ to control the creation of a new instance.
So with that in mind, when I work with SQLAlchemy and want to retrieve an instance (and create one if it does not already exist, e.g. retrieve a User object, I normally do this:
user = DBSession.query(User).filter(User.id==user_id).first()
if not user:
user = User()
This would either return the current user or give me a new one. Now with my new knowledge about magic, I thought something like this could be a good idea:
user = User(id=user_id)
And in my database class, I would call:
def __new__(cls, id=0):
if id:
user = DBSession.query(User).filter(User.id==id).first()
if not id or not user:
user = super(User, cls).__new__(cls, id=id)
return user
Now this code is only a quick draft (e.g. a call to super is missing) but it should clearly point out the idea.
Now my question: Is this a good practice or should I avoid this? If it should be avoided: Why?
Based on your question and your comments, I would suggest you not do this, because it doesn't appear you have any reason to do so, and you don't seem to understand what you're doing.
You say that you will put certain code __new__. But in the __new__ of what? If you have this:
class User(Base):
def __new__(cls, id=0):
if id:
user = DBSession.query(User).filter(User.id==id).first()
if not user:
user = User()
return user
. . . then when you try to create a User instance, its __new__ will try to create another instance, and so on, leading to infinite recursion.
Using user = User.__init__() solves nothing. __init__ always returns None, so you will just be trying to create a None object.
The appropriate use case for __new__ is when you want to change what kind of object is returned when you instantiate a class by doing SomeClass(). It is rare to need to do this. The most common case is when you want to create a user-defined class that mimics a builtin type such as dict, but even then you might not need to do this.
If your code works without overriding __new__, don't override __new__. Only override it if you have a specific problem or task that can't be solved in another way.
From what I see and unterstand, there is no reason why not to put your code into __init__ instead of __new__. There are only a few and very limited - but valid - uses cases for __new__ and you should really know what you are doing. So unless you have a very good reason, stick with __init__.
There is a very distinct difference between the first example (checking the return value) and the second (using the constructor immediately); and that difference is the free variable: DBSession.
In some cases, this difference is not interesting; If you are only using your sqlalchemy mapped objects for database persistence; and then only in contexts where sqlalchemy.orm.scopedsession is permissible (exactly one session per thread). then the difference is not very interesting.
I have found it unusual for both of these conditions to hold, and often neither holds.
By doing this you are preventing the objects from being useful outside the context of database persistence. By disconnecting your models from the database, your application can answer questions like "what if this object had this attribute?" in addition to questions like "does this object have this attribute?" This gets to the crux of why we map database values as python objects, so that they can have interesting behaviors, instead of just as dicts, which are merely bags of attributes.
For instance, in addition to using a regular database persistent login; you might allow users to log into your site with something like OAuth. Although you don't need to persist the users' name and password to your database, you still need to create the User object for the rest of your application to work (so that the user's gravatar shows up in the template).
The other question of implicitly accessing a particular database context by default is usually a bad idea. As applications grow, the need to manage how a database gets more complicated. Objects may be partitioned across several database hosts; you may be managing several concurrent transactions in the same thread; you might want to reuse a particular session for caching performance reasons. The sqlalchemy Session class exists to address all of these peculiarities; managing them explicitly, even when you are just using the most common pattern; makes dealing with the occasional variation much easier.
A really common example of that in web apps is start-up code; Sometimes it's neccesary to pull some key bits of data out of the database before an application is ready to serve any requests; but since there is no request to serve, where does the database connection come from? How do you get rid of it once you've finished starting up? These questions are usually non-issues with explicitly managed sessions.

How to use a python class with data objects in mysql

I am beginning to learn Python and Django. I want to know how if I have a simple class of "player" with some properties, like: name, points, inventory, how would I make the class also write the values to the database if they are changed. My thinking is that I create Django data models and then call the .save method within my classes. Is this correct?
You are correct that you call the save() method to save models to your db, But you don't have to define the save method within your model classes if you don't want to. It would be extremely helpful to go through the django tutorial which explains all.
https://docs.djangoproject.com/en/dev/intro/tutorial01/
https://docs.djangoproject.com/en/dev/topics/db/models/
Explains django models
django uses its own ORM (object-relational mapping)
This does exacxtly what it sounds like maps your django/python objects (models) to your backend.
It provides a sleek, intuitive, pythonic, very easy to use interface for creating models (tables in your rdbms) adding data and retrieving data.
First you would define your model
class Player(models.Model):
points = models.IntegerField()
name = models.CharField(max_length=255)
django provides commands for chanign this python object into a table.
python manage.py syncdb
you could also use python manage.py sql <appname> to show the actual sql that django is generating to turn this object into a table.
Once you have a storage for this object you can create new ones in the same manner you would create python objects
new_player = Player(points=100, name='me')
new_player.save()
Calling save() actually writes the object to your backend.
You're spot on...
Start at https://docs.djangoproject.com/en/dev/intro/tutorial01/
Make sure you have the python bindings for MySQL and work your way through it... Then if you have specific problems, ask again...

Django get_query_set override is being cached

I'm overriding Django's get_query_set function on one of my models dynamically. I'm doing this to forcibly filter the original query set returned by Model.objects.all/filter/get by a "scenario" value, using a decorator. Here's the decorator's function:
# Get the base QuerySet for these models before we modify their
# QuerySet managers. This prevents infinite recursion since the
# get_query_set function doesn't rely on itself to get this base QuerySet.
all_income_objects = Income.objects.all()
# Figure out what scenario the user is using.
current_scenario = Scenario.objects.get(user=request.user, selected=True)
# Modify the imported income class to filter based on the current scenario.
Expense.objects.get_query_set = lambda: all_expense_objects.filter(scenario=current_scenario)
# Call the method that was initially supposed to
# be executed before we were so rudely interrupted.
return view(request, **arguments)
I'm doing this to DRY up the code, so that all of my queries aren't littered with an additional filter. However, if the scenario changes, no objects are being returned. If I kill all of my python processes on my server, the objects for the newly select scenario appear. I'm thinking that it's caching the modified class, and then when the scenario changes, it's applying another filter that will never make sense, since objects can only have one scenario at a time.
This hasn't been an issue with user-based filters because the user never changes for my session. Is passenger doing something stupid to hold onto class objects between requests? Should I be bailing on this weird design pattern and just implement these filters on a per-view basis? There must be a best practice for DRYing filters up that apply across many views based on something dynamic, like the current user.
What about creating a Manager object for the model which takes the user as an argument where this filtering is done. My understanding of being DRY w/ Django querysets is to use a Model Manager
#### view code:
def some_view(request):
expenses = Expense.objects.filter_by_cur_scenario(request.user)
# add additional filters here, or add to manager via more params
expenses = expenses.filter(something_else=True)
#### models code:
class ExpenseManager(models.Manager):
def filter_by_cur_scenario(self, user):
current_scenario = Scenario.objects.get(user=request.user, selected=True)
return self.filter(scenario=current_scenario)
class Expense(models.Model):
objects = ExpenseManager()
Also, one quick caveat on the manager (which may apply to overriding get_query_set): foreign relationships will not take into account any filtering done at this level. For example, you override the MyObject.objects.filter() method to always filter out deleted rows; A model w/ a foreignkey to that won't use that filter function (at least from what I understand -- someone please correct me if I'm wrong).
I was hoping to have this implementation happen without having to code anything in other views. Essentially, after the class is imported, I want to modify it so that no matter where it's referenced using Expense.objects.get/filter/all it's already been filtered. As a result, there is no implementation required for any of the other views; it's completely transparent. And, even in cases where I'm using it as a ForeignKey, when an object is retrieved using the aforementioned Expense.objects.get/filter/all, they'll be filtered as well.

Categories