Bulk update using model's method in SQLAlchemy - python

I'm developing an application with SQLAlchemy and I've run into a bit of an issue. I would like to run a method on all models returned by a query and make all that in a single SQL query, while preserving the readability the ORM offers.
The method in question is very simple and doesn't depend on any external data nor makes any more queries. It's also fine if all the models in the bulk update use the same exact value, so the value itself needs to be evaluated only once.
Here's my model:
class Item(db.Model):
last_fetch = db.Column(db.DateTime)
def refresh(self):
self.last_fetch = datetime.utcnow()
I would like to call the refresh() function on all models returned by a query - for the sake of example let's assume it's Item.query.all().
I can iterate through them and run the method on each model but that would run a separate query for each one of them:
items = Item.query.all()
for item in items:
item.refresh()
Or I could do the following which works however I've now moved my refresh() logic from the model to the code that would otherwise just call that method:
items = Item.query.all()
items.update({Item.last_fetch: datetime.utcnow()})
Is there a better solution? A way to define a "smart" method on the model that would somehow allow the ORM to run it in bulk while still keeping it a model method?
Regards.

Since SQLAlchemy doesn't recreate queried objects using __init__ but __new__.
You could either override __new__ on your model or try and see if the #orm.reconstructor decorator explained in Constructors and Object Initialization would work.
Using the reconstructor:
class Item(db.Model):
last_fetch = db.Column(db.DateTime)
#orm.reconstructor
def set_last_fetch_on_load(self):
self.last_fetch = datetime.datetime.now()
Overriding __new__:
class Item(db.Model):
last_fetch = db.Column(db.DateTime)
def __new__(cls, *args, **kwargs):
obj = object.__new__(cls, *args, **kwargs)
obj.last_fetch = datetime.datetime.now()
return obj
Note: Haven't tested it.

Related

How does Model Class work, Django?

Before posting this question, I have read through the Official Django Documentation, scouring it for a comprehensive explanation for beginners. I have read the code of the actual Model Class, and searched around on StackOverflow.
When working with databases in Django, you work with classes inheriting from the Model class in the models module. This helps programmers avoid double-typing everything, jumping between database specific syntax and python. As I have read, 'the model class that each model inherits from automatically takes care of translation'.
How does this work? How Does the Model Class convert model attributes to database columns? I suppose some methods inherited from the parent Model Class are able to use the variables specified in each new model, but would like a better explanation if possible!
Also, why write 'models.Model' if the Model class is within models.base?
LINK TO MODEL CLASS: https://docs.djangoproject.com/en/1.11/_modules/django/db/models/base/#Model
EDIT:
Figured out the reason behind why models.Model work.
How Does the Model Class convert model attributes to database columns?
The Model class doesn't really do any conversion itself. You create a subclass of Model that has some column information,
which Django's ORM uses when building the database query corresponding to your Django ORM query. The conversion is done by a database driver when it actually communicates with your specific database.
Here's a toy ORM that behaves a little like Django's Model. You can implement QuerySet for fun if you want:
class Column:
'''
Represents a database column.
This is used to create the underlying table in the database
and to translate database types to Python types.
'''
def __init__(self, type):
self.type = type
class Manager:
'''
Accessed via `YourModel.objects`. This is what constructs
a `QuerySet` object in Django.
'''
def __init__(self, model):
self.model = model
def get(self, id):
'''
Pretend `YourModel.objects.get(id=123)` queries the database directly.
'''
# Create an instance of the model. We only keep track of the model class.
instance = self.model()
# Populate the instance's attributes with the result of the database query
for name in self.model._columns:
# Pretend we load the values from the database
value = 123
setattr(instance, name, value)
# This would be done above if we actually queried the database
instance.id = id
# Finally return the instance of `self.model`
return instance
class ModelBase(type):
def __new__(cls, name, bases, attrs):
new_cls = super().__new__(cls, name, bases, attrs)
# The `Manager` instance is made a class attribute
new_cls.objects = Manager(new_cls)
# Keep track of the columns for conveniece
new_cls._columns = {}
for name, attr in attrs.items():
if isinstance(attr, Column):
new_cls._columns[name] = attr
# The class is now ready
return new_cls
class Model(metaclass=ModelBase):
'''
Django's `Model` is more complex.
This one only uses `ModelBase` as its metaclass so you can just inherit from it
'''
pass
class MyModel(Model):
id = Column(int)
column2 = Column(float)
column3 = Column(str)
if __name__ == '__main__':
print(MyModel._columns)
instance = MyModel.objects.get(id=5)
print(instance.id)
The main functionality is provided by Model having ModelBase as a metaclass. The metaclass's __new__ method is called
when Model or any subclass is created (not an instance, the class itself), which allows the metaclass to modify the class arbitrarily.
Each Model subclass contains information about its own columns and gets a objects class attribute that queries the database for it.
Also, why write 'models.Model' if the Model class is within models.base?
models/__init__.py imports Model from models/base.py so you don't have to write models.base.Model.
When you create a model class and run
python manage.py makemigrations
It creates the corresponding scripts to create a table in your database.
You can find this script in your apps "migrations" folder.
And when you run
python manage.py migrate
These scripts are mapped to the correct commands and are executed on the database by Django.

How do I make class variables/methods created and used only when needed?

I'm working on a model that uses player data. During development, only certain data is needed. My thought was to create a PlayerData class but my amateur mind doesn't understand/know how to do this properly.
I understand this code is basic, but it's just for example...
class PlayerData(object):
def __init__(self, player_id):
self.player_id = player_id
def past_games(self):
# only if requested, query DB for data
def vital_info(self):
# only if requested, query DB for data
def abilities(self):
# only if requested, query DB for data
pd = PlayerData(235)
If I call pd.vital_info for the first time, I only want to execute the query at that point. How do I structure this so the requested query is run while the other queries are not (unless needed later on)?
The following code should help you understand how functions of a class are called in Python.
class PlayerData:
def __init__(self,player_id):
print("Calling __init__. Yay!")
self.player_id = player_id
def past_games(self):
print("Calling past_games")
# only if requested, query DB for data
def vital_info(self):
print("Calling vital_info")
# only if requested, query DB for data
def abilities(self):
print("Calling Abilities")
# only if requested, query DB for data
>> p = PlayerData(1)
Calling __init__. Yay! #No other method was called so you see not print out
>>p.past_games()
Calling past_games
>>p.vital_info()
Calling Vital info
>>p.abilities()
Calling Abilities
As you see the class functions need to be explicitly called. There are only a handful of methods that are called when class is initializer. One them is __init__
You don't need to do anything at all. If your queries are in individual methods, then only the methods you actually call will be executed.

Django model operating on a queryset

I'm new to Django and somewhat to Python as well. I'm trying to find the idiomatic way to loop over a queryset and set a variable on each model. Basically my model depends on a value from an api, and a model method must multiply one of it's attribs by this api value to get an up-to-date correct value.
At the moment I am doing it in the view and it works, but I'm not sure it's the correct way to achieve what I want. I have to replicate this looping elsewhere.
Is there a way I can encapsulate the looping logic into a queryset method so it can be used in multiple places?
NOTE: the variable I am setting on each model instance is just a regular attribute, not saved to db. I just need to be able to set that variable, not save it.
I have this atm (I am using django-rest-framework):
class FooViewSet(viewsets.ModelViewSet):
model = Foo
serializer_class = FooSerializer
bar = # some call to an api
def get_queryset(self):
# Dynamically set the bar variable on each instance!
foos = Foo.objects.filter(baz__pk=1).order_by('date')
for item in foos:
item.needs_bar = self.bar
return items
I would think something like so would be better:
def get_queryset(self):
bar = # some call to an api
# Dynamically set the bar variable on each instance!
return Foo.objects.filter(baz__pk=1).order_by('date').set_bar(bar)
I'm thinking the api hit should be in the controller and then injected to instances of the model, but I'm not sure how you do this. I've been looking around querysets and managers but still can't figure it out nor decided if it's the best method to achieve what I want.
Can anyone suggest the correct way to model this with django?
Thanks.
You can set some new properties on queryset items, but they will not update database (will be saved just in local namespace). I suppose that you want to recalculate field of your model multiplying it by some value:
class Foo(models.Model):
calculated_field = models.BigIntegerField(default=0)
def save(self, *args, **kwargs):
if self.pk is not None: # it's not a new record
foo = kwargs.get('foo')
if foo:
self.calculated_field = self.calculated_field * int(foo)
super(Foo, self).save(*args, **kwargs) # Call the "real" save() method.
def get_queryset(self):
bar = # some call to an api
# Dynamically set the bar variable on each instance!
foos = Foo.objects.filter(baz__pk=1).order_by('date')
for item in foos:
item.save(foo=bar)
# return updated data
return Foo.objects.filter(baz__pk=1).order_by('date')
At some point you might need to use transactions if you will run this code simultaneously.

Automatic GUID key_name in model

I want my model to get a GUID as key_name automatically and I'm using the code below. Is that a good approach to solve it? Does it have any drawbacks?
class SyncModel(polymodel.PolyModel):
def __init__(self, key_name=None, key=None, **kwargs):
super(SyncModel, self).__init__(key_name=str(uuid.uuid1()) if not key else None,key=key, **kwargs)
Overriding __init__ on a Model subclass is dangerous, because the constructor is used by the framework to reconstruct instances from the datastore, in addition to being used by user code. Unless you know exactly how the constructor is used to reconstruct existing entities - something which is an internal detail and may change in future - you should avoid overriding it.
Instead, define a factory method, like this:
class MyModel(db.Model):
#classmethod
def new(cls, **kwargs):
return cls(key_name=str(uuid.uuid4()), **kwargs)
There is an article by Nick about pre and post put hooks which and be used to set the key_name, I don't know if your current method is valid or not but at least you should be aware of other options.

How to write a common get_by_id() method for all kinds of models in Sqlalchemy?

I'm using pylons with sqlalchemy. I have several models, and found myself wrote such code again and again:
question = Session.query(Question).filter_by(id=question_id).one()
answer = Session.query(Answer).fileter_by(id=answer_id).one()
...
user = Session.query(User).filter_by(id=user_id).one()
Since the models are all extend class Base, is there any way to define a common get_by_id() method?
So I can use it as:
quesiton = Question.get_by_id(question_id)
answer = Answer.get_by_id(answer_id)
...
user = User.get_by_id(user_id)
If id is your primary key column, you just do:
session.query(Foo).get(id)
which has the advantage of not querying the database if that instance is already in the session.
Unfortunately, SQLAlchemy doesn't allow you to subclass Base without a corresponding table declaration. You could define a mixin class with get_by_id as a classmethod, but then you'd need to specify it for each class.
A quicker-and-dirtier solution is to just monkey-patch it into Base:
def get_by_id(cls, id, session=session):
return session.query(cls).filter_by(id=id).one()
Base.get_by_id = classmethod(get_by_id)
This assumes you've got a session object available at definition-time, otherwise you'll need to pass it as an argument each time.
class Base(object):
#classmethod
def get_by_id(cls, session, id):
q = session.query(cls).filter_by(id=id)
return q.one()
Question.get_by_id(Session, question_id)

Categories