I'd like to notice whenever a model is saved and then do some processing and save another model. I need the model to already have an ID set by the database in the processing stage.
With Django one would override the .save() method of model or use signals like:
from django.db.models.signals import post_save
from django.dispatch import receiver
from .models import MyModel, OtherModel
#receiver(post_save, sender=MyModel)
def do_stuff(sender, instance, created, **kwargs):
assert instance.id is not None
...
OtherModel.create(related=instance, data=...)
How to do similar with SQLAlchemy and Flask? I looked up ORM Events and it seemed that expire IntanceEvent would fit the bill. It seems to fire whenever a model instance is saved but when I try to do the same kind of thing:
from sqlalchemy import event
from . import db
from .models import MyModel, OtherModel
#event.listens_for(MyModel, "expire")
def do_stuff(target, attrs):
assert target.id is not None
...
db.session.add(OtherModel(related=target, data=...))
db.session.commit()
It fails on assert instance.id is not None with:
InvalidRequestError: This session is in 'committed' state; no further SQL can be emitted within this transaction.
It might be that I'm just approaching this the wrong way or I'm missing something crucial but I cannot figure it out. The documentation is split among Flask, Flask-SQLAlchemy and SQLAlchemy and I have hard time piecing this together.
How should I make this kind of post save trigger with SQLAlchemy?
The event you want to listen for is 'after_insert', not 'expire':
#event.listens_for(MyModel, 'after_insert')
def do_stuff(mapper, connection, target):
assert target.id is not None
...
Also, after creating OtherModel inside the listener and calling db.session.add, don't call db.session.commit as it will throw a ResourceClosedError exception.
Have a look at the accepted answer to this question which gives an example of using SQLAlchemy's after_insert mapper event. It should do what you want, but using raw SQL rather than your session object is recommended.
Related
I have a Django project and some foreign API's inside it. So, I have a script, that changes product stocks in my store via marketplace's API. And in my views.py I have a CreateAPIView class, that addresses to marketplace's API method, that allows to get product stocks, and writes it to MYSQL DB. Now I have to add a signal to start CreateAPIView class (to get and add changed stocks data) immediately after marketplace's change stocks method worked out. I know how to add a Django signal with pre_save and post_save, but I don't know how to add a singal on request.
I found some like this:
from django.core.signals import request_finished
from django.dispatch import receiver
#receiver(request_finished)
def my_callback(sender, **kwargs):
print("Request finished!")
But it is not that I'm looking for. I need a signal to start an CreateAPIView class after another api class finished its request. I will be very thankfull for any advise how to solve this problem.
You could create a Custom Signal, which can be called after the marketplace API is hit.
custom_signal = Signal(providing_args=["some_data"])
Send the signal when marketplace API is hit:
def marketplace_api():
data = 'some_data'
custom_signal.send(sender=None, some_data=data)
Then simply define a receiver function which will contain the logic you need:
#receiver(custom_signal)
def run_create_api_view(sender, **kwargs):
data = kwargs["some_data"]
if data is not None:
view = CreateAPIView()
view.dispatch(data)
How to do something after a Django user model save, including related changes to m2m fields like django.contrib.auth.models.Group?
Situation
I have a custom Django user model and want to trigger some actions after a user instance is - with related changes such as m2m group memberships - successfully saved to the database.
The use case here is a Wagtail CMS where I create ProfilePages for each user instance. Depending of the group memberships of the user instance, I need to do something.
Problem
In a custom model save() method, I'm not able to reference the changed group memberships, as m2m's are saved after the saving of the user instance. Even if running my custom function after the super().save() call, new group memberships are not yet available.
But I need to get the new group memberships in order to do something depending of the new groups for that user.
What I've tried
[✘] Custom model save()
# file: users/models.py
class CustomUser(AbstractUser):
super().save(*args, **kwargs)
do_something()
[✘] Signal post_save
As the above simple save() method did not do the trick, I tried the post_save signal of the user model:
# file users/signals.py
#receiver(post_save, sender=get_user_model())
def handle_profilepage(sender, instance, created, **kwargs):
action = 'created' if created else 'updated'
do_something()
... but even here I always get the "old" values from the group membership back.
[✔] Signal: m2m_changed
I learnt that there is a m2m_changed signal with which I could monitor changes of the Users.groups(.through) table.
My following implementation did what I need:
#receiver(m2m_changed, sender=User.groups.through)
def user_groups_changed_handler(sender, instance, **kwargs):
USER_GROUPS = instance.groups.values_list('name', flat=True)
if set(USER_GROUPS) & set(settings.PROFILE_GROUPS):
do_something_because_some_groups_match()
else:
do_something_else()
My whishlist
I would happily stay away from signals if there is a chance to solve this problem in the models save() method - but I'm stuck...
You asked after the dark magic. Tell me, what are you willing to sacrifice to avoid using signals? Would you embrace a greater evil? What if I were to suggest that you could start a new thread and sleep for a few seconds until you thought that the m2m relations were probably saved before doing anything?
from threading import Thread
from time import sleep
# file: users/models.py
class CustomUser(AbstractUser):
super().save(*args, **kwargs)
def do_something(obj):
sleep(3)
# stuff
thread = Thread(target = do_something, args = [self], daemon=True)
thread.start()
I'm using Flask, Flask-SQLAlchemy, Flask-Marshmallow + marshmallow-sqlalchemy, trying to implement REST api PUT method. I haven't found any tutorial using SQLA and Marshmallow implementing update.
Here is the code:
class NodeSchema(ma.Schema):
# ...
class NodeAPI(MethodView):
decorators = [login_required, ]
model = Node
def get_queryset(self):
if g.user.is_admin:
return self.model.query
return self.model.query.filter(self.model.owner == g.user)
def put(self, node_id):
json_data = request.get_json()
if not json_data:
return jsonify({'message': 'Invalid request'}), 400
# Here is part which I can't make it work for me
data, errors = node_schema.load(json_data)
if errors:
return jsonify(errors), 422
queryset = self.get_queryset()
node = queryset.filter(Node.id == node_id).first_or_404()
# Here I need some way to update this object
node.update(data) #=> raises AttributeError: 'Node' object has no attribute 'update'
# Also tried:
# node = queryset.filter(Node.id == node_id)
# node.update(data) <-- It doesn't if know there is any object
# Wrote testcase, when user1 tries to modify node of user2. Node doesn't change (OK), but user1 gets status code 200 (NOT OK).
db.session.commit()
return jsonify(), 200
UPDATED, 2022-12-08
Extending the ModelSchema from marshmallow-sqlalchemy instead of Flask-Marshmallow you can use the load method, which is defined like this:
load(data, *, session=None, instance=None, transient=False, **kwargs)
Putting that to use, it should look like that (or similar query):
node_schema.load(json_data, session= current_app.session, instance=Node().query.get(node_id))
And if you want to load without all required fields of Model, you can add the partial=True argument, like this:
node_schema.load(json_data, instance=Node().query.get(node_id), partial=True)
See the docs for more info (does not include definition of ModelSchema.load).
See the code for the load definition.
I wrestled with this issue for some time, and in consequence came back again and again to this post. In the end what made my situation difficult was that there was a confounding issue involving SQLAlchemy sessions. I figure this is common enough to Flask, Flask-SQLAlchemy, SQLAlchemy, and Marshmallow, to put down a discussion. I certainly, do not claim to be an expert on this, and yet I believe what I state below is essentially correct.
The db.session is, in fact, closely tied to the process of updating the DB with Marshmallow, and because of that decided to to give the details, but first the short of it.
Short Answer
Here is the answer I arrived at for updating the database using Marshmallow. It is a different approach from the very helpful post of Jair Perrut. I did look at the Marshmallow API and yet was unable to get his solution working in the code presented, because at the time I was experimenting with his solution I was not managing my SQLAlchemy sessions properly. To go a bit further, one might say that I wasn't managing them at all. The model can be updated in the following way:
user_model = user_schema.load(user)
db.session.add(user_model.data)
db.session.commit()
Give the session.add() a model with primary key and it will assume an update, leave the primary key out and a new record is created instead. This isn't all that surprising since MySQL has an ON DUPLICATE KEY UPDATE clause which performs an update if the key is present and creates if not.
Details
SQLAlchemy sessions are handled by Flask-SQLAlchemy during a request to the application. At the beginning of the request the session is opened, and when the request is closed that session is also closed. Flask provides hooks for setting up and tearing down the application where code for managing sessions and connections may be found. In the end, though, the SQLAlchemy session is managed by the developer, and Flask-SQLAlchemy just helps. Here is a particular case that illustrates the management of sessions.
Consider a function that gets a user dictionary as an argument and uses that with Marshmallow() to load the dictionary into a model. In this case, what is required is not the creation of a new object, but the update of an existing object. There are 2 things to keep in mind at the start:
The model classes are defined in a python module separate from any code, and these models require the session. Often the developer (Flask documentation) will put a line db = SQLAlchemy() at the head of this file to meet this requirement. This in fact, creates a session for the model.
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
In some other separate file there may be a need for a SQLAlchemy session as well. For example, the code may need to update the model, or create a new entry, by calling a function there. Here is where one might find db.session.add(user_model) and db.session.commit(). This session is created in the same way as in the bullet point above.
There are 2 SQLAlchemy sessions created. The model sits in one (SignallingSession) and the module uses its own (scoped_session). In fact, there are 3. The Marshmallow UserSchema has sqla_session = db.session: a session is attached to it. This then is the third, and the details are found in the code below:
from marshmallow_sqlalchemy import ModelSchema
from donate_api.models.donation import UserModel
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
class UserSchema(ModelSchema):
class Meta(object):
model = UserModel
strict = True
sqla_session = db.session
def some_function(user):
user_schema = UserSchema()
user['customer_id'] = '654321'
user_model = user_schema.load(user)
# Debug code:
user_model_query = UserModel.query.filter_by(id=3255161).first()
print db.session.object_session(user_model_query)
print db.session.object_session(user_model.data)
print db.session
db.session.add(user_model.data)
db.session.commit()
return
At the head of this module the model is imported, which creates its session, and then the module will create its own. Of course, as pointed out there is also the Marshmallow session. This is entirely acceptable to some degree because SQLAlchemy allows the developer to manage the sessions. Consider what happens when some_function(user) is called where user['id'] is assigned some value that exists in the database.
Since the user includes a valid primary key then db.session.add(user_model.data) knows that it is not creating a new row, but updating an existing one. This behavior should not be surprising, and is to be at least somewhat expected since from the MySQL documentation:
13.2.5.2 INSERT ... ON DUPLICATE KEY UPDATE Syntax
If you specify an ON DUPLICATE KEY UPDATE clause and a row to be inserted would cause a duplicate value in a UNIQUE index or PRIMARY KEY, an UPDATE of the old row occurs.
The snippet of code is then seen to be updating the customer_id on the dictionary for the user with primary key 32155161. The new customer_id is '654321'. The dictionary is loaded with Marshmallow and a commit done to the database. Examining the database it can be found that it was indeed updated. You might try two ways of verifying this:
In the code: db.session.query(UserModel).filter_by(id=325516).first()
In MySQL: select * from user
If you were to consider the following:
In the code: UserModel.query.filter_by(id=3255161).customer_id
You would find that the query brings back None. The model is not synchronized with the database. I have failed to manage our SQLAlchemy sessions correctly. In an attempt to bring clarity to this consider the output of the print statements when separate imports are made:
<sqlalchemy.orm.session.SignallingSession object at 0x7f81b9107b90>
<sqlalchemy.orm.session.SignallingSession object at 0x7f81b90a6150>
<sqlalchemy.orm.scoping.scoped_session object at 0x7f81b95eac50>
In this case the UserModel.query session is different from the Marshmallow session. The Marshmallow session is what gets loaded and added. This means that querying the model will not show our changes. In fact, if we do:
db.session.object_session(user_model.data).commit()
The model query will now bring back the updated customer_id! Consider the second alternative where the imports are done through flask_essentials:
from flask_sqlalchemy import SQLAlchemy
from flask_marshmallow import Marshmallow
db = SQLAlchemy()
ma = Marshmallow()
<sqlalchemy.orm.session.SignallingSession object at 0x7f00fe227910>
<sqlalchemy.orm.session.SignallingSession object at 0x7f00fe227910>
<sqlalchemy.orm.scoping.scoped_session object at 0x7f00fed38710>
And the UserModel.query session is now the same as the user_model.data (Marshmallow) session. Now the UserModel.query does reflect the change in the database: the Marshmallow and UserModel.query sessions are the same.
A note: the signalling session is the default session that Flask-SQLAlchemy uses. It extends the default session system with bind selection and modification tracking.
I have rolled out own solution. Hope it helps someone else. Solution implements update method on Node model.
Solution:
class Node(db.Model):
# ...
def update(self, **kwargs):
# py2 & py3 compatibility do:
# from six import iteritems
# for key, value in six.iteritems(kwargs):
for key, value in kwargs.items():
setattr(self, key, value)
class NodeAPI(MethodView):
decorators = [login_required, ]
model = Node
def get_queryset(self):
if g.user.is_admin:
return self.model.query
return self.model.query.filter(self.model.owner == g.user)
def put(self, node_id):
json_data = request.get_json()
if not json_data:
abort(400)
data, errors = node_schema.load(json_data) # validate with marshmallow
if errors:
return jsonify(errors), 422
queryset = self.get_queryset()
node = queryset.filter(self.model.id == node_id).first_or_404()
node.update(**data)
db.session.commit()
return jsonify(message='Successfuly updated'), 200
Latest Update [2020]:
You might facing the issue of mapping keys to the database models. Your request body have only updated fields so, you want to change only those without affecting others. There is an option to write multiple if conditions but that's not a good approach.
Solution
You can implement patch or put methods using sqlalchemy library only.
For example:
YourModelName.query.filter_by(
your_model_column_id = 12 #change 12: where condition to find particular row
).update(request_data)
request_data should be dict object. For ex.
{
"your_model_column_name_1": "Hello",
"your_model_column_name_2": "World",
}
In above case, only two columns will be updated that is: your_model_column_name_1 and your_model_column_name_2
Update function maps request_data to the database models and creates update query for you. Checkout this: https://docs.sqlalchemy.org/en/13/core/dml.html#sqlalchemy.sql.expression.update
Previous answer seems to be outdated as ModelSchema is now deprecated.
You should instead SQLAlchemyAutoSchema with the proper options.
class NodeSchema(SQLAlchemyAutoSchema):
class Meta:
model = Node
load_instance = True
sqla_session = db.session
node_schema = NodeSchema()
# then when you need to update a Node orm instance :
node_schema.load(node_data, instance=node, partial=True)
db.session.update()
Below is my solution with Flask-Marshmallow + marshmallow-sqlalchemy bundle as the author requested initially.
schemas.py
from flask import current_app
from flask_marshmallow import Marshmallow
from app.models import Node
ma = Marshmallow(current_app)
class NodeSchema(ma.SQLAlchemyAutoSchema):
class Meta:
model = Node
load_instance = True
load_instance is a key point here to make an update further.
routes.py
from flask import jsonify, request
from marshmallow import ValidationError
from app import db
#bp.route("/node/<node_uuid>/edit", methods=["POST"])
def edit_node(node_uuid):
json_data = request.get_json(force=True, silent=True)
node = Node.query.filter_by(
node_uuid=node_uuid
).first()
if node:
try:
schema = NodeSchema()
json_data["node_uuid"] = node_uuid
node = schema.load(json_data, instance=node)
db.session.commit()
return schema.jsonify(node)
except ValidationError as err:
return jsonify(err.messages), 422
else:
return jsonify("Not found"), 404
You have to check for existence of Node first, otherwise the new instance will be created.
I have a python module UserManager that takes care for all things user management related - users, groups, rights, authentication. Access to these assets is provided via master class that is passed SQLAlchemy engine parameter at constructor. The engine is needed to make the table-class mappings (using mapper objects), and to emit sessions.
This is how the gobal variables are established in the app module:
class UserManager:
def __init__(self, db):
self.db = db
self._db_session = None
meta = MetaData(db)
user_table = Table(
'USR_User', meta,
Column('field1'),
Column('field3')
)
mapper(User, user_table)
#property
def db_session(self):
if self._db_session is None:
self._db_session = scoped_session(sessionmaker())
self._db_session.configure(bind=self.db)
return self._db_session
class User(object):
def init(self, um):
self.um = um
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy(app)
um = UserManager(db.engine)
This module as such is designed to be context-agnostic by purpose, so that it can be used both for locally run and web application.
But here the problems arise: time to time I get the dreaded "Can't reconnect until invalid transaction is rolled back" error, presumably caused by some failed transaction in the UserManager code.
I am now trying to identify the problem source. Maybe it is not right way how to handle the database in the dynamic context of web server? Perhaps I have to pass the db.session to the um object so that I can be sure that the db connections are not mixed up?
In web context you should consider the request for every user isolated. For this you must use the flask.g
To share data that is valid for one request only from one function to
another, a global variable is not good enough because it would break
in threaded environments.Flask provides you with a special object
that ensures it is only valid for the active request and that will
return different values for each request. In a nutshell: it does the
right thing, like it does for request and session.
You can see more about here.
There are many Stack Overflow posts about recursion using the post_save signal, to which the comments and answers are overwhelmingly: "why not override save()" or a save that is only fired upon created == True.
Well I believe there's a good case for not using save() - for example, I am adding a temporary application that handles order fulfillment data completely separate from our Order model.
The rest of the framework is blissfully unaware of the fulfillment application and using post_save hooks isolates all fulfillment related code from our Order model.
If we drop the fulfillment service, nothing about our core code has to change. We delete the fulfillment app, and that's it.
So, are there any decent methods to ensure the post_save signal doesn't fire the same handler twice?
you can use update instead of save in the signal handler
queryset.filter(pk=instance.pk).update(....)
What you think about this solution?
#receiver(post_save, sender=Article)
def generate_thumbnails(sender, instance=None, created=False, **kwargs):
if not instance:
return
if hasattr(instance, '_dirty'):
return
do_something()
try:
instance._dirty = True
instance.save()
finally:
del instance._dirty
You can also create decorator
def prevent_recursion(func):
#wraps(func)
def no_recursion(sender, instance=None, **kwargs):
if not instance:
return
if hasattr(instance, '_dirty'):
return
func(sender, instance=instance, **kwargs)
try:
instance._dirty = True
instance.save()
finally:
del instance._dirty
return no_recursion
#receiver(post_save, sender=Article)
#prevent_recursion
def generate_thumbnails(sender, instance=None, created=False, **kwargs):
do_something()
Don't disconnect signals. If any new model of the same type is generated while the signal is disconnected the handler function won't be fired. Signals are global across Django and several requests can be running concurrently, making some fail while others run their post_save handler.
I think creating a save_without_signals() method on the model is more explicit:
class MyModel()
def __init__():
# Call super here.
self._disable_signals = False
def save_without_signals(self):
"""
This allows for updating the model from code running inside post_save()
signals without going into an infinite loop:
"""
self._disable_signals = True
self.save()
self._disable_signals = False
def my_model_post_save(sender, instance, *args, **kwargs):
if not instance._disable_signals:
# Execute the code here.
How about disconnecting then reconnecting the signal within your post_save function:
def my_post_save_handler(sender, instance, **kwargs):
post_save.disconnect(my_post_save_handler, sender=sender)
instance.do_stuff()
instance.save()
post_save.connect(my_post_save_handler, sender=sender)
post_save.connect(my_post_save_handler, sender=Order)
You should use queryset.update() instead of Model.save() but you need to take care of something else:
It's important to note that when you use it, if you want to use the new object you should get his object again, because it will not change the self object, for example:
>>> MyModel.objects.create(pk=1, text='')
>>> el = MyModel.objects.get(pk=1)
>>> queryset.filter(pk=1).update(text='Updated')
>>> print el.text
>>> ''
So, if you want to use the new object you should do again:
>>> MyModel.objects.create(pk=1, text='')
>>> el = MyModel.objects.get(pk=1)
>>> queryset.filter(pk=1).update(text='Updated')
>>> el = MyModel.objects.get(pk=1) # Do it again
>>> print el.text
>>> 'Updated'
You could also check the raw argument in post_save and then call save_baseinstead of save.
the Model's .objects.update() method bypasses the post_save signal
Try this something like this:
from django.db import models
from django.db.models.signals import post_save
class MyModel(models.Model):
name = models.CharField(max_length=200)
num_saves = models.PositiveSmallIntegerField(default=0)
#classmethod
def post_save(cls, sender, instance, created, *args, **kwargs):
MyModel.objects.filter(id=instance.id).update(save_counter=instance.save_counter + 1)
post_save.connect(MyModel.post_save, sender=MyModel)
In this example, an object has a name and each time .save() is called, the .num_saves property is incremented, but without recursion.
Check this out...
Each signal has it's own benefits as you can read about in the docs here but I wanted to share a couple things to keep in mind with the pre_save and post_save signals.
Both are called every time .save() on a model is called. In other words, if you save the model instance, the signals are sent.
running save() on the instance within a post_save can often create a never ending loop and therefore cause a max recursion depth exceeded error --- only if you don't use .save() correctly.
pre_save is great for changing just instance data because you do not have to call save() ever which eliminates the possibility for above. The reason you don't have to call save() is because a pre_save signal literally means right before being saved.
Signals can call other signals and or run delayed tasks (for Celery) which can be huge for usability.
Source: https://www.codingforentrepreneurs.com/blog/post-save-vs-pre-save-vs-override-save-method/
Regards!!
In post_save singal in django for avoiding recursion 'if created' check is required
from django.dispatch import receiver
from django.db.models.signals import post_save
#receiver(post_save, sender=DemoModel)
def _post_save_receiver(sender,instance,created, **kwargs):
if created:
print('hi..')
instance.save()
I was using the save_without_signals() method by #Rune Kaagaard until i updated my Django to 4.1. On Django 4.1 this method started raising an Integrity error on the database that gave me 4 days of headaches and i couldn't fix it.
So i started to use the queryset.update() method and it worked like a charm. It doesn't trigger the pre_save() neither post_save() and you don't need to override the save() method of your model. 1 line of code.
#receiver(pre_save, sender=Your_model)
def any_name(sender, instance, **kwargs):
Your_model.objects.filter(pk=instance.pk).update(model_attribute=any_value)