force object to be `dirty` in sqlalchemy - python

Is there a way to force an object mapped by sqlalchemy to be considered dirty? For example, given the context of sqlalchemy's Object Relational Tutorial the problem is demonstrated,
a=session.query(User).first()
a.__dict__['name']='eh'
session.dirty
yielding,
IdentitySet([])
i am looking for a way to force the user a into a dirty state.
This problem arises because the class that is mapped using sqlalchemy takes control of the attribute getter/setter methods, and this preventing sqlalchemy from registering changes.

I came across the same problem recently and it was not obvious.
Objects themselves are not dirty, but their attributes are. As SQLAlchemy will write back only changed attributes, not the whole object, as far as I know.
If you set an attribute using set_attribute and it is different from the original attribute data, SQLAlchemy founds out the object is dirty (TODO: I need details how it does the comparison):
from sqlalchemy.orm.attributes import set_attribute
set_attribute(obj, data_field_name, data)
If you want to mark the object dirty regardless of the original attribute value, no matter if it has changed or not, use flag_modified:
from sqlalchemy.orm.attributes import flag_modified
flag_modified(obj, data_field_name)

The flag_modified approach works if one know that attribute have a value present. SQLAlchemy documentation states:
Mark an attribute on an instance as ‘modified’.
This sets the ‘modified’ flag on the instance and establishes an
unconditional change event for the given attribute. The attribute must
have a value present, else an InvalidRequestError is raised.
Starting with version 1.2, if one wants to mark an entire instance then flag_dirty is the solution:
Mark an instance as ‘dirty’ without any specific attribute mentioned.

Related

DetachedInstanceError: SQLAlchemy wants to refresh the DateTime attribute of an expunged instance

I'm using a session with autocommit=True and expire_on_commit=False. I use the session to get an object A with a foreign key that points to an object B. I then call session.expunge(a.b); session.expunge(a).
Later, when trying to read the value of b.some_datetime, SQLAlchemy raises a DetachedInstanceError. No attribute has been configured for lazy-loading. The error happens randomly.
How is this possible? I assumed that all scalar attributes would be eagerly loaded and available after the object is expunged.
For what it's worth, the objects get expunged so they can be used in another thread, after all interactions with the database are over.
One of the mapped class's fields had an onupdate attribute, which caused it to expire whenever the object is changed.
The solution is to call session.refresh(myobj) between the flush and the call to session.expunge().

SQLAlchemy introspection of declarative classes

I'm writing a small sqlalchemy shim to export data from a MySQL database with some lightweight data transformations—mostly changing field names. My current script works fine but requires me to essentially describe my model twice—once in the class declaration and once as a list of field names to iterate over.
I'm trying to figure out how to use introspection to identify properties on row-objects that are column accessors. The following works almost perfectly:
for attr, value in self.__class__.__dict__.iteritems():
if isinstance(value, sqlalchemy.orm.attributes.InstrumentedAttribute):
self.__class__._columns.append(attr)
except that my to-many relation accessors are also instances of sqlalchemy.orm.attributes.InstrumentedAttribute, and I need to skip those. Is there any way to distinguish between the two while I am inspecting the class dictionary?
Most of the documentation I'm finding on sqlalchemy introspection involves looking at metadata.table, but since I'm renaming columns, that data isn't trivially mappable.
The Mapper of each mapped entity has an attribute columns with all column definitions. For example, if you have a declarative class User you can access the mapper with User.__mapper__ and the columns with:
list(User.__mapper__.columns)
Each column has several attributes, including name (which might not be the same as the mapped attribute named key), nullable, unique and so on...
I'd still like to see an answer to this question, but I've worked around it by name-mangling the relationship accessors (e.g. '_otherentity' instead of 'otherentity') and then filtering on the name. Works fine for my purposes.
An InstrumentedAttribute instance has an an attribute called impl that is in practice a ScalarAttributeImpl, a ScalarObjectAttributeImpl, or a CollectionAttributeImpl.
I'm not sure how brittle this is, but I just check which one it is to determine whether an instance will ultimately return a list or a single object.

SqlAlchemy optimizations for read-only object models

I have a complex network of objects being spawned from a sqlite database using sqlalchemy ORM mappings. I have quite a few deeply nested:
for parent in owner.collection:
for child in parent.collection:
for foo in child.collection:
do lots of calcs with foo.property
My profiling is showing me that the sqlalchemy instrumentation is taking a lot of time in this use case.
The thing is: I don't ever change the object model (mapped properties) at runtime, so once they are loaded I don't NEED the instrumentation, or indeed any sqlalchemy overhead at all. After much research, I'm thinking I might have to clone a 'pure python' set of objects from my already loaded 'instrumented objects', but that would be a pain.
Performance is really crucial here (it's a simulator), so maybe writing those layers as C extensions using sqlite api directly would be best. Any thoughts?
If you reference a single attribute of a single instance lots of times, a simple trick is to store it in a local variable.
If you want a way to create cheap pure python clones, share the dict object with the original object:
class CheapClone(object):
def __init__(self, original):
self.__dict__ = original.__dict__
Creating a copy like this costs about half of the instrumented attribute access and attribute lookups are as fast as normal.
There might also be a way to have the mapper create instances of an uninstrumented class instead of the instrumented one. If I have some time, I might take a look how deeply ingrained is the assumption that populated instances are of the same type as the instrumented class.
Found a quick and dirty way that seems to at least somewhat work on 0.5.8 and 0.6. Didn't test it with inheritance or other features that might interact badly. Also, this touches some non-public API's, so beware of breakage when changing versions.
from sqlalchemy.orm.attributes import ClassManager, instrumentation_registry
class ReadonlyClassManager(ClassManager):
"""Enables configuring a mapper to return instances of uninstrumented
classes instead. To use add a readonly_type attribute referencing the
desired class to use instead of the instrumented one."""
def __init__(self, class_):
ClassManager.__init__(self, class_)
self.readonly_version = getattr(class_, 'readonly_type', None)
if self.readonly_version:
# default instantiation logic doesn't know to install finders
# for our alternate class
instrumentation_registry._dict_finders[self.readonly_version] = self.dict_getter()
instrumentation_registry._state_finders[self.readonly_version] = self.state_getter()
def new_instance(self, state=None):
if self.readonly_version:
instance = self.readonly_version.__new__(self.readonly_version)
self.setup_instance(instance, state)
return instance
return ClassManager.new_instance(self, state)
Base = declarative_base()
Base.__sa_instrumentation_manager__ = ReadonlyClassManager
Usage example:
class ReadonlyFoo(object):
pass
class Foo(Base, ReadonlyFoo):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
name = Column(String(32))
readonly_type = ReadonlyFoo
assert type(session.query(Foo).first()) is ReadonlyFoo
You should be able to disable lazy loading on the relationships in question and sqlalchemy will fetch them all in a single query.
Try using a single query with JOINs instead of the python loops.

SQLAlchemy: Shallow copy avoiding lazy loading

I'm trying to automatically build a shallow copy of a SA-mapped
object.. At the moment my function is just:
newobj = src.__class__()
for prop in class_mapper(src.__class__).iterate_properties:
setattr(newobj, prop.key, getattr(src, prop.key))
but I'm having troubles with lazy relations... Obviously getattr
triggers the lazy loading, but since I don't need their values right
away, I'd like to just copy the "this should be lazy loaded"-state of
the attribute... Is this possible?
Edit: I need this for a "data logging" system.. That is, whenever someone updates a persisted entity, I have to generate a new record and then mark the old one as such.
In order to do this I create a shallow copy of the entity (so SQLA issues an INSERT instead of an UPDATE) and work from there..
The system works quite nicely (it's been in production use for months) but now I'd like to enhance it so that it won't need that all the relations get lazy-loaded first..
What you need is to copy column properties only, which can be easily filtered using isinstance(prop, sqlalchemy.orm.ColumnProperty). Note, that you HAVE to copy externally stored relations (all many-to-many), since there is no columns corresponding to them in the main table. This can't be done with high-level interface without lazy-loading, so I'd prefer to accept this trade-off. Many-to-many relations can be determined with isinstance(prop, RelationProperty) and prop.secondary test. The resulting code will look like the following:
from sqlalchemy.orm import object_mapper, ColumnProperty, RelationProperty
newobj = type(src)()
for prop in object_mapper(src).iterate_properties:
if (isinstance(prop, ColumnProperty) or
isinstance(prop, RelationProperty) and prop.secondary):
setattr(newobj, prop.key, getattr(src, prop.key))
Also note, that SQLAlchemy is designed to maintain single object loaded for each identity, while your copy breaks this when identity (primary key) properties are copied too, but this is probably not your case if your are storing with new (versioned) identifier.

How do I get the key value of a db.ReferenceProperty without a database hit?

Is there a way to get the key (or id) value of a db.ReferenceProperty, without dereferencing the actual entity it points to? I have been digging around - it looks like the key is stored as the property name preceeded with an _, but I have been unable to get any code working. Examples would be much appreciated. Thanks.
EDIT: Here is what I have unsuccessfully tried:
class Comment(db.Model):
series = db.ReferenceProperty(reference_class=Series);
def series_id(self):
return self._series
And in my template:
more
The result:
more
Actually, the way that you are advocating accessing the key for a ReferenceProperty might well not exist in the future. Attributes that begin with '_' in python are generally accepted to be "protected" in that things that are closely bound and intimate with its implementation can use them, but things that are updated with the implementation must change when it changes.
However, there is a way through the public interface that you can access the key for your reference-property so that it will be safe in the future. I'll revise the above example:
class Comment(db.Model):
series = db.ReferenceProperty(reference_class=Series);
def series_id(self):
return Comment.series.get_value_for_datastore(self)
When you access properties via the class it is associated, you get the property object itself, which has a public method that can get the underlying values.
You're correct - the key is stored as the property name prefixed with '_'. You should just be able to access it directly on the model object. Can you demonstrate what you're trying? I've used this technique in the past with no problems.
Edit: Have you tried calling series_id() directly, or referencing _series in your template directly? I'm not sure whether Django automatically calls methods with no arguments if you specify them in this context. You could also try putting the #property decorator on the method.

Categories